fix bug in online_softmax when all loaded values in a warp are -inf by LitLeo · Pull Request #64 · Dao-AILab/quack

LitLeo · 2026-01-08T10:09:14Z

if max_x=-inf, exp_x=nan, which causes error result.

@cute.jit
def online_softmax_reduce(
    x: cute.TensorSSA,
    threads_per_row: cutlass.Constexpr[int],
    reduction_buffer: Optional[cute.Tensor] = None,
    mbar_ptr: Optional[cute.Pointer] = None,
    hook_fn: Optional[Callable] = None,
    phase: Optional[Int32] = None,
    return_exp_x: bool = False,
) -> [Float32, Float32, Optional[cute.TensorSSA]]:
    assert x.dtype == Float32, "x must be of type Float32"
    """reduction_buffer must have shape (num_warps / warps_per_row, (warps_per_row, cluster_n), 2)"""
    max_x = cute.arch.warp_reduction(
        x.reduce(cute.ReductionOp.MAX, init_val=-Float32.inf, reduction_profile=0),
        cute.arch.fmax,
        threads_in_group=min(threads_per_row, cute.arch.WARP_SIZE),
    )
    log2_e = math.log2(math.e)
    exp_x = cute.math.exp2(x * log2_e - (max_x * log2_e), fastmath=True)
    sum_exp_x = cute.arch.warp_reduction(
        exp_x.reduce(cute.ReductionOp.ADD, init_val=0.0, reduction_profile=0),
        operator.add,
        threads_in_group=min(threads_per_row, cute.arch.WARP_SIZE),
    )

tridao · 2026-01-11T04:26:50Z

I think it's better to have the check:

max_x_cur = Float32(0.0) if max_x == -Float32.inf else max_x

The subtracting max_x_cur before the exp.

LitLeo · 2026-01-12T03:27:42Z

I think it's better to have the check:
max_x_cur = Float32(0.0) if max_x == -Float32.inf else max_x
The subtracting max_x_cur before the exp.

if max_x == -Float32.inf:
    max_x = Float32(0.0)

is OK, but I noticed that row_reduce has the same issue.
Here is a more general approach I found. What do you think of this solution?

if const_expr(not is_even_N):
    # utils.fill_oob(tXsX, tXpX, -tXsX.element_type.inf)
    utils.fill_oob(tXsX, tXpX, -BFloat16(2**15))

tridao · 2026-01-12T14:20:21Z

I prefer changing max-cur. For the case where the input is entirely -inf (not out of bounds) we still want output to be zero.

Remove handling for max_x being -inf to avoid NaN in exp_x.

LitLeo · 2026-01-19T07:36:51Z

I prefer changing max-cur. For the case where the input is entirely -inf (not out of bounds) we still want output to be zero.

Done.

tridao · 2026-01-19T08:46:54Z

Thanks, could you add a test case where the whole vector is -inf?

LitLeo · 2026-01-20T09:11:21Z

Thanks, could you add a test case where the whole vector is -inf?

A test case where the entire vector is -inf is meaningless, as the loss of this vector would be NaN. This bug is triggered when N=150000, so I've added a case with N to the test_cross_entropy.

if max_x=-inf, exp_x=nan, which causes error result.

2c8d81e

Simplify exp_x calculation by removing NaN check

a073e3c

Remove handling for max_x being -inf to avoid NaN in exp_x.

Add test case for cross entropy with value 150000

16422da

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix bug in online_softmax when all loaded values in a warp are -inf#64

fix bug in online_softmax when all loaded values in a warp are -inf#64
LitLeo wants to merge 3 commits intoDao-AILab:mainfrom
LitLeo:main

LitLeo commented Jan 8, 2026 •

edited

Loading

Uh oh!

tridao commented Jan 11, 2026

Uh oh!

LitLeo commented Jan 12, 2026

Uh oh!

tridao commented Jan 12, 2026

Uh oh!

LitLeo commented Jan 19, 2026

Uh oh!

tridao commented Jan 19, 2026

Uh oh!

LitLeo commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LitLeo commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tridao commented Jan 11, 2026

Uh oh!

LitLeo commented Jan 12, 2026

Uh oh!

tridao commented Jan 12, 2026

Uh oh!

LitLeo commented Jan 19, 2026

Uh oh!

tridao commented Jan 19, 2026

Uh oh!

LitLeo commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LitLeo commented Jan 8, 2026 •

edited

Loading