Potential Early Computation Issue in compute_q_matmul_k Function

Thank you for your excellent job on HLS.
I've noticed a potential issue in the compute_q_matmul_k function in attention.cpp file. It appears that during the initial stages of computation, many elements within q_blocks are involved in calculations before they have been fully read in. This could potentially lead to inaccuracies in the computed results. Could you please explain the rationale behind this approach?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Early Computation Issue in compute_q_matmul_k Function #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential Early Computation Issue in compute_q_matmul_k Function #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions