Skip to content

Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format#791

Open
polarG wants to merge 1 commit intomainfrom
dev/hongtaozhang/fix-hipblaslt-parse-in-new-version
Open

Enhancement - Fix hipblaslt-gemm result parsing for hipBLASLt v1500+ output format#791
polarG wants to merge 1 commit intomainfrom
dev/hongtaozhang/fix-hipblaslt-parse-in-new-version

Conversation

@polarG
Copy link
Contributor

@polarG polarG commented Mar 20, 2026

Description
The hipblaslt-gemm benchmark result parser fails with MICROBENCHMARK_RESULT_PARSING_FAILURE (return code 33) when running against hipBLASLt v1500+. The benchmark kernels execute successfully and produce valid TFLOPS data, but SuperBench cannot parse the results into structured metrics.

Root cause: The parser hardcodes len(fields) != 23 to validate the output CSV, but hipBLASLt v1500 outputs 34 columns — it added a_type, b_type, c_type, d_type, scaleA, scaleB, scaleC, scaleD, amaxD, bias_type, aux_type, and hipblaslt-GB/s. This causes two bugs:

  1. The field count check rejects every result line as invalid.
  2. Even if it passed, fields[-2] would return hipblaslt-GB/s instead of hipblaslt-Gflops.

Fix
Replace the hardcoded field-count check and positional index with header-based column lookup:

  • Parse the CSV header line to dynamically find the column index of hipblaslt-Gflops.
  • Validate that the data line has the same number of columns as the header (instead of a magic number).
  • Use the discovered column index to extract the Gflops value.

This approach is:

  • Backward compatible — works with the old 23-column format (hipBLASLt v600).
  • Forward compatible — will handle any future column additions as long as hipblaslt-Gflops remains in the header.

@polarG polarG requested a review from a team as a code owner March 20, 2026 17:52
Copilot AI review requested due to automatic review settings March 20, 2026 17:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates hipblaslt-gemm result parsing to support hipBLASLt v1500+ CSV output by using header-based column lookup instead of hardcoded field counts/positions.

Changes:

  • Parse the CSV header to dynamically locate the hipblaslt-Gflops column and validate row width against the header.
  • Add a unit test covering the new v1500+ (34-column) output format.
  • Clarify parsing intent via inline comments.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
tests/benchmarks/micro_benchmarks/test_hipblaslt_function.py Adds a new test case for v1500+ output and annotates the existing old-format test.
superbench/benchmarks/micro_benchmarks/hipblaslt_function.py Switches parsing to header-based hipblaslt-Gflops extraction and header/data column-count validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +140 to +141
self.assertEqual(2, len(benchmark.result))
self.assertEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0])
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.assertEqual(2, len(benchmark.result)) is a brittle assertion because it couples the test to the total number of emitted metrics rather than the behavior under test (correct Gflops extraction). Prefer asserting that the expected key exists (and optionally that no parsing error occurred) without pinning the total metric count. Also, comparing floats with assertEqual can be flaky due to floating-point representation; use assertAlmostEqual(..., places=...) for the numeric check.

Suggested change
self.assertEqual(2, len(benchmark.result))
self.assertEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0])
self.assertIn('fp16_1_4096_4096_4096_flops', benchmark.result)
self.assertAlmostEqual(678.209, benchmark.result['fp16_1_4096_4096_4096_flops'][0], places=3)

Copilot uses AI. Check for mistakes.
header_fields[0] = header_fields[0].split(']')[-1].lstrip(':')
gflops_col = None
for col_idx, col_name in enumerate(header_fields):
if 'hipblaslt-Gflops' in col_name:
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The column lookup uses substring matching (in), which can accidentally match unexpected column names (e.g., a future column like hipblaslt-Gflops-peak). This should be an exact match after normalizing whitespace (e.g., col_name.strip() == 'hipblaslt-Gflops') to ensure the parser reliably targets the intended metric.

Suggested change
if 'hipblaslt-Gflops' in col_name:
if col_name.strip() == 'hipblaslt-Gflops':

Copilot uses AI. Check for mistakes.
Comment on lines +123 to +128
# This is needed because hipBLASLt output format varies across versions:
# - v600 (old): 23 columns, Gflops at index -2
# - v1500 (new): 34 columns, added a_type/b_type/c_type/scaleA-D/amaxD/
# bias_type/aux_type/GB_s columns, Gflops at index -3
# Using header-based lookup ensures compatibility with both formats
# and any future column additions.
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the implementation no longer relies on positional indices, the comment stating exact positions (e.g., 'Gflops at index -2/-3') can become misleading as formats evolve. Consider removing the index claims and focusing the comment on the key guarantee (header-driven lookup + header/data width validation). Also, the comment mentions GB_s but the sample header uses hipblaslt-GB/s.

Suggested change
# This is needed because hipBLASLt output format varies across versions:
# - v600 (old): 23 columns, Gflops at index -2
# - v1500 (new): 34 columns, added a_type/b_type/c_type/scaleA-D/amaxD/
# bias_type/aux_type/GB_s columns, Gflops at index -3
# Using header-based lookup ensures compatibility with both formats
# and any future column additions.
# This is needed because the hipBLASLt output format varies across versions:
# - v600 (old): 23 columns.
# - v1500 (new): 34 columns, adding a_type/b_type/c_type/scaleA-D/amaxD/
# bias_type/aux_type/hipblaslt-GB/s columns.
# Using header-based lookup (plus header/data width validation) ensures
# compatibility across existing formats and resilience to future changes.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

❌ Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.71%. Comparing base (6b8e810) to head (622592c).

Files with missing lines Patch % Lines
.../benchmarks/micro_benchmarks/hipblaslt_function.py 81.81% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #791   +/-   ##
=======================================
  Coverage   85.70%   85.71%           
=======================================
  Files         102      102           
  Lines        7703     7712    +9     
=======================================
+ Hits         6602     6610    +8     
- Misses       1101     1102    +1     
Flag Coverage Δ
cpu-python3.10-unit-test 70.98% <81.81%> (+0.02%) ⬆️
cpu-python3.12-unit-test 70.98% <81.81%> (+0.02%) ⬆️
cpu-python3.7-unit-test 70.45% <81.81%> (+0.02%) ⬆️
cuda-unit-test 83.60% <81.81%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants