Skip to content

Claude sandbox: speedup-fast-float#11

Merged
perazz merged 4 commits intomainfrom
claude/speedup-fast-float
Mar 27, 2026
Merged

Claude sandbox: speedup-fast-float#11
perazz merged 4 commits intomainfrom
claude/speedup-fast-float

Conversation

@perazz
Copy link
Copy Markdown
Owner

@perazz perazz commented Mar 27, 2026

Commits

fd7c82a perf: add PGO build system for benchmark binary
c1b2b44 perf: optimize benchmark and ensure fair C comparison
80e2327 perf: optimize fast_float hot path for parsing performance
079e6c2 chore: update infrastructure, docs, CI, licenses, and doxygen config

Changed files

 .github/workflows/ci.yml                           |   90 +-
 .github/workflows/deploy-docs.yml                  |   86 +-
 .gitignore                                         |   18 +-
 LICENSE-APACHE                                     |  382 +-
 LICENSE-BOOST                                      |   46 +-
 LICENSE-MIT                                        |   56 +-
 README.md                                          |  298 +-
 app/test_accuracy.f90                              |  728 +--
 benchmark/benchmark_compare.f90                    |  840 +--
 benchmark/ffc.h                                    | 6484 ++++++++++----------
 benchmark/ffc_benchmark.c                          |   70 +-
 benchmark/ffc_c_bridge.f90                         |  150 +-
 benchmark/ffc_impl.c                               |    4 +-
 doc/mainpage.md                                    |  268 +-
 fpm.toml                                           |   86 +-
 project/doxygen/Doxyfile                           |  596 +-
 project/doxygen/doxygen-awesome-darkmode-toggle.js |  314 +-
 project/doxygen/doxygen-awesome.css                | 5362 ++++++++--------
 project/doxygen/header.html                        |  168 +-
 run_benchmarks.sh                                  |   89 +-
 src/fast_float_module.F90                          | 5385 ++++++++--------
 test/check.f90                                     | 1048 ++--
 22 files changed, 11338 insertions(+), 11230 deletions(-)

perazz added 4 commits March 27, 2026 20:24
- Remove default initializers from hot-path types (parsed_number,
  adjusted_mantissa, u128, stackvec) to eliminate mandatory zeroing
  on intent(out) per Fortran standard
- Remove dead try_fast_path call that always returned valid=false
  for PRESET_GENERAL (FMT_SKIP_WS flag set)
- Inline is_space check in whitespace skip loop
- Compute JSON flag inline, eliminating local variable
- Add explicit safety initializations for frac_start, frac_len,
  and too_many_digits on paths that skipped them
- Add parse_double_batch subroutine for batch parsing API
- Pre-compute integer offset arrays (istart/iend) to avoid c_size_t
  conversion overhead in the hot benchmark loop
- Add batch C benchmark function (benchmark_ffc_lines) to eliminate
  per-line Fortran-to-C call overhead
- Enable FMT_SKIP_WS in C ffc benchmark to match Fortran's
  PRESET_GENERAL for fair comparison
- Add ffc_parse_double_ws with whitespace-skip options
- Update Fortran bridge to use whitespace-skip C entry point
Replace fpm-based benchmark build with manual compilation using
Profile-Guided Optimization (PGO). This bypasses fpm's forced -fPIC
flag and enables proper PGO profile matching across build phases.

Build process: instrument → collect profile on uniform + file data →
rebuild with -fprofile-use. Combined with LTO and aggressive inlining
flags, this achieves ~30% speedup over the fpm release build.
@perazz perazz force-pushed the claude/speedup-fast-float branch from fd7c82a to 4d7f170 Compare March 27, 2026 19:42
@perazz perazz merged commit 71180cb into main Mar 27, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant