Refactor: thread-safe ErrorHint for GPU parallelism#1278
Draft
Refactor: thread-safe ErrorHint for GPU parallelism#1278
Conversation
Route all ErrorHint warning calls through modState%errorstate (thread-safe) instead of module-level SAVE variables. This is a prerequisite for multi-grid parallelism via Rayon or GPU offloading. Changes: - Add optional modState parameter to AerodynamicResistance, SurfaceResistance, and psyc_const; pass through to ErrorHint calls - Fix 4 ErrorHint calls in sat_vap_press_x/sat_vap_pressIce that had modState in scope but were not passing it - Update all callers in suews_ctrl_driver to pass modState - Remove supy_warning_count and supy_last_warning_message SAVE variables - Remove module-level warning fallback in ErrorHint - Convert 2 RSLProfile add_supy_warning calls to modState%errorstate%report - Retain add_supy_warning as no-op stub for 10 call sites without modState Remaining SAVE: supy_error_flag/code/message (fatal path only — acceptable because fatal errors terminate the run). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI Build PlanChanged FilesFortran source (7 files)
Rust bridge (4 files)
Python source (2 files)
Build Configuration
Rationale
Updated by CI on each push. See path-filters.yml for category definitions. |
Eliminate per-grid overhead in run_suews_rust_multi: - Serialise config dict once, patch sites[] per grid (no deep copy) - Prepare forcing block once (shared across all grids) - Use json.dumps instead of yaml.dump (~30x faster serialisation; valid JSON parses as valid YAML via serde_yaml) Benchmark (20 grids x 576 timesteps): Before: 5.75s (0.287s/grid, 2005 grid-timesteps/s) After: 3.69s (0.184s/grid, 3126 grid-timesteps/s) Also add scripts/profile_multi_grid.py for profiling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add process-based parallelism to run_suews_rust_multi using multiprocessing.Pool with spawn context (safe for Fortran SAVE). Thread serial_mode through run_suews_rust_chunked and _run_supy. Parallel mode is available but currently has high spawn overhead for short simulations. Best suited for long runs with many grids where per-grid compute time dominates process creation cost. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The thread-safe ErrorHint refactor (e560428) routed warnings through modState%errorstate%report(), which appends to a dynamically-growing array. Over a year-long simulation with frequent boundary-condition warnings, this caused unbounded memory growth and allocation overhead, timing out the Windows CI UMEP build. Cap non-fatal entries at 512; fatal entries always stored. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
M1 Max has 10 cores; 4 grids is sufficient for quick iteration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add run_suews_multi Rust function that uses Rayon par_iter to execute grid cells concurrently in shared memory (no IPC serialisation overhead). Changes: - Add rayon dependency to suews_bridge Cargo.toml - Add run_suews_multi PyO3 function: takes list of config JSONs + shared forcing, returns results from all grids in parallel - Add -frecursive to gfortran flags (Makefile.gfortran + build.rs) so concurrent Fortran calls each get their own stack frame - Python auto-detects run_suews_multi and uses it when serial_mode=False Benchmark (4 grids x 17520 timesteps, full year, M1 Max): Serial: 59.0s (14.75s/grid, 7146 grid-timesteps/s) Rayon: 33.6s (8.39s/grid, 12560 grid-timesteps/s) Speedup: 1.76x Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
modState%errorstate(thread-safe) instead of module-level SAVE variablessupy_warning_countandsupy_last_warning_messageSAVE variables frommodule_ctrl_error_statemodStateparameter toAerodynamicResistance,SurfaceResistance, andpsyc_const; update all callersadd_supy_warningas no-op stub for 10 call sites that don't yet havemodStatein scopeTest plan
make devbuilds cleanlymake test-smokepasses (9/9)make testpasses (695/695)modState🤖 Generated with Claude Code