Skip to content

mprotect-based memory boundary checking#272

Open
kateinoigakukun wants to merge 20 commits intomainfrom
katei/8698-implement-mprote
Open

mprotect-based memory boundary checking#272
kateinoigakukun wants to merge 20 commits intomainfrom
katei/8698-implement-mprote

Conversation

@kateinoigakukun
Copy link
Copy Markdown
Member

@kateinoigakukun kateinoigakukun commented Jan 17, 2026

Use mprotect-based virtual memory for Wasm linear memory on macOS and Linux.

Instead of bounds-checking every load/store in software, reserve the full 4GB address space via mmap(PROT_NONE) and mprotect committed pages as the memory grows. Out-of-bounds accesses hit unmapped guard pages, triggering SIGSEGV/SIGBUS which a registered signal handler converts into a Wasm trap.

This adds Unchecked variants of all memory load/store instructions. The translator emits these when the target memory uses mprotect and the memarg.offset fits within the guard region, skipping the software bounds check entirely.

New EngineConfiguration.memoryBoundsChecking option (.auto / .mprotect / .software). Default is .auto, which uses mprotect on supported platforms and falls back to software checks otherwise. Memory64 always uses software checks.

Signal Safety

siglongjmp from a signal handler does NOT run Swift destructors (deinit, defer, withValue closures) or ARC release operations on abandoned frames.

The direct-threaded path is safe: wasmkit_tc_start runs the dispatch loop in C trampolines, the memory.withValue lock is released before dispatch begins, and the unchecked instruction handlers (memoryLoadUnchecked/memoryStoreUnchecked) hold no resources. The only Swift scope on the stack is withUnsafeMutablePointer(to: &self) which merely pins a pointer.

The token-threaded path is problematic: runTokenThreadedImpl contains a Swift while true { try doExecute(...) } loop. A fault during an unchecked memory instruction causes siglongjmp to cut through Swift frames, which is undefined behavior in Swift. If the fault occurs mid-withValue on a Mutex-protected resource, the lock is never released.

For this reason, Engine.init will throw EngineConfigurationError when a user requests token-threading (including automatically selected one on platforms without direct threading) and explicit mprotect-based bounds checking.

@kateinoigakukun kateinoigakukun force-pushed the katei/8698-implement-mprote branch 2 times, most recently from 3e9e64f to 144c97e Compare January 17, 2026 23:05
@MaxDesiatov MaxDesiatov marked this pull request as ready for review February 28, 2026 14:04
@MaxDesiatov MaxDesiatov marked this pull request as draft February 28, 2026 14:31
@MaxDesiatov MaxDesiatov moved this to In Progress in Swift for Wasm Mar 13, 2026
@MaxDesiatov MaxDesiatov marked this pull request as ready for review March 26, 2026 16:41
@MaxDesiatov
Copy link
Copy Markdown
Member

Benchmarks:

CoreMark (3 runs each, Iterations/Sec, higher is better)

  ┌─────┬────────┬─────────────────┬───────┐
  │ Run │  main  │ mprotect branch │ delta │
  ├─────┼────────┼─────────────────┼───────┤
  │ 1   │ 3384.9 │ 3358.7          │ -0.8% │
  ├─────┼────────┼─────────────────┼───────┤
  │ 2   │ 3357.8 │ 3362.8          │ +0.1% │
  ├─────┼────────┼─────────────────┼───────┤
  │ 3   │ 3324.1 │ 3344.9          │ +0.6% │
  ├─────┼────────┼─────────────────┼───────┤
  │ avg │ 3355.6 │ 3355.5          │ -0.0% │
  └─────┴────────┴─────────────────┴───────┘

CoreMark is essentially identical between the two branches (well within noise).

libsodium (WishYouWereFast) via hyperfine, mean time in ms, lower is better

  ┌────────────────────────────────┬───────────┬───────────────┬────────┐
  │           Benchmark            │ main (ms) │ mprotect (ms) │ change │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ aead_chacha20poly1305          │     259.6 │         271.9 │  +4.7% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ aead_chacha20poly13052         │     502.1 │         520.6 │  +3.7% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ aead_xchacha20poly1305         │     262.5 │         273.3 │  +4.1% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ auth                           │      75.3 │          77.3 │  +2.7% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ auth2                          │      11.0 │          11.6 │  +5.5% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ auth3                          │      14.7 │          15.5 │  +5.4% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ auth6                          │      13.1 │          13.5 │  +2.6% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ box                            │     806.6 │         799.2 │  -0.9% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ box2                           │     802.7 │         802.8 │  +0.0% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ box_easy                       │    1570.4 │        1583.7 │  +0.8% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ box_seal                       │      N/A* │        4244.8 │     -- │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ box_seed                       │     141.3 │         140.5 │  -0.5% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ chacha20                       │    1266.7 │        1269.3 │  +0.2% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ codecs                         │    2144.5 │        2112.7 │  -1.5% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ generichash                    │    1107.7 │        1113.2 │  +0.5% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ generichash2                   │      97.2 │          96.7 │  -0.5% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ generichash3                   │      97.0 │          99.4 │  +2.4% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ hash                           │      16.1 │          15.3 │  -4.9% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ hash3                          │       9.5 │           8.0 │ -15.4% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ kdf                            │      65.3 │          65.1 │  -0.3% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ kdf_hkdf                       │    1757.2 │        1771.2 │  +0.8% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ keygen                         │      27.7 │          27.6 │  -0.5% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ onetimeauth                    │       9.1 │           8.0 │ -11.6% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ onetimeauth2                   │       6.7 │           5.5 │ -17.7% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ randombytes                    │    4628.4 │        4543.1 │  -1.8% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ scalarmult                     │    1455.8 │        1431.3 │  -1.7% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ scalarmult2                    │     139.9 │         137.6 │  -1.7% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ scalarmult5                    │     491.2 │         397.2 │ -19.1% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ scalarmult6                    │     400.5 │         397.9 │  -0.7% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ scalarmult7                    │     800.7 │         783.2 │  -2.2% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ secretbox                      │      11.8 │          11.3 │  -3.8% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ secretbox2                     │      10.0 │           9.8 │  -1.6% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ secretbox_easy                 │      28.3 │          27.4 │  -3.2% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ secretstream_xchacha20poly1305 │     191.9 │         188.3 │  -1.9% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ shorthash                      │      10.6 │           9.7 │  -8.3% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ sign2                          │    2369.8 │        2361.5 │  -0.3% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ siphashx24                     │      10.8 │          11.4 │  +5.7% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ stream3                        │       9.6 │           7.7 │ -19.1% │
  ├────────────────────────────────┼───────────┼───────────────┼────────┤
  │ stream4                        │       8.4 │           9.2 │  +9.3% │
  └────────────────────────────────┴───────────┴───────────────┴────────┘

  * box_seal.wasm hit a transient hyperfine error on main; it runs fine manually.

libsodium: Mixed picture, but overall net-positive:

  • Clear wins (>5% faster): hash3 (-15%), onetimeauth2 (-18%), scalarmult5 (-19%), onetimeauth (-12%), stream3 (-19%), shorthash (-8%). These sub-10ms benchmarks have high variance, but the trend is consistently favorable.
  • Slight regressions (3-6% slower): aead_chacha20poly1305 family (+4%), auth2/auth3 (+5%), siphashx24 (+6%), stream4 (+9%). Most of these are also short-running (<15ms) with high relative variance.
  • Long-running benchmarks (>100ms, lower noise): Mostly neutral or slightly faster. codecs (-1.5%), scalarmult (-1.7%), randombytes (-1.8%), secretstream (-1.9%), scalarmult7 (-2.2%).

Copy link
Copy Markdown
Member

@MaxDesiatov MaxDesiatov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is good to merge and it unblocks a lot of future work (shared memory for wasip1-threads, JIT/AOT and so on)

@kateinoigakukun kateinoigakukun force-pushed the katei/8698-implement-mprote branch from e71e628 to f9bd976 Compare March 28, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants