libia2: keep TCB TLS page shared across compartments#671
Open
libia2: keep TCB TLS page shared across compartments#671
Conversation
708fd0b to
da78029
Compare
ayrtonm
reviewed
Mar 24, 2026
|
|
||
| // x18-tagged aarch64 builds do not currently use the x86_64 TCB/%fs ABI path. | ||
| // Keep this symbol for cross-arch call-site symmetry. | ||
| void ia2_unprotect_thread_pointer_page(void) {} |
Contributor
There was a problem hiding this comment.
Might be better to remove this no-op since it could easily get missed later on and instead just add #if defined(__x86_64__) around this function's callsite.
ayrtonm
reviewed
Mar 24, 2026
| struct ia2_thread_metadata *const thread_metadata = ia2_thread_metadata_get_for_current_thread(); | ||
| #endif | ||
|
|
||
| enum { MAX_SHARED_TLS_PAGES = 2 }; |
Contributor
There was a problem hiding this comment.
nit: I don't think an enum provides any benefit for a single value so maybe make this const int or whatever type to avoid the implicit backing type
added 4 commits
March 31, 2026 13:01
This was encountered while running compartmentalized dav1d with IA2 libc compartmenting enabled. Observed behavior in dav1d: - `dav1d --version` succeeds - actual decode (`dav1d -i /tmp/test.ivf -o /dev/null`) crashes early - the first crash mode was a SIGSEGV on compiler-generated stack-protector access via `%fs:0x28` while running under a non-shared compartment PKRU during cross-compartment execution Root cause: - IA2 retags PT_TLS pages per compartment - on x86_64, the thread pointer / TCB page is `%fs` ABI state, not compartment-private state - if this page is retagged as compartment-private, normal function prologue/epilogue stack-canary checks can fault before intended callgate logic This change: - adds ia2_unprotect_thread_pointer_page() and declares it in ia2_internal.h - updates protect_tls_pages() to treat ABI-sensitive TLS pages as carve-outs that are explicitly tagged with pkey 0: - ia2_stackptr_0 page (existing shared callgate stack slot) - x86_64 TCB page (thread pointer / %fs page) - keeps the rest of each PT_TLS range tagged with the owning compartment pkey - invokes ia2_unprotect_thread_pointer_page() after startup compartment setup and after per-thread TLS setup for new threads After this change, the dav1d decode crash moves forward to a different site (`__tls_get_addr` reading `_rtld_local` in the loader), confirming this commit addresses the `%fs`/TCB failure mode specifically.
da78029 to
ad0606f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
IA2 retags writable memory, including
PT_TLS, to per-compartment pkeys. On x86_64, that model is too strict for TLS pages that hold process ABI state rather than compartment-private state.In particular, the thread-pointer/TCB page (
__builtin_thread_pointer(),%fsABI state) can be accessed in normal execution and transition paths where current PKRU does not yet match the compartment that originally tagged the page. When that page is left compartment-tagged, this can fault in real workloads (including decode-path crashes where stack-protector canary reads use%fs).Fix
protect_tls_pages()now uses an x86_64 split policy:PT_TLSranges with the target compartment pkeyia2_stackptr_0page (existing call-gate stack slot behavior)%fsABI state)Implementation details:
MAX_SHARED_TLS_PAGES = 2) and only add carve-outs that overlap the current module TLS rangepkey 0during process-init path (ia2_get_compartment() == 0)pkey 0; those pages are already shared and repeated retagging from compartment 1 can violate tracer policyia2_start()that re-applies TCB sharing after compartment setupSimilar Patterns in IA2
ia2_stackptr_0inprotect_tls_pages()(seeruntime/libia2/ia2.c,protect_tls_pages, around lines 466-473)protect_pages()already uses a "protect-by-default with explicit shared carve-outs" model for writable ranges (seeruntime/libia2/ia2.c,protect_pages, shared-range collection around lines 603-632, and overlap trimming beforeia2_mprotect_with_tagaround lines 685-723)runtime/libia2/ia2.c, explicitshared_sectionshandling around lines 605-632 andPT_GNU_RELROhandling around lines 634-645)