Conversation
78fb1fe to
6176db1
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes guest→host tracing breakage introduced by CoW by translating the guest-provided trace buffer pointer (GVA) via a page-table walk (using CR3) before decoding the FlatBuffer trace batch. It also refines guest-side spans around initialization/dispatch to improve trace usability and reduce mismatched span warnings.
Changes:
- Translate trace batch pointers from GVA→GPA on the host (CR3-driven page table walk) before decoding trace FlatBuffers.
- Update the guest↔host tracing ABI to pass trace batch metadata via r12/r13/r14 and adjust host handling accordingly.
- Improve guest tracing spans around
generic_initandinternal_dispatch_functionand enable Debug printing ofFunctionCall.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/hyperlight_host/src/sandbox/trace/context.rs | Switch trace batch extraction to read via GVA translation and new from_regs API. |
| src/hyperlight_host/src/sandbox/snapshot.rs | Expose helpers for resolving GPAs and reading page tables from shared/scratch memory. |
| src/hyperlight_host/src/mem/mgr.rs | Add read_guest_memory_by_gva that walks page tables and reads from shared/scratch backing. |
| src/hyperlight_host/src/hypervisor/hyperlight_vm.rs | Pass CR3 into trace handling and adjust when trace handling is attempted. |
| src/hyperlight_guest_tracing/src/state.rs | Send trace batch pointer/len via r12/r13/r14 in the OUT instruction. |
| src/hyperlight_guest_bin/src/lib.rs | Add a generic_init span and flush/close ordering to avoid host span warnings. |
| src/hyperlight_guest_bin/src/guest_function/call.rs | Add an internal_dispatch_function span and trace log the FunctionCall in debug builds. |
| src/hyperlight_guest/src/exit.rs | Update OUT path to attach trace batch metadata via r12/r13/r14. |
| src/hyperlight_common/src/vmem.rs | Update comment noting virt_to_phys is now used for tracing reads. |
| src/hyperlight_common/src/flatbuffer_wrappers/function_call.rs | Derive Debug for FunctionCall to support tracing output. |
6176db1 to
fb1f56a
Compare
The registers changed because I wanted to ensure the registers used couldn't be overwritten when returning from a function (currently, we set the registers in preparation for a |
fb1f56a to
bff35d2
Compare
bff35d2 to
18bbbf3
Compare
|
At a high level, I think it adds some complexity in the long term to do everything in terms of VAs here, even if it requires fewer changes in the guest immediately to not need to deal with making sure the buffer allocation is contiguous (think like how big kernels do DMA-pinned memory). The specific way that the host code accesses the trace buffers looks to me like it will be fairly slow as well. However, perhaps these are not major concerns if we are going to switch to using the same ring buffer transport as all other asyncio soon. (Side comment: we should definitely still be batching these trace records when we do that---descriptor capacity is limited and these records are tiny individually afaiui.)
|
I chose to use VAs because it seemed simpler to access the guest trace buffers that way. My plan is to definitely use the asyncio features once we have them and get rid of the logic handling the buffer info exchanges between the guest and host.
That's correct, we used to also set those registers before
Thanks for clearing this up for me, I got it backwards |
- With the new Copy-on-Write changes, the guest virtual addresses and guest physical addresses are no longer identity mapped, so we need a way to do this translation when a new guest trace batch arrives from the guest Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- The max virtual address was calculated based on the vmin (start of page), not the actual virtual address provided Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- This is currently done on the guest, but not on the host Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- For each call to `internal_dispatch_function` now we create a span manually because instrumenting the function would be redundant because of the call to `new_call` that resets tracing state for each new function call - The same logic is used for `generic_init` to enable it for the initialise call Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
a8c31b0 to
dadb693
Compare
Description
This PR fixes #1242.
The guest tracing data received by the host assumes identity mapped addresses, the address needs to be translated to correctly decode the flatbuffers data.
This can be reviewed commit by commit.
The first commit tackles this issue, the other two improve on the spans and usability.