fix(server): use ALPN-negotiated protocol for TLS connections#574
fix(server): use ALPN-negotiated protocol for TLS connections#574cluster2600 wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
The gateway server used hyper's auto-detection (byte-sniffing the HTTP/2 connection preface) for all connections, including TLS connections where ALPN already negotiated the protocol. This could misidentify h2 connections as HTTP/1.1 when the first read returned a partial preface, making gRPC unusable. Add MultiplexService::serve_h2() which starts the HTTP/2 state machine directly, bypassing auto-detection. After TLS handshake, check the ALPN-negotiated protocol and route h2 connections to serve_h2(). Plaintext connections continue using auto-detection. Also add TLS status logging (cert paths, ALPN config) on startup and per-connection ALPN protocol logging at debug level. Closes NVIDIA#535
|
Thank you for your submission! We ask that you sign our Developer Certificate of Origin before we can accept your contribution. You can sign the DCO by adding a comment below using this text: I have read the DCO document and I hereby sign the DCO. You can retrigger this bot by commenting recheck in this Pull Request. Posted by the DCO Assistant Lite bot. |
|
Thank you for your interest in contributing to OpenShell, @cluster2600. This project uses a vouch system for first-time contributors. Before submitting a pull request, you need to be vouched by a maintainer. To get vouched:
See CONTRIBUTING.md for details. |
|
I have read the DCO document and I hereby sign the DCO. |
|
Hi there — just a quick note to say we've done the needful on our end. The DCO is signed, and we've opened a vouch request as well. Happy to address any feedback on the implementation. The fix has been compiled and tested on our DGX Spark (aarch64) — 199 server lib tests passing, no regressions. Cheers! |
Summary
Resolves #535 — the gateway server ignored the ALPN-negotiated protocol on TLS connections, falling back to byte-sniffing auto-detection which could misidentify HTTP/2 connections as HTTP/1.1, rendering gRPC unusable.
MultiplexService::serve_h2()which starts the HTTP/2 state machine directly, bypassing auto-detectionh2connections toserve_h2()Root Cause
Hyper's auto-detection works by reading the first bytes of a connection and checking for the HTTP/2 connection preface (
PRI * HTTP/2.0). On TLS connections, if the firstread()returns a partial preface (due to buffering, MTU fragmentation, or TLS record boundaries), the auto-detector falls through to HTTP/1.1 — even though ALPN already negotiatedh2during the TLS handshake.graph TD subgraph "Before (broken)" A[TLS handshake<br/>ALPN negotiates h2] --> B[hyper auto-detect] B -->|"partial preface read"| C[HTTP/1.1 assumed] B -->|"full preface read"| D[HTTP/2] C --> E["gRPC fails ❌"] end subgraph "After (fixed)" F[TLS handshake<br/>ALPN negotiates h2] --> G{ALPN protocol?} G -->|"h2"| H["serve_h2() — direct HTTP/2"] G -->|"other/none"| I[hyper auto-detect fallback] H --> J["gRPC works ✅"] endChanges
crates/openshell-server/src/lib.rsh2toserve_h2(); add TLS startup loggingcrates/openshell-server/src/multiplex.rsALPN_H2constant; implementserve_h2()usinghyper::server::conn::http2::Builderdirectlycrates/openshell-server/src/tls.rsfrom_files_rejects_missing_cert,load_certs_rejects_nonexistent_file,load_key_rejects_nonexistent_file)Test Plan
cargo test -p openshell-server --lib— 199 passed, 0 failedalpn_h2_constant_matches_standard_protocol_id— verifies theALPN_H2constant equalsb"h2"tls.rssequenceDiagram participant Client participant TLS as TLS Acceptor participant Server as MultiplexService Client->>TLS: ClientHello (ALPN: h2, http/1.1) TLS->>Client: ServerHello (ALPN: h2) TLS->>Server: tls_stream + ALPN = "h2" alt ALPN == h2 Server->>Server: serve_h2(tls_stream) Note over Server: HTTP/2 state machine starts immediately else ALPN != h2 or absent Server->>Server: serve(tls_stream) Note over Server: byte-sniffing auto-detection end Client->>Server: gRPC request (HTTP/2 POST) Server->>Client: gRPC response