A high-performance, encrypted UDP tunnel implementing a SOCKS5-compatible proxy with session multiplexing, optimized for traversing restrictive network environments.
- Architecture Overview
- Key Design Decisions & Trade-Offs
- Core Components Deep Dive
- Failure Modes & Reliability
- Security & Compliance
- Performance Insights
- Extensibility & Future Roadmap
- Setup Instructions
The system implements a split-architecture proxy where:
- Client runs locally, accepting SOCKS5 connections from browsers/applications
- Server runs on a remote VPS, relaying traffic to the internet
All traffic between client and server is tunneled over a single UDP socket using a custom binary protocol with XChaCha20-Poly1305 AEAD encryption.
flowchart LR
subgraph Local["Local Machine"]
Browser["Browser/App"]
Client["proxy-vpn client"]
SOCKS5["SOCKS5 Handler"]
CMux["Multiplexer"]
CDemux["Demultiplexer"]
end
subgraph Remote["Remote VPS"]
Server["proxy-vpn server"]
SDemux["Demultiplexer"]
SMux["Multiplexer"]
Relay["TCP Relay"]
end
subgraph Internet
Target["Target Website"]
end
Browser -->|"TCP (SOCKS5)"| SOCKS5
SOCKS5 --> Client
Client --> CMux
CMux -->|"UDP (Encrypted)"| SDemux
SDemux --> Relay
Relay -->|"TCP"| Target
Target -->|"TCP"| Relay
Relay --> SMux
SMux -->|"UDP (Encrypted)"| CDemux
CDemux --> Client
Client -->|"TCP"| Browser
| Pattern | Implementation | Rationale |
|---|---|---|
| Multiplexer/Demultiplexer | Channel-based goroutines for all UDP I/O | Single UDP socket handles N concurrent sessions |
| Session-per-Connection | SessionContext with sliding window |
Enables packet reordering over unreliable UDP |
| Interface-based Abstraction | Codec, Crypto interfaces |
Hot-swappable serialization and encryption |
| Singleton with Lazy Init | Global codec.C(), crypto.C() accessors |
Avoids dependency injection complexity |
| Object Pool | sync.Pool for 1500-byte buffers |
Zero-allocation hot path |
┌─────────────────────────────────────────────────────────────────────┐
│ Encrypted Packet │
├──────────────┬─────────────────────────────────┬────────────────────┤
│ Nonce (24B) │ Ciphertext │ Poly1305 Tag (16B) │
└──────────────┴─────────────────────────────────┴────────────────────┘
Decrypted payload structure:
┌────────────┬──────┬────────────┬──────────┬────────────────────────┐
│ SessionID │ Type │ SeqID │ Length │ Payload │
│ (4B) │ (1B) │ (4B) │ (2B) │ (variable) │
└────────────┴──────┴────────────┴──────────┴────────────────────────┘
│
└─► TYPE_CONNECT=1, TYPE_DATA=2, TYPE_FIN=3, TYPE_PING=4, TYPE_PONG=5
Header Size: 11 bytes fixed
Max Payload: 1449 bytes (1500 MTU - 11 header - 24 nonce - 16 tag)
Choice: UDP transport between client and server.
Rationale:
- Avoids TCP-over-TCP meltdown (retransmission amplification)
- Lower latency for real-time applications
- Better NAT traversal characteristics
- Mimics legitimate UDP traffic patterns (VoIP, gaming)
Trade-off: Required implementing custom reliability layer (seq-based reordering) within the application.
Choice: XChaCha20-Poly1305 (extended nonce variant)
| Factor | XChaCha20-Poly1305 | AES-GCM |
|---|---|---|
| Nonce size | 24 bytes (safe random) | 12 bytes (requires counter) |
| Hardware accel | Software-only | AES-NI available |
| Nonce collision risk | ~2^192 birthday bound | ~2^48 birthday bound |
Rationale: 24-byte random nonce eliminates nonce-management complexity—critical for UDP where packet ordering isn't guaranteed. Performance difference is marginal for tunnel workloads.
Choice: Custom binary codec with fixed offsets.
// internal/protocol/codec/binary.go
binary.BigEndian.PutUint32(buf[0:4], h.SessionID)
buf[4] = h.Type
binary.BigEndian.PutUint32(buf[5:9], h.SeqID)
binary.BigEndian.PutUint16(buf[9:11], h.Length)Rationale:
- Zero allocation on encode/decode
- Deterministic 11-byte header
- No schema evolution needed (protocol is internal)
- Protobuf/MsgPack stubs exist but are disabled—future extensibility preserved
Choice: Per-session sequence-based window instead of strict ordering.
// internal/session/session.go
func (s *SessionContext) InsertPacket(seqID uint32, payload []byte, originalBuffer []byte) {
s.Window[seqID] = item{payload, originalBuffer}
s.Signal <- struct{}{}
}Trade-off:
- ✅ Out-of-order delivery support
- ✅ Graceful handling of packet loss (timeout-based advancement)
- ❌ No retransmission—relies on underlying reliability when needed
The 50ms ticker advances NextSeqID on timeout, accepting some packet loss for lower latency.
Choice: All sessions share one UDP socket.
Architecture implications:
- Client:
Multiplexer.SendChanaggregates all outbound packets - Server:
Demultiplexerroutes incoming packets bySessionID
Trade-off: Simplifies NAT pinhole management but requires careful channel sizing (2000-5000 capacity) to prevent backpressure.
Packet → codec.Encode() → plaintext frame → crypto.Encrypt() → wire bytes
func (b *Builder) Build(p *Packet) (OutboundWork, error) {
encoded, _ := codec.C().Encode(p.Header, p.Buffer) // Header into buffer
encrypted, _ := crypto.C().Encrypt(p.Buffer, encoded) // In-place encrypt
return OutboundWork{Data: encrypted, OriginalBuffer: p.Buffer}, nil
}Key insight: Buffer reuse—p.Buffer is the allocation, and all operations write into it.
wire bytes → crypto.Decrypt() → plaintext → codec.Decode() → Packet
In-place decryption: aead.Open(enc[:0], nonce, enc, nil) overwrites ciphertext.
Each browser connection produces one SessionContext:
type SessionContext struct {
TargetConn net.Conn // Browser (client) or Website (server)
Window map[uint32]item // SeqID → payload for reordering
NextSeqID uint32 // Expected sequence
Signal chan struct{} // Flush trigger
Quit chan struct{} // Shutdown signal
ClientAddr *net.UDPAddr // Server-side: client's UDP address
}Flusher goroutine pattern:
func (s *SessionContext) runFlusher() {
ticker := time.NewTicker(50 * time.Millisecond)
for {
select {
case <-s.Signal:
s.flush() // Immediate flush on packet insert
case <-ticker.C:
s.handleTimeout() // Advance window on timeout
case <-s.Quit:
return
}
}
}Thread-safe session lookup with sync.RWMutex:
func (r *Registry) Get(sessionID uint32) (*SessionContext, bool) {
r.mu.RLock()
defer r.mu.RUnlock()
sess, ok := r.sessions[sessionID]
return sess, ok
}Full RFC 1928 implementation supporting:
- IPv4 (
0x01) - Domain name (
0x03) - IPv6 (
0x04)
func PerformSOCKS5Handshake(conn net.Conn) (string, error) {
// 1. Read greeting (version + methods)
// 2. Reply with no-auth (0x05, 0x00)
// 3. Read CONNECT request
// 4. Parse address type and extract target
// 5. Send success reply
return net.JoinHostPort(host, port), nil
}No authentication implemented—suitable for local proxy use only.
func HandleBrowserSession(browserConn, registry, multiplexer, builder) {
targetAddr := PerformSOCKS5Handshake(browserConn)
sessID := GenerateSessionID() // atomic increment
sess := session.NewSession(browserConn)
registry.Add(sessID, sess)
// Send CONNECT packet
multiplexer.SendChan <- builder.Build(connectPacket)
// Relay loop: Browser → UDP
for {
n := browserConn.Read(buf[11:1460]) // Offset for header
pkt := NewPacket(sessID, TYPE_DATA, seqID++, payload, buf)
multiplexer.SendChan <- builder.Build(pkt)
}
}func (d *Demultiplexer) handlePacket(buf []byte, n int, clientAddr *net.UDPAddr) {
pkt := d.Parser.Parse(buf[:n], buf)
sess, ok := d.Registry.Get(pkt.Header.SessionID)
switch pkt.Header.Type {
case TYPE_CONNECT:
if !ok {
go d.setupAndRelay(sessionID, targetAddr, clientAddr)
}
case TYPE_DATA:
if ok { sess.InsertPacket(seqID, payload, buf) }
case TYPE_FIN:
if ok { sess.Close(); d.Registry.Delete(sessionID) }
}
}Synchronous relay loop per session:
func (d *Demultiplexer) runTCPRelay(sess, sessionID) {
for {
n := sess.TargetConn.Read(buf[11:1460]) // Read from website
pkt := NewPacket(sessionID, TYPE_DATA, seqID++, payload, buf)
d.Multiplexer.SendChan <- OutboundPacket{Data, Addr, Buffer}
}
}type TokenBucket struct {
rate float64 // tokens/second
burst float64 // max capacity
tokens float64 // current
lastCheck time.Time
}
func (tb *TokenBucket) Wait(tokensToConsume int) {
// Blocks until tokens available
// Refills at `rate` tokens/second
}Currently disabled in main.go but infrastructure is in place for bandwidth limiting.
| Failure | Detection | Recovery |
|---|---|---|
| Packet corruption | Poly1305 auth tag verification | Drop packet, return to pool |
| Out-of-order arrival | SeqID mismatch in window | Buffer until flush or timeout |
| Session timeout | 30s read deadline | Send FIN, cleanup |
| UDP write failure | Error from WriteToUDP |
Log, continue (best-effort) |
| Crypto init failure | Key length validation | panic() at startup |
defer func() {
sess.Close()
registry.Delete(sessID)
}()All handlers use deferred cleanup. SessionContext.Close() uses sync.Once semantics via channel close detection:
func (s *SessionContext) Close() {
select {
case <-s.Quit:
return // Already closed
default:
close(s.Quit)
s.TargetConn.Close()
// Return all buffered payloads to pool
}
}Logging is pervasive but unsophisticated:
log.Printf("[session %d] connection established: client=%s → target=%s (local=%s)",
sessionID, clientAddr, targetAddr, conn.LocalAddr())Current gaps: No structured logging, no metrics, no tracing.
| Property | Implementation |
|---|---|
| Confidentiality | XChaCha20 stream cipher |
| Integrity | Poly1305 MAC (16 bytes) |
| Authenticity | AEAD construction prevents tampering |
| Nonce uniqueness | 24-byte random per packet |
| Key derivation | Raw 32-byte hex from environment |
| Threat | Mitigation |
|---|---|
| Replay attacks | Implicit - no replay protection (stateless packets) |
| Traffic analysis | Partial - fixed header size, but payload length leaked |
| Key compromise | Single pre-shared key compromise is catastrophic |
| Denial of service | Rate limiting infrastructure present but unused |
# .env file
KEY="32 bit hex string" # 64 hex chars = 32 bytesWeaknesses:
- No key rotation mechanism
- Plaintext in environment file
- No authentication handshake—any party with the key can impersonate
| Operation | Time | Space |
|---|---|---|
| Packet encode | O(1) | O(1) - in-place |
| Packet decrypt | O(n) | O(1) - in-place |
| Session lookup | O(1) avg | O(n) sessions |
| Window insert | O(1) | O(w) window size |
| Window flush | O(k) consecutive | O(1) per item |
var bytePool = sync.Pool{
New: func() any { return make([]byte, 1500) },
}Critical path is allocation-free:
pool.Get()→ borrow buffer- Read into buffer offset (preserving header space)
- Build packet referencing buffer
- Encrypt in-place
- Send via channel
pool.Put()after UDP write
const MaxPacketSize = 1500 // MTU-sized- Header: 11 bytes
- Payload: up to 1449 bytes
- Encrypted overhead: 24 (nonce) + 16 (tag) = 40 bytes
- Max ciphertext: ~1500 bytes
| Component | Capacity | Rationale |
|---|---|---|
| Client Multiplexer | 2000 | Absorb burst from multiple sessions |
| Server Multiplexer | 5000 | Higher concurrency expected |
| Session Signal | 1 | Non-blocking notification |
-
Codec Interface: Add
CodecMsgPack,CodecProtoimplementationstype Codec interface { Encode(h *header.Header, payload []byte) ([]byte, error) Decode(b []byte) (*header.Header, []byte, error) }
-
Crypto Interface: Add
CryptoAESimplementationtype Crypto interface { Encrypt(dst, plaintext []byte) ([]byte, error) Decrypt(ciphertext []byte) ([]byte, error) }
-
Token Bucket: Pre-built rate limiting (disabled)
| Area | Improvement | Complexity |
|---|---|---|
| Reliability | ARQ with selective ACKs | High |
| Security | ECDH key exchange at session start | Medium |
| Observability | Prometheus metrics, structured logging | Low |
| Performance | UDP batch I/O (recvmmsg) |
Medium |
| NAT Traversal | STUN/TURN integration | High |
| Compression | LZ4 before encryption | Low |
The codebase contains stubs for:
- MsgPack codec (
internal/protocol/codec/msgpack.go) - AES-GCM crypto (commented in
crypto.go) - Session manager with rate limiting (commented in
server/main.go)
- Go 1.21+ (uses
golang.org/x/crypto) - UDP port accessible on server
Create .env in project root:
SERVER_ADDR=<VPS_IP>:<PORT> # Client: where to connect
SERVER_PORT=8000 # Server: port to listen
CODEC=binary
CRYPTO=chacha20
KEY=<64-hex-chars> # 32 bytes = 256-bit key
CLIENT_ADDR=127.0.0.1:1080 # Client: SOCKS5 listen addressGenerate a key:
openssl rand -hex 32# Server (on VPS)
go build -o vpn-server ./cmd/server
./vpn-server
# Client (locally)
go build -o vpn-client ./cmd/client
./vpn-clientConfigure browser to use SOCKS5 proxy at 127.0.0.1:1080.
proxy-vpn/
├── cmd/
│ ├── client/main.go # Client entrypoint
│ └── server/main.go # Server entrypoint
├── internal/
│ ├── client/
│ │ ├── demultiplexer.go # UDP → Session routing
│ │ ├── handler.go # Per-browser session handler
│ │ ├── multiplexer.go # Session → UDP aggregation
│ │ ├── socks5.go # SOCKS5 protocol implementation
│ │ └── utils.go # Session ID generation
│ ├── pool/
│ │ └── pool.go # sync.Pool for byte buffers
│ ├── protocol/
│ │ ├── builder.go # Packet → wire format
│ │ ├── parser.go # Wire format → Packet
│ │ ├── packet.go # Packet struct definitions
│ │ ├── codec/ # Serialization implementations
│ │ ├── crypto/ # Encryption implementations
│ │ └── header/ # Header constants and types
│ ├── server/
│ │ ├── congestion.go # Token bucket rate limiter
│ │ ├── demultiplexer.go # UDP → TCP relay per session
│ │ └── multiplexer.go # TCP → UDP response aggregation
│ └── session/
│ ├── registry.go # Thread-safe session lookup
│ └── session.go # Reordering window implementation
├── .env.example
├── go.mod
└── go.sum