AI eXchange Intermediate Optimization Medium
A programming language designed as the canonical transfer format between AI agents, optimized for machine understanding and iterative optimization, that compiles to native code via LLVM.
This is NOT a language for humans to program in. This is a language for AI agents to communicate optimized computation through.
AXIOM beats C (-O3 -march=native -ffast-math) by 3% overall across 20 real-world benchmarks. 21 real-world C project ports (~60K+ combined GitHub stars) all at parity or faster. ~42,000 LOC. 579 tests. 115/115 benchmarks pass (1.01x avg vs C). ALL 47 milestones COMPLETE.
Every existing language was designed for humans. AXIOM is designed for the gap between AI agents: when one AI generates code and another needs to optimize it, they need a format that preserves semantic intent, exposes optimization surfaces, and compiles to the fastest possible native code.
@module matmul;
@intent("Dense matrix multiplication for compute benchmarking");
@constraint { correctness: "IEEE 754 compliant" };
@pure
@complexity O(n^3)
@vectorizable(i, j, k)
fn matmul(
a: tensor[f32, M, K] @layout(row_major) @align(64),
b: tensor[f32, K, N] @layout(col_major) @align(64),
) -> tensor[f32, M, N] @layout(row_major) {
@strategy {
tiling: { M: ?tile_m, N: ?tile_n, K: ?tile_k }
order: ?loop_order
parallel: ?parallel_dims
unroll: ?unroll_factor
}
// ... implementation ...
}
The ?params are optimization holes that AI agents fill in, benchmark, and iterate on. The @annotations carry semantic intent through every compilation stage. No other language does this.
115/115 benchmarks pass, 1.01x average ratio vs C (parity). Raytracer: AXIOM scalar 42ms (+7% faster than C), AXIOM AOS vec3 44ms (+2% faster), C -O2 47ms.
197 benchmarks comparing AXIOM against C turbo (clang -O3 -march=native -ffast-math). Same LLVM backend, but AXIOM generates better-optimized IR.
| Version | Median (ms) | vs C -O2 |
|---|---|---|
| AXIOM scalar | 42 | +7% faster |
| AXIOM AOS vec3 | 44 | +2% faster |
| C -O2 | 47 | baseline |
| C turbo (-O3 -ffast-math) | 51 | -9% slower |
| Benchmark | AXIOM | C Turbo | Winner |
|---|---|---|---|
| JPEG DCT | -- | -- | AXIOM 56% faster |
| RLE compression | -- | -- | AXIOM 16% faster |
| ... | ... | ... | ... |
| Total (20 programs) | 0.97x | 1.00x | AXIOM 3% faster (2 wins, 9 ties, 9 C wins) |
AXIOM ports of popular open-source C libraries, benchmarked against the original C compiled with clang -O3 -march=native -ffast-math. Each port uses only general AXIOM optimizations -- no benchmark-specific cheating. All 21 ports are annotated with @strict (every function carries @pure/@intent/@complexity).
| Project | GitHub Stars | Category | Result |
|---|---|---|---|
| QOI (image codec) | 7,439 | Compression | AXIOM 16% faster |
| TurboPFor (integer compression) | 800+ | Compression | AXIOM 35% faster |
| Huffman/miniz (deflate codec) | 2,300+ | Compression | AXIOM 14% faster |
| SipHash (keyed hash) | 400+ | Crypto | Parity |
| xxHash32 (non-crypto hash) | 10,954 | Hashing | Parity |
| AES-128 (encryption) | 4,902 | Crypto | Parity |
| heatshrink (embedded LZSS) | 1,300+ | Compression | Parity |
| LZ4 (fast compression) | 10,600 | Compression | Parity |
| cJSON (JSON parser) | 11,000 | Parsing | Parity |
| FastLZ (LZ77 compression) | 500+ | Compression | Parity |
| LZAV (improved LZ77) | 400+ | Compression | Parity (1.04x) |
| Base64 (Turbo-Base64 codec) | -- | Encoding | Parity |
| BLAKE3 (crypto hash) | -- | Crypto | Parity |
| minimp3 (MP3 IMDCT-36) | -- | Audio | Parity |
| stb_jpeg (JPEG IDCT) | -- | Image | Parity |
| SMHasher (4 hash functions) | -- | Hashing | Parity |
| lodepng (PNG decode core) | 2,200+ | Image | Parity |
| fpng (fast PNG encode) | 850+ | Image | Parity |
| libdeflate (fast DEFLATE) | 900+ | Compression | Parity |
| utf8proc (UTF-8 processing) | 450+ | Text | Parity |
| Roaring Bitmaps (compressed bitmaps) | 1,500+ | Data Structures | Parity |
Key optimizations applied across all ports: @pure -> fast-math | noalias on all pointers | @inline(always) on hot helpers | array_const_u8/array_const_i32 for lookup tables (direct .rodata GEP) | inbounds GEP on all ptr access | wrapping arithmetic (+%, *%) for hash/crypto | zext for array indices | interprocedural const pointer propagation
| Benchmark | AXIOM | C -O2 | Winner |
|---|---|---|---|
| Binary trees (arena) | 0.18s | 0.92s | AXIOM 80% faster |
| Dijkstra shortest path | 0.06s | 0.11s | AXIOM 45% faster |
| Random alloc/free | 0.09s | 0.12s | AXIOM 28% faster |
| Sparse matrix (arena) | 0.06s | 0.08s | AXIOM 23% faster |
AXIOM has more information than C and uses it:
| Optimization | What AXIOM knows | What C doesn't | LLVM effect |
|---|---|---|---|
@pure |
Function has no side effects | Must assume side effects | memory(none), fast math flags |
noalias |
No pointer aliasing (by design) | Must assume aliasing | Enables vectorization, reordering |
nsw |
No signed integer overflow | Must assume possible overflow | Strength reduction, loop opts |
| Arena allocator | Batch allocation lifetime | Per-object malloc/free | 50-200x allocation throughput |
@lifetime(scope) |
Heap can be stack | Must use heap | Zero-cost promotion |
fastcc |
Internal calling convention | C calling convention | Fewer register saves |
fence |
Release/acquire semantics | No memory model | Correct concurrency |
readonly/writeonly |
Pointer access direction | Must assume read+write | Alias analysis, dead store elim |
calloc for zeroed alloc |
Zero-init via OS page trick | malloc + memset |
Kernel-level zero pages, skips user-space memset |
@inline(always) |
Force-inline hot paths | Heuristic-only inlining | alwaysinline attribute, eliminates call overhead |
array_const_* |
Compile-time constant arrays | Runtime initialization | Direct GEP into .rodata, no pointer load |
inbounds GEP |
All pointer accesses are in-bounds | No guarantee | Enables LLVM alias analysis optimizations |
| Const ptr propagation | Interprocedural const pointer flow | Per-function only | Eliminates redundant loads from constant tables |
AXIOM maintains an Optimization Knowledge Base (docs/OPTIMIZATION_KNOWLEDGE.md) that grows with each LLM optimization session: 14 rules + 6 anti-patterns discovered so far. Rules capture what works (e.g., "arena allocators beat malloc by 50-200x for tree structures", "use array_const_u8 for lookup tables -- direct GEP into .rodata", "wrapping arithmetic is mandatory for hash/crypto"), anti-patterns capture what doesn't (e.g., "marking I/O functions as @pure breaks correctness", "trusting agent reports without verification"). The knowledge base is read before every optimization pass and updated after discoveries, creating a feedback loop where the compiler gets smarter over time.
# Build the compiler
cargo build --release
# Compile an AXIOM program
axiom compile examples/numerical/pi.axm -o pi
./pi
# See intermediate representations
axiom compile --emit=tokens examples/sort/bubble_sort.axm
axiom compile --emit=ast examples/sort/bubble_sort.axm
axiom compile --emit=hir examples/sort/bubble_sort.axm
axiom compile --emit=llvm-ir examples/sort/bubble_sort.axm
# Target a specific CPU architecture
axiom compile --target=x86-64-v4 program.axm -o program
# Enable runtime pre/postcondition checks
axiom compile --debug program.axm -o program
# JSON diagnostic output (for IDE/tooling integration)
axiom compile --error-format=json program.axm
# Run optimization protocol
axiom optimize examples/matmul/matmul_simple.axm --iterations 5
# Benchmark a program
axiom bench examples/numerical/pi.axm --runs 10
# Profile a program (compile + time + surface extraction + suggestions)
axiom profile program.axm --iterations 10
# Format an AXIOM source file (parse -> HIR -> pretty-print)
axiom fmt program.axm
# Generate documentation from @intent and doc comments
axiom doc program.axm
# Profile-guided optimization bootstrap
axiom pgo program.axm --iterations 3
# Watch mode -- recompile on file changes
axiom watch program.axm
# Build project with dependency resolution
axiom build
# Source-to-source AI rewriter
axiom rewrite program.axm --strategy performance
# Verified development
axiom verify program.axm # Check annotation completeness (@strict)
axiom test program.axm # Run @test blocks
axiom test program.axm --fuzz # Auto-fuzz from @precondition
# Time-travel debugging
axiom compile --record program.axm -o program # Record execution trace
axiom replay program.trace.jsonl # Replay trace events
# Start LSP server for editor integration
axiom lsp
# Start MCP server for AI agent integration
axiom mcpRequires: Rust (latest stable), clang (for native binary compilation)
AXIOM Source (.axm)
|
v
LEXER (63 tests) Tokens with spans
|
v
PARSER (52 tests) Typed AST with annotations
|
v
HIR LOWERING (32 tests) Validated annotations, type checking,
| @strict enforcement, pre/postcondition lowering
v
LLVM IR GEN (169 tests) Optimized IR text with:
| - noalias, nsw, fast-math
| - fastcc, branch hints
| - allocator attributes
| - fence release/acquire
| - readonly/writeonly pointer attrs
| - DWARF debug metadata
v
CLANG -O2 Native binary
i8 i16 i32 i64 i128 // Signed integers
u8 u16 u32 u64 u128 // Unsigned integers (u32 has proper unsigned semantics: udiv, urem, icmp ult, add nuw)
f16 bf16 f32 f64 // Floating point
bool // Boolean
array[T, N] // Fixed-size stack array
ptr[T] // Heap pointer
readonly_ptr[T] // Read-only pointer
writeonly_ptr[T] // Write-only pointer
slice[T] // Fat pointer (ptr + length)
vec2 vec3 vec4 // SIMD f64 vectors (2/3/4 lanes, hardware-mapped)
ivec2 ivec3 ivec4 // SIMD i32 vectors
fvec2 fvec3 fvec4 // SIMD f32 vectors
mat3 mat4 // 3x3 and 4x4 f64 matrices
tensor[T, dims...] // Tensor type (planned)
(T1, T2, T3) // Tuple
fn(T1, T2) -> R // Function type
struct Name { field: Type } // With literal constructors: Name { x: 1, y: 2 }
@pure // No side effects -> fast-math, noalias
@const // Compile-time evaluable
@inline(always | never | hint) // Inlining control
@complexity O(n^3) // Algorithmic complexity
@intent("description") // Semantic intent
@strategy { ... } // Optimization surface with ?holes
@constraint { key: value } // Hard performance constraints
@vectorizable(dims) // Auto-vectorization hint
@parallel(dims) // Parallelization hints
@parallel_for(shared_read: [...], shared_write: [...], reduction(+: var), private: [...])
// Data-parallel for loop with sharing clauses
@lifetime(scope | static | manual) // Memory lifetime control
@layout(row_major | col_major) // Memory layout
@align(bytes) // Alignment
@target(device_class) // Target hardware
@export // C-compatible symbol
@strict // Module: enforce annotations on all functions
@precondition(expr) // Function: runtime check at entry (--debug)
@postcondition(expr) // Function: runtime check at exit (--debug)
@test { input: (...), expect } // Function: inline test case
@requires(expr) // Function: formal precondition (alias for @precondition)
@ensures(expr) // Function: formal postcondition (alias for @postcondition)
@invariant(expr) // Block: loop invariant (checked in --debug)
@trace // Function: emit ENTER/EXIT calls for tracing
@link("lib", "kind") // Function: link against a native library
@transfer { ... } // Inter-agent handoff metadata
@optimization_log { ... } // Optimization history
// Stack arrays (zero-cost)
let arr: array[i32, 1000] = array_zeros[i32, 1000];
// Heap allocation
let data: ptr[i32] = heap_alloc(n, 4);
let data_z: ptr[i32] = heap_alloc_zeroed(n, 4);
let data2: ptr[i32] = heap_realloc(data, new_n, 4);
ptr_write_i32(data, i, value);
let val: i32 = ptr_read_i32(data, i);
heap_free(data);
// Arena allocation (50-200x faster than malloc)
let arena: ptr[i32] = arena_create(1048576); // 1MB arena
let nodes: ptr[i32] = arena_alloc(arena, 10000, 4);
// ... use nodes ...
arena_reset(arena); // Free ALL allocations instantly
arena_destroy(arena);
// Dynamic arrays (vec)
let v: ptr[i32] = vec_new(4); // elem_size = 4
vec_push_i32(v, 42);
let x: i32 = vec_get_i32(v, 0);
vec_set_i32(v, 0, 99);
let n: i32 = vec_len(v);
vec_free(v);
// Option (tagged union packed into i64)
let none_val: i64 = option_none();
let some_val: i64 = option_some(42);
let is_some: i32 = option_is_some(some_val);
let inner: i32 = option_unwrap(some_val);
// Result (error handling, tagged union packed into i64)
let ok_val: i64 = result_ok(42);
let err_val: i64 = result_err(1);
let is_ok: i32 = result_is_ok(ok_val);
let value: i32 = result_unwrap(ok_val);
let code: i32 = result_err_code(err_val);
// Strings (fat pointer: ptr + len)
let s: ptr[i32] = string_from_literal("hello");
let len: i32 = string_len(s);
let eq: i32 = string_eq(s1, s2);
string_print(s);
// Compile-time constant arrays stored in .rodata (zero runtime initialization cost)
let sbox: ptr[i32] = array_const_u8(0x63, 0x7c, 0x77, ...); // u8 lookup table
let table: ptr[i32] = array_const_i32(1, 2, 3, 4, 5); // i32 constant array
let coeffs: ptr[f64] = array_const_f64(1.0, 0.5, 0.25); // f64 constant array
// Functions returning array_const_*() are detected and their callers
// get direct GEP into .rodata (interprocedural const pointer propagation)
// Global mutable arrays (zero-initialized, writable)
let table: ptr[i32] = global_array_i32(256); // 256-element writable global i32 array
let buf: ptr[i64] = global_array_u8(1024); // 1024-byte writable global u8 array
let data: ptr[f64] = global_array_f64(64); // 64-element writable global f64 array
// Low-level memory operations (map to LLVM intrinsics)
memcpy(dst, src, 8); // Copy 8 bytes (non-overlapping) -> llvm.memcpy
memset(buf, 0, 64); // Zero-fill 64 bytes -> llvm.memset
memmove(dst, src, 32); // Copy 32 bytes (overlapping safe) -> llvm.memmove
memcmp(a, b, 16); // Compare 16 bytes -> 0 if equal
ptr_to_i64(ptr) // Cast pointer to i64
i64_to_ptr(n) // Cast i64 to pointer
simd_min(a, b) // Elementwise min on vec2/vec3/vec4/fvec* types
simd_max(a, b) // Elementwise max
simd_abs(v) // Elementwise absolute value
simd_sqrt(v) // Elementwise square root
simd_floor(v) // Elementwise floor
simd_ceil(v) // Elementwise ceil
band(a, b) // AND bor(a, b) // OR
bxor(a, b) // XOR bnot(a) // NOT
shl(a, n) // Shift left shr(a, n) // Arithmetic shift right
lshr(a, n) // Logical shift right
rotl(a, n) // Rotate left (i32) rotr(a, n) // Rotate right (i32)
rotl64(a, n) // Rotate left (i64) rotr64(a, n) // Rotate right (i64)
// Note: >> operator deliberately not supported (AI-first design).
// Use shr() for arithmetic right shift, lshr() for logical right shift.
abs(x) // Integer absolute value
abs_f64(x) // Float absolute value fabs(x) // Float absolute value (alias)
min(a, b) // Integer min max(a, b) // Integer max
min_f64(a, b) // Float min max_f64(a, b) // Float max
sqrt(x) // Square root pow(x, y) // Power
sin(x) cos(x) tan(x) // Trigonometric
asin(x) acos(x) atan(x) atan2(y,x) // Inverse trig
floor(x) ceil(x) round(x) // Rounding
log(x) log2(x) exp(x) exp2(x) // Logarithmic / exponential
to_f64(x) // i32 -> f64 to_f64_i64(x) // i64 -> f64
truncate(x) // Float -> integer truncation
let v: vec3 = vec3(1.0, 2.0, 3.0); // SIMD vector construction
let d: f64 = dot(a, b); // Dot product
let c: vec3 = cross(a, b); // Cross product
let l: f64 = length(v); // Vector length
let n: vec3 = normalize(v); // Unit vector
let r: vec3 = reflect(i, n); // Reflection
let m: vec3 = lerp(a, b, t); // Linear interpolation
// GLSL-style swizzles:
let xy: vec2 = v.xy; // Extract components
let rev: vec3 = v.zyx; // Reorder components
widen(x) // Widen integer (e.g. i32 -> i64)
narrow(x) // Narrow integer (e.g. i64 -> i32)
truncate(x) // Float -> integer truncation
f32_to_f64(x) // f32 -> f64
f64_to_f32(x) // f64 -> f32
const PI: f64 = 3.14159265358979; // Local constant (inlined)
for i in range(0, 10, 2) { } // range with optional step
break; // Break out of loop
continue; // Skip to next iteration
if x > 0 { } else if x == 0 { } else { } // else if chains
match x { 1 => ..., 2 => ..., _ => ... } // Pattern match (integers, booleans)
struct Point { x: f64, y: f64 }
let p: Point = Point { x: 1.0, y: 2.0 }; // Struct literal constructor
fn make_point(x: f64, y: f64) -> Point { } // Struct return from functions
// Threads
let tid: i32 = thread_create(func, arg);
thread_join(tid);
// Atomics
let val: i32 = atomic_load(ptr);
atomic_store(ptr, val);
let old: i32 = atomic_add(ptr, delta);
let old: i32 = atomic_cas(ptr, expected, desired);
// Mutex
let mtx: ptr[i32] = mutex_create();
mutex_lock(mtx);
mutex_unlock(mtx);
mutex_destroy(mtx);
// Job system (thread pool)
jobs_init(num_cores());
job_dispatch(func, data, total_items);
job_wait();
let handle: i32 = job_dispatch_handle(func, data, total_items);
let handle2: i32 = job_dispatch_after(func, data, total_items, handle);
job_wait_handle(handle2);
jobs_shutdown();
// Coroutines (stackful, via OS fibers/ucontext)
let coro: i32 = coro_create(func, arg);
let val: i32 = coro_resume(coro);
coro_yield(value);
let done: i32 = coro_is_done(coro);
coro_destroy(coro);
print("hello"); // Print string
print_i32(42); // Print i32
print_i64(100); // Print i64
print_f64(3.14); // Print f64
print_hex_i32(n) // Print i32 as hex
print_hex_i64(n) // Print i64 as hex
flush() // Flush stdout
format_i32(n) // Format i32 to string (ptr[i32])
format_i64(n) // Format i64 to string
format_f64(x) // Format f64 to string
format_hex(n) // Format integer as hex string
file_read(path) // Read entire file
file_write(path, data, len) // Write bytes to file
file_size(path) // Get file size
clock_ns() // Nanosecond wall clock
get_argc() // Argument count
get_argv(i) // Argument string
cpu_features() // CPUID feature bitmask
let fp: ptr[i32] = fn_ptr(my_function);
let result: i32 = call_fn_ptr_i32(fp, arg);
let result: f64 = call_fn_ptr_f64(fp, arg);
let result: i32 = call_ptr(fp, arg1, arg2); // Generic call through function pointer
extern fn clock() -> i64;
@export
fn compute(data: ptr[f64], n: i32) -> f64 { ... }
// Link against native libraries
@link("mylib", "static")
extern fn my_native_func(x: i32) -> i32;
1. EXTRACT -> Discover ?holes and @strategy blocks
2. PROPOSE -> Fill holes with concrete values
3. VALIDATE -> Check types, ranges, constraints
4. BENCHMARK -> Compile, run, measure performance
5. RECORD -> Store results in @optimization_log
The axiom optimize command feeds source + LLVM IR + assembly + benchmark data to an LLM, which analyzes the generated code and suggests improvements. The LLM prompt includes @constraint annotations (e.g., optimize_for: "performance" vs "memory" vs "latency") to steer the optimization direction.
# Dry run -- see the prompt the LLM would receive
axiom optimize program.axm --dry-run
# Full optimization loop with Claude API
ANTHROPIC_API_KEY=sk-... axiom optimize program.axm --iterations 5
# Profile a program (compile + time + surface extraction)
axiom profile program.axm --iterations 10Demonstrated result: The LLM analyzed the assembly output of a prime-counting program, identified a divl bottleneck (~25 cycles per integer division), and suggested wheel factorization (6k+-1). Result: 37% speedup, identical output, verified against C.
v1 (naive): 18.7ms -> v2 (LLM-optimized): 13.6ms = 1.37x faster
Both: AXIOM matches C exactly (1.00x on both algorithms)
The optimization loop:
- Compile -> LLVM IR + assembly
- Benchmark -> timing data
- Build prompt (source + IR + asm + timing + ?params + history + constraints)
- LLM analyzes, suggests ?param values and code changes
- Apply, recompile, re-benchmark, record in @optimization_log
- Repeat -- LLM sees history of what worked and what didn't
let session = AgentSession::from_file("matmul.axm")?;
let surfaces = session.surfaces(); // Discover optimization holes
session.apply_proposal(proposal, metrics, "agent-name")?;
let exported = session.export_with_transfer(transfer_info);axiom mcp # Starts JSON-RPC server on stdioTools: axiom_load, axiom_surfaces, axiom_propose, axiom_compile, axiom_history
axiom/
├── .github/
│ └── workflows/
│ └── ci.yml # GitHub Actions CI pipeline
├── crates/
│ ├── axiom-lexer/ # Tokenizer (63 tests)
│ ├── axiom-parser/ # Recursive descent + Pratt (52 tests)
│ ├── axiom-hir/ # High-level IR + validation (32 tests)
│ ├── axiom-codegen/ # LLVM IR generation (169 tests)
│ ├── axiom-optimize/ # Optimization protocol + agent API (132 tests)
│ ├── axiom-mir/ # Mid-level IR (stub)
│ └── axiom-driver/ # CLI + MCP server + compilation (97 tests + 12 E2E/doc-tests)
│ └── runtime/
│ └── axiom_rt.c # C runtime (I/O, coroutines, threads, jobs)
├── spec/ # Formal language specification
│ ├── grammar.ebnf # EBNF grammar
│ ├── types.md # Type system
│ ├── annotations.md # Annotation schema
│ ├── optimization.md # Optimization protocol
│ └── transfer.md # Inter-agent transfer protocol
├── benchmarks/
│ ├── suite/ # 115 simple benchmarks
│ ├── complex/ # 30 complex benchmarks
│ ├── real_world/ # 20 real-world benchmarks
│ ├── memory/ # 30 memory benchmarks
│ ├── fib/ # Recursive fibonacci (from drujensen/fib)
│ └── leibniz/ # Leibniz Pi (from niklas-heer/speed-comparison)
├── examples/ # 38 example programs (including 21 C project ports)
│ ├── sort/ # Bubble, insertion, selection sort
│ ├── nbody/ # N-body gravitational simulation
│ ├── numerical/ # Pi, root finding, integration
│ ├── crypto/ # Caesar cipher
│ ├── matmul/ # Matrix multiplication demos
│ ├── ecs/ # Entity-Component-System game demo
│ ├── raytracer/ # Full raytracer (scalar + vec3 versions)
│ ├── image_filter/ # Image processing
│ ├── json_parser/ # JSON parser
│ ├── pathfinder/ # Pathfinding algorithms
│ ├── physics_sim/ # Physics simulation
│ ├── compiler_demo/ # Compiler demo
│ ├── game_loop/ # Frame allocator, zero per-frame allocs
│ ├── self_opt/ # LLM optimization demos (primes, matmul)
│ ├── multi_agent/ # Multi-agent handoff demo
│ ├── self_host/ # AXIOM lexer written in AXIOM
│ ├── siphash/ # SipHash-2-4 port (400+ stars)
│ ├── qoi/ # QOI image codec port (7,439 stars)
│ ├── xxhash/ # xxHash32 port (10,954 stars)
│ ├── aes/ # AES-128 ECB port (4,902 stars)
│ ├── heatshrink/ # Heatshrink LZSS port (1,300+ stars)
│ ├── lz4/ # LZ4 compression port (10,600 stars)
│ ├── cjson/ # cJSON parser port (11,000 stars)
│ ├── fastlz/ # FastLZ compression port (500+ stars)
│ ├── lzav/ # LZAV compression port (400+ stars)
│ ├── turbopfor/ # TurboPFor integer compression (800+ stars)
│ ├── miniz/ # Huffman codec port (miniz/2,300+ stars)
│ ├── base64/ # Base64 codec (Turbo-Base64 algorithm)
│ ├── blake3/ # BLAKE3 crypto hash port
│ ├── minimp3/ # minimp3 IMDCT-36 port
│ ├── stb_jpeg/ # stb_image JPEG IDCT port
│ ├── smhasher/ # SMHasher hash functions port
│ ├── lodepng/ # lodepng PNG decode port (2,200+ stars)
│ ├── fpng/ # fpng fast PNG encode port (850+ stars)
│ ├── libdeflate/ # libdeflate fast DEFLATE port (900+ stars)
│ ├── utf8proc/ # utf8proc UTF-8 processing port (450+ stars)
│ └── roaring/ # Roaring Bitmaps port (1,500+ stars)
├── lib/ # AXIOM standard libraries
│ └── ecs.axm # ECS library (archetype storage)
├── scripts/ # Development scripts
│ └── self_optimize.sh # Self-optimization bootstrap script
├── tests/samples/ # 24 test programs
├── docs/ # Research documents
│ ├── MASTER_TASK_LIST.md # 47-milestone task tracker (ALL COMPLETE)
│ ├── OPTIMIZATION_RESEARCH.md
│ ├── MEMORY_ALLOCATION_RESEARCH.md
│ ├── GAME_ENGINE_RESEARCH.md
│ ├── MULTITHREADING_ANALYSIS.md
│ ├── LUX_INTEGRATION_RESEARCH.md
│ └── AXIOM_Language_Plan.md
├── CLAUDE.md # Project context for AI agents
├── DESIGN.md # Living design document
├── BENCHMARKS.md # Performance results
└── Cargo.toml # Workspace root
- ~40,100 lines of Rust across 7 crates
- 579 tests (all passing)
- 115/115 benchmarks pass (1.01x avg ratio vs C)
- 21 real-world C project ports (~60K+ combined GitHub stars) -- all at parity or faster (3 wins)
- ~185 builtin functions (I/O, math, vector math, matrix ops, memory, memcpy/memset/memmove/memcmp, SIMD intrinsics, format/print_hex, concurrency, collections, debug, slices, global constant/mutable arrays, ptr_to_i64/i64_to_ptr, call_ptr)
- 18 CLI commands: compile, lex, bench, mcp, optimize, profile, fmt, doc, pgo, watch, build, rewrite, lsp, verify, test, replay
- 38 example programs (including 21 C project ports), 24 sample programs
- 5 formal specification documents
- 7 research documents (optimization, memory, game engine, multithreading, Lux integration, language plan, optimization knowledge base)
- 14 optimization rules + 6 anti-patterns in the LLM knowledge base
- 47/47 milestones COMPLETE across 8 tracks (plus Phase L verified development)
- Phase A: MT-1 -- Fixed UB/soundness: removed incorrect
@pure/noalias/nosyncon shared pointers, added fences, fixed@puresemantics for write-through-ptr - Phase B: MT-2, MT-3 --
@parallel_forwith data clauses (private, shared_read, shared_write, reduction), HIR validation, correct LLVM IR with atomics/fences, thread-local accumulation + final combine - Phase C: L1, L3, P1, P4 -- Constraint-driven LLM prompts (
@constraint { optimize_for: X }threaded into LLM prompt), recursive@constevaluation,@target { cpu: "native" }with-march=native, constraint-to-clang-flag mapping - Phase D: MT-4, MT-5, MT-6 --
readonly_ptr[T]/writeonly_ptr[T]ownership types, job dependency graph (job_dispatch_handle,job_dispatch_after,job_wait_handle), LLVM parallel metadata - Phase E: F1, F2, F3, F5 -- Option/Result sum type builtins, string builtins (fat pointer), vec (dynamic array) builtins, function pointer builtins (
fn_ptr,call_fn_ptr_i32,call_fn_ptr_f64) - Phase F: L2, P2, P3 -- Hardware counter integration (perf data fed to LLM),
cpu_features()CPUID detection, SIMD width metadata on vectorizable loops - Phase G: F4, F6, F7, F8 -- Generics with monomorphization, module system with separate compilation, Result type builtins, while-let/if-let codegen
- Phase H: E1, E2, E3 -- GitHub Actions CI (
ci.yml), DWARF debug info in LLVM IR,axiom fmtformatter,axiom profileprofiler - Phase K: S1-S3 -- Self-improvement (self-hosted parser, compiler self-optimization via PGO bootstrap, source-to-source AI optimizer with
axiom rewrite) - Phase L: V1-V4 -- Verified development pipeline (
@strictannotation enforcement,@precondition/@postconditionruntime checks,@testinline test cases,axiom verify,axiom test --fuzz)
AXIOM was built using a multi-agent development pipeline with 7 independent agents:
| Agent | Role |
|---|---|
| Architect | Designs specifications and acceptance criteria |
| Optimistic Design Reviewer | Reviews spec for completeness and ambition |
| Pessimistic Design Reviewer | Reviews spec for risks and missing edge cases |
| Coder | Implements from spec |
| QA | Runs tests, verifies acceptance criteria |
| Optimistic Code Reviewer | Reviews code for quality and patterns |
| Pessimistic Code Reviewer | Adversarial review for bugs and UB |
Each milestone goes through all 7 agents with git branch isolation and retry loops.
AXIOM includes a built-in verification system for AI-generated code quality:
@strictmodule annotation enforces that all functions carry@pure/@intent/@complexityannotations. Missing annotations are compile errors.@precondition(expr)and@postcondition(expr)on functions emit runtime checks in--debugbuilds (zero overhead in release).@test { input: (...), expect: value }attaches inline test cases directly to functions. Run withaxiom test.axiom verifychecks annotation completeness across a module without compiling.axiom test --fuzzauto-generates test inputs from@preconditionconstraints.assert(cond, msg)anddebug_print(expr)builtins for runtime assertions and debug-only output.
@strict; // All functions must have @pure/@intent/@complexity
@pure
@intent("Compute absolute value")
@complexity O(1)
@precondition(x > -2147483648)
@postcondition(result >= 0)
@test { input: (5), expect: 5 }
@test { input: (-3), expect: 3 }
fn my_abs(x: i32) -> i32 {
if x < 0 { return 0 - x; }
return x;
}
MIT