AXIOM

AI eXchange Intermediate Optimization Medium

A programming language designed as the canonical transfer format between AI agents, optimized for machine understanding and iterative optimization, that compiles to native code via LLVM.

This is NOT a language for humans to program in. This is a language for AI agents to communicate optimized computation through.

AXIOM beats C (-O3 -march=native -ffast-math) by 3% overall across 20 real-world benchmarks. 21 real-world C project ports (~60K+ combined GitHub stars) all at parity or faster. ~42,000 LOC. 579 tests. 115/115 benchmarks pass (1.01x avg vs C). ALL 47 milestones COMPLETE.

Why AXIOM Exists

Every existing language was designed for humans. AXIOM is designed for the gap between AI agents: when one AI generates code and another needs to optimize it, they need a format that preserves semantic intent, exposes optimization surfaces, and compiles to the fastest possible native code.

@module matmul;
@intent("Dense matrix multiplication for compute benchmarking");
@constraint { correctness: "IEEE 754 compliant" };

@pure
@complexity O(n^3)
@vectorizable(i, j, k)
fn matmul(
    a: tensor[f32, M, K] @layout(row_major) @align(64),
    b: tensor[f32, K, N] @layout(col_major) @align(64),
) -> tensor[f32, M, N] @layout(row_major) {

    @strategy {
        tiling:   { M: ?tile_m, N: ?tile_n, K: ?tile_k }
        order:    ?loop_order
        parallel: ?parallel_dims
        unroll:   ?unroll_factor
    }

    // ... implementation ...
}

The ?params are optimization holes that AI agents fill in, benchmark, and iterate on. The @annotations carry semantic intent through every compilation stage. No other language does this.

Benchmark Results

115/115 benchmarks pass, 1.01x average ratio vs C (parity). Raytracer: AXIOM scalar 42ms (+7% faster than C), AXIOM AOS vec3 44ms (+2% faster), C -O2 47ms.

197 benchmarks comparing AXIOM against C turbo (clang -O3 -march=native -ffast-math). Same LLVM backend, but AXIOM generates better-optimized IR.

Raytracer Benchmark (latest)

Version	Median (ms)	vs C -O2
AXIOM scalar	42	+7% faster
AXIOM AOS vec3	44	+2% faster
C -O2	47	baseline
C turbo (-O3 -ffast-math)	51	-9% slower

Real-World Benchmarks (20 programs) -- vs C Turbo

Benchmark	AXIOM	C Turbo	Winner
JPEG DCT	--	--	AXIOM 56% faster
RLE compression	--	--	AXIOM 16% faster
...	...	...	...
Total (20 programs)	0.97x	1.00x	AXIOM 3% faster (2 wins, 9 ties, 9 C wins)

Real-World C Project Ports (21 projects, ~60K+ combined GitHub stars)

AXIOM ports of popular open-source C libraries, benchmarked against the original C compiled with clang -O3 -march=native -ffast-math. Each port uses only general AXIOM optimizations -- no benchmark-specific cheating. All 21 ports are annotated with @strict (every function carries @pure/@intent/@complexity).

Project	GitHub Stars	Category	Result
QOI (image codec)	7,439	Compression	AXIOM 16% faster
TurboPFor (integer compression)	800+	Compression	AXIOM 35% faster
Huffman/miniz (deflate codec)	2,300+	Compression	AXIOM 14% faster
SipHash (keyed hash)	400+	Crypto	Parity
xxHash32 (non-crypto hash)	10,954	Hashing	Parity
AES-128 (encryption)	4,902	Crypto	Parity
heatshrink (embedded LZSS)	1,300+	Compression	Parity
LZ4 (fast compression)	10,600	Compression	Parity
cJSON (JSON parser)	11,000	Parsing	Parity
FastLZ (LZ77 compression)	500+	Compression	Parity
LZAV (improved LZ77)	400+	Compression	Parity (1.04x)
Base64 (Turbo-Base64 codec)	--	Encoding	Parity
BLAKE3 (crypto hash)	--	Crypto	Parity
minimp3 (MP3 IMDCT-36)	--	Audio	Parity
stb_jpeg (JPEG IDCT)	--	Image	Parity
SMHasher (4 hash functions)	--	Hashing	Parity
lodepng (PNG decode core)	2,200+	Image	Parity
fpng (fast PNG encode)	850+	Image	Parity
libdeflate (fast DEFLATE)	900+	Compression	Parity
utf8proc (UTF-8 processing)	450+	Text	Parity
Roaring Bitmaps (compressed bitmaps)	1,500+	Data Structures	Parity

Key optimizations applied across all ports: @pure -> fast-math | noalias on all pointers | @inline(always) on hot helpers | array_const_u8/array_const_i32 for lookup tables (direct .rodata GEP) | inbounds GEP on all ptr access | wrapping arithmetic (+%, *%) for hash/crypto | zext for array indices | interprocedural const pointer propagation

Memory Benchmarks (30 programs)

Benchmark	AXIOM	C -O2	Winner
Binary trees (arena)	0.18s	0.92s	AXIOM 80% faster
Dijkstra shortest path	0.06s	0.11s	AXIOM 45% faster
Random alloc/free	0.09s	0.12s	AXIOM 28% faster
Sparse matrix (arena)	0.06s	0.08s	AXIOM 23% faster

How AXIOM Beats C

AXIOM has more information than C and uses it:

Optimization	What AXIOM knows	What C doesn't	LLVM effect
`@pure`	Function has no side effects	Must assume side effects	`memory(none)`, `fast` math flags
`noalias`	No pointer aliasing (by design)	Must assume aliasing	Enables vectorization, reordering
`nsw`	No signed integer overflow	Must assume possible overflow	Strength reduction, loop opts
Arena allocator	Batch allocation lifetime	Per-object malloc/free	50-200x allocation throughput
`@lifetime(scope)`	Heap can be stack	Must use heap	Zero-cost promotion
`fastcc`	Internal calling convention	C calling convention	Fewer register saves
`fence`	Release/acquire semantics	No memory model	Correct concurrency
`readonly`/`writeonly`	Pointer access direction	Must assume read+write	Alias analysis, dead store elim
`calloc` for zeroed alloc	Zero-init via OS page trick	`malloc` + `memset`	Kernel-level zero pages, skips user-space memset
`@inline(always)`	Force-inline hot paths	Heuristic-only inlining	`alwaysinline` attribute, eliminates call overhead
`array_const_*`	Compile-time constant arrays	Runtime initialization	Direct GEP into .rodata, no pointer load
`inbounds` GEP	All pointer accesses are in-bounds	No guarantee	Enables LLVM alias analysis optimizations
Const ptr propagation	Interprocedural const pointer flow	Per-function only	Eliminates redundant loads from constant tables

Optimization Knowledge Base

AXIOM maintains an Optimization Knowledge Base (docs/OPTIMIZATION_KNOWLEDGE.md) that grows with each LLM optimization session: 14 rules + 6 anti-patterns discovered so far. Rules capture what works (e.g., "arena allocators beat malloc by 50-200x for tree structures", "use array_const_u8 for lookup tables -- direct GEP into .rodata", "wrapping arithmetic is mandatory for hash/crypto"), anti-patterns capture what doesn't (e.g., "marking I/O functions as @pure breaks correctness", "trusting agent reports without verification"). The knowledge base is read before every optimization pass and updated after discoveries, creating a feedback loop where the compiler gets smarter over time.

Quick Start

# Build the compiler
cargo build --release

# Compile an AXIOM program
axiom compile examples/numerical/pi.axm -o pi
./pi

# See intermediate representations
axiom compile --emit=tokens examples/sort/bubble_sort.axm
axiom compile --emit=ast examples/sort/bubble_sort.axm
axiom compile --emit=hir examples/sort/bubble_sort.axm
axiom compile --emit=llvm-ir examples/sort/bubble_sort.axm

# Target a specific CPU architecture
axiom compile --target=x86-64-v4 program.axm -o program

# Enable runtime pre/postcondition checks
axiom compile --debug program.axm -o program

# JSON diagnostic output (for IDE/tooling integration)
axiom compile --error-format=json program.axm

# Run optimization protocol
axiom optimize examples/matmul/matmul_simple.axm --iterations 5

# Benchmark a program
axiom bench examples/numerical/pi.axm --runs 10

# Profile a program (compile + time + surface extraction + suggestions)
axiom profile program.axm --iterations 10

# Format an AXIOM source file (parse -> HIR -> pretty-print)
axiom fmt program.axm

# Generate documentation from @intent and doc comments
axiom doc program.axm

# Profile-guided optimization bootstrap
axiom pgo program.axm --iterations 3

# Watch mode -- recompile on file changes
axiom watch program.axm

# Build project with dependency resolution
axiom build

# Source-to-source AI rewriter
axiom rewrite program.axm --strategy performance

# Verified development
axiom verify program.axm                # Check annotation completeness (@strict)
axiom test program.axm                  # Run @test blocks
axiom test program.axm --fuzz           # Auto-fuzz from @precondition

# Time-travel debugging
axiom compile --record program.axm -o program  # Record execution trace
axiom replay program.trace.jsonl               # Replay trace events

# Start LSP server for editor integration
axiom lsp

# Start MCP server for AI agent integration
axiom mcp

Requires: Rust (latest stable), clang (for native binary compilation)

Compilation Pipeline

AXIOM Source (.axm)
       |
       v
   LEXER (63 tests)         Tokens with spans
       |
       v
   PARSER (52 tests)        Typed AST with annotations
       |
       v
   HIR LOWERING (32 tests)  Validated annotations, type checking,
       |                     @strict enforcement, pre/postcondition lowering
       v
   LLVM IR GEN (169 tests)  Optimized IR text with:
       |                     - noalias, nsw, fast-math
       |                     - fastcc, branch hints
       |                     - allocator attributes
       |                     - fence release/acquire
       |                     - readonly/writeonly pointer attrs
       |                     - DWARF debug metadata
       v
   CLANG -O2                 Native binary

Language Features

Types

i8 i16 i32 i64 i128           // Signed integers
u8 u16 u32 u64 u128           // Unsigned integers (u32 has proper unsigned semantics: udiv, urem, icmp ult, add nuw)
f16 bf16 f32 f64              // Floating point
bool                           // Boolean
array[T, N]                    // Fixed-size stack array
ptr[T]                         // Heap pointer
readonly_ptr[T]                // Read-only pointer
writeonly_ptr[T]               // Write-only pointer
slice[T]                       // Fat pointer (ptr + length)
vec2 vec3 vec4                 // SIMD f64 vectors (2/3/4 lanes, hardware-mapped)
ivec2 ivec3 ivec4              // SIMD i32 vectors
fvec2 fvec3 fvec4              // SIMD f32 vectors
mat3 mat4                      // 3x3 and 4x4 f64 matrices
tensor[T, dims...]             // Tensor type (planned)
(T1, T2, T3)                  // Tuple
fn(T1, T2) -> R               // Function type
struct Name { field: Type }    // With literal constructors: Name { x: 1, y: 2 }

Annotations

@pure                          // No side effects -> fast-math, noalias
@const                         // Compile-time evaluable
@inline(always | never | hint) // Inlining control
@complexity O(n^3)             // Algorithmic complexity
@intent("description")         // Semantic intent
@strategy { ... }              // Optimization surface with ?holes
@constraint { key: value }     // Hard performance constraints
@vectorizable(dims)            // Auto-vectorization hint
@parallel(dims)                // Parallelization hints
@parallel_for(shared_read: [...], shared_write: [...], reduction(+: var), private: [...])
                               // Data-parallel for loop with sharing clauses
@lifetime(scope | static | manual)  // Memory lifetime control
@layout(row_major | col_major) // Memory layout
@align(bytes)                  // Alignment
@target(device_class)          // Target hardware
@export                        // C-compatible symbol
@strict                        // Module: enforce annotations on all functions
@precondition(expr)            // Function: runtime check at entry (--debug)
@postcondition(expr)           // Function: runtime check at exit (--debug)
@test { input: (...), expect } // Function: inline test case
@requires(expr)                // Function: formal precondition (alias for @precondition)
@ensures(expr)                 // Function: formal postcondition (alias for @postcondition)
@invariant(expr)               // Block: loop invariant (checked in --debug)
@trace                         // Function: emit ENTER/EXIT calls for tracing
@link("lib", "kind")           // Function: link against a native library
@transfer { ... }              // Inter-agent handoff metadata
@optimization_log { ... }      // Optimization history

Memory Management

// Stack arrays (zero-cost)
let arr: array[i32, 1000] = array_zeros[i32, 1000];

// Heap allocation
let data: ptr[i32] = heap_alloc(n, 4);
let data_z: ptr[i32] = heap_alloc_zeroed(n, 4);
let data2: ptr[i32] = heap_realloc(data, new_n, 4);
ptr_write_i32(data, i, value);
let val: i32 = ptr_read_i32(data, i);
heap_free(data);

// Arena allocation (50-200x faster than malloc)
let arena: ptr[i32] = arena_create(1048576);  // 1MB arena
let nodes: ptr[i32] = arena_alloc(arena, 10000, 4);
// ... use nodes ...
arena_reset(arena);   // Free ALL allocations instantly
arena_destroy(arena);

// Dynamic arrays (vec)
let v: ptr[i32] = vec_new(4);   // elem_size = 4
vec_push_i32(v, 42);
let x: i32 = vec_get_i32(v, 0);
vec_set_i32(v, 0, 99);
let n: i32 = vec_len(v);
vec_free(v);

// Option (tagged union packed into i64)
let none_val: i64 = option_none();
let some_val: i64 = option_some(42);
let is_some: i32 = option_is_some(some_val);
let inner: i32 = option_unwrap(some_val);

// Result (error handling, tagged union packed into i64)
let ok_val: i64 = result_ok(42);
let err_val: i64 = result_err(1);
let is_ok: i32 = result_is_ok(ok_val);
let value: i32 = result_unwrap(ok_val);
let code: i32 = result_err_code(err_val);

// Strings (fat pointer: ptr + len)
let s: ptr[i32] = string_from_literal("hello");
let len: i32 = string_len(s);
let eq: i32 = string_eq(s1, s2);
string_print(s);

Global Constant Arrays

// Compile-time constant arrays stored in .rodata (zero runtime initialization cost)
let sbox: ptr[i32] = array_const_u8(0x63, 0x7c, 0x77, ...);  // u8 lookup table
let table: ptr[i32] = array_const_i32(1, 2, 3, 4, 5);        // i32 constant array
let coeffs: ptr[f64] = array_const_f64(1.0, 0.5, 0.25);      // f64 constant array
// Functions returning array_const_*() are detected and their callers
// get direct GEP into .rodata (interprocedural const pointer propagation)

// Global mutable arrays (zero-initialized, writable)
let table: ptr[i32] = global_array_i32(256);   // 256-element writable global i32 array
let buf: ptr[i64] = global_array_u8(1024);     // 1024-byte writable global u8 array
let data: ptr[f64] = global_array_f64(64);     // 64-element writable global f64 array

// Low-level memory operations (map to LLVM intrinsics)
memcpy(dst, src, 8);              // Copy 8 bytes (non-overlapping) -> llvm.memcpy
memset(buf, 0, 64);              // Zero-fill 64 bytes -> llvm.memset
memmove(dst, src, 32);           // Copy 32 bytes (overlapping safe) -> llvm.memmove
memcmp(a, b, 16);                // Compare 16 bytes -> 0 if equal
ptr_to_i64(ptr)                  // Cast pointer to i64
i64_to_ptr(n)                    // Cast i64 to pointer

SIMD Intrinsics

simd_min(a, b)    // Elementwise min on vec2/vec3/vec4/fvec* types
simd_max(a, b)    // Elementwise max
simd_abs(v)       // Elementwise absolute value
simd_sqrt(v)      // Elementwise square root
simd_floor(v)     // Elementwise floor
simd_ceil(v)      // Elementwise ceil

Bitwise Operations

band(a, b)     // AND        bor(a, b)      // OR
bxor(a, b)     // XOR        bnot(a)        // NOT
shl(a, n)      // Shift left  shr(a, n)      // Arithmetic shift right
lshr(a, n)     // Logical shift right
rotl(a, n)     // Rotate left (i32)    rotr(a, n)     // Rotate right (i32)
rotl64(a, n)   // Rotate left (i64)    rotr64(a, n)   // Rotate right (i64)
// Note: >> operator deliberately not supported (AI-first design).
// Use shr() for arithmetic right shift, lshr() for logical right shift.

Math (25 builtins)

abs(x)         // Integer absolute value
abs_f64(x)     // Float absolute value      fabs(x)        // Float absolute value (alias)
min(a, b)      // Integer min       max(a, b)      // Integer max
min_f64(a, b)  // Float min         max_f64(a, b)  // Float max
sqrt(x)        // Square root       pow(x, y)      // Power
sin(x) cos(x) tan(x)              // Trigonometric
asin(x) acos(x) atan(x) atan2(y,x) // Inverse trig
floor(x) ceil(x) round(x)         // Rounding
log(x) log2(x) exp(x) exp2(x)    // Logarithmic / exponential
to_f64(x)      // i32 -> f64        to_f64_i64(x)  // i64 -> f64
truncate(x)    // Float -> integer truncation

Vector Math (9 builtins)

let v: vec3 = vec3(1.0, 2.0, 3.0);  // SIMD vector construction
let d: f64 = dot(a, b);              // Dot product
let c: vec3 = cross(a, b);           // Cross product
let l: f64 = length(v);              // Vector length
let n: vec3 = normalize(v);          // Unit vector
let r: vec3 = reflect(i, n);        // Reflection
let m: vec3 = lerp(a, b, t);        // Linear interpolation
// GLSL-style swizzles:
let xy: vec2 = v.xy;                // Extract components
let rev: vec3 = v.zyx;              // Reorder components

Type Conversions

widen(x)          // Widen integer (e.g. i32 -> i64)
narrow(x)         // Narrow integer (e.g. i64 -> i32)
truncate(x)       // Float -> integer truncation
f32_to_f64(x)     // f32 -> f64
f64_to_f32(x)     // f64 -> f32

Constants & Control Flow

const PI: f64 = 3.14159265358979;    // Local constant (inlined)
for i in range(0, 10, 2) { }        // range with optional step
break;                                // Break out of loop
continue;                             // Skip to next iteration
if x > 0 { } else if x == 0 { } else { }  // else if chains
match x { 1 => ..., 2 => ..., _ => ... }  // Pattern match (integers, booleans)

Struct Constructors

struct Point { x: f64, y: f64 }
let p: Point = Point { x: 1.0, y: 2.0 };  // Struct literal constructor
fn make_point(x: f64, y: f64) -> Point { } // Struct return from functions

Concurrency

// Threads
let tid: i32 = thread_create(func, arg);
thread_join(tid);

// Atomics
let val: i32 = atomic_load(ptr);
atomic_store(ptr, val);
let old: i32 = atomic_add(ptr, delta);
let old: i32 = atomic_cas(ptr, expected, desired);

// Mutex
let mtx: ptr[i32] = mutex_create();
mutex_lock(mtx);
mutex_unlock(mtx);
mutex_destroy(mtx);

// Job system (thread pool)
jobs_init(num_cores());
job_dispatch(func, data, total_items);
job_wait();
let handle: i32 = job_dispatch_handle(func, data, total_items);
let handle2: i32 = job_dispatch_after(func, data, total_items, handle);
job_wait_handle(handle2);
jobs_shutdown();

// Coroutines (stackful, via OS fibers/ucontext)
let coro: i32 = coro_create(func, arg);
let val: i32 = coro_resume(coro);
coro_yield(value);
let done: i32 = coro_is_done(coro);
coro_destroy(coro);

I/O and System

print("hello");               // Print string
print_i32(42);                // Print i32
print_i64(100);               // Print i64
print_f64(3.14);              // Print f64
print_hex_i32(n)              // Print i32 as hex
print_hex_i64(n)              // Print i64 as hex
flush()                       // Flush stdout
format_i32(n)                 // Format i32 to string (ptr[i32])
format_i64(n)                 // Format i64 to string
format_f64(x)                 // Format f64 to string
format_hex(n)                 // Format integer as hex string
file_read(path)               // Read entire file
file_write(path, data, len)   // Write bytes to file
file_size(path)               // Get file size
clock_ns()                    // Nanosecond wall clock
get_argc()                    // Argument count
get_argv(i)                   // Argument string
cpu_features()                // CPUID feature bitmask

Function Pointers

let fp: ptr[i32] = fn_ptr(my_function);
let result: i32 = call_fn_ptr_i32(fp, arg);
let result: f64 = call_fn_ptr_f64(fp, arg);
let result: i32 = call_ptr(fp, arg1, arg2);  // Generic call through function pointer

C Interop / FFI

extern fn clock() -> i64;

@export
fn compute(data: ptr[f64], n: i32) -> f64 { ... }

// Link against native libraries
@link("mylib", "static")
extern fn my_native_func(x: i32) -> i32;

AI Agent Integration

Optimization Protocol

1. EXTRACT   ->  Discover ?holes and @strategy blocks
2. PROPOSE   ->  Fill holes with concrete values
3. VALIDATE  ->  Check types, ranges, constraints
4. BENCHMARK ->  Compile, run, measure performance
5. RECORD    ->  Store results in @optimization_log

LLM Self-Optimization Pipeline (The Core Differentiator)

The axiom optimize command feeds source + LLVM IR + assembly + benchmark data to an LLM, which analyzes the generated code and suggests improvements. The LLM prompt includes @constraint annotations (e.g., optimize_for: "performance" vs "memory" vs "latency") to steer the optimization direction.

# Dry run -- see the prompt the LLM would receive
axiom optimize program.axm --dry-run

# Full optimization loop with Claude API
ANTHROPIC_API_KEY=sk-... axiom optimize program.axm --iterations 5

# Profile a program (compile + time + surface extraction)
axiom profile program.axm --iterations 10

Demonstrated result: The LLM analyzed the assembly output of a prime-counting program, identified a divl bottleneck (~25 cycles per integer division), and suggested wheel factorization (6k+-1). Result: 37% speedup, identical output, verified against C.

v1 (naive):  18.7ms  ->  v2 (LLM-optimized):  13.6ms  =  1.37x faster
Both: AXIOM matches C exactly (1.00x on both algorithms)

The optimization loop:

Compile -> LLVM IR + assembly
Benchmark -> timing data
Build prompt (source + IR + asm + timing + ?params + history + constraints)
LLM analyzes, suggests ?param values and code changes
Apply, recompile, re-benchmark, record in @optimization_log
Repeat -- LLM sees history of what worked and what didn't

Agent Session API (Rust)

let session = AgentSession::from_file("matmul.axm")?;
let surfaces = session.surfaces();      // Discover optimization holes
session.apply_proposal(proposal, metrics, "agent-name")?;
let exported = session.export_with_transfer(transfer_info);

MCP Server (for Claude, etc.)

axiom mcp  # Starts JSON-RPC server on stdio

Tools: axiom_load, axiom_surfaces, axiom_propose, axiom_compile, axiom_history

Project Structure

axiom/
├── .github/
│   └── workflows/
│       └── ci.yml              # GitHub Actions CI pipeline
├── crates/
│   ├── axiom-lexer/            # Tokenizer (63 tests)
│   ├── axiom-parser/           # Recursive descent + Pratt (52 tests)
│   ├── axiom-hir/              # High-level IR + validation (32 tests)
│   ├── axiom-codegen/          # LLVM IR generation (169 tests)
│   ├── axiom-optimize/         # Optimization protocol + agent API (132 tests)
│   ├── axiom-mir/              # Mid-level IR (stub)
│   └── axiom-driver/           # CLI + MCP server + compilation (97 tests + 12 E2E/doc-tests)
│       └── runtime/
│           └── axiom_rt.c      # C runtime (I/O, coroutines, threads, jobs)
├── spec/                       # Formal language specification
│   ├── grammar.ebnf            # EBNF grammar
│   ├── types.md                # Type system
│   ├── annotations.md          # Annotation schema
│   ├── optimization.md         # Optimization protocol
│   └── transfer.md             # Inter-agent transfer protocol
├── benchmarks/
│   ├── suite/                  # 115 simple benchmarks
│   ├── complex/                # 30 complex benchmarks
│   ├── real_world/             # 20 real-world benchmarks
│   ├── memory/                 # 30 memory benchmarks
│   ├── fib/                    # Recursive fibonacci (from drujensen/fib)
│   └── leibniz/                # Leibniz Pi (from niklas-heer/speed-comparison)
├── examples/                   # 38 example programs (including 21 C project ports)
│   ├── sort/                   # Bubble, insertion, selection sort
│   ├── nbody/                  # N-body gravitational simulation
│   ├── numerical/              # Pi, root finding, integration
│   ├── crypto/                 # Caesar cipher
│   ├── matmul/                 # Matrix multiplication demos
│   ├── ecs/                    # Entity-Component-System game demo
│   ├── raytracer/              # Full raytracer (scalar + vec3 versions)
│   ├── image_filter/           # Image processing
│   ├── json_parser/            # JSON parser
│   ├── pathfinder/             # Pathfinding algorithms
│   ├── physics_sim/            # Physics simulation
│   ├── compiler_demo/          # Compiler demo
│   ├── game_loop/              # Frame allocator, zero per-frame allocs
│   ├── self_opt/               # LLM optimization demos (primes, matmul)
│   ├── multi_agent/            # Multi-agent handoff demo
│   ├── self_host/              # AXIOM lexer written in AXIOM
│   ├── siphash/                # SipHash-2-4 port (400+ stars)
│   ├── qoi/                    # QOI image codec port (7,439 stars)
│   ├── xxhash/                 # xxHash32 port (10,954 stars)
│   ├── aes/                    # AES-128 ECB port (4,902 stars)
│   ├── heatshrink/             # Heatshrink LZSS port (1,300+ stars)
│   ├── lz4/                    # LZ4 compression port (10,600 stars)
│   ├── cjson/                  # cJSON parser port (11,000 stars)
│   ├── fastlz/                 # FastLZ compression port (500+ stars)
│   ├── lzav/                   # LZAV compression port (400+ stars)
│   ├── turbopfor/              # TurboPFor integer compression (800+ stars)
│   ├── miniz/                  # Huffman codec port (miniz/2,300+ stars)
│   ├── base64/                 # Base64 codec (Turbo-Base64 algorithm)
│   ├── blake3/                 # BLAKE3 crypto hash port
│   ├── minimp3/                # minimp3 IMDCT-36 port
│   ├── stb_jpeg/               # stb_image JPEG IDCT port
│   ├── smhasher/               # SMHasher hash functions port
│   ├── lodepng/                # lodepng PNG decode port (2,200+ stars)
│   ├── fpng/                   # fpng fast PNG encode port (850+ stars)
│   ├── libdeflate/             # libdeflate fast DEFLATE port (900+ stars)
│   ├── utf8proc/               # utf8proc UTF-8 processing port (450+ stars)
│   └── roaring/                # Roaring Bitmaps port (1,500+ stars)
├── lib/                        # AXIOM standard libraries
│   └── ecs.axm                 # ECS library (archetype storage)
├── scripts/                    # Development scripts
│   └── self_optimize.sh        # Self-optimization bootstrap script
├── tests/samples/              # 24 test programs
├── docs/                       # Research documents
│   ├── MASTER_TASK_LIST.md     # 47-milestone task tracker (ALL COMPLETE)
│   ├── OPTIMIZATION_RESEARCH.md
│   ├── MEMORY_ALLOCATION_RESEARCH.md
│   ├── GAME_ENGINE_RESEARCH.md
│   ├── MULTITHREADING_ANALYSIS.md
│   ├── LUX_INTEGRATION_RESEARCH.md
│   └── AXIOM_Language_Plan.md
├── CLAUDE.md                   # Project context for AI agents
├── DESIGN.md                   # Living design document
├── BENCHMARKS.md               # Performance results
└── Cargo.toml                  # Workspace root

Stats

~40,100 lines of Rust across 7 crates
579 tests (all passing)
115/115 benchmarks pass (1.01x avg ratio vs C)
21 real-world C project ports (~60K+ combined GitHub stars) -- all at parity or faster (3 wins)
~185 builtin functions (I/O, math, vector math, matrix ops, memory, memcpy/memset/memmove/memcmp, SIMD intrinsics, format/print_hex, concurrency, collections, debug, slices, global constant/mutable arrays, ptr_to_i64/i64_to_ptr, call_ptr)
18 CLI commands: compile, lex, bench, mcp, optimize, profile, fmt, doc, pgo, watch, build, rewrite, lsp, verify, test, replay
38 example programs (including 21 C project ports), 24 sample programs
5 formal specification documents
7 research documents (optimization, memory, game engine, multithreading, Lux integration, language plan, optimization knowledge base)
14 optimization rules + 6 anti-patterns in the LLM knowledge base
47/47 milestones COMPLETE across 8 tracks (plus Phase L verified development)

Roadmap

ALL PHASES COMPLETE

Phase A: MT-1 -- Fixed UB/soundness: removed incorrect @pure/noalias/nosync on shared pointers, added fences, fixed @pure semantics for write-through-ptr
Phase B: MT-2, MT-3 -- @parallel_for with data clauses (private, shared_read, shared_write, reduction), HIR validation, correct LLVM IR with atomics/fences, thread-local accumulation + final combine
Phase C: L1, L3, P1, P4 -- Constraint-driven LLM prompts (@constraint { optimize_for: X } threaded into LLM prompt), recursive @const evaluation, @target { cpu: "native" } with -march=native, constraint-to-clang-flag mapping
Phase D: MT-4, MT-5, MT-6 -- readonly_ptr[T]/writeonly_ptr[T] ownership types, job dependency graph (job_dispatch_handle, job_dispatch_after, job_wait_handle), LLVM parallel metadata
Phase E: F1, F2, F3, F5 -- Option/Result sum type builtins, string builtins (fat pointer), vec (dynamic array) builtins, function pointer builtins (fn_ptr, call_fn_ptr_i32, call_fn_ptr_f64)
Phase F: L2, P2, P3 -- Hardware counter integration (perf data fed to LLM), cpu_features() CPUID detection, SIMD width metadata on vectorizable loops
Phase G: F4, F6, F7, F8 -- Generics with monomorphization, module system with separate compilation, Result type builtins, while-let/if-let codegen
Phase H: E1, E2, E3 -- GitHub Actions CI (ci.yml), DWARF debug info in LLVM IR, axiom fmt formatter, axiom profile profiler
Phase K: S1-S3 -- Self-improvement (self-hosted parser, compiler self-optimization via PGO bootstrap, source-to-source AI optimizer with axiom rewrite)
Phase L: V1-V4 -- Verified development pipeline (@strict annotation enforcement, @precondition/@postcondition runtime checks, @test inline test cases, axiom verify, axiom test --fuzz)

Development Pipeline

AXIOM was built using a multi-agent development pipeline with 7 independent agents:

Agent	Role
Architect	Designs specifications and acceptance criteria
Optimistic Design Reviewer	Reviews spec for completeness and ambition
Pessimistic Design Reviewer	Reviews spec for risks and missing edge cases
Coder	Implements from spec
QA	Runs tests, verifies acceptance criteria
Optimistic Code Reviewer	Reviews code for quality and patterns
Pessimistic Code Reviewer	Adversarial review for bugs and UB

Each milestone goes through all 7 agents with git branch isolation and retry loops.

Verified Development Pipeline

AXIOM includes a built-in verification system for AI-generated code quality:

@strict module annotation enforces that all functions carry @pure/@intent/@complexity annotations. Missing annotations are compile errors.
@precondition(expr) and @postcondition(expr) on functions emit runtime checks in --debug builds (zero overhead in release).
@test { input: (...), expect: value } attaches inline test cases directly to functions. Run with axiom test.
axiom verify checks annotation completeness across a module without compiling.
axiom test --fuzz auto-generates test inputs from @precondition constraints.
assert(cond, msg) and debug_print(expr) builtins for runtime assertions and debug-only output.

@strict;  // All functions must have @pure/@intent/@complexity

@pure
@intent("Compute absolute value")
@complexity O(1)
@precondition(x > -2147483648)
@postcondition(result >= 0)
@test { input: (5), expect: 5 }
@test { input: (-3), expect: 3 }
fn my_abs(x: i32) -> i32 {
    if x < 0 { return 0 - x; }
    return x;
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 226 Commits
.github/workflows		.github/workflows
.pipeline		.pipeline
benchmarks		benchmarks
crates		crates
docs		docs
examples		examples
lib		lib
scripts		scripts
spec		spec
tests/samples		tests/samples
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DESIGN.md		DESIGN.md
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

AXIOM

Why AXIOM Exists

Benchmark Results

Raytracer Benchmark (latest)

Real-World Benchmarks (20 programs) -- vs C Turbo

Real-World C Project Ports (21 projects, ~60K+ combined GitHub stars)

Memory Benchmarks (30 programs)

How AXIOM Beats C

Optimization Knowledge Base

Quick Start

Compilation Pipeline

Language Features

Types

Annotations

Memory Management

Global Constant Arrays

SIMD Intrinsics

Bitwise Operations

Math (25 builtins)

Vector Math (9 builtins)

Type Conversions

Constants & Control Flow

Struct Constructors

Concurrency

I/O and System

Function Pointers

C Interop / FFI

AI Agent Integration

Optimization Protocol

LLM Self-Optimization Pipeline (The Core Differentiator)

Agent Session API (Rust)

MCP Server (for Claude, etc.)

Project Structure

Stats

Roadmap

ALL PHASES COMPLETE

Development Pipeline

Verified Development Pipeline

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages