Skip to content

rudybear/axiom

Repository files navigation

AXIOM

AI eXchange Intermediate Optimization Medium

A programming language designed as the canonical transfer format between AI agents, optimized for machine understanding and iterative optimization, that compiles to native code via LLVM.

This is NOT a language for humans to program in. This is a language for AI agents to communicate optimized computation through.

AXIOM beats C (-O3 -march=native -ffast-math) by 3% overall across 20 real-world benchmarks. 21 real-world C project ports (~60K+ combined GitHub stars) all at parity or faster. ~42,000 LOC. 579 tests. 115/115 benchmarks pass (1.01x avg vs C). ALL 47 milestones COMPLETE.

Why AXIOM Exists

Every existing language was designed for humans. AXIOM is designed for the gap between AI agents: when one AI generates code and another needs to optimize it, they need a format that preserves semantic intent, exposes optimization surfaces, and compiles to the fastest possible native code.

@module matmul;
@intent("Dense matrix multiplication for compute benchmarking");
@constraint { correctness: "IEEE 754 compliant" };

@pure
@complexity O(n^3)
@vectorizable(i, j, k)
fn matmul(
    a: tensor[f32, M, K] @layout(row_major) @align(64),
    b: tensor[f32, K, N] @layout(col_major) @align(64),
) -> tensor[f32, M, N] @layout(row_major) {

    @strategy {
        tiling:   { M: ?tile_m, N: ?tile_n, K: ?tile_k }
        order:    ?loop_order
        parallel: ?parallel_dims
        unroll:   ?unroll_factor
    }

    // ... implementation ...
}

The ?params are optimization holes that AI agents fill in, benchmark, and iterate on. The @annotations carry semantic intent through every compilation stage. No other language does this.

Benchmark Results

115/115 benchmarks pass, 1.01x average ratio vs C (parity). Raytracer: AXIOM scalar 42ms (+7% faster than C), AXIOM AOS vec3 44ms (+2% faster), C -O2 47ms.

197 benchmarks comparing AXIOM against C turbo (clang -O3 -march=native -ffast-math). Same LLVM backend, but AXIOM generates better-optimized IR.

Raytracer Benchmark (latest)

Version Median (ms) vs C -O2
AXIOM scalar 42 +7% faster
AXIOM AOS vec3 44 +2% faster
C -O2 47 baseline
C turbo (-O3 -ffast-math) 51 -9% slower

Real-World Benchmarks (20 programs) -- vs C Turbo

Benchmark AXIOM C Turbo Winner
JPEG DCT -- -- AXIOM 56% faster
RLE compression -- -- AXIOM 16% faster
... ... ... ...
Total (20 programs) 0.97x 1.00x AXIOM 3% faster (2 wins, 9 ties, 9 C wins)

Real-World C Project Ports (21 projects, ~60K+ combined GitHub stars)

AXIOM ports of popular open-source C libraries, benchmarked against the original C compiled with clang -O3 -march=native -ffast-math. Each port uses only general AXIOM optimizations -- no benchmark-specific cheating. All 21 ports are annotated with @strict (every function carries @pure/@intent/@complexity).

Project GitHub Stars Category Result
QOI (image codec) 7,439 Compression AXIOM 16% faster
TurboPFor (integer compression) 800+ Compression AXIOM 35% faster
Huffman/miniz (deflate codec) 2,300+ Compression AXIOM 14% faster
SipHash (keyed hash) 400+ Crypto Parity
xxHash32 (non-crypto hash) 10,954 Hashing Parity
AES-128 (encryption) 4,902 Crypto Parity
heatshrink (embedded LZSS) 1,300+ Compression Parity
LZ4 (fast compression) 10,600 Compression Parity
cJSON (JSON parser) 11,000 Parsing Parity
FastLZ (LZ77 compression) 500+ Compression Parity
LZAV (improved LZ77) 400+ Compression Parity (1.04x)
Base64 (Turbo-Base64 codec) -- Encoding Parity
BLAKE3 (crypto hash) -- Crypto Parity
minimp3 (MP3 IMDCT-36) -- Audio Parity
stb_jpeg (JPEG IDCT) -- Image Parity
SMHasher (4 hash functions) -- Hashing Parity
lodepng (PNG decode core) 2,200+ Image Parity
fpng (fast PNG encode) 850+ Image Parity
libdeflate (fast DEFLATE) 900+ Compression Parity
utf8proc (UTF-8 processing) 450+ Text Parity
Roaring Bitmaps (compressed bitmaps) 1,500+ Data Structures Parity

Key optimizations applied across all ports: @pure -> fast-math | noalias on all pointers | @inline(always) on hot helpers | array_const_u8/array_const_i32 for lookup tables (direct .rodata GEP) | inbounds GEP on all ptr access | wrapping arithmetic (+%, *%) for hash/crypto | zext for array indices | interprocedural const pointer propagation

Memory Benchmarks (30 programs)

Benchmark AXIOM C -O2 Winner
Binary trees (arena) 0.18s 0.92s AXIOM 80% faster
Dijkstra shortest path 0.06s 0.11s AXIOM 45% faster
Random alloc/free 0.09s 0.12s AXIOM 28% faster
Sparse matrix (arena) 0.06s 0.08s AXIOM 23% faster

How AXIOM Beats C

AXIOM has more information than C and uses it:

Optimization What AXIOM knows What C doesn't LLVM effect
@pure Function has no side effects Must assume side effects memory(none), fast math flags
noalias No pointer aliasing (by design) Must assume aliasing Enables vectorization, reordering
nsw No signed integer overflow Must assume possible overflow Strength reduction, loop opts
Arena allocator Batch allocation lifetime Per-object malloc/free 50-200x allocation throughput
@lifetime(scope) Heap can be stack Must use heap Zero-cost promotion
fastcc Internal calling convention C calling convention Fewer register saves
fence Release/acquire semantics No memory model Correct concurrency
readonly/writeonly Pointer access direction Must assume read+write Alias analysis, dead store elim
calloc for zeroed alloc Zero-init via OS page trick malloc + memset Kernel-level zero pages, skips user-space memset
@inline(always) Force-inline hot paths Heuristic-only inlining alwaysinline attribute, eliminates call overhead
array_const_* Compile-time constant arrays Runtime initialization Direct GEP into .rodata, no pointer load
inbounds GEP All pointer accesses are in-bounds No guarantee Enables LLVM alias analysis optimizations
Const ptr propagation Interprocedural const pointer flow Per-function only Eliminates redundant loads from constant tables

Optimization Knowledge Base

AXIOM maintains an Optimization Knowledge Base (docs/OPTIMIZATION_KNOWLEDGE.md) that grows with each LLM optimization session: 14 rules + 6 anti-patterns discovered so far. Rules capture what works (e.g., "arena allocators beat malloc by 50-200x for tree structures", "use array_const_u8 for lookup tables -- direct GEP into .rodata", "wrapping arithmetic is mandatory for hash/crypto"), anti-patterns capture what doesn't (e.g., "marking I/O functions as @pure breaks correctness", "trusting agent reports without verification"). The knowledge base is read before every optimization pass and updated after discoveries, creating a feedback loop where the compiler gets smarter over time.

Quick Start

# Build the compiler
cargo build --release

# Compile an AXIOM program
axiom compile examples/numerical/pi.axm -o pi
./pi

# See intermediate representations
axiom compile --emit=tokens examples/sort/bubble_sort.axm
axiom compile --emit=ast examples/sort/bubble_sort.axm
axiom compile --emit=hir examples/sort/bubble_sort.axm
axiom compile --emit=llvm-ir examples/sort/bubble_sort.axm

# Target a specific CPU architecture
axiom compile --target=x86-64-v4 program.axm -o program

# Enable runtime pre/postcondition checks
axiom compile --debug program.axm -o program

# JSON diagnostic output (for IDE/tooling integration)
axiom compile --error-format=json program.axm

# Run optimization protocol
axiom optimize examples/matmul/matmul_simple.axm --iterations 5

# Benchmark a program
axiom bench examples/numerical/pi.axm --runs 10

# Profile a program (compile + time + surface extraction + suggestions)
axiom profile program.axm --iterations 10

# Format an AXIOM source file (parse -> HIR -> pretty-print)
axiom fmt program.axm

# Generate documentation from @intent and doc comments
axiom doc program.axm

# Profile-guided optimization bootstrap
axiom pgo program.axm --iterations 3

# Watch mode -- recompile on file changes
axiom watch program.axm

# Build project with dependency resolution
axiom build

# Source-to-source AI rewriter
axiom rewrite program.axm --strategy performance

# Verified development
axiom verify program.axm                # Check annotation completeness (@strict)
axiom test program.axm                  # Run @test blocks
axiom test program.axm --fuzz           # Auto-fuzz from @precondition

# Time-travel debugging
axiom compile --record program.axm -o program  # Record execution trace
axiom replay program.trace.jsonl               # Replay trace events

# Start LSP server for editor integration
axiom lsp

# Start MCP server for AI agent integration
axiom mcp

Requires: Rust (latest stable), clang (for native binary compilation)

Compilation Pipeline

AXIOM Source (.axm)
       |
       v
   LEXER (63 tests)         Tokens with spans
       |
       v
   PARSER (52 tests)        Typed AST with annotations
       |
       v
   HIR LOWERING (32 tests)  Validated annotations, type checking,
       |                     @strict enforcement, pre/postcondition lowering
       v
   LLVM IR GEN (169 tests)  Optimized IR text with:
       |                     - noalias, nsw, fast-math
       |                     - fastcc, branch hints
       |                     - allocator attributes
       |                     - fence release/acquire
       |                     - readonly/writeonly pointer attrs
       |                     - DWARF debug metadata
       v
   CLANG -O2                 Native binary

Language Features

Types

i8 i16 i32 i64 i128           // Signed integers
u8 u16 u32 u64 u128           // Unsigned integers (u32 has proper unsigned semantics: udiv, urem, icmp ult, add nuw)
f16 bf16 f32 f64              // Floating point
bool                           // Boolean
array[T, N]                    // Fixed-size stack array
ptr[T]                         // Heap pointer
readonly_ptr[T]                // Read-only pointer
writeonly_ptr[T]               // Write-only pointer
slice[T]                       // Fat pointer (ptr + length)
vec2 vec3 vec4                 // SIMD f64 vectors (2/3/4 lanes, hardware-mapped)
ivec2 ivec3 ivec4              // SIMD i32 vectors
fvec2 fvec3 fvec4              // SIMD f32 vectors
mat3 mat4                      // 3x3 and 4x4 f64 matrices
tensor[T, dims...]             // Tensor type (planned)
(T1, T2, T3)                  // Tuple
fn(T1, T2) -> R               // Function type
struct Name { field: Type }    // With literal constructors: Name { x: 1, y: 2 }

Annotations

@pure                          // No side effects -> fast-math, noalias
@const                         // Compile-time evaluable
@inline(always | never | hint) // Inlining control
@complexity O(n^3)             // Algorithmic complexity
@intent("description")         // Semantic intent
@strategy { ... }              // Optimization surface with ?holes
@constraint { key: value }     // Hard performance constraints
@vectorizable(dims)            // Auto-vectorization hint
@parallel(dims)                // Parallelization hints
@parallel_for(shared_read: [...], shared_write: [...], reduction(+: var), private: [...])
                               // Data-parallel for loop with sharing clauses
@lifetime(scope | static | manual)  // Memory lifetime control
@layout(row_major | col_major) // Memory layout
@align(bytes)                  // Alignment
@target(device_class)          // Target hardware
@export                        // C-compatible symbol
@strict                        // Module: enforce annotations on all functions
@precondition(expr)            // Function: runtime check at entry (--debug)
@postcondition(expr)           // Function: runtime check at exit (--debug)
@test { input: (...), expect } // Function: inline test case
@requires(expr)                // Function: formal precondition (alias for @precondition)
@ensures(expr)                 // Function: formal postcondition (alias for @postcondition)
@invariant(expr)               // Block: loop invariant (checked in --debug)
@trace                         // Function: emit ENTER/EXIT calls for tracing
@link("lib", "kind")           // Function: link against a native library
@transfer { ... }              // Inter-agent handoff metadata
@optimization_log { ... }      // Optimization history

Memory Management

// Stack arrays (zero-cost)
let arr: array[i32, 1000] = array_zeros[i32, 1000];

// Heap allocation
let data: ptr[i32] = heap_alloc(n, 4);
let data_z: ptr[i32] = heap_alloc_zeroed(n, 4);
let data2: ptr[i32] = heap_realloc(data, new_n, 4);
ptr_write_i32(data, i, value);
let val: i32 = ptr_read_i32(data, i);
heap_free(data);

// Arena allocation (50-200x faster than malloc)
let arena: ptr[i32] = arena_create(1048576);  // 1MB arena
let nodes: ptr[i32] = arena_alloc(arena, 10000, 4);
// ... use nodes ...
arena_reset(arena);   // Free ALL allocations instantly
arena_destroy(arena);

// Dynamic arrays (vec)
let v: ptr[i32] = vec_new(4);   // elem_size = 4
vec_push_i32(v, 42);
let x: i32 = vec_get_i32(v, 0);
vec_set_i32(v, 0, 99);
let n: i32 = vec_len(v);
vec_free(v);

// Option (tagged union packed into i64)
let none_val: i64 = option_none();
let some_val: i64 = option_some(42);
let is_some: i32 = option_is_some(some_val);
let inner: i32 = option_unwrap(some_val);

// Result (error handling, tagged union packed into i64)
let ok_val: i64 = result_ok(42);
let err_val: i64 = result_err(1);
let is_ok: i32 = result_is_ok(ok_val);
let value: i32 = result_unwrap(ok_val);
let code: i32 = result_err_code(err_val);

// Strings (fat pointer: ptr + len)
let s: ptr[i32] = string_from_literal("hello");
let len: i32 = string_len(s);
let eq: i32 = string_eq(s1, s2);
string_print(s);

Global Constant Arrays

// Compile-time constant arrays stored in .rodata (zero runtime initialization cost)
let sbox: ptr[i32] = array_const_u8(0x63, 0x7c, 0x77, ...);  // u8 lookup table
let table: ptr[i32] = array_const_i32(1, 2, 3, 4, 5);        // i32 constant array
let coeffs: ptr[f64] = array_const_f64(1.0, 0.5, 0.25);      // f64 constant array
// Functions returning array_const_*() are detected and their callers
// get direct GEP into .rodata (interprocedural const pointer propagation)

// Global mutable arrays (zero-initialized, writable)
let table: ptr[i32] = global_array_i32(256);   // 256-element writable global i32 array
let buf: ptr[i64] = global_array_u8(1024);     // 1024-byte writable global u8 array
let data: ptr[f64] = global_array_f64(64);     // 64-element writable global f64 array

// Low-level memory operations (map to LLVM intrinsics)
memcpy(dst, src, 8);              // Copy 8 bytes (non-overlapping) -> llvm.memcpy
memset(buf, 0, 64);              // Zero-fill 64 bytes -> llvm.memset
memmove(dst, src, 32);           // Copy 32 bytes (overlapping safe) -> llvm.memmove
memcmp(a, b, 16);                // Compare 16 bytes -> 0 if equal
ptr_to_i64(ptr)                  // Cast pointer to i64
i64_to_ptr(n)                    // Cast i64 to pointer

SIMD Intrinsics

simd_min(a, b)    // Elementwise min on vec2/vec3/vec4/fvec* types
simd_max(a, b)    // Elementwise max
simd_abs(v)       // Elementwise absolute value
simd_sqrt(v)      // Elementwise square root
simd_floor(v)     // Elementwise floor
simd_ceil(v)      // Elementwise ceil

Bitwise Operations

band(a, b)     // AND        bor(a, b)      // OR
bxor(a, b)     // XOR        bnot(a)        // NOT
shl(a, n)      // Shift left  shr(a, n)      // Arithmetic shift right
lshr(a, n)     // Logical shift right
rotl(a, n)     // Rotate left (i32)    rotr(a, n)     // Rotate right (i32)
rotl64(a, n)   // Rotate left (i64)    rotr64(a, n)   // Rotate right (i64)
// Note: >> operator deliberately not supported (AI-first design).
// Use shr() for arithmetic right shift, lshr() for logical right shift.

Math (25 builtins)

abs(x)         // Integer absolute value
abs_f64(x)     // Float absolute value      fabs(x)        // Float absolute value (alias)
min(a, b)      // Integer min       max(a, b)      // Integer max
min_f64(a, b)  // Float min         max_f64(a, b)  // Float max
sqrt(x)        // Square root       pow(x, y)      // Power
sin(x) cos(x) tan(x)              // Trigonometric
asin(x) acos(x) atan(x) atan2(y,x) // Inverse trig
floor(x) ceil(x) round(x)         // Rounding
log(x) log2(x) exp(x) exp2(x)    // Logarithmic / exponential
to_f64(x)      // i32 -> f64        to_f64_i64(x)  // i64 -> f64
truncate(x)    // Float -> integer truncation

Vector Math (9 builtins)

let v: vec3 = vec3(1.0, 2.0, 3.0);  // SIMD vector construction
let d: f64 = dot(a, b);              // Dot product
let c: vec3 = cross(a, b);           // Cross product
let l: f64 = length(v);              // Vector length
let n: vec3 = normalize(v);          // Unit vector
let r: vec3 = reflect(i, n);        // Reflection
let m: vec3 = lerp(a, b, t);        // Linear interpolation
// GLSL-style swizzles:
let xy: vec2 = v.xy;                // Extract components
let rev: vec3 = v.zyx;              // Reorder components

Type Conversions

widen(x)          // Widen integer (e.g. i32 -> i64)
narrow(x)         // Narrow integer (e.g. i64 -> i32)
truncate(x)       // Float -> integer truncation
f32_to_f64(x)     // f32 -> f64
f64_to_f32(x)     // f64 -> f32

Constants & Control Flow

const PI: f64 = 3.14159265358979;    // Local constant (inlined)
for i in range(0, 10, 2) { }        // range with optional step
break;                                // Break out of loop
continue;                             // Skip to next iteration
if x > 0 { } else if x == 0 { } else { }  // else if chains
match x { 1 => ..., 2 => ..., _ => ... }  // Pattern match (integers, booleans)

Struct Constructors

struct Point { x: f64, y: f64 }
let p: Point = Point { x: 1.0, y: 2.0 };  // Struct literal constructor
fn make_point(x: f64, y: f64) -> Point { } // Struct return from functions

Concurrency

// Threads
let tid: i32 = thread_create(func, arg);
thread_join(tid);

// Atomics
let val: i32 = atomic_load(ptr);
atomic_store(ptr, val);
let old: i32 = atomic_add(ptr, delta);
let old: i32 = atomic_cas(ptr, expected, desired);

// Mutex
let mtx: ptr[i32] = mutex_create();
mutex_lock(mtx);
mutex_unlock(mtx);
mutex_destroy(mtx);

// Job system (thread pool)
jobs_init(num_cores());
job_dispatch(func, data, total_items);
job_wait();
let handle: i32 = job_dispatch_handle(func, data, total_items);
let handle2: i32 = job_dispatch_after(func, data, total_items, handle);
job_wait_handle(handle2);
jobs_shutdown();

// Coroutines (stackful, via OS fibers/ucontext)
let coro: i32 = coro_create(func, arg);
let val: i32 = coro_resume(coro);
coro_yield(value);
let done: i32 = coro_is_done(coro);
coro_destroy(coro);

I/O and System

print("hello");               // Print string
print_i32(42);                // Print i32
print_i64(100);               // Print i64
print_f64(3.14);              // Print f64
print_hex_i32(n)              // Print i32 as hex
print_hex_i64(n)              // Print i64 as hex
flush()                       // Flush stdout
format_i32(n)                 // Format i32 to string (ptr[i32])
format_i64(n)                 // Format i64 to string
format_f64(x)                 // Format f64 to string
format_hex(n)                 // Format integer as hex string
file_read(path)               // Read entire file
file_write(path, data, len)   // Write bytes to file
file_size(path)               // Get file size
clock_ns()                    // Nanosecond wall clock
get_argc()                    // Argument count
get_argv(i)                   // Argument string
cpu_features()                // CPUID feature bitmask

Function Pointers

let fp: ptr[i32] = fn_ptr(my_function);
let result: i32 = call_fn_ptr_i32(fp, arg);
let result: f64 = call_fn_ptr_f64(fp, arg);
let result: i32 = call_ptr(fp, arg1, arg2);  // Generic call through function pointer

C Interop / FFI

extern fn clock() -> i64;

@export
fn compute(data: ptr[f64], n: i32) -> f64 { ... }

// Link against native libraries
@link("mylib", "static")
extern fn my_native_func(x: i32) -> i32;

AI Agent Integration

Optimization Protocol

1. EXTRACT   ->  Discover ?holes and @strategy blocks
2. PROPOSE   ->  Fill holes with concrete values
3. VALIDATE  ->  Check types, ranges, constraints
4. BENCHMARK ->  Compile, run, measure performance
5. RECORD    ->  Store results in @optimization_log

LLM Self-Optimization Pipeline (The Core Differentiator)

The axiom optimize command feeds source + LLVM IR + assembly + benchmark data to an LLM, which analyzes the generated code and suggests improvements. The LLM prompt includes @constraint annotations (e.g., optimize_for: "performance" vs "memory" vs "latency") to steer the optimization direction.

# Dry run -- see the prompt the LLM would receive
axiom optimize program.axm --dry-run

# Full optimization loop with Claude API
ANTHROPIC_API_KEY=sk-... axiom optimize program.axm --iterations 5

# Profile a program (compile + time + surface extraction)
axiom profile program.axm --iterations 10

Demonstrated result: The LLM analyzed the assembly output of a prime-counting program, identified a divl bottleneck (~25 cycles per integer division), and suggested wheel factorization (6k+-1). Result: 37% speedup, identical output, verified against C.

v1 (naive):  18.7ms  ->  v2 (LLM-optimized):  13.6ms  =  1.37x faster
Both: AXIOM matches C exactly (1.00x on both algorithms)

The optimization loop:

  1. Compile -> LLVM IR + assembly
  2. Benchmark -> timing data
  3. Build prompt (source + IR + asm + timing + ?params + history + constraints)
  4. LLM analyzes, suggests ?param values and code changes
  5. Apply, recompile, re-benchmark, record in @optimization_log
  6. Repeat -- LLM sees history of what worked and what didn't

Agent Session API (Rust)

let session = AgentSession::from_file("matmul.axm")?;
let surfaces = session.surfaces();      // Discover optimization holes
session.apply_proposal(proposal, metrics, "agent-name")?;
let exported = session.export_with_transfer(transfer_info);

MCP Server (for Claude, etc.)

axiom mcp  # Starts JSON-RPC server on stdio

Tools: axiom_load, axiom_surfaces, axiom_propose, axiom_compile, axiom_history

Project Structure

axiom/
├── .github/
│   └── workflows/
│       └── ci.yml              # GitHub Actions CI pipeline
├── crates/
│   ├── axiom-lexer/            # Tokenizer (63 tests)
│   ├── axiom-parser/           # Recursive descent + Pratt (52 tests)
│   ├── axiom-hir/              # High-level IR + validation (32 tests)
│   ├── axiom-codegen/          # LLVM IR generation (169 tests)
│   ├── axiom-optimize/         # Optimization protocol + agent API (132 tests)
│   ├── axiom-mir/              # Mid-level IR (stub)
│   └── axiom-driver/           # CLI + MCP server + compilation (97 tests + 12 E2E/doc-tests)
│       └── runtime/
│           └── axiom_rt.c      # C runtime (I/O, coroutines, threads, jobs)
├── spec/                       # Formal language specification
│   ├── grammar.ebnf            # EBNF grammar
│   ├── types.md                # Type system
│   ├── annotations.md          # Annotation schema
│   ├── optimization.md         # Optimization protocol
│   └── transfer.md             # Inter-agent transfer protocol
├── benchmarks/
│   ├── suite/                  # 115 simple benchmarks
│   ├── complex/                # 30 complex benchmarks
│   ├── real_world/             # 20 real-world benchmarks
│   ├── memory/                 # 30 memory benchmarks
│   ├── fib/                    # Recursive fibonacci (from drujensen/fib)
│   └── leibniz/                # Leibniz Pi (from niklas-heer/speed-comparison)
├── examples/                   # 38 example programs (including 21 C project ports)
│   ├── sort/                   # Bubble, insertion, selection sort
│   ├── nbody/                  # N-body gravitational simulation
│   ├── numerical/              # Pi, root finding, integration
│   ├── crypto/                 # Caesar cipher
│   ├── matmul/                 # Matrix multiplication demos
│   ├── ecs/                    # Entity-Component-System game demo
│   ├── raytracer/              # Full raytracer (scalar + vec3 versions)
│   ├── image_filter/           # Image processing
│   ├── json_parser/            # JSON parser
│   ├── pathfinder/             # Pathfinding algorithms
│   ├── physics_sim/            # Physics simulation
│   ├── compiler_demo/          # Compiler demo
│   ├── game_loop/              # Frame allocator, zero per-frame allocs
│   ├── self_opt/               # LLM optimization demos (primes, matmul)
│   ├── multi_agent/            # Multi-agent handoff demo
│   ├── self_host/              # AXIOM lexer written in AXIOM
│   ├── siphash/                # SipHash-2-4 port (400+ stars)
│   ├── qoi/                    # QOI image codec port (7,439 stars)
│   ├── xxhash/                 # xxHash32 port (10,954 stars)
│   ├── aes/                    # AES-128 ECB port (4,902 stars)
│   ├── heatshrink/             # Heatshrink LZSS port (1,300+ stars)
│   ├── lz4/                    # LZ4 compression port (10,600 stars)
│   ├── cjson/                  # cJSON parser port (11,000 stars)
│   ├── fastlz/                 # FastLZ compression port (500+ stars)
│   ├── lzav/                   # LZAV compression port (400+ stars)
│   ├── turbopfor/              # TurboPFor integer compression (800+ stars)
│   ├── miniz/                  # Huffman codec port (miniz/2,300+ stars)
│   ├── base64/                 # Base64 codec (Turbo-Base64 algorithm)
│   ├── blake3/                 # BLAKE3 crypto hash port
│   ├── minimp3/                # minimp3 IMDCT-36 port
│   ├── stb_jpeg/               # stb_image JPEG IDCT port
│   ├── smhasher/               # SMHasher hash functions port
│   ├── lodepng/                # lodepng PNG decode port (2,200+ stars)
│   ├── fpng/                   # fpng fast PNG encode port (850+ stars)
│   ├── libdeflate/             # libdeflate fast DEFLATE port (900+ stars)
│   ├── utf8proc/               # utf8proc UTF-8 processing port (450+ stars)
│   └── roaring/                # Roaring Bitmaps port (1,500+ stars)
├── lib/                        # AXIOM standard libraries
│   └── ecs.axm                 # ECS library (archetype storage)
├── scripts/                    # Development scripts
│   └── self_optimize.sh        # Self-optimization bootstrap script
├── tests/samples/              # 24 test programs
├── docs/                       # Research documents
│   ├── MASTER_TASK_LIST.md     # 47-milestone task tracker (ALL COMPLETE)
│   ├── OPTIMIZATION_RESEARCH.md
│   ├── MEMORY_ALLOCATION_RESEARCH.md
│   ├── GAME_ENGINE_RESEARCH.md
│   ├── MULTITHREADING_ANALYSIS.md
│   ├── LUX_INTEGRATION_RESEARCH.md
│   └── AXIOM_Language_Plan.md
├── CLAUDE.md                   # Project context for AI agents
├── DESIGN.md                   # Living design document
├── BENCHMARKS.md               # Performance results
└── Cargo.toml                  # Workspace root

Stats

  • ~40,100 lines of Rust across 7 crates
  • 579 tests (all passing)
  • 115/115 benchmarks pass (1.01x avg ratio vs C)
  • 21 real-world C project ports (~60K+ combined GitHub stars) -- all at parity or faster (3 wins)
  • ~185 builtin functions (I/O, math, vector math, matrix ops, memory, memcpy/memset/memmove/memcmp, SIMD intrinsics, format/print_hex, concurrency, collections, debug, slices, global constant/mutable arrays, ptr_to_i64/i64_to_ptr, call_ptr)
  • 18 CLI commands: compile, lex, bench, mcp, optimize, profile, fmt, doc, pgo, watch, build, rewrite, lsp, verify, test, replay
  • 38 example programs (including 21 C project ports), 24 sample programs
  • 5 formal specification documents
  • 7 research documents (optimization, memory, game engine, multithreading, Lux integration, language plan, optimization knowledge base)
  • 14 optimization rules + 6 anti-patterns in the LLM knowledge base
  • 47/47 milestones COMPLETE across 8 tracks (plus Phase L verified development)

Roadmap

ALL PHASES COMPLETE

  • Phase A: MT-1 -- Fixed UB/soundness: removed incorrect @pure/noalias/nosync on shared pointers, added fences, fixed @pure semantics for write-through-ptr
  • Phase B: MT-2, MT-3 -- @parallel_for with data clauses (private, shared_read, shared_write, reduction), HIR validation, correct LLVM IR with atomics/fences, thread-local accumulation + final combine
  • Phase C: L1, L3, P1, P4 -- Constraint-driven LLM prompts (@constraint { optimize_for: X } threaded into LLM prompt), recursive @const evaluation, @target { cpu: "native" } with -march=native, constraint-to-clang-flag mapping
  • Phase D: MT-4, MT-5, MT-6 -- readonly_ptr[T]/writeonly_ptr[T] ownership types, job dependency graph (job_dispatch_handle, job_dispatch_after, job_wait_handle), LLVM parallel metadata
  • Phase E: F1, F2, F3, F5 -- Option/Result sum type builtins, string builtins (fat pointer), vec (dynamic array) builtins, function pointer builtins (fn_ptr, call_fn_ptr_i32, call_fn_ptr_f64)
  • Phase F: L2, P2, P3 -- Hardware counter integration (perf data fed to LLM), cpu_features() CPUID detection, SIMD width metadata on vectorizable loops
  • Phase G: F4, F6, F7, F8 -- Generics with monomorphization, module system with separate compilation, Result type builtins, while-let/if-let codegen
  • Phase H: E1, E2, E3 -- GitHub Actions CI (ci.yml), DWARF debug info in LLVM IR, axiom fmt formatter, axiom profile profiler
  • Phase K: S1-S3 -- Self-improvement (self-hosted parser, compiler self-optimization via PGO bootstrap, source-to-source AI optimizer with axiom rewrite)
  • Phase L: V1-V4 -- Verified development pipeline (@strict annotation enforcement, @precondition/@postcondition runtime checks, @test inline test cases, axiom verify, axiom test --fuzz)

Development Pipeline

AXIOM was built using a multi-agent development pipeline with 7 independent agents:

Agent Role
Architect Designs specifications and acceptance criteria
Optimistic Design Reviewer Reviews spec for completeness and ambition
Pessimistic Design Reviewer Reviews spec for risks and missing edge cases
Coder Implements from spec
QA Runs tests, verifies acceptance criteria
Optimistic Code Reviewer Reviews code for quality and patterns
Pessimistic Code Reviewer Adversarial review for bugs and UB

Each milestone goes through all 7 agents with git branch isolation and retry loops.

Verified Development Pipeline

AXIOM includes a built-in verification system for AI-generated code quality:

  • @strict module annotation enforces that all functions carry @pure/@intent/@complexity annotations. Missing annotations are compile errors.
  • @precondition(expr) and @postcondition(expr) on functions emit runtime checks in --debug builds (zero overhead in release).
  • @test { input: (...), expect: value } attaches inline test cases directly to functions. Run with axiom test.
  • axiom verify checks annotation completeness across a module without compiling.
  • axiom test --fuzz auto-generates test inputs from @precondition constraints.
  • assert(cond, msg) and debug_print(expr) builtins for runtime assertions and debug-only output.
@strict;  // All functions must have @pure/@intent/@complexity

@pure
@intent("Compute absolute value")
@complexity O(1)
@precondition(x > -2147483648)
@postcondition(result >= 0)
@test { input: (5), expect: 5 }
@test { input: (-3), expect: 3 }
fn my_abs(x: i32) -> i32 {
    if x < 0 { return 0 - x; }
    return x;
}

License

MIT

About

AXIOM — AI eXchange Intermediate Optimization Medium. A programming language for AI-to-AI code transfer that compiles to native binaries via LLVM, with built-in optimization protocol, arena allocator, and 197 benchmarks showing it beats C.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors