Skip to content

Grillcheese-AI/grillcheese

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GrillCheese

GrillCheese is a neuromorphic, multimodal AI system built on Grilly with Vulkan-first execution.

The current language core includes a new Grilly-native SSM path inspired by Mamba2, integrated with hippocampal memory, amygdala/endocrine modulation, VSA, and specialist routing.

Current core architecture

  • Orchestrator and brain routing:
    • thalamic routing
    • specialist selection
    • amygdala + endocrine modulation
    • dream/consolidation phases
  • Memory:
    • hippocampal/capsule memory flow
    • semantic memory backend
    • long-term persistence and replay
  • Language backends:
    • Grilly native transformer
    • Grilly native SSM (native_ssm, Mamba2-inspired)
  • Learning:
    • phase-based training (affect, snn, conversations, instructions, factual, multilanguage, tools_usage, tools_crafting)
    • pretraining pipeline with checkpoints/resume
    • VSA Reasoning Head training with cached hidden states and fused Vulkan shaders

Mamba2-inspired native SSM

Primary implementation:

  • grillcheese/language/grilly_ssm.py
  • grillcheese/language/grilly_native.py
  • grillcheese/pipelines/pretrain_pipeline.py

Model shape:

  1. token embedding
  2. N x selective scan block
  3. final norm
  4. LM head
  5. optional NLMS residual adaptation head

Selective scan block flow:

  1. norm
  2. in_proj to split gate and value
  3. selective scan recurrence with learned decay
  4. out_proj
  5. residual add

Supported scan implementations:

  • vectorized (default)
  • loop
  • vectorized + fused Vulkan scan math (auto when available)

Set with:

  • GRILLCHEESE_SSM_SCAN_IMPL=vectorized|loop
  • GRILLCHEESE_SSM_USE_FUSED_SCAN=1|0 (default 1)
  • GRILLCHEESE_SSM_FUSED_SCAN_MAX_SEQ_LEN (default 1024)

VulkanTensor integration

native_ssm now integrates Grilly VulkanTensor in SSM projection hotspots with safe fallback:

  • enabled by default for native SSM path
  • if GPU tensor conversion fails, it falls back to numpy automatically

Toggle:

  • GRILLCHEESE_NATIVE_USE_VULKAN_TENSOR=1|0

Other relevant native SSM env vars:

  • GRILLCHEESE_NATIVE_SSM_VOCAB_SIZE (default 32768)
  • GRILLCHEESE_NATIVE_SSM_D_MODEL (default 768)
  • GRILLCHEESE_NATIVE_SSM_N_LAYERS (default 12)
  • GRILLCHEESE_NATIVE_USE_SNN_RMSNORM=1|0
  • GRILLCHEESE_NATIVE_USE_NLMS_HEAD=1|0
  • GRILLCHEESE_NATIVE_NLMS_TOPK
  • GRILLCHEESE_NATIVE_NLMS_SCALE
  • GRILLCHEESE_NATIVE_NLMS_MAX_ENTRIES
  • GRILLCHEESE_NATIVE_NLMS_LR
  • GRILLCHEESE_NATIVE_NLMS_MU_DECAY
  • GRILLCHEESE_NATIVE_NLMS_MU_MIN

During pretrain --mode native_ssm, tqdm now shows scan=... in postfix so you can verify whether SSM scan is running on vk_fused_math or numpy fallback.

Quick start

uv sync
uv run grillcheese doctor
uv run grillcheese chat "hello" --latency-mode instant

Native SSM chat (Vulkan via Grilly):

uv run grillcheese --native --native-mode native_ssm chat "hello"

Training pipeline

Run all standard train phases:

uv run grillcheese train --phase full --batch-size 32

Run one phase:

uv run grillcheese train --phase tools_usage --dataset training_data/jsonl/tool_usage_training_data.jsonl

Live status:

uv run grillcheese train-status

Pretraining (Mamba2-inspired path)

Start native SSM pretraining:

uv run grillcheese pretrain --mode native_ssm --data-dir training_data/unified --batch-size 4 --epochs 1

Resume:

uv run grillcheese pretrain --mode native_ssm --data-dir training_data/unified --resume-latest

Checkpoints are written under:

  • ~/.grillcheese/runtime/checkpoints/pretrain/

VSA Reasoning Head Training

Train the VSA head (projects SSM hidden states to 10,000-dim SVC space):

python scripts/run_vsa_training.py --head-only --epochs 1 --batch-size 64 --lr 5e-4 --lr-schedule cosine

This runs a two-phase pipeline:

  1. Pre-compute hidden states: Forward-only backbone pass, cached as memmap (~1.44 GB)
  2. Cached head training: Fused Linear+tanh Vulkan shader + AdamW GPU optimizer on cached vectors

Skip re-computation if cache already exists:

python scripts/run_vsa_training.py --head-only --skip-precompute --skip-hidden-cache

VSA head checkpoints are written under:

  • ~/.grillcheese/runtime/checkpoints/pretrain_vsa/

Detailed pretraining guide:

  • PRETRAINING.md

Unified dataset merge

Merge pretraining JSONL sources:

uv run python scripts/merge_pretraining_dataset.py

Default output/report:

  • tmp/pretrain_ready_merged.v1.jsonl
  • tmp/pretrain_ready_merged.v1.report.json

Runtime modes

  • local: all local GPU compute
  • hybrid: local-first, optional cloud offload
  • cloud: thin client mode

Set with:

  • GRILLCHEESE_RUNTIME_MODE=local|hybrid|cloud

Documentation map

  • ARCHITECTURE.md - full system architecture, design decisions, and process diagrams
  • OVERVIEW.md - architecture and system overview
  • TRAINING_PIPELINE_DIAGRAM.md - visual training/pretraining flow
  • TRAINING_PIPELINE_TECHNICAL.md - method-level technical pipeline
  • PRETRAINING.md - new detailed pretraining guide
  • BRAIN_ARCHITECTURE_VSA.md - VSA architecture direction
  • BRAIN_ARCHITECTURE_RECOMMENDATIONS.md - architecture recommendations

About

GrillCheese is a local-first AI assistant with bio-inspired memory architecture. This is the public code to run it.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

 
 
 

Contributors

Languages