Alpha Research is an extensible platform for grounded research over large text corpora.
This repository currently keeps one active reference product:
AlphaBook, the book-centric reference app backed by a Project Gutenberg-derived corpus
It also keeps one small non-book proof point:
packages/source-fixture, a minimal adapter and repository example used to validate that the platform is not Gutenberg-only
The important positioning is simple:
- this repo is extensible
- it is not turnkey
- AlphaBook stays book-specific
- the reusable surface lives in the platform, adapter, ingest, and repository seams
Install dependencies:
npm installRun the repo checks for the supported extensible surface:
npm run validate:extensibleRun the fixture corpus demo without provisioning Postgres or object storage:
npx tsx apps/ingest/src/index.ts ingest-fixtureThat command falls back to local preview mode when the persistence env vars are not set.
If you want the full setup path for your own corpus, start with:
- docs/oss-supported-surface.md
- docs/oss-quickstart.md
- docs/deep-research-setup.md
- docs/bring-your-own-corpus.md
- docs/adapter-architecture.md
This repository is published as two things at once:
- Alpha Research, the shared corpus-research platform layer
- AlphaBook, the active book-centric reference implementation
The boundary is intentional:
- the live AlphaBook product, routes, and copy stay book-centric
- the AlphaBook HTTP API stays
workandbookshaped for compatibility - the generic extension points for OSS adopters live in neutral packages and the additive
/api/v1/documents/*API - adding a new corpus still requires adapter, ingest, implementation, and deployment work
apps/frontend: AlphaBook frontendapps/orchestrator-worker: Linux API and worker serviceapps/runtime: Linux runtime service for filesystem-backed analysisapps/ingest: adapter-aware ingest serviceapps/book-content-worker: static rendered-book content worker for AlphaBookpackages/corpus-core: generic runtime limits and artifact key helperspackages/corpus-text: generic text embedding helperspackages/platform: neutral contracts, repository interfaces, and adapter registrypackages/implementations: implementation-level branding, origins, and prompt configurationpackages/source-gutenberg: Project Gutenberg adapter for ingest and storage conventionspackages/source-fixture: minimal non-book adapter and repository examplepackages/db: database client and migration utilitiespackages/shared: AlphaBook-facing contracts, prompts, and compatibility exportspackages/tooling: local scripts such as migrations and implementation scaffolding
- Implemented:
- Linux API chat and health endpoints
- retrieval, workspace hydration, and cited synthesis flow
- Linux runtime service integration
- neutral document API plus AlphaBook compatibility API
- adapter-aware ingest helpers
- implementation scaffolding for future corpus-specific deployments
- Still incomplete:
- daily Project Gutenberg feed diffing
- full production-grade Gutenberg embedding backfill automation
- turnkey setup for arbitrary new datasets
The main repo check for the reusable surface is:
npm run validate:extensibleThat command covers:
- implementation config typechecks and tests
- platform typechecks and tests
- DB typechecks and tests
- Gutenberg adapter typechecks and tests
- fixture adapter typechecks and tests
- shared compatibility typechecks
- ingest typechecks and tests
- focused orchestrator repository/store tests
If you change AlphaBook product code outside that surface, run the app-specific checks too.
The full environment list is in docs/environment.md.
Core variables include:
DATABASE_URLOPENAI_API_KEYOPENAI_MODELOPENAI_SYNTH_MODELOPENAI_EMBEDDING_MODELTOOL_STREAM_CLEANUP_MODELS3_BUCKET_NAMEorSPACES_BUCKET_NAMES3_ENDPOINTorSPACES_ENDPOINTS3_ACCESS_KEY_IDorSPACES_ACCESS_KEY_IDS3_SECRET_ACCESS_KEYorSPACES_SECRET_ACCESS_KEYS3_REGIONorSPACES_REGIONGUTENBERG_MIRROR_ROOTRUNTIME_SERVICE_URLRUNTIME_SERVICE_TOKENQUEUE_INGEST_NAMEQUEUE_JOBS_NAMEVITE_API_BASE_URL
Run database migrations:
DATABASE_URL=postgres://postgres:postgres@127.0.0.1:5432/alphabook npm run migrateRun the API:
npm run dev:orchestratorRun the local Node-backed API harness:
PORT=8788 npm run dev:node -w @alphabook/orchestrator-workerRun the frontend:
npm run dev:frontendRun the runtime service:
npm run dev:runtimeRun the ingest service:
npm run dev:ingest