Sequoia Capture is a macOS 15+ Rust project that records:
- system audio from ScreenCaptureKit
- microphone audio from ScreenCaptureKit
captureMicrophone - a stereo WAV with channel mapping
L=mic,R=system
- macOS 15+
- Xcode command line tools
- Rust toolchain
- Capture source is ScreenCaptureKit only.
- Microphone capture is enabled through
SCStreamConfiguration.captureMicrophone. - System audio and microphone arrive as separate ScreenCaptureKit output streams and are aligned by PTS in the recorder pipeline.
src/main.rs: canonicalrecorditoperator CLI shellsrc/bin/sck_probe.rs: engineering probe binary (stream/output/timestamp inspection)src/bin/sequoia_capture.rs: WAV recorder binarysrc/bin/transcribe_live.rs: transcription CLI contract and config validation entrypointRecordit.xcodeproj: macOS app project containing theRecorditApptarget/schemeapp/RecorditApp/: SwiftUI@mainapp entrypoint and initial window sceneapp/RecorditApp/Info.plist: Recordit.app bundle metadata and privacy usage descriptionspackaging/entitlements.plist: Recordit.app signing entitlements for the current v1 release posture (empty/unsandboxed for v1)docs/research.md: API/TCC/platform researchdocs/architecture.md: real-time pipeline and interleave specdocs/state-machine.md: executable state-machine single source of truth (CLI/runtime/capture/queue/lifecycle)docs/operator-quickstart.md: canonical first-run operator path forrecorditdocs/adr-001-backend-decision.md: backend decision and explicit fallback triggersdocs/adr-002-lock-free-transport.md: callback transport architecture decisiondocs/adr-003-cleanup-boundary-policy.md: cleanup isolation and policy decisiondocs/adr-004-packaged-entrypoint.md: historical packaged entrypoint decision (superseded for user-facing default policy)docs/adr-005-recordit-default-entrypoint.md: canonical user-facing default entrypoint policy (Recordit.app) and fallback boundarydocs/bd-dk69-product-contract-matrix.md: spec-clause product contract matrix with implementation/app-validation/release-evidence obligationsdocs/bd-1nqb-build-system-strategy.md: accepted build-system strategy decision forRecordit.app(xcodebuild/Xcode app-target first)docs/bd-3vwh-recordit-app-target.md: evidence doc forRecordit.xcodeproj+@mainapp-target creation/build/launchdocs/bd-1gx5-recordit-makefile-packaging.md: Makefile packaging cutover evidence for Recordit-first build/bundle/sign/verify targetsdocs/bd-1msp-packaged-gate-retarget.md: packaged gate retarget evidence for Recordit-default launch semantics + runtime compatibility checksdocs/bd-yu7n-recordit-signing-notary-paths.md: signing/notarization/gatekeeper path retarget evidence forRecordit.appdocs/bd-1vo3-gatekeeper-notarization-expectations.md: canonical local ad-hoc vs notarized release validation expectations and exact Gatekeeper/notary checksdocs/bd-14y4-sequoiatranscribe-fallback-policy.md: strict non-default fallback policy for legacySequoiaTranscribeusagedocs/bd-k993-coverage-claim-policy.md: canonical terminology policy for truthful coverage/readiness claimsdocs/bd-2gw4-release-posture-and-build-context-parity.md: canonical guide to which dev, packaged, and release contexts are authoritative for which claimsdocs/bd-1ff5-xctest-xcuitest-retained-artifact-contract.md: concrete retained-artifact contract for XCTest/XCUITest/app-launched evidence lanesdocs/bd-2j49-cross-lane-e2e-evidence-standard.md: project-wide standard tying shell, packaged, XCTest, XCUITest, and app-launched retained evidence togetherdocs/bd-1ngy-cross-lane-evidence-index-and-triage-map.md: practical triage map showing where to start and which retained artifacts to inspect for common failure classesdocs/beads-governance.md: issue decomposition, traceability governance, and evidence-linking workflow
Canonical default is the GUI-first Recordit.app user journey. Follow docs/operator-quickstart.md for the full install/launch/first-run validation flow.
Minimal GUI-first sequence:
make create-recordit-dmg RECORDIT_DMG_NAME=Recordit-local.dmg RECORDIT_DMG_VOLNAME='Recordit'
open dist/Recordit-local.dmgThen drag Recordit.app to Applications, launch it, complete onboarding (permissions + model setup), and validate first live start/stop in-window.
Fallback diagnostics are non-default and should be labeled as compatibility/support workflows:
make run-transcribe-app ...(SequoiaTranscribe.appcompatibility lane)- direct CLI commands (
cargo run --bin recordit -- ...)
make buildBuilds debug binaries.
Recordit.app builds use a two-stage runtime handoff:
scripts/prepare_recordit_runtime_inputs.shperforms the Rust build/staging step outside Xcode and writes prebuilt runtime inputs under.build/recordit-runtime-inputs/<Configuration>/....- Xcode's
Embed Runtime Binariesphase (scripts/embed_recordit_runtime_binaries.sh) is copy-only and consumes those prebuilt inputs viaRECORDIT_RUNTIME_INPUT_DIR(defaulting to the same.buildpath).
make build-recordit-app runs both stages in order, with explicit stage-prefixed logs:
[recordit-app][rust-build] ...[recordit-app][xcodebuild] ...
Embedded runtime paths inside Recordit.app:
Recordit.app/Contents/Resources/runtime/bin/recorditRecordit.app/Contents/Resources/runtime/bin/sequoia_capture
Runtime resolution contract:
- app runtime resolution expects bundled executables in
Contents/Resources/runtime/bin(plus explicit absolute-path overrides viaRECORDIT_RUNTIME_BINARY/SEQUOIA_CAPTURE_BINARYwhen intentionally set) - implicit PATH fallback is disabled by default for startup/readiness so missing bundled payloads fail explicitly
This keeps the GUI-first flow terminal-free for end users (no external PATH setup required).
make contracts-ciRuns the machine-readable contract/schema enforcement suite used by CI (scripts/ci_contracts.sh).
scripts/ci_recordit_xctest_evidence.shRuns app-level xcodebuild test lanes, captures per-step logs + .xcresult bundles, and writes deterministic status summaries under:
artifacts/ci/xctest_evidence/<stamp>/status.csvartifacts/ci/xctest_evidence/<stamp>/summary.csvartifacts/ci/xctest_evidence/<stamp>/responsiveness_budget_summary.csv(app-level responsiveness gate evidence)artifacts/ci/xctest_evidence/<stamp>/contracts/xctest/evidence_contract.jsonartifacts/ci/xctest_evidence/<stamp>/contracts/xcuitest/evidence_contract.jsonartifacts/ci/xctest_evidence/<stamp>/contracts/lane_matrix.json
See docs/bd-1ff5-xctest-xcuitest-retained-artifact-contract.md for the truthful retained-artifact contract, including the current rule that app-launched verification is represented through the xcuitest-evidence lane. See docs/bd-2j49-cross-lane-e2e-evidence-standard.md for the cross-lane summary-surface and traceability standard shared with shell and packaged evidence lanes.
Relevant controls:
XCTEST_EVIDENCE_STAMP(artifact folder name)XCTEST_DESTINATION(default:platform=macOS)CI_STRICT_UI_TESTS(1makes UI-test execution failures required-fail; lane script default is0, CI workflow pins this to1)XCTEST_RESPONSIVENESS_SUMMARY_PATH(override path for responsiveness gate key/value artifact)
summary.csv includes responsiveness threshold rows emitted from app-level XCTest gating:
threshold_first_stable_transcript_budget_okthreshold_stop_to_summary_budget_okresponsiveness_gate_pass
GitHub Actions workflow: .github/workflows/recordit-xctest-evidence.yml.
make probe CAPTURE_SECS=8Runs src/bin/sck_probe.rs and prints output-type/timestamp metadata.
make capture CAPTURE_SECS=10 OUT=artifacts/hello-world.wav SAMPLE_RATE=48000Runs the debug recorder binary directly.
make transcribe-live ASR_MODEL=models/ggml-base.en.binCompatibility note: prefer recordit run --mode offline for normal operator usage.
Validates CLI flags, runs representative ASR transcription against --input-wav (auto-generated locally if missing), emits partial/final events to terminal + JSONL, computes VAD boundaries, and writes runtime manifest + mode-specific latency benchmark artifacts.
make transcribe-live-stream ASR_MODEL=models/ggml-base.en.binCompatibility note: prefer recordit run --mode live for normal operator usage.
Runs the --live-stream runtime selector and prints absolute paths for captured input + emitted artifacts before execution. In this mode, --input-wav is the progressive scratch capture artifact that grows during runtime, while --out-wav is the canonical session WAV materialized on successful closeout.
Common overrides:
TRANSCRIBE_LIVE_STREAM_SECSTRANSCRIBE_LIVE_STREAM_INPUT_WAVTRANSCRIBE_LIVE_STREAM_OUT_WAV,TRANSCRIBE_LIVE_STREAM_OUT_JSONL,TRANSCRIBE_LIVE_STREAM_OUT_MANIFESTTRANSCRIBE_LIVE_STREAM_ARGS(pass-through extratranscribe-liveflags)
| Runtime taxonomy mode | Selector | Use this when | When stable transcript lines appear |
|---|---|---|---|
representative-offline |
<default> |
deterministic artifact validation against an input WAV | mostly at end-of-run summary/replay surfaces |
representative-chunked |
--live-chunked |
near-live scheduler validation on captured WAV (no true concurrent capture) | during runtime as boundaries close, with end summary for complete closeout |
live-stream |
--live-stream |
true live capture + transcription during recording | during active runtime after warmup, then deterministic close summary |
Migration note for selector naming compatibility:
Quick first-run path for true live mode:
New machine permission bootstrap (one-time):
make probe CAPTURE_SECS=3Grant Screen Recording + Microphone access to your terminal when prompted, then continue.
make setup-whispercpp-model
cargo run --bin recordit -- preflight --mode live --json
cargo run --bin recordit -- run --mode live --model artifacts/bench/models/whispercpp/ggml-tiny.en.bin --json
cargo run --bin recordit -- replay --jsonl <session-root>/session.jsonl --format jsonWhat to expect:
- startup banner is deterministic and compact:
runtime_mode,runtime_mode_taxonomy,runtime_mode_selector,runtime_mode_status,channel_mode_requested,duration_sec,input_wav, and canonical artifact paths. - when launched through
recordit, the legacy verboseTranscribe-live configurationdump is suppressed by default so startup remains concise. warmuplifecycle starts first; transcript lines are not expected until runtime reachesactive.- once
active, interactive terminals show low-noise partial updates and stable final lines as segments close. - close summary is deterministic; if stable lines were already shown live, duplicate replay is suppressed.
- health interpretation is deterministic:
ok(no trust notices),degraded(trust notices present),failed(non-zero exit before successful close-summary emission). - runtime result includes
remediation_hintsas a concise, deterministic top-hints line for common degradation/failure follow-ups. - if summary reports degraded trust/degradation counters, use
reconciled_finalplus manifest trust/reconciliation fields for canonical review.
make capture-transcribeRuns sequoia_capture first and stops immediately on capture failure, then invokes transcribe-live on the captured WAV (--asr-backend whispercpp by default). The target prints absolute paths for input/output artifacts before execution.
Common overrides:
PIPELINE_SECS(capture/transcribe duration)PIPELINE_CAPTURE_WAV(intermediate captured WAV used as--input-wav)PIPELINE_OUT_WAV,PIPELINE_OUT_JSONL,PIPELINE_OUT_MANIFESTPIPELINE_CHANNEL_MODE(separate,mixed, ormixed-fallback)PIPELINE_ASR_MODEL(optional explicit model path; if unset, transcribe-live uses its backend default resolution)PIPELINE_ARGS(pass-through extratranscribe-liveflags)
make transcribe-preflight ASR_MODEL=models/ggml-base.en.binRuns structured PASS/WARN/FAIL prerequisite checks before capture/transcription startup and writes a preflight manifest to --out-manifest.
make smokeRuns the CI-safe smoke bundle:
make smoke-offline(deterministic offline journey)make smoke-near-live-deterministic(deterministic near-live fallback using a stereo fixture)
Host near-live smoke (machine-dependent, requires Screen Recording + Microphone permissions):
make smoke-near-liveSmoke artifact roots:
- offline:
artifacts/smoke/offline/ - near-live host capture:
artifacts/smoke/near-live/ - near-live deterministic fallback:
artifacts/smoke/near-live-deterministic/
make gate-d-soakRuns the deterministic near-live soak harness (scripts/gate_d_soak.sh) and writes per-run artifacts plus summary.csv under artifacts/bench/gate_d/<timestamp>/.
make gate-backlog-pressureRuns the deterministic backlog-pressure gate harness and writes artifacts under artifacts/bench/gate_backlog_pressure/<timestamp>/.
make gate-transcript-completenessRuns the reconciliation completeness gate under induced backlog and writes artifacts under artifacts/bench/gate_transcript_completeness/<timestamp>/.
make gate-v1-acceptanceRuns deterministic cold/warm near-live checks plus backlog/trust checks and writes artifacts under artifacts/bench/gate_v1_acceptance/<timestamp>/.
make gate-packaged-live-smokeRuns the packaged smoke gate with two layers of validation:
Recordit.appremains the GUI-default packaged launch path (run-recordit-appplan semantics)- signed compatibility runtime (
SequoiaTranscribe.app) still satisfies live-stream artifact/trust/timing contracts
Machine-readable evidence is written under:
~/Library/Containers/com.recordit.sequoiatranscribe/Data/artifacts/packaged-beta/gates/gate_packaged_live_smoke/<timestamp>/summary.csv~/Library/Containers/com.recordit.sequoiatranscribe/Data/artifacts/packaged-beta/gates/gate_packaged_live_smoke/<timestamp>/status.txt~/Library/Containers/com.recordit.sequoiatranscribe/Data/artifacts/packaged-beta/gates/gate_packaged_live_smoke/<timestamp>/recordit_run_plan.log
Key packaged live checks:
recordit_launch_semantics_ok=true: default packaged launch plan resolves todist/Recordit.appviarun-recordit-appruntime_first_stable_emit_ok=true: first stable transcript evidence is present during active runtimeruntime_transcript_surface_ok=true: manifest/JSONL transcript surfaces are populatedruntime_manifest_out_wav_match_ok=true: manifestsession_summary.artifacts.out_wavmatches the canonical runtimeout_wavpathruntime_manifest_out_jsonl_match_ok=true: manifestsession_summary.artifacts.out_jsonlmatches the canonical runtime JSONL pathruntime_terminal_live_mode_ok=true: terminal contract stayed in live mode without replay fallbackgate_pass=true: packaged live-stream operator path satisfies the current acceptance bar
Reference: docs/gate-packaged-live-smoke.md.
Post-implementation verification checklist and evidence index: docs/post-implementation-verification-checklist.md.
make create-recordit-dmgBuilds dist/Recordit.dmg from dist/Recordit.app and stages an Applications alias/symlink in the DMG root so install UX is explicit.
make inspect-recordit-release-artifactsBuilds or reuses the current packaged artifacts and writes a retained evidence bundle under:
artifacts/ops/release-artifact-inspection/<timestamp>/summary.csvartifacts/ops/release-artifact-inspection/<timestamp>/dist_release_context/summary.csvartifacts/ops/release-artifact-inspection/<timestamp>/artifacts/xcode_bundle_inventory.jsonartifacts/ops/release-artifact-inspection/<timestamp>/artifacts/dmg_root_inventory.json
This is the canonical automated inspection path for the current v1 release posture: it captures Xcode-built app inventory, nested dist/Recordit.app release-context verification, DMG metadata/checksum/mounted contents, and runtime-payload parity across those artifact layers.
Optional overrides:
RECORDIT_DMG_NAME(default:Recordit.dmg)RECORDIT_DMG_VOLNAME(default:Recordit)
make gate-dmg-install-openRuns retained install-surface verification for Recordit.dmg with standardized e2e evidence output:
- optional app/DMG build steps (
make sign-recordit-app, DMG creation) - DMG attach + layout checks (
Recordit.apppresence,Applicationslink target) - copy/install to an explicit destination root
- launch attempt of the installed app (
open -n) and deterministic launch diagnostics - explicit detach cleanup
Evidence root default:
artifacts/ops/gate_dmg_install_open/<timestamp>/
Key retained outputs:
evidence_contract.jsonsummary.csvsummary.jsonstatus.txtlogs/<phase>.log|stdout|stderrartifacts/dmg_attach.plistartifacts/dmg_layout_report.txtartifacts/install_copy_report.txtartifacts/open_launch_report.txt
Useful overrides:
OUT_DIRRECORDIT_DMG_NAMERECORDIT_DMG_VOLNAMESKIP_BUILD=1(reuse existingdist/Recordit.app)SKIP_DMG_BUILD=1(reuse an existing DMG path)INSTALL_DESTINATION=<path>OPEN_WAIT_SEC=<seconds>
Reference: docs/gate-dmg-install-open.md.
make run-transcribe-app ASR_MODEL=models/ggml-base.en.binSuperseded-default context:
docs/adr-005-recordit-default-entrypoint.mdmakesRecordit.appthe canonical user-facing default.- this
run-transcribe-app/SequoiaTranscribe.apppath remains a legacy compatibility and fallback lane for internal runtime continuity while cutover work completes. - fallback policy guardrails (scope/escalation/timeline):
docs/bd-14y4-sequoiatranscribe-fallback-policy.md
Builds/signs dist/SequoiaTranscribe.app (signed app mode for transcribe-live).
Default packaged runs launch via open -W; live selectors such as --live-stream and --live-chunked run the signed executable directly so terminal transcript output remains attached to the invoking shell.
For those attached live runs, the explicit --asr-model asset is staged into the app container before launch so the signed runtime can read it under sandbox rules.
This is a compatibility/fallback packaged launch path, not the primary user-facing default.
The target prints absolute container-scoped artifact destinations before launch and prints a concise post-run session summary after the signed app exits.
For packaged diagnostics on the same path, use make run-transcribe-preflight-app.
For engineering-only development flows, keep using debug targets such as make transcribe-live, make capture-transcribe, and direct cargo run.
Decision records: docs/adr-005-recordit-default-entrypoint.md (current default policy), docs/adr-004-packaged-entrypoint.md (historical/superseded for default policy).
Packaged artifact destination defaults:
- root:
~/Library/Containers/com.recordit.sequoiatranscribe/Data/artifacts/packaged-beta/ - session files:
session.wavsession.jsonlsession.manifest.json
Optional overrides:
TRANSCRIBE_APP_ARTIFACT_ROOTTRANSCRIBE_APP_SESSION_STEM
Explicit packaged live-stream wrapper:
make run-transcribe-live-stream-app ASR_MODEL=models/ggml-base.en.binThis keeps the same signed app entrypoint, prints the live input/output artifact paths before launch, and uses:
<root>/<session-stem>.input.wav<root>/<session-stem>.wav<root>/<session-stem>.jsonl<root>/<session-stem>.manifest.json
Artifact semantics for this wrapper:
<session-stem>.input.wav: progressive live scratch artifact written during capture<session-stem>.wav: canonical session artifact materialized after successful runtime shutdown/drain
Packaged live follow-on evidence path:
make gate-packaged-live-smokewrites packaged live smoke evidence under<root>/gates/gate_packaged_live_smoke/<timestamp>/...- reference:
docs/adr-004-packaged-entrypoint.md(follow-on design section)
make run-transcribe-preflight-app ASR_MODEL=models/ggml-base.en.binRuns the same preflight diagnostics in signed app context and writes results into the configured manifest path. Default signed preflight manifest path:
~/Library/Containers/com.recordit.sequoiatranscribe/Data/artifacts/packaged-beta/session.manifest.json
make run-transcribe-model-doctor-app ASR_MODEL=models/ggml-base.en.binRuns model/backend diagnostics in the same signed app context used by packaged beta runs so model resolution and backend readiness can be verified without using debug-only entrypoints.
For live-stream prerequisite diagnostics in packaged context:
make run-transcribe-model-doctor-app \
ASR_MODEL=models/ggml-base.en.bin \
TRANSCRIBE_ARGS=--live-streamUse this path for live-mode readiness checks when you want backend/model diagnostics only.
Canonical policy: --preflight is compatible with --live-stream and --live-chunked; use --replay-jsonl for post-run replay and keep it separate.
make sign SIGN_IDENTITY=-
make verifyBuilds dist/SequoiaCapture.app, sets Swift runtime rpath, signs, and verifies entitlements/signature.
make run-app CAPTURE_SECS=10 OUT=artifacts/hello-world.wav SAMPLE_RATE=48000Launches the app bundle via open -W and passes recorder arguments.
make reset-permsResets ScreenCapture and Microphone grants for both:
com.recordit.sequoiatranscribecom.recordit.sequoiacapture
make cleanRemoves build/output artifacts and runs cargo clean.
cargo run --bin recordit -- run --mode live
cargo run --bin recordit -- run --mode offline --input-wav <path>
cargo run --bin recordit -- doctor
cargo run --bin recordit -- preflight --mode live
cargo run --bin recordit -- replay --jsonl <path>
cargo run --bin recordit -- inspect-contract cli --format jsonrecorditis the recommended human-facing path for normal operator workflows.- The legacy
transcribe-livecontract remains stable for scripts, gates, and expert-only controls. - Contract/schema evolution policy:
docs/schema-versioning-policy.md
cargo run --bin sck_probe -- [duration_seconds]duration_secondsoptional, default8
cargo run --bin sequoia_capture -- [duration_seconds] [output_path] [sample_rate_hz] [sample_rate_mismatch_policy] [callback_contract_mode]duration_secondsoptional, default10output_pathoptional, defaultartifacts/hello-world.wavsample_rate_hzoptional, default48000sample_rate_mismatch_policyoptional, defaultadapt-stream-rate(adapt-stream-rateorstrict)adapt-stream-ratekeeps callback non-blocking and performs worker-side resampling to the requested output rate when mic/system native rates differstrictfails fast when either stream rate differs from the requested target
callback_contract_modeoptional, defaultwarn(warnorstrict)- telemetry artifact
<output_stem>.telemetry.jsonincludessample_rate_policywith input rates and resampled chunk/frame counters
cargo run --bin transcribe-live -- [--asr-model <local-model-path>] [flags...]- Migration note:
- prefer
recordit run --mode live(or--mode offline) for normal operator usage - keep using
transcribe-livefor legacy automation/gates and deep engineering controls transcribe-live --helpnow prints this migration guidance directly for operators
- prefer
- key flags currently validated:
--duration-sec--input-wav--out-wav--out-jsonl--out-manifest--sample-rate--asr-backend--asr-model--asr-language--asr-threads--asr-profile--vad-backend--vad-threshold--vad-min-speech-ms--vad-min-silence-ms--llm-cleanup--llm-endpoint--llm-model--llm-timeout-ms--llm-max-queue--llm-retries--live-chunked--live-stream--chunk-window-ms--chunk-stride-ms--chunk-queue-cap--live-asr-workers--keep-temp-audio--transcribe-channels--speaker-labels--benchmark-runs--model-doctor--replay-jsonl--preflight
--out-wavcontract:- canonical session WAV artifact path for the run
- always materialized on successful runtime execution
- for
--live-stream, materialized from the progressive--input-wavscratch artifact during successful runtime closeout - for representative modes, materialized according to the mode-specific input/output semantics described in the manifest
- runtime manifest records
out_wav_materializedandout_wav_bytesso artifact truth does not depend on reading the filesystem out-of-band
- backend values:
whispercpp(primary and the only standard v1 setup path forRecordit.app)whisperkit(advanced/manual compatibility path until packaged parity exists)moonshine(placeholder; adapter not wired yet)
- model resolution precedence:
--asr-model <path>(explicit override, highest priority)- explicit
--asr-modelis fail-fast: missing/invalid explicit paths do not fall through to defaults RECORDIT_ASR_MODELenvironment variable- backend defaults (sandbox container model path, then repo-local model defaults)
- whispercpp expects a file path; whisperkit expects a directory path
- preflight/runtime manifests expose both resolved path and source (
asr_model_resolved,asr_model_source)
- model doctor:
- run
cargo run --bin transcribe-live -- --model-doctor [--asr-backend ...] [--asr-model ...] - PASS/WARN/FAIL report includes backend helper availability, model path resolution/kind, and model readability
- use this as first-stop diagnostics before runtime execution when model/backend setup is uncertain
- run
- channel mode values:
separatemixedmixed-fallback(prefers separate but falls back to mixed when dual-channel inputs are unavailable)
- near-live runtime contract:
-
default runtime mode is
representative-offline -
enable near-live contract with
--live-chunked -
runtime taxonomy is authoritative and currently split into:
Taxonomy mode Current selector runtime_modeartifact valuePrimary operator intent Transcript timing expectation --replay-jsonlcompatibility--preflightcompatibilityrepresentative-offline<default>representative-offlinedeterministic offline transcript contract validation stable transcript lines are primarily end-of-run surfaces compatible compatible representative-chunked--live-chunkedlive-chunkednear-live queue/scheduler behavior validation on captured WAV runtime stable lines emit as boundaries close; summary closes out full session incompatible compatible live-stream--live-streamlive-streamtrue concurrent capture + transcription while recording transcript emission starts after warmup enters activeand continues during captureincompatible compatible -
--live-chunkedprepares runtime input via the shared in-process live capture runtime (recordit::live_capture) and then runs a rolling near-live scheduler over the captured WAV -
rolling scheduler semantics:
2sdefault window,0.5sdefault stride, deterministic chunk segment IDs, and tail-aligned final window coverage -
boundary-scoped final segment IDs are normalized from deterministic boundary ordering (
start_ms,end_ms,source,id) so IDs stay stable even if upstream boundary insertion order changes -
near-live ASR work is routed through a bounded queue; when saturated, oldest queued chunk work is dropped to preserve non-blocking producer behavior
-
if chunk backlog caused drops, a post-session reconciliation pass emits
reconciled_finalevents from canonical session audio to improve final completeness without hiding live-path degradation -
--chunk-window-msdefault2000 -
--chunk-stride-msdefault500 -
--chunk-queue-capdefault4 -
--live-asr-workersdefault2 -
--chunk-stride-msmust be<= --chunk-window-ms -
live ASR channel work runs through a dedicated worker pool with explicit backend prewarm before the first live run
-
channel-slice temp WAVs default to
retain-on-failurecleanup; add--keep-temp-audioto retain them on success for debugging -
chunk tuning flags require
--live-chunkedor--live-stream -
--live-streamand--live-chunkedare mutually exclusive selectors -
--live-chunkedand--live-streamare incompatible with--replay-jsonl -
--preflightis compatible with both live selectors and should be used as a readiness diagnostic lane before live runtime execution -
selector naming/deprecation guidance lives in
docs/live-chunked-migration.md
-
- mode/degradation artifact policy:
- runtime manifest records both
channel_mode_requestedand activechannel_mode - runtime manifest records mode contracts as additive fields:
runtime_mode,runtime_mode_taxonomy,runtime_mode_selector,runtime_mode_status - runtime JSONL emits
event_type=mode_degradationwhen fallback/degradation occurs - runtime JSONL emits
event_type=trust_noticewith cause/impact/guidance for user-facing trust calibration - runtime JSONL emits
event_type=asr_worker_poolwith prewarm, queue, and temp-audio cleanup counters - runtime JSONL emits
event_type=chunk_queuewith near-live queue pressure + lag counters - runtime JSONL is append-only and emitted incrementally during lifecycle progression (not only at shutdown)
- in
live-stream, lifecycle transitions and transcript events are emitted during active runtime so JSONL growth itself is evidence of true live behavior - runtime JSONL durability checkpoints call
sync_data()every 24 lines and at stage boundaries - runtime manifest records
out_wav,out_wav_materialized, andout_wav_bytesfor canonical session artifact truth - runtime manifest includes a
degradation_eventsarray with stablecode+detail - runtime manifest includes
asr_worker_pooltelemetry (prewarm_ok,submitted,enqueued,dropped_queue_full,processed,succeeded,failed,retry_attempts,temp_audio_deleted,temp_audio_retained) - runtime manifest includes
chunk_queuetelemetry (submitted,enqueued,dropped_oldest,processed,pending,high_water,lag_sample_count,lag_p50_ms,lag_p95_ms,lag_max_ms) - runtime manifest includes a structured
trustobject (degraded_mode_active,notice_count,notices) - runtime manifest includes
session_summary, a deterministic machine-consumable mirror of terminal close-summary fields (session_status, modes, transcript event counts, queue/lag, trust/degradation top codes, cleanup queue, artifacts) - runtime manifest
event_countsincludes transcript family counts (partial,final,llm_final,reconciled_final) for deterministic diagnostics - runtime manifest
first_emit_timing_msincludesfirst_any,first_partial,first_final, andfirst_stableso gates can validate active-runtime emission without relying on raw JSONL row ordering - replay output prints trust notices so audit reads preserve degraded-mode context
- runtime manifest records both
- readability default contract:
- terminal rendering is capability-aware:
- interactive
TTYshows low-noise partial overwrite updates using[MM:SS.mmm-MM:SS.mmm] <channel> ~ <text>and appends stablefinallines as segments close - non-
TTYlogs suppress partial overwrite updates and emit deterministic stablefinallines during active runtime - end-of-session summary avoids replaying those already-emitted live stable lines to reduce duplicate noise
- interactive
- terminal close-summary fields are emitted in deterministic order (
session_status,duration_sec, mode fields, transcript event counts, queue/lag, trust/degradation, cleanup queue, artifacts) - merged transcript line format:
[MM:SS.mmm-MM:SS.mmm] <channel>: <text> - per-channel transcript line format:
[MM:SS.mmm-MM:SS.mmm] <text> - near-simultaneous cross-channel finals are deterministic: keep canonical sort order and annotate the later line with
(overlap<=120ms with <channel>) - runtime manifest includes
terminal_summary(live_mode,stable_line_count,stable_lines_replayed,stable_lines) aligned with end-of-session terminal behavior - runtime manifest persists ordered transcript events (
partial,final,llm_final,reconciled_final) underevents - runtime manifest includes
readability_defaults+transcript_per_channelentries
- terminal rendering is capability-aware:
- use
cargo run --bin transcribe-live -- --helpto print the full contract - cleanup isolation policy:
- finalized-segment cleanup is queued via non-blocking enqueue (
try_send) - queue-full cleanup requests are dropped, never blocking ASR/final event emission
- processed cleanup requests use
--llm-timeout-msand--llm-retriespolicy - prompt policy is constrained to readability cleanup only (no semantic expansion)
- successful cleanup emits
llm_finalevents withsource_final_segment_idlineage to the originalfinalsegment - queue telemetry is emitted as
cleanup_queuein both JSONL and runtime manifest outputs
- finalized-segment cleanup is queued via non-blocking enqueue (
Sample readable transcript output:
[00:00.000-00:00.420] mic: hello from mic
[00:00.050-00:00.410] system: hello from system (overlap<=120ms with mic)
Replay example:
cargo run --bin transcribe-live -- --replay-jsonl artifacts/transcribe-live.runtime.jsonlBenchmark artifacts are written under:
artifacts/bench/transcribe-live-single-channel/<timestamp>/summary.csvartifacts/bench/transcribe-live-single-channel/<timestamp>/runs.csvartifacts/bench/transcribe-live-dual-channel/<timestamp>/summary.csvartifacts/bench/transcribe-live-dual-channel/<timestamp>/runs.csvartifacts/bench/gate_backlog_pressure/<timestamp>/summary.csvartifacts/bench/gate_transcript_completeness/<timestamp>/summary.csv
make capture/ directcargo run --bin sequoia_capture: output path is resolved from the current shell working directory.make run-app: app is sandboxed, so relative paths resolve inside container storage.make transcribe-live,make transcribe-live-stream, andmake run-transcribe-app: all pass absolute artifact paths and print them before execution.make run-transcribe-appkeeps live selector runs (--live-stream,--live-chunked) attached to the current terminal so incremental transcript output can render during execution.make run-transcribe-appstages explicit live-run model assets under the packaged container root before attached execution so the signed runtime can read them.make run-transcribe-appalso prints a post-run session summary (manifest presence + trust/degradation counters whenjqis available).make transcribe-preflightandmake run-transcribe-preflight-app: run deterministic preflight checks and persist checklist outcomes in the manifest output.make transcribe-model-doctorandmake run-transcribe-model-doctor-app: run model/backend diagnostics in debug and packaged contexts with the same operator-facing contract.- Signed transcribe targets default to container-scoped absolute destinations under:
~/Library/Containers/com.recordit.sequoiatranscribe/Data/artifacts/packaged-beta/
Default run-app output for OUT=artifacts/hello-world.wav:
~/Library/Containers/com.recordit.sequoiacapture/Data/artifacts/hello-world.wav
Open the generated file:
open ~/Library/Containers/com.recordit.sequoiacapture/Data/artifacts/hello-world.wavCopy into repo-local artifacts/:
cp ~/Library/Containers/com.recordit.sequoiacapture/Data/artifacts/hello-world.wav ./artifacts/hello-world.wavRunning the signed app requests macOS privacy permissions for:
- Screen/Screen & System Audio Recording
- Microphone
On Sequoia, direct screen/audio access may also show the private-window-picker bypass prompt with an Allow for one month option.
- The signed app exits automatically after
CAPTURE_SECS; this is expected CLI behavior. - At least one display must be available for the ScreenCaptureKit content filter.
recordit is built to solve a specific reliability problem in real-time transcription workflows on macOS:
- capture system + microphone audio in one deterministic session
- generate readable transcript output while the session is running
- always emit machine-consumable artifacts that automation can trust
- preserve degradation signals explicitly instead of silently hiding them
The project is useful when humans and automation both need the same session output:
- humans need concise terminal feedback and readable lines
- automation needs stable schemas, stable event types, and stable exit semantics
At a system level, recordit is four cooperating layers:
- Operator shell (
recorditCLI)
- canonical command grammar for
run,preflight,doctor,replay, andinspect-contract - mode mapping and guardrails before runtime starts
- Capture substrate (
live_capture)
- ScreenCaptureKit callback ingestion for system audio + microphone
- non-blocking callback path with bounded lock-free transport
- deterministic stereo output contract (
L=mic,R=system)
- Runtime coordinator (
live_stream_runtime+live_asr_pool)
- lifecycle control (
warmup,active,draining,shutdown) - bounded queueing and priority-aware scheduling for ASR work
- transcript/event assembly with deterministic ordering
- Contract/artifact boundary
- append-only runtime JSONL event stream
- deterministic runtime/preflight manifests
- machine-readable compatibility contracts in
contracts/*.json
- Parse and validate command intent
recorditenforces operator-facing mode rules before dispatching runtime work.
- Resolve runtime identity
- runtime mode tuple is explicit in output (
runtime_mode,runtime_mode_taxonomy,runtime_mode_selector,runtime_mode_status).
- Start capture and scheduler
- callback thread ingests and queues audio chunks without blocking.
- worker/runtime threads perform VAD-driven chunking and ASR submission.
- Emit progressive evidence
- JSONL is written incrementally with transcript events and control events (
lifecycle_phase,chunk_queue,trust_notice, etc.).
- Reconcile and close
- runtime drains outstanding work, writes final artifacts, emits session summary, and classifies session as nominal/degraded/failed via manifest fields and exit semantics.
- mic and system streams are anchored to a shared timeline using presentation timestamps (PTS).
- output frame placement is deterministic relative to timeline origin.
- channel mapping is fixed (
L=mic,R=system) to keep downstream replay and analysis stable.
- live modes use chunk window/stride controls to generate rolling ASR work while recording continues.
- VAD boundaries drive segment lifecycle and finalization timing.
- deterministic replay surfaces are preserved by stable ordering metadata and event emission rules.
- ASR queue classes are explicit:
final,reconcile,partial. - scheduling preference is high-signal work first (
finalbefore background classes). - under pressure, eviction/drop behavior is intentional and deterministic rather than unbounded growth.
- pressure/degradation is surfaced in queue + trust telemetry instead of hidden.
- reconciliation can emit
reconciled_finalwhen backlog/ordering recovery is required. - readability cleanup (
llm_final) is policy-bounded and lineage-linked to originalfinalsegments. - cleanup is isolated from core runtime correctness; core transcript completion does not depend on cleanup success.
Public surfaces are treated as compatibility boundaries:
- runtime mode matrix:
contracts/runtime-mode-matrix.v1.json - exit-code classes:
contracts/recordit-exit-code-contract.v1.json - JSONL event schema:
contracts/runtime-jsonl.schema.v1.json - session/preflight manifest schema:
contracts/session-manifest.schema.v1.json
- stable field vocabularies and deterministic summary ordering
- explicit mode labels and selectors in artifacts
- replayable JSONL + manifest pair as source-of-truth evidence
- non-blocking callbacks
- bounded queues with explicit eviction/drop semantics
- explicit lifecycle phases and readiness transitions
- degraded success is intentionally represented as
exit_code=0plus trust/degradation signals - failure remains explicit (
exit_code=2) for usage/config/runtime/preflight/replay failures
recordit intentionally separates "process exit" from "session quality":
exit_code=0can mean either:- nominal success, or
- degraded success (artifacts produced, but trust/degradation review required)
exit_code=2means execution failure or invalid invocation path
For automation, use both layers:
- exit code class from
contracts/recordit-exit-code-contract.v1.json - manifest trust/degradation fields (
trust.*,degradation_events,session_summary.session_status)
- Reliable CI/gate inputs: machine-readable outputs stay stable across runs.
- Better operator ergonomics: concise terminal path with deterministic closeout summaries.
- Better postmortems: runtime JSONL + manifest preserve enough context to debug pressure/recovery behavior.
- Safer rollout evolution: compatibility contracts make change impact explicit.
- Operator shell and command mapping:
src/recordit_cli.rs - Runtime compatibility shell:
src/bin/transcribe_live/app.rs - Shared capture runtime:
src/live_capture.rs - Live-stream coordinator and scheduler:
src/live_stream_runtime.rs - Bounded ASR pool and queue policy:
src/live_asr_pool.rs - Executable behavior model:
docs/state-machine.md - Pipeline architecture narrative:
docs/architecture.md
- Real-time meeting/session transcription with explicit confidence/degradation telemetry
- Regression/gate validation using deterministic artifacts and replay
- Packaged app smoke validation with the same runtime semantics as debug mode
- Automation pipelines that need strict schemas and stable interpretation rules
- Not a general-purpose DAW/audio editor.
- Not an unconstrained low-latency stream processor with unbounded buffering.
- Not a "best effort but opaque" transcription tool; this project favors explicit contracts and telemetry.
- Not tied to one ASR backend implementation strategy; backend selection is modular and policy-driven.