Problem
All knowledge sync operations (GitHub, Discourse, papers, docstrings) write to the same SQLite database per community. SQLite's default journal mode (delete) only allows one writer at a time, causing sqlite3.OperationalError: database is locked when multiple syncs run concurrently.
This forces all sync operations to run serially per community, significantly increasing total sync time.
Proposed Solution
Enable WAL (Write-Ahead Logging) mode on knowledge databases:
conn.execute("PRAGMA journal_mode=WAL")
WAL mode allows concurrent readers and a single writer without blocking, and multiple writers will wait (with configurable timeout) rather than immediately failing.
Implementation
- Set WAL mode in
get_db_connection() or equivalent in src/knowledge/db.py
- Set a reasonable
busy_timeout (e.g., 30 seconds): conn.execute("PRAGMA busy_timeout=30000")
- WAL mode persists per database file (only needs to be set once)
- Add a migration or startup check to enable WAL on existing databases
Benefits
- Parallel sync operations per community (GitHub + Discourse + papers simultaneously)
- Faster total sync time
- No more
database is locked errors during normal operation
Considerations
- WAL mode creates
-wal and -shm sidecar files (need to be included in backups)
- Slightly more disk usage
- Works fine with SQLite 3.7+ (all modern Python versions)
Context
Discovered during MNE community onboarding where GitHub, Discourse, and papers syncs all need to write to the same database. Currently we must run them serially to avoid lock errors.
Problem
All knowledge sync operations (GitHub, Discourse, papers, docstrings) write to the same SQLite database per community. SQLite's default journal mode (
delete) only allows one writer at a time, causingsqlite3.OperationalError: database is lockedwhen multiple syncs run concurrently.This forces all sync operations to run serially per community, significantly increasing total sync time.
Proposed Solution
Enable WAL (Write-Ahead Logging) mode on knowledge databases:
WAL mode allows concurrent readers and a single writer without blocking, and multiple writers will wait (with configurable timeout) rather than immediately failing.
Implementation
get_db_connection()or equivalent insrc/knowledge/db.pybusy_timeout(e.g., 30 seconds):conn.execute("PRAGMA busy_timeout=30000")Benefits
database is lockederrors during normal operationConsiderations
-waland-shmsidecar files (need to be included in backups)Context
Discovered during MNE community onboarding where GitHub, Discourse, and papers syncs all need to write to the same database. Currently we must run them serially to avoid lock errors.