explainable risk signals engine + UK trend alert subsystem for a reputation & exposure manager.
It includes:
- A small synthetic Items dataset (CSV + JSONL) with edge cases (sarcasm, quoting, ambiguity, non‑political use of political terms, multi‑meaning trend terms).
- A minimal FastAPI service with transparent scoring and reason lists.
- A minimal UK trends ingestor (NewsAPI optional, RSS fallback) + trend term extraction.
- A simple cross‑match job that creates Alerts when historical items overlap with current trends.
- A tiny demo single‑page dashboard (vanilla HTML/JS) that calls the API.
- Starter SQLite schema + seed script + unit tests.
Ethical framing: scores are risk indicators, not judgments. Operate only on user‑provided/consented data. Explanations are returned for every flag.
- Python 3.10+
- Node.js 22+
- (Optional) A NewsAPI key (for /trends/ingest using live headlines)
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Setup frontend
cd web
npm i
npm run build
cd ..
# Optional: only if using live NewsAPI headlines
cp .env.example .envpython -m scripts.seed_dbuvicorn app.main:app --reloadOpen:
- API docs: http://127.0.0.1:8000/docs
- Demo UI: http://127.0.0.1:8000/
To simulate a 'newly risky today' pipeline end to end (seed → trends → cross-match → alerts) without any network dependency:
python -m scripts.run_daily_alerts --source snapshotPOST /risk/score— score a single item (returns score + reasons + decomposition)GET /items— list sample items + latest stored scoresGET /trends/current— list current trends stored in DBPOST /trends/ingest— ingest headlines and updateTrendTopicPOST /alerts/run— cross‑match trends to items and createAlertGET /alerts— list alerts
A static OpenAPI contract is included at openapi.yaml (FastAPI also generates one at /openapi.json).
Signals (all weights configurable in config/scoring.yaml):
- Sentiment (VADER, rule‑based) — negative tone nudges risk upward
- Topic tags (keyword rules from
config/topics.yaml) - Toxicity (rule‑based keyword list + simple style cues)
- Age (older items get a small exposure bump)
- Trend overlap delta (if item overlaps with current trends)
Every signal returns:
- numeric contribution
- structured reasons (matched terms, categories, overlaps)
- optional “edge case” annotations (possible sarcasm, quoting, ambiguity)
data/items_synthetic.csvdata/items_synthetic.jsonl
data/trend_history_uk_demo.csv
Many Twitter/X research datasets are shared as Tweet IDs only, and require “rehydration” via the API to fetch text (to comply with platform redistribution/compliance rules). For this student project (no keys / no OAuth), this sponsor pack ships a synthetic dataset and open‑licensed trend sources instead.
See data/EXTERNAL_DATASETS.md for recommended open‑licensed alternatives and caveats.
Run unit + integration tests:
pytest -qRun integration tests alone
pytest -q -k integrationapp/ FastAPI app
config/ YAML configs (topics, scoring weights, ambiguity rules)
data/ sample datasets
scripts/ seed + demo scripts
tests/ unit tests
web/ simple dashboard (served as static)
- Create a feature branch per issue
- Open a Pull Request
- Ensure tests pass (CI)
- Require at least 1 reviewer before merging