A Shiny-for-Python web application to explore and compare distributions of ClinVar variants across different CADD PHRED-score thresholds, filter by gene lists or panels, and export per-gene/per-panel or filtered annotation summaries. The app is primarily intended for investigating the score distribution of known pathogenic and benign variants for different CADD PHRED-score thresholds.
This README explains the repository layout, how to run the app locally (pip/conda).
Highlights
- Interactive visualizations of CADD PHRED-score distributions
- Compare distributions across CADD/ClinVar versions and genome builds
- Per-gene filtering (paste a list or upload a file) and exportable summaries
- Per-panel filtering using panels from PanelApp and exportable summaries
- Python 3.10+ (3.12 recommended)
- See
requirements.txtorenvironment.ymlfor full dependencies - Docker (optional) — a
Dockerfileis included for containerized runs
The underlying data for the CADD-ThresholdApp needs to be downloaded, if the source code is downloaded as a package from bioconda or pip. The data can be downloaded here. The data is also versionized seperately from the packages. You can also preprocess your own data for the website using this Snakemake workflow: https://github.com/kircherlab/CADD_threshold_analysis.
data/- contains preprocessed tables, panel summaries and metrics used by the app.paneldata/- CSVs summarizing panels and versions used by the UIpanel_metrics/- generated metrics stored by date/version
Notes:
- Large raw annotation files are typically not tracked in the repository. The app
expects prepared/normalized CSV inputs - use https://github.com/kircherlab/CADD_threshold_analysis to regenerate CSV inputs or use the
modules/panelapp/utilities if you need to regenerate panel CSVs from PanelApp.
Using conda
conda create -n cadd_threshold_app -c bioconda -c conda-forge cadd-threshold-app
conda activate cadd_threshold_app
cadd-threshold-app --data </path/to/data>Using pip
pip install cadd-threshold-app
cadd-threshold-app --data </path/to/data>git clone https://github.com/kircherlab/CADD_threshold_app.git
cd CADD_threshold_app
pip install .
cadd-threshold-app --data dataInstall as package (editable, recommended for development)
pip install -e .Option A: run via the package entry point
This requires installing the project as a package (e.g. pip install -e .).
cadd-threshold-app --data </path/to/data>Alternatively to the cli option --data, you can set the CADD_THRESHOLD_APP_DATA_DIR environment variable.
export CADD_THRESHOLD_APP_DATA_DIR=data
cadd-threshold-appFurther CLI options are available to configute host and port - run cadd-threshold-app --help for details.
Option B: run from the repository root. Please set the CADD_THRESHOLD_APP_DATA_DIR environment variable to point to your data directory (e.g. data/ in the repository) before running.
export CADD_THRESHOLD_APP_DATA_DIR=data
python -m shiny run cadd_threshold_app.app:appThen open http://localhost:8080 in your browser.
app.py- Shiny app entrypoint and UI wiringserver_logic.py- main server-side reactive logic and handlersdata_loader.py- helpers to load and preprocess annotation tablesui_components.py- UImodules/- plotting helpers, utilities and gene-list/panel parsing helpersbasic_plot.py,basic_bar_plot.py,compare_basic_plot.py- plotting factoriesfunctions_server_helpers.py,read_genes_from_list_or_file_functions.py- utilitiespanelapp/- scripts to interact with PanelApp (CSV generation, comparison)
- To extend plots: add a factory under
modules/and register it in server logic - To add data sources: update
data_loader.pyand ensure column names match the plotting/metric code paths - Linting/tests: None included by default. Add unit tests for critical data parsing when making larger refactors.
- The included
Dockerfilebuilds a minimal image running the app on port 8080.
- See
LICENSEfor licensing terms. - For questions about data sources, interpretation, or contributions, contact the repository maintainers or open an issue.