A workflow designed to clean fastq files for the SEACONNECT project
-
Updated
Aug 21, 2019 - Python
A workflow designed to clean fastq files for the SEACONNECT project
Sievio turns GitHub, local repos, and web PDFs into clean JSONL for LLM pretraining, fine-tuning, and RAG. It offers structure-aware chunking, reliable Unicode decoding, pluggable QC and safety checks, plus optional dataset cards and deduplication.
Automated quality filtering for diabetic retinopathy images using adaptive, medically informed thresholds.
Machine learning quality flags for Gaia DR3 effective temperatures using XGBoost, CatBoost, and LightGBM (MNRAS 2024)
Add a description, image, and links to the quality-filtering topic page so that developers can more easily learn about it.
To associate your repository with the quality-filtering topic, visit your repo's landing page and select "manage topics."