Hello,
While exploring the malariagen-data-python API and preparing a proposal for the GSoC
Machine-learning taxon classifier project, I noticed that it might be useful to
have a helper utility that quickly summarizes mosquito samples by species and
country.
I propose adding a small helper function that generates a table summarizing
sample counts by species and country using the existing API datasets.
Additionally, I would like to include an exploratory notebook demonstrating:
- dataset exploration
- species distribution visualization
- potential machine-learning features for taxonomic classification
This could help users better understand the dataset and explore potential
classification approaches.
I’d be happy to work on implementing this if it seems useful.
Please let me know if any adjustments would be preferred.
Best regards,
Prakhar