Hello,
I'm very interested in your excellent work on MedResearcher-R1. Your paper describes a medical trajectory dataset generated through KISA from 30M+ PubMed abstracts, which appears to be a key contribution to achieving the impressive results on MedBrowseComp.
I noticed you've also open-sourced a high-quality QA dataset within your GitHub repository (specifically TrajectoryGenerationPipeline/qa_data/open_data.jsonl). However, upon examination, this dataset appears to focus on general-domain questions rather than medical-specific content, and may not be the actual dataset used for training MedResearcher-R1.
If you're interested, could you please release the dataset used in training MedResearcher-R1? Let me know if I have any misunderstanding.
Thank you very much!
Sincerely,
IcecreamArtist
Hello,
I'm very interested in your excellent work on MedResearcher-R1. Your paper describes a medical trajectory dataset generated through KISA from 30M+ PubMed abstracts, which appears to be a key contribution to achieving the impressive results on MedBrowseComp.
I noticed you've also open-sourced a high-quality QA dataset within your GitHub repository (specifically TrajectoryGenerationPipeline/qa_data/open_data.jsonl). However, upon examination, this dataset appears to focus on general-domain questions rather than medical-specific content, and may not be the actual dataset used for training MedResearcher-R1.
If you're interested, could you please release the dataset used in training MedResearcher-R1? Let me know if I have any misunderstanding.
Thank you very much!
Sincerely,
IcecreamArtist