Skip to content

JYP0824/EHR-TempSQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

EHR-TempSQL: Text-to-SQL Dataset for Accumulative Electronic Health Records

Overview

We introduce EHR-TempSQL, a benchmark for temporal database state reasoning in medical text-to-SQL, comprising 2K dual-turn interactions constructed from three publicly available EHR databases, including MIMIC-III, MIMIC-IV, and eICU, where database states are separated by an average 1.42 hour interval. We construct the dataset through a multi-stage pipeline that transforms static queries into temporal progression scenarios where queries track accumulated information and detect updates between states.

Dataset

The dataset contains the following fields:

  • scenario_id: Unique identifier for each sample.
  • turn: Indicates the turn number in the dual-turn interaction.
  • T_init, T_curr: Timestamps of the database snapshots.
  • question: Natural language question.
  • query: Corresponding SQL query.
  • sql_result: Result of executing the SQL query.
  • answer: Final answer to the question.
  • q_tag: The question template.
  • value: Sampled values from the database.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors