About
There are two tasks in PMC-Patients for benchmarking ReCDS systems: Patient-to-Article Retrieval (PAR) and Patient-to-Patient Retrieval (PPR). For a given query patient, PAR aims to retrieve relevant articles from PubMed, and PPR aims to retrieve similar patients from PMC-Patients.
For more details about PMC-Patients, please refer to our paper:
Dataset & Submission
PMC-Patients contain 167k patient summaries collected from PubMed Central, annotated with 3.1M relevant articles and 293k similar patients defined by PubMed citation relationships.
Please visit our GitHub repository to download the dataset and submit your model:
Patient-to-Article Retrieval (PAR) Leaderboard
Model | MRR (%) | P@10 (%) | nDCG@10 (%) | R@1k (%) | |
---|---|---|---|---|---|
1 June 25, 2023 |
DPR (SciMult-MHAExpert) UIUC/Microsoft (Zhang et al. 2023) |
29.89 | 9.35 | 13.79 | 53.71 |
2 Apr 5, 2023 |
RRF Tsinghua University (Zhao et al. 2023) |
29.86 | 8.86 | 13.36 | 49.45 |
3 Apr 5, 2023 |
DPR (PubMedBERT) Tsinghua University (Zhao et al. 2023) |
19.83 | 6.51 | 8.87 | 46.23 |
4 Apr 5, 2023 |
DPR (BioLinkBERT) Tsinghua University (Zhao et al. 2023) |
19.06 | 6.11 | 8.26 | 45.79 |
5 Apr 5, 2023 |
DPR (SPECTER) Tsinghua University (Zhao et al. 2023) |
17.92 | 5.49 | 7.66 | 42.46 |
6 Apr 5, 2023 |
BM25 Tsinghua University (Zhao et al. 2023) |
18.71 | 3.84 | 7.38 | 21.89 |
7 Sep 14, 2023 |
bge-base-en-v1.5 BAAI (Xiao et al. 2023) |
15.88 | 4.27 | 6.44 | 30.43 |
8 Oct 4, 2023 |
MedCPT-d NCBI (Jin et al. 2023) |
13.06 | 2.67 | 4.95 | 19.94 |
Patient-to-Patient Retrieval (PPR) Leaderboard
Model | MRR (%) | P@10 (%) | nDCG@10 (%) | R@1k (%) | |
---|---|---|---|---|---|
1 Apr 5, 2023 |
RRF Tsinghua University (Zhao et al. 2023) |
27.76 | 6.96 | 24.12 | 85.14 |
2 June 25, 2023 |
DPR (SciMult-MHAExpert) UIUC/Microsoft (Zhang et al. 2023) |
25.34 | 6.66 | 22.40 | 83.87 |
3 Apr 5, 2023 |
BM25 Tsinghua University (Zhao et al. 2023) |
22.86 | 4.67 | 18.29 | 69.66 |
4 Apr 5, 2023 |
DPR (BioLinkBERT) Tsinghua University (Zhao et al. 2023) |
21.20 | 5.59 | 18.06 | 80.49 |
5 Apr 5, 2023 |
DPR (PubMedBERT) Tsinghua University (Zhao et al. 2023) |
19.37 | 5.05 | 16.30 | 79.35 |
6 Sep 14, 2023 |
bge-base-en-v1.5 BAAI (Xiao et al. 2023) |
16.20 | 3.78 | 13.02 | 68.85 |
7 Apr 5, 2023 |
DPR (SPECTER) Tsinghua University (Zhao et al. 2023) |
15.08 | 3.79 | 12.27 | 73.01 |
8 Oct 4, 2023 |
MedCPT-d NCBI (Jin et al. 2023) |
13.68 | 3.18 | 11.01 | 60.17 |
Citation
If you use PMC-Patients in your research, please cite our paper by:
@article{Zhao2023ALD, title={A large-scale dataset of patient summaries for retrieval-based clinical decision support systems.}, author={Zhengyun Zhao and Qiao Jin and Fangyuan Chen and Tuorui Peng and Sheng Yu}, journal={Scientific data}, year={2023}, volume={10 1}, pages={ 909 }, url={https://api.semanticscholar.org/CorpusID:266360591} }