Clinician-rated reasoning quality predicts diagnostic correctness in a psychiatric evaluation of large language models

NOTE: Code and data will made public upon publication, according to this data availability statement as reported in the manuscript: Clinician-authored fictitious vignettes will be publicly available. We will not publicly redistribute text derived from published case reports or verbatim model reasoning traces; citations to original sources will be provided, and access to restricted materials may be provided under controlled conditions (e.g., to qualified researchers under a data-use agreement and/or institutional approval).

Description: Code and data repository for the paper "Clinician-rated reasoning quality predicts diagnostic correctness in a psychiatric evaluation of large language models".

Preprint: https://www.medrxiv.org/content/10.64898/2026.02.03.26345402v2

Corresponding author: Kevin W. Jin

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clinician-rated reasoning quality predicts diagnostic correctness in a psychiatric evaluation of large language models

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Clinician-rated reasoning quality predicts diagnostic correctness in a psychiatric evaluation of large language models

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages