Deidentified Psychiatric Intake Notes

A data repository containing 1,000 deidentified and annotated psychiatric intake notes developed and used for the RDoC Natural Language Processing Challenge and Workshop will be available from the N-GRID and DBMI websites in November 2017 (after participants have been allowed exclusive use for a period of one year following the Workshop). This dataset contains information about patients' medical and psychiatric histories, drug and alcohol use, family history, current living situations, and other information potentially relevant to their psychiatric problems. The corpus contains 1,862,452 tokens.