Automated Clinical Trial Screening Eligibility Software Algorithm
Structured and natural language processing techniques used to match patients to clinical trials from EHR data.
This technology uses of state-of-the-art natural language processing (NLP), information extraction (IE), and machine learning (ML) technologies for automated clinical trial eligibility screening (ES). The approach applied logical constraint filters (LCFs) on the EHR to exclude ineligible encounters based on structured data fields derived from the trial criteria. The unstructured data fields of the prefiltered encounters were then processed, from which the medical terms were extracted and stored in the encounter pattern vectors. The same process was applied to the trial criteria to construct the trial pattern vector; the vector was also extended with informative patterns extracted from EHRs of previously eligible patients to capture hyponyms relevant to the trial criteria. Finally, IE algorithms matched the trial vector with the encounter vectors and returned a ranked list of potentially eligible encounters. Using a physician generated, gold-standard-based evaluation of real-world clinical data and trials, the approach achieved more than 90% workload reduction potential in patient cohort identification and showed the potential of a 450% increase in trial screening efficiency.
- Clinical trial eligibility screening for CROs
- Identifying clinical trial matches for hospitals (physician-patient awareness)
- Automated, 90% workload reduction in patient cohort identification
- Validated on a real world data set
Identifying eligible patients is one of the most challenging and important components in the clinical trial process. For example, only 3% of US adults with cancer participate in clinical trials but an estimated 20% are eligible. One study reported that the cost of eligibility screening ranged by study phase from approximately $129 (observational) to $336 (phase I) per enrolled patient, and time spent to enroll a patient varied from 3.4 to 8.8 hours.
Yizhao Ni, PhD, Division of Biomedical Informatics