Introduction: Diagnosis of patients presenting with chronic respiratory symptoms is difficult, because the symptoms of COPD and asthma may be similar, and their diagnostic criteria overlap. However, treatment recommendations for COPD and asthma differ, and inappropriate treatment as a result of misdiagnoses bears the potential to increase the risk of exacerbations, morbidity and mortality, and reduces quality of life. Machine learning (ML) offers an innovative approach of mining large electronic health records data to develop diagnostic algorithms for disease differentiation. Methods: From a US electronic health records database, covering primary care, specialist care and hospital medical records, cohorts of patients’ ≥35 years who had a specialist diagnosis of asthma, COPD or both (asthma-COPD overlap, ACO) on ≥2 occasions were created. The specialist diagnosis was used as the case label. Over 60 clinical features including spirometry results, blood test results, comorbidities and symptoms were extracted from patients’ electronic health records data within 12-months before and 12-months after patients’ incident diagnosis. Eleven supervised ML methods were investigated to perform disease classification on 85% of the labeled cases, and the remaining 15% were used as a hold out data set for model validation. Results: A total of 240,378 COPD, 143,748 asthma and 27,437 ACO cases were identified. Extreme Gradient Boosting (XGB) with Bayesian hyper-parameter optimization had the best performance. The XGB model with 12 clinical features including spirometry results, pack-years, body mass index, symptoms, and allergic rhinitis and chronic rhinitis achieved a sensitivity of 0.98, 0.98 and 0.78, and an F1-score (accuracy measure) of 0.98, 0.98 and 0.84, in diagnosing COPD, asthma and ACO, respectively. Conclusions: Machine learning is a powerful tool to aid physicians in the differential diagnosis of asthma, COPD and ACO. Additional studies are needed to evaluate the model in other settings and countries, and to assess its safety for guiding treatment decisions.
Share this article