Deep learning to estimate lung-related mortality from chest radiographs

Prevention and management of chronic lung diseases (COPD, lung cancer, etc.) are of great importance. While dedicated tests are available for reliable diagnosis and monitoring, accurate identification of those who will eventually develop severe morbidity and mortality is currently limited. Therefore, new possibilities to improve risk stratification are desirable. Chest radiographs (CXR) are common in patients at risk for chronic lung disease and may provide a window into long-term risk. Here, we developed a deep learning model (CXR Lung-Risk) to predict the risk of lung disease mortality from a single routine CXR image.

The CXR Lung-Risk model was developed using 147,497 CXRs of 40,643 asymptomatic participants from the Prostate, Lung, Colorectal, and Ovarian (PLCO) cancer screening trial to predict lung-related mortality (COPD, interstitial lung disease, emphysema, lung cancer) based on a single CXR image as input. In three independent testing datasets comprising 15,976 individuals, we found that CXR Lung-Risk showed a graded association with lung disease mortality which remained robust after adjustment for known risk factors, including chronological age, smoking, and radiologic findings (adjusted hazard ratio for highest risk group vs. reference up to 12.15 [8.85-16.68]; p<0.001). Adding CXR Lung-Risk to a multivariable model improved estimates of lung disease mortality (for all p≤0.004).

These findings motivate the use of deep learning algorithms to translate a patient’s visual appearance on easily obtainable and low-cost chest radiograph images into objective, quantitative, and clinically useful measures. Our results demonstrate that such approaches can improve prognostication in individuals with lung diseases. This allows for improved risk assessment of those who would benefit most from personalized prevention and treatment strategies. 

Publication

Weiss J, Raghu VK, Bontempi D, Christiani DC, Mak RH, Lu MT, Aerts HJWL Deep learning to estimate lung-related mortality from chest radiographs.

Nature Communications 2023

Code availability

All the code of the deep learning system including the trained model is publicly available under the open MIT license and can be found here. Furthermore, to showcase how our model works, promote transparency, and encourage further validation of our deep learning solution, we provide the community with a free-to-use working implementation of the end-to-end pipeline through a Google Colab Notebook. This cloud-based instance allows users with minimal coding proficiency to process a large amount of CXR data without having to install anything on their local node. In the notebook, we provide a description of all the preprocessing steps needed to convert a standard of care CXR from the DICOM format to the format the ensemble model accepts as an input. We also describe all the steps of the processing, discuss the different model composing the ensemble and their details, and provide examples.

Statistical Code

This link contains the code to reproduce the statistical analysis of our paper Deep learning to estimate lung-related mortality from chest radiographs. More information about the statistical analysis can be found in the Methods section. The code uses the automatic prediction of our deep learning algorithm which was described in the data availability section below.

Data availability

Due to our data use agreements, the original PLCO and NLST data cannot be distributed but downloaded upon request from the National Cancer Institute and ECOG-ACRIN (PLCO: https://biometry.nci.nih.gov/cdas/plco/; NLST: https://biometry.nci.nih.gov/cdas/nlst/). The BLCS data are available through the BLCS Trial Center for academic non-commercial research purposes only and are subject to review of a project proposal that will be evaluated by a BLCS data access committee.

 
 
 

AIM Investigators

 

Acknowledgements

The authors thank the study participants, the investigators, and the NCI for the data collected in the PLCO and NLST trials. Original data collection for the ACRIN 6654 trial (NLST) was supported by NCI Cancer Imaging Program grants. The statements contained herein are solely those of the authors and do not represent or imply concurrence or endorsement by the above organizations. The authors further acknowledge financial support from NIH (DC: NIH (NCI) 5U01CA209414; HA: NIH-USA U24CA194354, NIH-USA U01CA190234, NIH-USA U01CA209414, and NIH-USA R35CA22052), the European Union - European Research Council (HA: 866504), the American Heart Association (ML: 810966; VR: 935176), the Massachusetts General Hospital Thrall Innovation award (ML), the Johnson & Johnson Innovation/National Academy of Medicine Healthy Longevity Quickfire Challenge (ML, VR), and the National Academy of Medicine Healthy Longevity Catalyst Award (VR, JW, ML: 2000011734).