Acta Informatica Pragensia 2023, 12(2), 260-274 | DOI: 10.18267/j.aip.2143230
Diagnostic Performance Evaluation of Deep Learning-Based Medical Text Modelling to Predict Pulmonary Diseases from Unstructured Radiology Free-Text Reports
- 1 Department of Information Technology, National Institute of Technology Karnataka, Mangalore-575025, Karnataka, India
- 2 Department of Computer Science and Engineering, Nitte Mahalinga Adyanthaya Memorial Institute of Technology (NMAMIT), NITTE (Deemed to be University), Udupi-574110, India
- 3 Department of Radiology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal-575001, India
The third most common cause of death worldwide is attributed to pulmonary diseases, making it imperative to diagnose them promptly. Radiology is a medical discipline that utilizes medical imaging to guide treatment. Radiologists prepare reports interpreting details and findings analysed from medical images. Radiology free-text reports are a rich source of textual information that can be exploited to enhance the efficacy of medical prognosis, treatment and research. Radiology reports exist in an unstructured format as are not suitable by themselves for mathematical computation or machine learning operations. Therefore, natural language processing (NLP) strategies are employed to convert unstructured natural language text into a structured format that can be fed into machine learning (ML) or deep learning (DL) models for information extraction. We propose a DL-based medical text modelling framework incorporating a knowledge base to predict pulmonary diseases from unstructured radiology free-text reports. We make detailed diagnostic performance evaluations of our proposed technique by comparing it with state-of-the-art NLP techniques on radiology free-text reports extracted from two medical institutions. The comprehensive analysis shows that the proposed model achieves superior results compared to existing state-of-the-art text modelling techniques.
Keywords: Radiology reports; Unstructured data; Natural language processing; Deep learning.
Received: February 27, 2023; Revised: April 13, 2023; Accepted: April 13, 2023; Prepublished online: April 21, 2023; Published: October 10, 2023 Show citation
ACS | AIP | APA | ASA | Harvard | Chicago | Chicago Notes | IEEE | ISO690 | MLA | NLM | Turabian | Vancouver |
References
- Bayrak, ª., Yucel, E., & Takci, H. (2022). Epilepsy radiology reports classification using deep learning networks. Computers, Materials & Continua, 70(2), 3589-3607. https://doi.org/10.32604/cmc.2022.018742
Go to original source...
- Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistic, 5, 135-146.
Go to original source...
- Chapman, B. E., Lee, S., Kang, H. P., & Chapman, W. W. (2011). Document-level classification of CT pulmonary angiography reports based on an extension of the ConText algorithm. Journal of Biomedical Informatics, 44(5), 728-737. https://doi.org/10.1016/j.jbi.2011.03.011
Go to original source...
- Chen, M. C., Ball, R. L., Yang, L., Moradzadeh, N., Chapman, B. E., Larson, D. B., Langlotz, C. P., Amrhein, T. J., & Lungren, M. P. (2018). Deep learning to classify radiology Free-Text reports. Radiology, 286(3), 845-852. https://doi.org/10.1148/radiol.2017171115
Go to original source...
- Dahl, F. A., Rama, T., Hurlen, P., Brekke, P., Husby, H., Gundersen, T., Nytrø, Ø., & Øvrelid, L. (2021). Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children. BMC Medical Informatics and Decision Making, 21(1), Article number 84. https://doi.org/10.1186/s12911-021-01451-8
Go to original source...
- Hassanpour, S., Bay, G. H., & Langlotz, C. P. (2017). Characterization of change and significance for clinical findings in radiology reports through natural language processing. Journal of Digital Imaging, 30(3), 314-322. https://doi.org/10.1007/s10278-016-9931-8
Go to original source...
- Liu, G., Hsu, T.H., McDermott, M.B.A., Boag, W., Weng, W., Szolovits, P., & Ghassemi, M. (2019). Clinically accurate chest x-ray report generation. In Proceedings of the 4th Machine Learning for Healthcare Conference, PMLR (pp. 249-269). https://proceedings.mlr.press/v106/liu19a.html
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations, ICLR 2013. arXiv:1301.3781. https://doi.org/10.48550/arXiv.1301.3781
Go to original source...
- Nakamura, Y., Hanaoka, S., Nomura, Y., Nakao, T., Miki, S., Watadani, T., Yoshikawa, T., Hayashi, N., & Abe, O. (2021). Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers. BMC Medical Informatics and Decision Making, 21(1), Article number 262. https://doi.org/10.1186/s12911-021-01623-6
Go to original source...
- Pennington, J., Socher, R., & Manning, C. (2014). GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP (pp. 1532-1543). Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1162
Go to original source...
- Pons, E., Braun, L., Hunink, M. G. M., & Kors, J. A. (2016). Natural Language Processing in Radiology: A Systematic review. Radiology, 279(2), 329-343. https://doi.org/10.1148/radiol.16142770
Go to original source...
- Powers, D. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2(1), 37-63.
- Sammut, C., & Webb, G.I. (2011). TF-IDF. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning (pp. 986-987). Springer. https://doi.org/10.1007/978-0-387-30164-8_832
Go to original source...
- Shetty, S., Ananthanarayana, V.S., Mahale, A. (2020). Medical knowledge-based deep learning framework for disease prediction on unstructured radiology free-text reports under low data condition. In Proceedings of the 21st EANN (Engineering Applications of Neural Networks) 2020 Conference, (pp. 352-364). Springer. https://doi.org/10.1007/978-3-030-48791-1_27
Go to original source...
- Shetty, S., Ananthanarayana, V. S., & Mahale, A. (2022). Comprehensive Review of Multimodal Medical data Analysis: open issues and future research Directions. Acta Informatica Pragensia, 11(3), 423-457. https://doi.org/10.18267/j.aip.202
Go to original source...
- Shetty, S., Ananthanarayana, V. S., & Mahale, A. (2023). Multimodal medical tensor fusion network-based DL framework for abnormality prediction from the radiology CXRs and clinical text reports. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-023-14940-x
Go to original source...
- Shin, B., Chokshi, F.H., Lee, T., & Choi, J. D. (2017). Classification of radiology reports using neural attention models. arXiv:1708.06828. https://doi.org/10.48550/arXiv.1708.06828
Go to original source...
- Sippo, D. A., Warden, G. I., Andriole, K. P., Lacson, R., Ikuta, I., Birdwell, R. L., & Khorasani, R. (2013). Automated Extraction of BI-RADS Final Assessment Categories from Radiology Reports with Natural Language Processing. Journal of Digital Imaging, 26(5), 989-994. https://doi.org/10.1007/s10278-013-9616-5
Go to original source...
- Sivic, J., & Zisserman, A. (2009). Efficient visual search of videos cast as text retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4), 591-606. https://doi.org/10.1109/tpami.2008.111
Go to original source...
- Wang, Q., Xu, J., Chen, H., & He, B. (2017). Two improved continuous bag-of-word models. In 2017 International Joint Conference on Neural Networks (IJCNN). (pp. 2851-2856). IEEE. https://doi.org/10.1109/IJCNN.2017.7966208
Go to original source...
- Zhang, Y., Ding, D. Y., Qian, T., Manning, C. D., & Langlotz, C. P. (2018). Learning to summarize radiology findings. In EMNLP 2018 Workshop on Health Text Mining and Information Analysis. ACL. https://aclanthology.org/W18-5623.pdf
Go to original source...
This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, distribution, and reproduction in any medium, provided the original publication is properly cited. No use, distribution or reproduction is permitted which does not comply with these terms.