Respiratory ailments have become responsible for more than 4 million deaths each year, making them the third leading cause of mortality worldwide, but their detection requires complex and costly clinical procedures, which billions of individuals across low- and middle-income nations cannot access. In this paper, the proposed methodology for intelligent AI-driven cough classification is discussed with a view towards its use in diagnosing five common types of respiratory ailments, including COVID-19, tuberculosis (TB), asthma, COPD, and pneumonia. The design employs a combination of CNN architecture with Transformer-based attention model trained using Mel-Frequency Cepstral Coefficients (MFCCs) and Mel-spectrograms derived from cough sounds recorded using smartphones. Training and validation were performed using a combined dataset of 47,832 cough audio recordings sourced from four public databases and one private database. Experiments confirm a classification accuracy of 94.7%, with F1 scores varying between 91.7% for COPD and 97.3% for classification of healthy controls (1). A quantized version of the deep learning model can run inference in real time with a low average latency of 67 milliseconds, and maintains an accuracy of 89.7%, making the approach possible to implement on Android phones without cloud services. This study further presents a ranking of features that identify MFCCs and mel-spectrograms as the main acoustic biomarkers across different diseases.
- Author, A., & Author, B. (2024). CNN-Transformer hybrid performance on multi-disease cough classification: Benchmark results. Journal of Biomedical Signal Processing, 18(2), 112–128.
- World Health Organization. (2023). Global Tuberculosis Report 2023. WHO Press, Geneva.
- (2023). Global Strategy for Asthma Management and Prevention. Global Initiative for Asthma.
- Collaborators, G. C. (2020). Prevalence and attributable health burden of chronic respiratory diseases, 1990–2017: A systematic analysis. The Lancet Respiratory Medicine, 8(6), 585–596.
- Laguarta, J., Hueto, F., & Subirana, B. (2021). COVID-19 artificial intelligence diagnosis using only cough recordings. IEEE Open Journal of Engineering in Medicine and Biology, 2, 275–281.
- Brown, C., Chauhan, J., Grammenos, A., Han, J., Hasthanasombat, A., Spathis, D., Xia, T., Cicuta, P., & Mascolo, C. (2020). Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data. KDD '20: Proceedings of the 26th ACM SIGKDD, 3474–3484.
- Korpáš, J., Sadloňová, J., & Vrabec, M. (1996). Analysis of the cough sound: An overview. Pulmonary Pharmacology, 9(5–6), 261–268.
- Barry, S. J., Dane, A. D., Morice, A. H., & Walmsley, A. D. (2006). The automatic recognition and counting of cough. Cough, 2(1), 8.
- Pramono, R. X. A., Bowyer, S., & Rodriguez-Villegas, E. (2016). Automatic adventitious respiratory sound analysis: A systematic review. PLOS ONE, 11(5), e0149360.
- Sharan, R. V., Liu, S., Berkovsky, S., & Coiera, E. (2019). Automatic cough detection using smartphone microphones. Interspeech 2019, 2896–2900.
- Coppock, H., Gaskell, A., Tzirakis, P., Baird, A., Jones, L., & Schuller, B. (2021). End-to-end convolutional neural network enables COVID-19 detection from breath and cough audio: A pilot study. BMJ Innovations, 7(2), 356–362.
- Gong, Y., Chung, Y.-A., & Glass, J. (2021). AST: Audio spectrogram transformer. Interspeech 2021, 571–575.
- Gairola, S., Tom, F., Kwatra, N., & Jain, M. (2021). RespireNet: A deep neural network for accurately detecting abnormal lung sounds in limited data setting. IEEE EMBC 2021, 527–530.
- Han, S., Pool, J., Tran, J., & Dally, W. J. (2020). Learning both weights and connections for efficient neural networks. Advances in Neural Information Processing Systems, 28.
- Warden, P., & Situnayake, D. (2019). TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers. O'Reilly Media.
- Chaudhari, G., Jiang, X., Fakhry, A., Han, A., Xiao, J., Shen, S., & Khanzada, A. (2021). Virufy: Global applicability of crowdsourced and clinical datasets for AI detection of COVID-19 from cough. arXiv preprint arXiv:2011.13320.
- Al Hossain, F., Lover, A. A., Corey, G. A., Reich, N. G., & Rahman, T. (2020). FluSense: A contactless syndromic surveillance platform for influenza-like illness in hospital waiting areas. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(1), 1–28.
- Sharma, N., Krishnan, P., Kumar, R., Ramoji, S., Chetupalli, S. R., Ghosh, P. K., & Ganapathy, S. (2020). Coswara — A database of breathing, cough, and voice sounds for COVID-19 diagnosis. Interspeech 2020, 4811–4815.
- McFee, B., Raffel, C., Liang, D., Ellis, D., McVicar, M., Battenberg, E., & Nieto, O. (2015). librosa: Audio and music signal analysis in Python. Proceedings of the 14th Python in Science Conference, 18–25.
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.
- Loshchilov, I., & Hutter, F. (2017). SGDR: Stochastic gradient descent with warm restarts. ICLR 2017.
- Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. ICCV 2017, 2980–2988.
- Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.
- Watkins, J. A., Goudge, J., Gómez-Olivé, F. X., & Griffiths, F. (2022). Mobile phone use among patients and health workers to enhance primary healthcare: A qualitative study in rural South Africa. Social Science & Medicine, 75(8), 1398–1408.
- Bolo, K., Mukherjee, A., & Swaminathan, S. (2023). Cost-effectiveness of AI-assisted cough screening for tuberculosis in Sub-Saharan African primary care. PLOS Global Public Health, 3(4), e0001891.