Publications

2026

BUET multi-disease heart sound dataset: A comprehensive auscultation dataset for developing computer-aided diagnostic systems

Shams Nafisa Ali, Afia Zahin, Samiul Based Shuvo, Nusrat Binta Nizam, Shoyad Ibn Sabur Khan Nuhash, Sayeed Sajjad Razin, S.M. Sakeef Sani, Farihin Rahman, Nawshad Binta Nizam, Farhat Binte Azam, Rakib Hossen, Sumaiya Ohab, Nawsabah Noor, Taufiq Hasan

Computer Methods and Programs in Biomedicine Update

Cardiac auscultation, an integral tool in diagnosing cardiovascular diseases (CVDs), often relies on the subjective interpretation of clinicians, presenting a limitation in consistency and accuracy. Addressing this, we introduce the BUET Multi-disease Heart Sound (BMD-HS) dataset, a comprehensive and meticulously curated collection of heart sound recordings. This dataset, encompassing 864 recordings across five distinct classes of common heart sounds, is representative of a broad spectrum of valvular heart diseases, with a focus on diagnostically challenging cases. The standout feature of the BMD-HS Dataset is its innovative multi-label annotation system, which captures a diverse range of diseases and unique disease states. This system significantly enhances the dataset’s utility for developing advanced machine learning models in automated heart sound classification and diagnosis. By bridging the gap between traditional auscultation practices and contemporary data-driven diagnostic methods, the BMD-HS Dataset is poised to revolutionize CVD diagnosis and management, providing an invaluable resource for the advancement of cardiac health research. The dataset is publicly available at: https://github.com/sani002/BMD-HS-Dataset

OxyJet CPAP: an electricity-free low-cost emergency respiratory support device

Md Kawsar Ahmed, Kaisar Ahmed Alman, Meemnur Rashid, Farhan Muhib, Saeedur Rahman, Nawsabah Noor, Md. Khairul Islam, Forhad Uddin Hasan Chowdhury, Md. Mohiuddin Sharif, Rifat Hossain Ratul, Sohana Jahan, Mushfiq Newaz Ahmed, Naveed Rahman, Kazi Nazmul Islam, Mohammad Shahjahan Siddike Shakil, Md. Safiul Islam, Salahuddin Ahmed, Md. Khairul Anam, Md. Titu Miah, Robed Amin, Alain Bernard Labrique, Yasser Khan, Taufiq Hasan

BioMedical Engineering OnLine

Respiratory support devices in resource-limited settings should be inexpensive, portable, effective, and easy to use. Here, the authors describe OxyJet, a low-cost, portable, electricity-free, and 3D-printed continuous positive airway pressure (CPAP) device designed to provide non-invasive respiratory support outside the ICU. Built using off-the-shelf components and 3D printing technology, OxyJet is intended to be inexpensive and easy to produce, costing less than 10% of the price of a comparable CPAP system. Using the potential energy of a high-pressure oxygen jet, it can deliver oxygenated airflow up to about 65 L/min, positive end-expiratory pressure (PEEP) within 5 to 15 cmH2O, and a fraction of inspired oxygen (FiO2) of up to 100%. The device was bench-tested following UK-MHRA RMCPAP guidelines and then evaluated in healthy volunteers and hypoxemic patients. In a comparative pilot study of 23 hypoxemic adult patients in Dhaka, Bangladesh, OxyJet CPAP significantly improved peripheral oxygen saturation (SpO2), with a mean increase of 12.0% (95% CI 10.8 to 13.2), compared with 11.5% (95% CI 9.3 to 13.8) for standard CPAP. The findings suggest that OxyJet CPAP is feasible and has short-term physiological effects comparable to standard CPAP systems, with potential value as an emergency respiratory support device outside the ICU in resource-limited settings.

2025

RadTextAid: A CNN-Guided Framework Utilizing Lightweight Vision-Language Models for Assistive Radiology Reporting

Mahmud Wasif Nafee, Tasmia Rahman Aanika, Taufiq Hasan

Workshop on Large Language Models and Generative AI for Health at AAAI 2025

Deciphering chest X-rays is crucial for diagnosing thoracic diseases such as pneumonia, lung cancer, and cardiomegaly. Radiologists often work under significant workloads and handle large volumes of data, which can lead to exhaustion and burnout. Advanced deep learning models can effectively generate draft radiology reports, potentially alleviating the radiologist's workload. However, many current systems create reports that include clinically irrelevant or redundant information. To address these limitations, we propose RadTextAid, a novel multi-modal framework for generating high-quality, clinically relevant radiology reports. Our approach integrates VLMs for natural language generation, augmented by disease-specific tags derived from a CNN analyzing chest X-ray images to identify key pathological features. A key feature within our framework is the pre-processing of the radiology report training dataset. This removes routine, repetitive, or non-informative phrases commonly found in chest X-ray reports and ensures that the model focuses its learning on clinically meaningful content, which expert radiologists qualitatively validated. Experimental results show that our system yields an absolute improvement of 4.8% in terms of BERTScore and 3.16% in terms of the F1-cheXbert metric compared to a state-of-the-art model. Thus, the results demonstrate that the proposed RadTextAid framework not only improves the detection of abnormalities from chest X-ray images but also enhances the overall quality and coherence of generated reports, thus paving the way toward more efficient and effective radiology reporting.

2025

Efficient Antihallucinogenic AI for Tropical Medicine: A Probabilistic Framework for Differential Diagnosis

SM Sakeef Sani, Md Shaown Miah, Taufiq Hasan

Workshop on Large Language Models and Generative AI for Health at AAAI 2025

Clinical decision-making, particularly in the context of differential diagnosis in low-resource healthcare settings, poses significant challenges due to the complexity and variety of symptoms presented by patients and the lack of skilled doctors. This study introduces mLabLLM, a fine-tuned adaptation of the LLaMA 3.2 3B model designed to enhance clinical decision-making in differential diagnosis. Leveraging domain-specific datasets, including a curated tropical diseases dataset including Dengue, malaria, and chikungunya-prevalent health challenges in South Asian countries and employing optimization techniques like Low-Rank Adaptation (LoRA) and pruning to reduce computational overhead. The model achieves greater efficiency without compromising performance. A probabilistic framework integrates symptom-disease frequencies with Bayesian reasoning, enabling dynamic ranking of diagnoses during patient interactions. Experimental results show that mLabLLM significantly outperforms baseline models, achieving an 82.8% Top-3 accuracy in differential diagnosis, compared to 75.1% for Phi-3-128k and 72.4% for LLaMA 3.2 3B, positioning it as a scalable and practical solution for real-world clinical applications.

2025

A self-attention-driven deep denoiser model for real time lung sound denoising in noisy environments

Samiul Based Shuvo, Syed Samiul Alam, Taufiq Hasan

Biomedical Signal Processing and Control

Objective: Lung auscultation is a valuable tool in diagnosing and monitoring various respiratory diseases. However, lung sounds (LS) are significantly affected by numerous sources of contamination, especially when recorded in real-world clinical settings. Conventional denoising models prove impractical for LS denoising, primarily owing to spectral overlap complexities arising from diverse noise sources. To address this issue, we propose a specialized deep-denoiser model (Uformer) for lung sound denoising. Methods: The proposed Uformer model is constituted of three modules: a Convolutional Neural Network (CNN) encoder module, dedicated to extracting latent features; a Transformer encoder module, employed to further enhance the encoding of unique LS features and effectively capture intricate long-range dependencies; and a CNN decoder module, employed to generate the denoised signals. An ablation study was performed in order to find the most optimal architecture. Results: The performance of the proposed Uformer model was evaluated on lung sounds induced with different types of synthetic and real-world noises. Lung sound signals of −12 dB to 15 dB signal-to-noise ratio (SNR) were considered in testing experiments. The proposed model showed an average SNR improvement of 16.51 dB when evaluated with −12 dB LS signals. Our end-to-end model, with an average SNR of 19.31 dB, outperforms the existing model when evaluated with ambient noise and fewer parameters. Conclusion: Based on the qualitative and quantitative findings in this study, it can be stated that Uformer is robust and generalized to be used in assisting the monitoring of respiratory conditions.

2024

BUET Multi-disease Heart Sound Dataset: A Comprehensive Auscultation Dataset for Developing Computer-Aided Diagnostic Systems

Shams Nafisa Ali, Afia Zahin, Samiul Based Shuvo, Nusrat Binta Nizam, Shoyad Ibn Sabur Khan Nuhash, Sayeed Sajjad Razin, SM Sani, Farihin Rahman, Nawshad Binta Nizam, Farhat Binte Azam, Rakib Hossen, Sumaiya Ohab, Nawsabah Noor, Taufiq Hasan

arXiv preprint arXiv:2409.00724

Cardiac auscultation, an integral tool in diagnosing cardiovascular diseases (CVDs), often relies on the subjective interpretation of clinicians, presenting a limitation in consistency and accuracy. Addressing this, we introduce the BUET Multi-disease Heart Sound (BMD-HS) dataset - a comprehensive and meticulously curated collection of heart sound recordings. This dataset, encompassing 864 recordings across five distinct classes of common heart sounds, represents a broad spectrum of valvular heart diseases, with a focus on diagnostically challenging cases. The standout feature of the BMD-HS dataset is its innovative multi-label annotation system, which captures a diverse range of diseases and unique disease states. This system significantly enhances the dataset's utility for developing advanced machine learning models in automated heart sound classification and diagnosis. By bridging the gap between traditional auscultation practices and contemporary data-driven diagnostic methods, the BMD-HS dataset is poised to revolutionize CVD diagnosis and management, providing an invaluable resource for the advancement of cardiac health research. The dataset is publicly available at this link: https://github.com/mHealthBuet/BMD-HS-Dataset.

BT-Net: An end-to-end multi-task architecture for brain tumor classification, segmentation, and localization from MRI images

Salman Fazle Rabby, Muhammad Abdullah Arafat, Taufiq Hasan

Array, Vol. 22

Brain tumors are severe medical conditions that can prove fatal if not detected and treated early. Radiologists often use MRI and CT scan imaging to diagnose brain tumors early. However, a shortage of skilled radiologists to analyze medical images can be problematic in low-resource healthcare settings. To overcome this issue, deep learning-based automatic analysis of medical images can be an effective tool for assistive diagnosis. Conventional methods generally focus on developing specialized algorithms to address a single aspect, such as segmentation, classification, or localization of brain tumors. In this work, a novel multi-task network was proposed, modified from the conventional VGG16, along with a U-Net variant concatenation, that can simultaneously achieve segmentation, classification, and localization using the same architecture. We trained the classification branch using the Brain Tumor MRI Dataset, and the segmentation branch using a “Brain Tumor Segmentation dataset. The integration of our method’s output can aid in simultaneous classification, segmentation, and localization of four types of brain tumors in MRI scans. The proposed multi-task framework achieved 97% accuracy in classification and a dice similarity score of 0.86 for segmentation. In addition, the method shows higher computational efficiency compared to existing methods. Our method can be a promising tool for assistive diagnosis in low-resource healthcare settings where skilled radiologists are scarce.

Improving Pediatric Pneumonia Diagnosis with Adult Chest X-ray Images Utilizing Contrastive Learning and Embedding Similarity

Mohammad Zunaed, Anwarul Hasan, Taufiq Hasan

2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

Despite the advancement of deep learning-based computer-aided diagnosis (CAD) methods for pneumonia from adult chest x-ray (CXR) images, the performance of CAD methods applied to pediatric images remains suboptimal, mainly due to the lack of large-scale annotated pediatric imaging datasets. Establishing a proper framework to leverage existing adult large-scale CXR datasets can thus enhance pediatric pneumonia detection performance. In this paper, we propose a three-branch parallel path learning-based framework that utilizes both adult and pediatric datasets to improve the performance of deep learning models on pediatric test datasets. The paths are trained with pediatric only, adult only, and both types of CXRs, respectively. Our proposed framework utilizes the multi-positive contrastive loss to cluster the classwise embeddings and the embedding similarity loss among these three parallel paths to make the classwise embeddings as close as possible to reduce the effect of domain shift. Experimental evaluations on open-access adult and pediatric CXR datasets show that the proposed method achieves a superior AUROC score of 0.8464 compared to 0.8348 obtained using the conventional approach of join training on both datasets. The proposed approach thus paves the way for generalized CAD models that are effective for both adult and pediatric age groups.

Uformer: A UNet-Transformer fused robust end-to-end deep learning framework for real-time denoising of lung sounds

Samiul Based Shuvo, Syed Samiul Alam, Taufiq Hasan

arXiv preprint arXiv:2404.04365

Objective: Lung auscultation is a valuable tool in diagnosing and monitoring various respiratory diseases. However, lung sounds (LS) are significantly affected by numerous sources of contamination, especially when recorded in real-world clinical settings. Conventional denoising models prove impractical for LS denoising, primarily owing to spectral overlap complexities arising from diverse noise sources. To address this issue, we propose a specialized deep-learning model (Uformer) for lung sound denoising. Methods: The proposed Uformer model is constituted of three modules: a Convolutional Neural Network (CNN) encoder module, dedicated to extracting latent features; a Transformer encoder module, employed to further enhance the encoding of unique LS features and effectively capture intricate long-range dependencies; and a CNN decoder module, employed to generate the denoised signals. An ablation study was performed in order to find the most optimal architecture. Results: The performance of the proposed Uformer model was evaluated on lung sounds induced with different types of synthetic and real-world noises. Lung sound signals of -12 dB to 15 dB signal-to-noise ratio (SNR) were considered in testing experiments. The proposed model showed an average SNR improvement of 16.51 dB when evaluated with -12 dB LS signals. Our end-to-end model, with an average SNR improvement of 19.31 dB, outperforms the existing model when evaluated with ambient noise and fewer parameters. Conclusion: Based on the qualitative and quantitative findings in this study, it can be stated that Uformer is robust and generalized to be used in assisting the monitoring of respiratory conditions.

Learning to generalize towards unseen domains via a content-aware style invariant model for disease detection from chest X-rays

Mohammad Zunaed, Md Aynal Haque, Taufiq Hasan

IEEE Journal of Biomedical and Health Informatics

Performance degradation due to distribution discrepancy is a longstanding challenge in intelligent imaging, particularly for chest X-rays (CXRs). Recent studies have demonstrated that CNNs are biased toward styles (e.g., uninformative textures) rather than content (e.g., shape), in stark contrast to the human vision system. Radiologists tend to learn visual cues from CXRs and thus perform well across multiple domains. Motivated by this, we employ the novel on-the-fly style randomization modules at both image (SRM-IL) and feature (SRM-FL) levels to create rich style perturbed features while keeping the content intact for robust cross-domain performance. Previous methods simulate unseen domains by constructing new styles via interpolation or swapping styles from existing data, limiting them to available source domains during training. However, SRM-IL samples the style statistics from the possible value range of a CXR image instead of the training data to achieve more diversified augmentations. Moreover, we utilize pixel-wise learnable parameters in the SRM-FL compared to pre-defined channel-wise mean and standard deviations as style embeddings for capturing more representative style features. Additionally, we leverage consistency regularizations on global semantic features and predictive distributions from with and without styleperturbed versions of the same CXR to tweak the model’s sensitivity toward content markers for accurate predictions. Our proposed method, trained on CheXpert and MIMIC-CXR datasets, achieves 77.32±0.35, 88.38±0.19, 82.63±0.13 AUCs(%) on the unseen domain test datasets, i.e., BRAX, VinDr-CXR, and NIH chest X-ray14, respectively, compared to 75.56±0.80, 87.57±0.46, 82.07±0.19 from state-of-the-art models on five-fold cross-validation with statistically significant results in thoracic disease classification.

A web-based mpox skin lesion detection system using state-of-the-art deep learning models considering racial diversity

Shams Nafisa Ali, Md Tazuddin Ahmed, Tasnim Jahan, Joydip Paul, SM Sakeef Sani, Nawsabah Noor, Anzirun Nahar Asma, Taufiq Hasan

Biomedical Signal Processing and Control

The recent ‘Mpox’ outbreak, formerly known as ‘Monkeypox’, has become a significant public health concern and has spread to over 110 countries globally. The challenge of clinically diagnosing mpox early on is due, in part, to its similarity to other types of rashes. Computer-aided screening tools have been proven valuable in cases where Polymerase Chain Reaction (PCR) based diagnosis is not immediately available. Deep learning methods are powerful in learning complex data representations, but their efficacy largely depends on adequate training data. To address this challenge, we present the “Mpox Skin Lesion Dataset Version 2.0 (MSLD v2.0)” as a followup to the previously released openly accessible dataset, one of the first datasets containing mpox lesion images. This dataset contains images of patients with mpox and five other non-mpox classes (chickenpox, measles, handfoot-mouth disease, cowpox, and healthy). We benchmark the performance of several state-of-the-art deep learning models, including VGG16, ResNet50, DenseNet121, MobileNetV2, EfficientNetB3, InceptionV3, and Xception, to classify mpox and other infectious skin diseases. In order to reduce the impact of racial bias, we utilize a color space data augmentation method to increase skin color variability during training. Additionally, by leveraging transfer learning implemented with pre-trained weights generated from the HAM10000 dataset, an extensive collection of pigmented skin lesion images, we achieved the best overall accuracy of 83.59 ± 2.11%. Finally, the developed models are incorporated within a prototype web application to analyze uploaded skin images by a user and determine whether a subject is a suspected mpox patient.

2023

Evaluation of feasibility phase of adaptive version of locally made bubble continuous positive airway pressure oxygen therapy for the treatment of COVID-19 positive and negative adults with severe pneumonia and hypoxaemia

Mohammod Jobayer Chisti, Ahmed Ehsanur Rahman, Taufiq Hasan, Tahmeed Ahmed, Shams El Arifeen, John David Clemens, Abu Sayem Mirza Md Hasibur Rahman, Md Fakhar Uddin, Md Robed Amin, Md Titu Miah, Md Khairul Islam, Mohiuddin Sharif, Abu Sadat Mohammad Sayeem Bin Shahid, Anisuddin Ahmed, Goutom Banik, Meemnur Rashid, Md Kawsar Ahmed, Lubaba Shahrin, Farzana Afroze, Monira Sarmin, Sharika Nuzhat, Supriya Sarkar, Jahurul Islam, Muhammad Shariful Islam, John Norrie, Harry Campbell, Harish Nair, Steve Cunningham

Journal of Global Health

Background: Bubble continuous positive airway pressure (bCPAP) oxygen therapy has been shown to be safe and effective in treating children with severe pneumonia and hypoxaemia in Bangladesh. Due to lack of adequate non-invasive ventilatory support during coronavirus disease 2019 (COVID-19) crisis, we aimed to evaluate whether bCPAP was safe and feasible when adapted for use in adults with similar indications. Methods: Adults (18-64 years) with severe pneumonia and moderate hypoxaemia (80 to <90% oxygen saturation (SpO2) in room air) were provided bCPAP via nasal cannula at a flow rate of 10 litres per minute (l/min) oxygen at 10 centimetres (cm) H2O pressure, in two tertiary hospitals in Dhaka, Bangladesh. Qualitative interviews and focus group discussions, using a descriptive phenomenological approach, were performed with patients and staff (n = 39) prior to and after the introduction (n = 12 and n = 27 respectively) to understand the operational challenges to the introduction of bCPAP. Results: We enrolled 30 adults (median age 52, interquartile range (IQR) 40-60 years) with severe pneumonia and hypoxaemia and/or acute respiratory distress syndrome (ARDS) irrespective of coronavirus disease 2019 (COVID-19) test results to receive bCPAP. At baseline mean SpO2 on room air was 87% (±2) which increased to 98% (±2), after initiation of bCPAP. The mean duration of bCPAP oxygen therapy was 14.4 ± 24.8 hours. There were no adverse events of note, and no treatment failure or deaths. Operational challenges to the clinical introduction of bCPAP were lack of functioning pulse oximeters, difficult nasal interface fixation among those wearing nose pin, occasional auto bubbling or lack of bubbling in water-filled plastic bottle, lack of holder for water-filled plastic bottle, rapid turnover of trained clinicians at the hospitals, and limited routine care of patients by hospital clinicians particularly after official hours. Discussion: If the tertiary hospitals in Bangladesh are supplied with well-functioning good quality pulse oximeters and enhanced training of the doctors and nurses on proper use of adapted version of bCPAP, in treating adults with severe pneumonia and hypoxaemia with or without ARDS, the bCPAP was found to be safe, well tolerated and not associated with treatment failure across all study participants. These observations increase the confidence level of the investigators to consider a future efficacy trial of adaptive bCPAP oxygen therapy compared to WHO standard low flow oxygen therapy in such patients. Conclusion: s Although bCPAP oxygen therapy was found to be safe and feasible in this pilot study, several challenges were identified that need to be taken into account when planning a definitive clinical trial.Bubble continuous positive airway pressure (bCPAP) oxygen therapy has been shown to be safe and effective in treating children with severe pneumonia and hypoxaemia in Bangladesh. Due to lack of adequate non-invasive ventilatory support during coronavirus disease 2019 (COVID-19) crisis, we aimed to evaluate whether bCPAP was safe and feasible when adapted for use in adults with similar indications. Methods Adults (18-64 years) with severe pneumonia and moderate hypoxaemia (80 to <90% oxygen saturation (SpO2) in room air) were provided bCPAP via nasal cannula at a flow rate of 10 litres per minute (l/min) oxygen at 10 centimetres (cm) H2O pressure, in two tertiary hospitals in Dhaka, Bangladesh. Qualitative interviews and focus group discussions, using a descriptive phenomenological approach, were performed with patients and staff (n = 39) prior to and after the introduction (n = 12 and n = 27 respectively) to understand the operational challenges to the introduction of bCPAP. Results We enrolled 30 adults (median age 52, interquartile range (IQR) 40-60 years) with severe pneumonia and hypoxaemia and/or acute respiratory distress syndrome (ARDS) irrespective of coronavirus disease 2019 (COVID-19) test results to receive bCPAP. At baseline mean SpO2 on room air was 87% (±2) which increased to 98% (±2), after initiation of bCPAP. The mean duration of bCPAP oxygen therapy was 14.4 ± 24.8 hours. There were no adverse events of note, and no treatment failure or deaths. Operational challenges to the clinical introduction of bCPAP were lack of functioning pulse oximeters, difficult nasal interface fixation among those wearing nose pin, occasional auto bubbling or lack of bubbling in water-filled plastic bottle, lack of holder for water-filled plastic bottle, rapid turnover of trained clinicians at the hospitals, and limited routine care of patients by hospital clinicians particularly after official hours. Discussion If the tertiary hospitals in Bangladesh are supplied with well-functioning good quality pulse oximeters and enhanced training of the doctors and nurses on proper use of adapted version of bCPAP, in treating adults with severe pneumonia and hypoxaemia with or without ARDS, the bCPAP was found to be safe, well tolerated and not associated with treatment failure across all study participants. These observations increase the confidence level of the investigators to consider a future efficacy trial of adaptive bCPAP oxygen therapy compared to WHO standard low flow oxygen therapy in such patients. Conclusion s Although bCPAP oxygen therapy was found to be safe and feasible in this pilot study, several challenges were identified that need to be taken into account when planning a definitive clinical trial.

COVID-19 Severity Prediction from Chest X-ray Images Using an Anatomy-Aware Deep Learning Model

Nusrat Binta Nizam, Sadi Mohammad Siddiquee, Mahbuba Shirin, Mohammed Imamul Hassan Bhuiyan, Taufiq Hasan

Journal of Digital Imaging

The COVID-19 pandemic has been adversely affecting the patient management systems in hospitals around the world. Radiological imaging, especially chest x-ray and lung Computed Tomography (CT) scans, plays a vital role in the severity analysis of hospitalized COVID-19 patients. However, with an increasing number of patients and a lack of skilled radiologists, automated assessment of COVID-19 severity using medical image analysis has become increasingly important. Chest x-ray (CXR) imaging plays a significant role in assessing the severity of pneumonia, especially in low-resource hospitals, and is the most frequently used diagnostic imaging in the world. Previous methods that automatically predict the severity of COVID-19 pneumonia mainly focus on feature pooling from pre-trained CXR models without explicitly considering the underlying human anatomical attributes. This paper proposes an anatomy-aware (AA) deep learning model that learns the generic features from x-ray images considering the underlying anatomical information. Utilizing a pre-trained model and lung segmentation masks, the model generates a feature vector including disease-level features and lung involvement scores. We have used four different open-source datasets, along with an in-house annotated test set for training and evaluation of the proposed method. The proposed method improves the geographical extent score by 11% in terms of mean squared error (MSE) while preserving the benchmark result in lung opacity score. The results demonstrate the effectiveness of the proposed AA model in COVID-19 severity prediction from chest X-ray images. The algorithm can be used in low-resource setting hospitals for COVID-19 severity prediction, especially where there is a lack of skilled radiologists.

Comparison of Low-cost, Electricity-free, CPAP (OXYJET) with High-Flow Nasal Cannula Treatment outside Critical Care: A Randomized Clinical Trial

Taufiq Hasan, Md Kawsar Ahmed, Kaisar Ahmed Alman, Meemnur Rashid, Farhan Muhib, Saeedur Rahman, Nawsabah Noor, Md Khairul Islam, Forhad Uddin Hasan Chowdhury, Md Mohiuddin Sharif, Rifat Hossain Ratul, Sohana Jahan, Md Titu Miah, Robed Amin, Alain Bernard Labrique, Yasser Khan

Background: In Low- and Middle-income Countries (LMICs), the general wards typically are limited to providing low-flow oxygen therapy (up to 15L/min). However, high-flow treatment, such as noninvasive ventilation (NIV) administered in pre-ICU settings, has effectively reduced intensive care unit (ICU) admissions. Here, we describe a low-cost, electricity-free, pressure-driven, and 3D-printed continuous positive airway pressure (CPAP) device (‘OxyJet’) which can provide noninvasive ventilation support in the general wards. This study assesses whether the developed CPAP device can be alternative to a HighFlow Nasal Cannula (HFNC) device for supporting hypoxemic patient outside critical care. Methods: We performed an open-label, parallel-assignment, randomized controlled trial in 45 severely hypoxemic patients, between April 17 to July 9, 2021 (NCT04681859). The primary outcome was ventilator-free days at day 10 (VFD10). We compared changes from baseline in peripheral oxygen saturation, heart rate, respiratory rate, mortality hazard-ratio, death/intubation in 10 days, patient recovery, adverse events and oxygen consumption of the two treatment groups. Results: The trial results showed that the patients in CPAP group had a mean difference in the ventilatorfree days at day 10 (VFD10) of 2.75 (95% CI -0.17—5.68; p=0.003). The mortality of the patients in CPAP group showed a low hazard-ratio (HR) of 0.65 (95% CI 0.31—1.37; p=0.041). The death/intubation in 10 days of the patients in CPAP group showed a low relative-risk (RR) of 0.52 (95% CI 0.25—1.05; p=0.033). Finally, the device consumed significantly less oxygen than HFNC, with a median difference of -16.11 L/min (95% CI -24.63 — -6.67; p=0.001).

An end-to-end deep learning framework for real-time denoising of heart sounds for cardiac disease detection in unseen noise

Shams Nafisa Ali, Samiul Based Shuvo, Muhammad Ishtiaque Sayeed Al-Manzo, Anwarul Hasan, Taufiq Hasan

IEEE Access, vol. 11

The heart sound signals captured via a digital stethoscope are often distorted by environmental and physiological noise, altering their salient and critical properties. The problem is exacerbated in crowded low-resource hospital settings with high noise levels which degrades the diagnostic performance. In this study, we present a novel deep encoder-decoder-based denoising architecture (LU-Net) to suppress ambient and internal lung sound noises. Training is done using a large benchmark PCG dataset mixed with physiological noise, i.e., breathing sounds. Two different noisy datasets were prepared for experimental evaluation by mixing unseen lung sounds and hospital ambient noises with the clean heart sound recordings. We also used the inherently noisy portion of the PASCAL heart sound dataset for evaluation. The proposed framework showed effective suppression of background noises in both unseen real-world data and synthetically generated noisy heart sound recordings, improving the signal-to-noise ratio (SNR) level by 5.575 dB on an average using only 1.32 M parameters. The proposed model outperforms the current state-of-the-art U-Net model with an average SNR improvement of 5.613 dB and 5.537 dB in the presence of lung sound and unseen hospital noise, respectively. LU-Net also outperformed the state-of-the-art Fully Convolutional Network (FCN) by 1.750 dB and 1.748 dB for lung sound and unseen hospital noise conditions, respectively. In addition, the proposed denoising method model improves classification accuracy by 38.93% in the noisy portion of the PASCAL heart sound dataset. The results presented in the paper indicate that our proposed architecture demonstrated a robust denoising performance on different datasets with diverse levels and characteristics of noise. The proposed deep learning-based PCG denoising approach is a pioneering study that can significantly improve the accuracy of computer-aided auscultation systems for detecting cardiac diseases in noisy, low-resource hospitals and underserved communities.

Activity Classification from First-Person Office Videos with Visual Privacy Protection

Partho Ghosh, Md Abrar Istiak, Nayeeb Rashid, Ahsan Habib Akash, Ridwan Abrar, Ankan Ghosh Dastider, Asif Shahriyar Sushmit, Taufiq Hasan

Proceedings of International Conference on Fourth Industrial Revolution and Beyond 2021

With the advent of wearable body cameras, human activity classification from First-Person Videos (FPV) has become a topic of increasing importance for various applications, including life-logging, law enforcement, sports, workplace, and health care. One of the challenging aspects of FPV is its exposure to potentially sensitive objects within the user’s field of view. In this work, we developed a visual privacy-aware activity classification system focusing on office videos. We utilized a Mask R-CNN with an Inception-ResNet hybrid as a feature extractor for detecting and later blurring out sensitive objects (e.g., digital screens, human face, paper) from the videos. We incorporate an ensemble of Recurrent Neural Networks (RNNs) with ResNet, ResNeXt, and DenseNet-based feature extractors for activity classification. The proposed system was trained and evaluated on the FPV office video dataset which is a subset of the BON [1] vision dataset for office activity recognition (2021) and includes 18 classes made available through the IEEE Video and Image Processing (VIP) Cup 2019 competition. On the original unprotected FPVs, the proposed activity classifier ensemble reached an accuracy of 85.078% with precision, recall, and F1 scores of 0.88, 0.85, and 0.86, respectively. The performances were slightly degraded on privacy-protected videos, with accuracy, precision, recall, and F1 scores at 73.68%, 0.79, 0.75, and 0.74, respectively.

Learning to generalize towards unseen domains via a content-aware style invariant framework for disease detection from chest x-rays

Mohammad Zunaed, M Haque, Taufiq Hasan

IEEE Journal of Biomedical and Health Informatics

Performance degradation due to distribution discrepancy is a longstanding challenge in intelligent imaging, particularly for chest X-rays (CXRs). Recent studies have demonstrated that CNNs are biased toward styles (e.g., uninformative textures) rather than content (e.g., shape), in stark contrast to the human vision system. Radiologists tend to learn visual cues from CXRs and thus perform well across multiple domains. Motivated by this, we employ the novel on-the-fly style randomization modules at both image (SRM-IL) and feature (SRM-FL) levels to create rich style perturbed features while keeping the content intact for robust cross-domain performance. Previous methods simulate unseen domains by constructing new styles via interpolation or swapping styles from existing data, limiting them to available source domains during training. However, SRM-IL samples the style statistics from the possible value range of a CXR image instead of the training data to achieve more diversified augmentations. Moreover, we utilize pixel-wise learnable parameters in the SRM-FL compared to pre-defined channel-wise mean and standard deviations as style embeddings for capturing more representative style features. Additionally, we leverage consistency regularizations on global semantic features and predictive distributions from with and without style-perturbed versions of the same CXR to tweak the model's sensitivity toward content markers for accurate predictions. Our proposed method, trained on CheXpert and MIMIC-CXR datasets, achieves 77.32\pm0.35, 88.38\pm0.19, 82.63\pm0.13 AUCs(%) on the unseen domain test datasets, i.e., BRAX, VinDr-CXR, and NIH chest X-ray14, respectively, compared to 75.56\pm0.80, 87.57\pm0.46, 82.07\pm0.19 from state-of-the-art models on five-fold cross-validation with statistically significant results in thoracic disease classification.

ThoraX-PriorNet: A novel attention-based architecture using anatomical prior probability maps for thoracic disease classification

Md Iqbal Hossain, Mohammad Zunaed, Md Kawsar Ahmed, SM Jawwad Hossain, Anwarul Hasan, Taufiq Hasan

IEEE Access, vol. 12

Computer-aided disease diagnosis and prognosis based on medical images is a rapidly emerging field. Many Convolutional Neural Network (CNN) architectures have been developed by researchers for disease classification and localization from chest X-ray images. It is known that different thoracic disease lesions are more likely to occur in specific anatomical regions compared to others. This article aims to incorporate this disease and region-dependent prior probability distribution within a deep learning framework. We present the ThoraX-PriorNet, a novel attention-based CNN model for thoracic disease classification. We first estimate a disease-dependent spatial probability, i.e., an anatomical prior, that indicates the probability of occurrence of a disease in a specific region in a chest X-ray image. Next, we develop a novel attention-based classification model that combines information from the estimated anatomical prior and automatically extracted chest region of interest (ROI) masks to provide attention to the feature maps generated from a deep convolution network. Unlike previous works that utilize various self-attention mechanisms, the proposed method leverages the extracted chest ROI masks along with the probabilistic anatomical prior information, which selects the region of interest for different diseases to provide attention. The proposed method shows superior performance in disease classification on the NIH ChestX-ray14 dataset compared to existing state-of-the-art methods while reaching an area under the ROC curve (%AUC) of 84.67. Regarding disease localization, the anatomy prior attention method shows competitive performance compared to state-of-the-art methods, achieving an accuracy of 0.80, 0.63, 0.49, 0.33, 0.28, 0.21, and 0.04 with an Intersection over Union (IoU) threshold of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, and 0.7, respectively. The proposed ThoraX-PriorNet can be generalized to different medical image classification and localization tasks where the probability of occurrence of the lesion is dependent on specific anatomical sites.

2022

SpectNet: End-to-end audio signal classification using learnable spectrograms

Md Istiaq Ansari, Taufiq Hasan

arXiv preprint arXiv:2211.09352

Pattern recognition from audio signals is an active research topic encompassing audio tagging, acoustic scene classification, music classification, and other areas. Spectrogram and mel-frequency cepstral coefficients (MFCC) are among the most commonly used features for audio signal analysis and classification. Recently, deep convolutional neural networks (CNN) have been successfully used for audio classification problems using spectrogram-based 2D features. In this paper, we present SpectNet, an integrated front-end layer that extracts spectrogram features within a CNN architecture that can be used for audio pattern recognition tasks. The front-end layer utilizes learnable gammatone filters that are initialized using mel-scale filters. The proposed layer outputs a 2D spectrogram image which can be fed into a 2D CNN for classification. The parameters of the entire network, including the front-end filterbank, can be updated via back-propagation. This training scheme allows for fine-tuning the spectrogram-image features according to the target audio dataset. The proposed method is evaluated in two different audio signal classification tasks: heart sound anomaly detection and acoustic scene classification. The proposed method shows a significant 1.02\% improvement in MACC for the heart sound classification task and 2.11\% improvement in accuracy for the acoustic scene classification task compared to the classical spectrogram image features. The source code of our experiments can be found at \url{https://github.com/mHealthBuet/SpectNet}

Cardiac anomaly detection considering an additive noise and convolutional distortion model of heart sound recordings

Farhat Binte Azam, Md Istiaq Ansari, Shoyad Ibn Sabur Khan Nuhash, Ian McLane, Taufiq Hasan

Artificial Intelligence in Medicine

Cardiac auscultation is an essential point-of-care method used for the early diagnosis of heart diseases. Automatic analysis of heart sounds for abnormality detection is faced with the challenges of additive noise and sensor-dependent degradation. This paper aims to develop methods to address the cardiac abnormality detection problem when both of these components are present in the cardiac auscultation sound. We first mathematically analyze the effect of additive noise and convolutional distortion on short-term mel-filterbank energy-based features and a Convolutional Neural Network (CNN) layer. Based on the analysis, we propose a combination of linear and logarithmic spectrogram-image features. These 2D features are provided as input to a residual CNN network (ResNet) for heart sound abnormality detection. Experimental validation is performed first on an open-access, multiclass heart sound dataset where we analyzed the effect of additive noise by mixing lung sound noise with the recordings. In noisy conditions, the proposed method outperforms one of the best-performing methods in the literature achieving an Macc (mean of sensitivity and specificity) of 89.55% and an average F-1 score of 82.96%, respectively, when averaged over all noise levels. Next, we perform heart sound abnormality detection (binary classification) experiments on the 2016 Physionet/CinC Challenge dataset that involves noisy recordings obtained from multiple stethoscope sensors. The proposed method achieves significantly improved results compared to the conventional approaches on this dataset, in the presence of both additive noise and channel distortion, with an area under the ROC (receiver operating characteristics) curve (AUC) of 91.36%, F-1 score of 84.09%, and Macc of 85.08%. We also show that the proposed method shows the best mean accuracy across different source domains, including stethoscope and noise variability, demonstrating its effectiveness in different recording conditions. The proposed combination of linear and logarithmic features along with the ResNet classifier effectively minimizes the impact of background noise and sensor variability for classifying phonocardiogram (PCG) signals. The method thus paves the way toward developing computer-aided cardiac auscultation systems in noisy environments using low-cost stethoscopes.

Bon: An extended public domain dataset for human activity recognition

Girmaw Abebe Tadesse, Oliver Bent, Komminist Weldemariam, Md Abrar Istiak, Taufiq Hasan, Andrea Cavallaro

arXiv preprint arXiv:2209.05077

Body-worn first-person vision (FPV) camera enables to extract a rich source of information on the environment from the subject's viewpoint. However, the research progress in wearable camera-based egocentric office activity understanding is slow compared to other activity environments (e.g., kitchen and outdoor ambulatory), mainly due to the lack of adequate datasets to train more sophisticated (e.g., deep learning) models for human activity recognition in office environments. This paper provides details of a large and publicly available office activity dataset (BON) collected in different office settings across three geographical locations: Barcelona (Spain), Oxford (UK) and Nairobi (Kenya), using a chest-mounted GoPro Hero camera. The BON dataset contains eighteen common office activities that can be categorised into person-to-person interactions (e.g., Chat with colleagues), person-to-object (e.g., Writing on a whiteboard), and proprioceptive (e.g., Walking). Annotation is provided for each segment of video with 5-seconds duration. Generally, BON contains 25 subjects and 2639 total segments. In order to facilitate further research in the sub-domain, we have also provided results that could be used as baselines for future studies.

Anatomy-xnet: An anatomy aware convolutional neural network for thoracic disease classification in chest x-rays

Uday Kamal, Mohammad Zunaed, Nusrat Binta Nizam, Taufiq Hasan

IEEE Journal of Biomedical and Health Informatics

Thoracic disease detection from chest radiographs using deep learning methods has been an active area of research in the last decade. Most previous methods attempt to focus on the diseased organs of the image by identifying spatial regions responsible for significant contributions to the model's prediction. In contrast, expert radiologists first locate the prominent anatomical structures before determining if those regions are anomalous. Therefore, integrating anatomical knowledge within deep learning models could bring substantial improvement in automatic disease classification. Motivated by this, we propose Anatomy-XNet, an anatomy-aware attention-based thoracic disease classification network that prioritizes the spatial features guided by the pre-identified anatomy regions. We adopt a semi-supervised learning method by utilizing available small-scale organ-level annotations to locate the anatomy regions in large-scale datasets where the organ-level annotations are absent. The proposed Anatomy-XNet uses the pre-trained DenseNet-121 as the backbone network with two corresponding structured modules, the Anatomy Aware Attention (A^3) and Probabilistic Weighted Average Pooling (PWAP), in a cohesive framework for anatomical attention learning. We experimentally show that our proposed method sets a new state-of-the-art benchmark by achieving an AUC score of 85.78%, 92.07%, and, 84.04% on three publicly available large-scale CXR datasets--NIH, Stanford CheXpert, and MIMIC-CXR, respectively. This not only proves the efficacy of utilizing the anatomy segmentation knowledge to improve the thoracic disease classification but also demonstrates the generalizability of the proposed framework.

Monkeypox skin lesion detection using deep learning models: A feasibility study

Shams Nafisa Ali, Md Tazuddin Ahmed, Joydip Paul, Tasnim Jahan, SM Sani, Nawsabah Noor, Taufiq Hasan

arXiv preprint arXiv:2207.03342

The recent monkeypox outbreak has become a public health concern due to its rapid spread in more than 40 countries outside Africa. Clinical diagnosis of monkeypox in an early stage is challenging due to its similarity with chickenpox and measles. In cases where the confirmatory Polymerase Chain Reaction (PCR) tests are not readily available, computer-assisted detection of monkeypox lesions could be beneficial for surveillance and rapid identification of suspected cases. Deep learning methods have been found effective in the automated detection of skin lesions, provided that sufficient training examples are available. However, as of now, such datasets are not available for the monkeypox disease. In the current study, we first develop the ``Monkeypox Skin Lesion Dataset (MSLD)" consisting skin lesion images of monkeypox, chickenpox, and measles. The images are mainly collected from websites, news portals, and publicly accessible case reports. Data augmentation is used to increase the sample size, and a 3-fold cross-validation experiment is set up. In the next step, several pre-trained deep learning models, namely, VGG-16, ResNet50, and InceptionV3 are employed to classify monkeypox and other diseases. An ensemble of the three models is also developed. ResNet50 achieves the best overall accuracy of 82.96(\pm4.57\%), while VGG16 and the ensemble system achieved accuracies of 81.48(\pm6.87\%) and 79.26(\pm1.05\%), respectively. A prototype web-application is also developed as an online monkeypox screening tool. While the initial results on this limited dataset are promising, a larger demographically diverse dataset is required to further enhance the generalizability of these models.

Nutritional Status and Severity Correlation of COPD Patients Admitted in Tertiary Care Hospital

Nawsabah Noor, Taufiq Hasan, Meemnur Rashid, Kaisar Ahmed Alman, Homayra Tahseen Hossain, AKM Humayon Kabir

Bangladesh Journal of Medicine

Background: Malnourishment is highly prevalent among COPD patients. The study was carried out to assess the nutritional status of hospital admitted COPD patients to evaluate the relationships between the nutritional indices and the pulmonary function parameters with severity correlation. Methods: A cross-sectional observational study was done constituting 50 spirometryproven COPD patients admitted at Dhaka Medical College Hospital. Lung function was measured by routine spirometry. Anthropometric measures, biochemical parameters, and Mini Nutritional Assessment (MNA) score were used for nutritional assessment. Results: Mean age of study population was 64.31 years.22% (n = 11), 42% (n = 21) 32% (n = 16) & 4% (n = 2) of the patients were of stage I, II, III and IV of COPD respectively.According to MNA scalethe study population were malnourished 46% (n = 23), at risk of malnutrition 40% (n = 20) and normal nutritional status 14% (n = 7).13 patients were found malnourished according to BMI scale and were in stage I COPD 15.38% (n = 2), stage II 38.46% (n = 5), stage III 38.46% (n = 5) and stage IV 7.69% (n = 1). Mid arm circumference (MAC), mid-calf circumference (MCC), MNA scale score and BMI score showed a significant decline of mean value with increasing severity of stages of COPD. The correlation between BMI and FEV1 (R2 = 0.087 and p value= 0.038), body weight and FEV1 (R2 = 0.173 and p value= 0.003), MUAC and FEV1 (R2: 0.202, p value = 0.001) and MNA scale and FEV1 (R2 = 0.144 and p value= 0.007). All correlations were statistically significant. Conclusion: The high prevalence of malnutrition among hospitalized COPD patients is related to their lung function. Weight, mean MNA, and BMI score decrease with increasing severity of COPD.

A fabric-based inexpensive wearable neckband for accurate and reliable dietary activity monitoring

Md Tauhiduzzaman Khan, Shabnam Ghaffarzadegan, Zhe Feng, Taufiq Hasan

2022 25th International Conference on Computer and Information Technology (ICCIT)

Dietary habits play a significant role in public health and well-being. Monitoring dietary activities is thus essential for maintaining a healthy lifestyle and preventing many widespread diseases, such as diabetes, obesity, and hypertension. In this work, we present a low-cost wearable neckband for automatic diet activity monitoring. The $5 fabric-based device, comprising an electret microphone, a Bluetooth radio module, and a rechargeable Lithium-ion battery, can wirelessly transmit audio to a smart device in real-time. The classification algorithm processes the audio stream in 3s segments and extracts short-time spectral, waveform, and energy-based acoustic features. We compute various statistical functions from the acoustic features to obtain segmental feature vectors, which are subsequently used for machine learning. We perform an experimental evaluation using an in-house dataset collected using the neckband. We compare the performance of different classifiers in distinguishing between drinking, chewing solid foods, and other non-dietary activities. An averaged class-wise F-measure of 81.25% is achieved using the proposed wearable device and a Random Forest (RF) based classifier.

2021

OxyJet: Design and Evaluation of A Low-Cost Precision Venturi Based Continuous Positive Airway Pressure (CPAP) System

Md Kawsar Ahmed, Meemnur Rashid, Kaisar Ahmed Alman, Farhan Muhib, Saeedur Rahman, Taufiq Hasan

arXiv preprint arXiv:2106.00981

The Covid-19 pandemic has strained the hospital systems in many countries in the world, especially in developing countries. In many low-resource hospitals, severely ill hypoxemic Covid-19 patients are treated with various forms of low-flow oxygen therapy (0-15 L/min), including interfaces such as a nasal cannula, Hudson mask, venturi-mask, and non-rebreather masks. When 15L/min of pure oxygen flow is not sufficient for the patient, treatment guidelines suggest non-invasive positive pressure ventilation (NIPPV) or high-flow nasal oxygenation (HFNO) as the next stage of treatment. However, administering HFNO in the general wards of a low-resource hospital is difficult due to several factors, including difficulty in operation, unavailability of electric power outlets, and frequent maintenance. Therefore, in many cases, the highest level of care a patient receives in the general ward is 15L/min of oxygen on a Non-Rebreather Mask. With a shortage of Intensive Care Unit (ICU) beds, this is a major problem since intermediate forms of treatments are simply not available at an affordable cost. To address this gap, we have developed a low-cost CPAP system specifically designed for low-resource hospitals. The device is a precision venturi-based flow-generator capable of providing up to 60L/min of flow. The device utilizes the mechanics of a jet pump driven by high-pressure oxygen to increase the volumetric flow rate by entraining atmospheric air. The fraction of inspired oxygen (FiO2) can be attained between 40 - 100% using a dual-flowmeter. Consisting of a traditional 22mm breathing circuit, a non-vented CPAP mask, and a Positive End-Expiratory Pressure (PEEP) valve, the CPAP can provide positive pressures between 5-20 cm H2O. The device is manufactured using local 3D printing and workshop facilities.

2020

A Wavelet-CNN feature fusion approach for detecting COVID-19 from chest radiographs

Md Latifur Rahman, Nusrat Binta Nizam, Prasun Datta, Md Moynul Hasan, Taufiq Hasan, Mohammed Imamul Hassan Bhuiyan

2020 11th International Conference on Electrical and Computer Engineering (ICECE)

Despite the combined effort, the COVID-19 pandemic continues with a devastating effect on the healthcare system and the well-being of the world population. With a lack of RT-PCR testing facilities, one of the screening approaches has been the use of is chest radiography. In this paper, we propose an automatic chest x-ray image classification model that utilizes the pre-trained CNN architecture (DenseNet121, MobileNetV2) as a feature extractor, and wavelet transformation of the pre-processed images using the CLAHE algorithm and SOBEL edge detection. Our model can detect COVID-19 from x-ray images with high accuracy, sensitivity, specificity, and precision. The result analysis of different architectures and a comparison study of pre-processing techniques (Histogram Equalization and Edge Detection) are thoroughly examined. In this experiment, the Support Vector Machine (SVM) classifier fitted most accurately (accuracy 97.73%, sensitivity 97.84%, F1- score 97.73%, specificity 97.73%, and precision 98.79%) with a wavelet and MobileNetV2 feature sets to identify COVID-19. The memory consumption is also examined to make the model more feasible for telemedicine and mobile healthcare application.

Respiratory distress detection from telephone speech using acoustic and prosodic features

Meemnur Rashid, Kaisar Ahmed Alman, Khaled Hasan, John HL Hansen, Taufiq Hasan

arXiv preprint arXiv:2011.09270

With the widespread use of telemedicine services, automatic assessment of health conditions via telephone speech can significantly impact public health. This work summarizes our preliminary findings on automatic detection of respiratory distress using well-known acoustic and prosodic features. Speech samples are collected from de-identified telemedicine phonecalls from a healthcare provider in Bangladesh. The recordings include conversational speech samples of patients talking to doctors showing mild or severe respiratory distress or asthma symptoms. We hypothesize that respiratory distress may alter speech features such as voice quality, speaking pattern, loudness, and speech-pause duration. To capture these variations, we utilize a set of well-known acoustic and prosodic features with a Support Vector Machine (SVM) classifier for detecting the presence of respiratory distress. Experimental evaluations are performed using a 3-fold cross-validation scheme, ensuring patient-independent data splits. We obtained an overall accuracy of 86.4\% in detecting respiratory distress from the speech recordings using the acoustic feature set. Correlation analysis reveals that the top-performing features include loudness, voice rate, voice duration, and pause duration.

Segcodenet: Color-coded segmentation masks for activity detection from wearable cameras

Asif Shahriyar Sushmit, Partho Ghosh, Md Abrar Istiak, Nayeeb Rashid, Ahsan Habib Akash, Taufiq Hasan

arXiv preprint arXiv:2008.08452

Activity detection from first-person videos (FPV) captured using a wearable camera is an active research field with potential applications in many sectors, including healthcare, law enforcement, and rehabilitation. State-of-the-art methods use optical flow-based hybrid techniques that rely on features derived from the motion of objects from consecutive frames. In this work, we developed a two-stream network, the \emph{SegCodeNet}, that uses a network branch containing video-streams with color-coded semantic segmentation masks of relevant objects in addition to the original RGB video-stream. We also include a stream-wise attention gating that prioritizes between the two streams and a frame-wise attention module that prioritizes the video frames that contain relevant features. Experiments are conducted on an FPV dataset containing activity classes in office environments. In comparison to a single-stream network, the proposed two-stream method achieves an absolute improvement of and for averaged F1 score and accuracy, respectively, when average results are compared for three different frame sizes , , and . The proposed method provides significant performance gains for lower-resolution images with absolute improvements of and in F1 score for input dimensions of and , respectively. The best performance is achieved for a frame size of yielding an F1 score and accuracy of and which outperforms the state-of-the-art Inflated 3D ConvNet (I3D) \cite{carreira2017quo} method by an absolute margin of and , respectively.

A Low-cost, Low-energy Wearable ECG System with Cloud-Based Arrhythmia Detection

Nurul Huda, Sadia Khan, Ragib Abid, Samiul Based Shuvo, Mir Maheen Labib, Taufiq Hasan

2020 IEEE Region 10 Symposium (TENSYMP)

Continuously monitoring the Electrocardiogram (ECG) is an essential tool for Cardiovascular Disease (CVD) patients. In low-resource countries, the hospitals and health centers do not have adequate ECG systems, and this unavailability exacerbates the patients’ health condition. Lack of skilled physicians, limited availability of continuous ECG monitoring devices, and their high prices, all lead to a higher CVD burden in the developing countries. To address these challenges, we present a low-cost, low-power, and wireless ECG monitoring system with deep learning-based automatic arrhythmia detection. Flexible fabric-based design and the wearable nature of the device enhances the patient’s comfort while facilitating continuous monitoring. An AD8232 chip is used for the ECG Analog FrontEnd (AFE) with two 450 mi-Ah Li-ion batteries for powering the device. The acquired ECG signal can be transmitted to a smartdevice over Bluetooth and subsequently sent to a cloud server for analysis. A 1-D Convolutional Neural Network (CNN) based deep learning model is developed that provides an accuracy of 94.03% in classifying abnormal cardiac rhythm on the MIT-BIH Arrhythmia Database.

Towards domain invariant heart sound abnormality detection using learnable filterbanks

Ahmed Imtiaz Humayun, Shabnam Ghaffarzadegan, Md Istiaq Ansari, Zhe Feng, Taufiq Hasan

IEEE journal of biomedical and health informatics

Objective: Cardiac auscultation is the most practiced non-invasive and cost-effective procedure for the early diagnosis of heart diseases. While machine learning based systems can aid in automatically screening patients, the robustness of these systems is affected by numerous factors including the stethoscope/sensor, environment, and data collection protocol. This paper studies the adverse effect of domain variability on heart sound abnormality detection and develops strategies to address this problem. Methods: We propose a novel Convolutional Neural Network (CNN) layer, consisting of time-convolutional (tConv) units, that emulate Finite Impulse Response (FIR) filters. The filter coefficients can be updated via backpropagation and be stacked in the front-end of the network as a learnable filterbank. Results: On publicly available multi-domain datasets, the proposed method surpasses the top-scoring systems found in the literature for heart sound abnormality detection (a binary classification task). We utilized sensitivity, specificity, F-1 score and Macc (average of sensitivity and specificity) as performance metrics. Our systems achieved relative improvements of up to 11.84% in terms of MAcc, compared to state-of-the-art methods. Conclusion: The results demonstrate the effectiveness of the proposed learnable filterbank CNN architecture in achieving robustness towards sensor/domain variability in PCG signals. Significance: The proposed methods pave the way for deploying automated cardiac screening systems in diversified and underserved communities.

A lightweight CNN model for detecting respiratory diseases from lung auscultation sounds using EMD-CWT-based hybrid scalogram

Samiul Based Shuvo, Shams Nafisa Ali, Soham Irtiza Swapnil, Taufiq Hasan, Mohammed Imamul Hassan Bhuiyan

IEEE Journal of Biomedical and Health Informatics

Listening to lung sounds through auscultation is vital in examining the respiratory system for abnormalities. Automated analysis of lung auscultation sounds can be beneficial to the health systems in low-resource settings where there is a lack of skilled physicians. In this work, we propose a lightweight convolutional neural network (CNN) architecture to classify respiratory diseases from individual breath cycles using hybrid scalogram-based features of lung sounds. The proposed feature-set utilizes the empirical mode decomposition (EMD) and the continuous wavelet transform (CWT). The performance of the proposed scheme is studied using a patient independent train-validation-test set from the publicly available ICBHI 2017 lung sound dataset. Employing the proposed framework, weighted accuracy scores of 98.92% for three-class chronic classification and 98.70% for six-class pathological classification are achieved, which outperform well-known and much larger VGG16 in terms of accuracy by absolute margins of 1.10% and 1.11%, respectively. The proposed CNN model also outperforms other contemporary lightweight models while being computationally comparable.

2019

X-Ray Image Compression Using Convolutional Recurrent Neural Networks

Asif Shahriyar Sushmit, Shakib Uz Zaman, Ahmed Imtiaz Humayun, Taufiq Hasan, Mohammed Imamul Hassan Bhuiyan

2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI)

In the advent of a digital health revolution, vast amounts of clinical data are being generated, stored and processed on a daily basis. This has made the storage and retrieval of large volumes of health-care data, especially, highresolution medical images, particularly challenging. Effective image compression for medical images thus plays a vital role in todays healthcare information system, particularly in teleradiology. In this work, an X-ray image compression method based on a Convolutional Recurrent Neural Networks (RNN-Conv) is presented. The proposed architecture can provide variable compression rates during deployment while it requires each network to be trained only once for a specific dimension of X-ray images. The model uses a multi-level pooling scheme that learns contextualized features for effective compression. We perform our image compression experiments on the National Institute of Health (NIH) ChestX-ray8 dataset and compare the performance of the proposed architecture with a state-of-theart RNN based technique and JPEG 2000. The experimental results depict improved compression performance achieved by the proposed method in terms of Structural Similarity Index (SSIM) and Peak Signal-to-Noise Ratio (PSNR) metrics. To the best of our knowledge, this is the first reported evaluation on using a deep convolutional RNN for medical image compression.

End-to-end sleep staging with raw single channel EEG using deep residual convnets

Ahmed Imtiaz Humayun, Asif Shahriyar Sushmit, Taufiq Hasan, Mohammed Imamul Hassan Bhuiyan

2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI)

Humans approximately spend a third of their life sleeping, which makes monitoring sleep an integral part of well-being. In this paper, a 34-layer deep residual ConvNet architecture for end-to-end sleep staging is proposed. The network takes raw single channel electroencephalogram (Fpz-Cz) signal as input and yields hypnogram annotations for each 30s segments as output. Experiments are carried out for two different scoring standards (5 and 6 stage classification) on the expanded PhysioNet Sleep-EDF dataset, which contains multi-source data from hospital and household polysomnography setups. The performance of the proposed network is compared with that of the state-of-the-art algorithms in patient independent validation tasks. The experimental results demonstrate the superiority of the proposed network compared to the best existing method, providing a relative improvement in epoch-wise average accuracy of 6.8% and 6.3% on the household data and multi-source data, respectively. Codes are made publicly available on Github.

2018

An ensemble of transfer, semi-supervised and supervised learning methods for pathological heart sound classification

Ahmed Imtiaz Humayun, Md Tauhiduzzaman Khan, Shabnam Ghaffarzadegan, Zhe Feng, Taufiq Hasan

Interspeech 2018

In this work, we propose an ensemble of classifiers to distinguish between various degrees of abnormalities of the heart using Phonocardiogram (PCG) signals acquired using digital stethoscopes in a clinical setting, for the INTERSPEECH 2018 Computational Paralinguistics (ComParE) Heart Beats SubChallenge. Our primary classification framework constitutes a convolutional neural network with 1D-CNN time-convolution (tConv) layers, which uses features transferred from a model trained on the 2016 Physionet Heart Sound Database. We also employ a Representation Learning (RL) approach to generate features in an unsupervised manner using Deep Recurrent Autoencoders and use Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) classifiers. Finally, we utilize an SVM classifier on a high-dimensional segment-level feature extracted using various functionals on short-term acoustic features, i.e., Low-Level Descriptors (LLD). An ensemble of the three different approaches provides a relative improvement of 11.13% compared to our best single sub-system in terms of the Unweighted Average Recall (UAR) performance metric on the evaluation dataset.

Learning front-end filter-bank parameters using convolutional neural networks for abnormal heart sound detection

Ahmed Imtiaz Humayun, Shabnam Ghaffarzadegan, Zhe Feng, Taufiq Hasan

2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

Automatic heart sound abnormality detection can play a vital role in the early diagnosis of heart diseases, particularly in low-resource settings. The state-of-the-art algorithms for this task utilize a set of Finite Impulse Response (FIR) band-pass filters as a front-end followed by a Convolutional Neural Network (CNN) model. In this work, we propound a novel CNN architecture that integrates the front-end bandpass filters within the network using time-convolution (tConv) layers, which enables the FIR filter-bank parameters to become learnable. Different initialization strategies for the learnable filters, including random parameters and a set of predefined FIR filter-bank coefficients, are examined. Using the proposed tConv layers, we add constraints to the learnable FIR filters to ensure linear and zero phase responses. Experimental evaluations are performed on a balanced 4-fold cross-validation task prepared using the PhysioNet/CinC 2016 dataset. Results demonstrate that the proposed models yield superior performance compared to the state-of-the-art system, while the linear phase FIR filterbank method provides an absolute improvement of 9.54% over the baseline in terms of an overall accuracy metric.