Feature selection and classification for differential diagnosis of asthma and COPD
Küçük Resim Yok
Tarih
2024
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
İstanbul Bilgi Üniversitesi
Erişim Hakkı
info:eu-repo/semantics/openAccess
Özet
Astım ve kronik obstrüktif akciğer hastalığı (KOAH), klinik semptomların örtüşmesi ve mevcut yöntemlerin yetersizliği nedeniyle klinik olarak ayırt edilmesi zor olan iki kronik solunum yolu hastalığı türüdür. Bu nedenle, bu çalışmanın temel amacı, makine öğrenmesi yöntemlerini kullanarak ayırıcı tanı için pulmoner seslerden ve bunların vektör uzayında kombinasyonlarından hesaplanan akustik özellikleri sınıflandırmak için nicel bir yöntem geliştirmektir. Doktorlar tarafından klinik olarak tanı konulan 50 gönüllüden (30 astım ve 20 KOAH) toplanan akciğer seslerinden 22 farklı özellik tipi altında toplam 58 özellik hesaplanmıştır. Hesaplanan bu özellikler, bireysel performanslarına göre veri kümesi için en başarılı sınıflandırıcıyı bulmak amacıyla dört farklı sınıflandırıcıya girdi olarak verilmiştir. İleri sıralı özellik seçimi, doğrusal diskriminant analizi ve temel bileşen analizi gibi boyutluluk azaltma algoritmaları, en iyi performans gösteren sınıflandırıcı aracılığıyla en uygun özellik kümesini bulmak için ayrı ayrı ve birlikte kullanılmıştır. Bireysel performans ve zaman tüketimi nedeniyle özellik seçim sürecine k-en yakın komşu sınıflandırıcı ve radyal tabanlı çekirdek destek vektör makineleri ile devam edilmiştir. K-en yakın komşu sınıflandırıcı ile mel frekans cepstral katsayıları özellik ailesine uygulanan doğrusal diskriminant analizi sonucunda %97 F1 skoru elde edilmiştir. Radyal çekirdek destek vektör makineleri ile mel frekansı cepstral katsayıları ve özbağlanımlı model katsayıları özellik ailelerine uygulanan doğrusal diskriminant analizi sonucunda %98 F1 skoru elde edilmiştir.
Asthma and chronic obstructive pulmonary disease (COPD) are two types of chronic respiratory diseases that are difficult to distinguish clinically due to the overlap of clinical symptoms and the inadequacy of existing methods. Therefore, the main objective of this study is to develop a quantitative method to classify acoustic features calculated from pulmonary sounds and their combinations in vector space for differential diagnosis using machine learning methods. A total of 58 features under 22 different feature types were calculated from lung sounds collected from 50 volunteers (30 with asthma and 20 with COPD) clinically diagnosed by physicians. These calculated features were given as input to four different classifiers to find the most successful classifier for the dataset based on their individual performances. Dimensionality reduction algorithms such as forward sequential feature selection, linear discriminant analysis, and principal component analysis were used separately and in combination to find the optimal feature set through the best-performing classifier. Due to the individual performance and time consumption, the feature selection process was continued with the k-nearest neighbor classifier and radial basis kernel support vector machines. With the k-nearest neighbor classifier, a 97% F1 score was achieved as a result of linear discriminant analysis applied to the mel frequency cepstral coefficients feature family. With radial kernel support vector machines, a 98% F1 score was achieved as a result of linear discriminant analysis applied to the feature families of mel-frequency cepstral coefficients and autocorrelated model coefficients.
Asthma and chronic obstructive pulmonary disease (COPD) are two types of chronic respiratory diseases that are difficult to distinguish clinically due to the overlap of clinical symptoms and the inadequacy of existing methods. Therefore, the main objective of this study is to develop a quantitative method to classify acoustic features calculated from pulmonary sounds and their combinations in vector space for differential diagnosis using machine learning methods. A total of 58 features under 22 different feature types were calculated from lung sounds collected from 50 volunteers (30 with asthma and 20 with COPD) clinically diagnosed by physicians. These calculated features were given as input to four different classifiers to find the most successful classifier for the dataset based on their individual performances. Dimensionality reduction algorithms such as forward sequential feature selection, linear discriminant analysis, and principal component analysis were used separately and in combination to find the optimal feature set through the best-performing classifier. Due to the individual performance and time consumption, the feature selection process was continued with the k-nearest neighbor classifier and radial basis kernel support vector machines. With the k-nearest neighbor classifier, a 97% F1 score was achieved as a result of linear discriminant analysis applied to the mel frequency cepstral coefficients feature family. With radial kernel support vector machines, a 98% F1 score was achieved as a result of linear discriminant analysis applied to the feature families of mel-frequency cepstral coefficients and autocorrelated model coefficients.
Açıklama
Lisansüstü Programlar Enstitüsü, Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
Anahtar Kelimeler
Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering