Building Domain-Specific Lexicons: An Application to Financial News
Küçük Resim Yok
Tarih
2019
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Institute of Electrical and Electronics Engineers Inc.
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Natural Language Processing (NLP) has gained attention in the recent years. Previous research (such as WordNet and Cyc) has focused on developing an all purpose (generalised) polarised lexicons. However, these lexicons do not provide much information in different domains such as Finance and Medical Sciences. Using these lexicons for text classification could affect the prediction accuracy. Therefore, there is a need for building domain- and context-specific lexicons. To achieve this, in this work, a label based propagation based word embedding algorithm has been proposed to obtain positive and negative lexicons. The proposed algorithm works on the principle of graph theory and word embedding. The proposed algorithm is tested on Dow Jones news wires text feed to classify the Financial news as hot and non-hot. Three classifiers, namely, Logistic Regression, Random Forest and XGBoost, employing polarised lexicons, seed words and random words were used. The performance of classifiers in all cases was evaluated using accuracy. Lexicons generated using the proposed approach were effective in classifying the Financial news articles as hot and non-hot compared to classifiers using seed words and random words. Proposed label propagation with word embedding algorithm generates context-specific lexicons, which aids in helps in better representation of text in natural processing tasks and avoids the problem of dimensionality. © 2019 IEEE.
Açıklama
2019 International Conference on Deep Learning and Machine Learning in Emerging Applications, Deep-ML 2019 -- 26 August 2019 through 28 August 2019 -- -- 153122
Anahtar Kelimeler
Classification, Dow Jones Dataset, Financial Lexicons, Financial News, Label Propagation, Machine Learning, Classification (Of İnformation), Decision Trees, Embeddings, Finance, Graph Theory, Learning Algorithms, Learning Systems, Machine Learning, Natural Language Processing Systems, Text Processing, Dow Jones, Embedding Algorithms, Financial Lexicons, Financial News, Label Propagation, Logistic Regressions, Natural Language Processing, Performance Of Classifier, Deep Learning
Kaynak
Proceedings - 2019 International Conference on Deep Learning and Machine Learning in Emerging Applications, Deep-ML 2019
WoS Q Değeri
Scopus Q Değeri
N/A