Building Domain-Specific Lexicons: An Application to Financial News

Küçük Resim Yok

Tarih

2019

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Institute of Electrical and Electronics Engineers Inc.

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Natural Language Processing (NLP) has gained attention in the recent years. Previous research (such as WordNet and Cyc) has focused on developing an all purpose (generalised) polarised lexicons. However, these lexicons do not provide much information in different domains such as Finance and Medical Sciences. Using these lexicons for text classification could affect the prediction accuracy. Therefore, there is a need for building domain- and context-specific lexicons. To achieve this, in this work, a label based propagation based word embedding algorithm has been proposed to obtain positive and negative lexicons. The proposed algorithm works on the principle of graph theory and word embedding. The proposed algorithm is tested on Dow Jones news wires text feed to classify the Financial news as hot and non-hot. Three classifiers, namely, Logistic Regression, Random Forest and XGBoost, employing polarised lexicons, seed words and random words were used. The performance of classifiers in all cases was evaluated using accuracy. Lexicons generated using the proposed approach were effective in classifying the Financial news articles as hot and non-hot compared to classifiers using seed words and random words. Proposed label propagation with word embedding algorithm generates context-specific lexicons, which aids in helps in better representation of text in natural processing tasks and avoids the problem of dimensionality. © 2019 IEEE.

Açıklama

2019 International Conference on Deep Learning and Machine Learning in Emerging Applications, Deep-ML 2019 -- 26 August 2019 through 28 August 2019 -- -- 153122

Anahtar Kelimeler

Classification, Dow Jones Dataset, Financial Lexicons, Financial News, Label Propagation, Machine Learning, Classification (Of İnformation), Decision Trees, Embeddings, Finance, Graph Theory, Learning Algorithms, Learning Systems, Machine Learning, Natural Language Processing Systems, Text Processing, Dow Jones, Embedding Algorithms, Financial Lexicons, Financial News, Label Propagation, Logistic Regressions, Natural Language Processing, Performance Of Classifier, Deep Learning

Kaynak

Proceedings - 2019 International Conference on Deep Learning and Machine Learning in Emerging Applications, Deep-ML 2019

WoS Q Değeri

Scopus Q Değeri

N/A

Cilt

Sayı

Künye