Classification of »hot News» for Financial Forecast Using NLP Techniques
Küçük Resim Yok
Tarih
2018
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Institute of Electrical and Electronics Engineers Inc.
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Complex dynamics of stock market could be attributed to various factors ranging from company's financial ratios to investors' sentiment and reaction to Financial news. The paper aims to classify Financial news articles as »hot» (significant) and »non-hot» (non-significant). The study is carried out using Dow Jones newswires text feed for a period of four years spanning from 2013 till 2017. Bag-of-ngrams appraoch and Term Frequency-Inverse Document Frequency (TF-IDF) were used for text representation and text weighting, respectively. Four linear classifiers, namely, Logistic Regression (LR), Support Vector Machine (SVM), k Nearest Neighbours (kNN) and multinomial Naïve Bayes (mNB) were used. Grid search was used for hyperparameter optimisation. Performance of the classifiers was evaluated using five measures, namely, success rate, precision, recall, F1 measure and area under receiver operating characteristics curve. LR and SVM outperformed other models in terms of all five performance measures for both Bag-of-ngrams model and Bag-of-ngrams model with TF-IDF approach. Use of TF-IDF improved performance of the classifiers, especially, in case of mNB. This study serves as a stepping stone in identification of important/relevant news, which could used as predictors for stock price forecasting. © 2018 IEEE.
Açıklama
Baidu;et al.;Expedia Group;IEEE;IEEE Computer Society;Squirrel AI Learning
2018 IEEE International Conference on Big Data, Big Data 2018 -- 10 December 2018 through 13 December 2018 -- -- 144531
2018 IEEE International Conference on Big Data, Big Data 2018 -- 10 December 2018 through 13 December 2018 -- -- 144531
Anahtar Kelimeler
Classification, Financial Forecasts, Financial News, Hot News, Natural Language Processing, Barium Compounds, Electronic Trading, Financial Markets, Forecasting, Investments, Natural Language Processing Systems, Nearest Neighbor Search, Text Processing, Financial Forecasts, Financial News, Hot News, Language Processing, Logistics Regressions, Multinomial Naive Bayes, Natural Language Processing, Natural Languages, Performance, Term Frequencyinverse Document Frequency (Tf-Idf), Support Vector Machines
Kaynak
Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
WoS Q Değeri
Scopus Q Değeri
N/A