Combination of Lexicon Based and Machine Learning Techniques in the Development of Political Tweet Sentiment Analysis Model
Abstract
Twitter is a popular micro blogging social media platform and the largest data contributor in the analysis of political sentiments in the United States especially in Presidential elections. Lack of labeled data as well as requirements of testing data are major problems in political domain since due to their constant change according to current events. The contribution of this study is to compare two dictionary-based Lexicon approaches which are Bing Liu Opinion Lexicon and Textblob for tweets labelling. Some comparative models have been developed. Model based on Bing Liu Opinion Lexicon which used machine learning algorithm TF-IDF for feature extraction and also classified with Naïve Bayes gets the highest F1-Score with 93%, outperformed our baseline model with score of only 68%. Test results have shown the effectiveness of combining lexicon approaches and machine learning algorithms in the development of sentiment analysis model.