This was a NLP data science project which analyzed the sentiment of 200,000 regular and political tweets in order to estimate the public sentiment of each political party. The majority of this project was coded in Python. All aspects to a classic data science project were performed: cleaning, exploring, feature engineering, modelling, and application. Multiple uses of NLP techniques such as Bag of Words and TD-IDF were explored in order to generate the final models. The best performing model used a Random Forest base and relied on TF-IDF for feature engineering.
The full code and analysis can be found in this Google Colab link.