Debkalpa-das Portfolio

Jupyter Notebook
Capstone Project + Research

Enhancing Sentiment Analysis

Overview

This project represents a comprehensive approach on sentiment analysis for movie reviews using deep learning techniques. The shown methodology makes use of Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks to automatically classify movie reviews into two categories: Positive sentiment and Negative Sentiment.

Preprocessing: I have taken the "IMDB Dataset.csv" dataset. The data has been preprocessed using python commands and GloVe technique. All errors, null values and inconsistencies have been removed and adjusted to as to reduce the error margin.

Design:Using various Deep Learning algorithms like Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks we have achieved an accuracy of 90.28% which is the highest record ever. We have also built pipelines for input and output prediction, set up web repository and developed web application routes for our project. .

Research:In our project we have used CNNs and LSTM networks for better training our model, and this indeed provided better results than previous researches. Our model generated an average accuracy of 90.28% whereas the previous works have best yielded 88.94% and 88.21% respectively.

Challenges

Every project comes with its challenges. This Capstone Project and Research was no different. The combined efforts of various Deep Learning Techniques along with Glove Methodology helped us in getting a better outcome.

Challenge: Accuracy Breakthrough.
Solution: A lot of research work and development has been done on enhancing sentiment analysis, but none were able to get that model accuracy above 90. Me and my team finally crossed the 90's mark with our model getting an average accuracy of 90.28%.

Challenge: Working with cross-functional team.
Solution: Finding a cross-functional team of students from various specializations and getting them to work together to achieve this common goal which is to build a model to surpass the previous accuracy on Sentiment Analysis was quite a hassle, yet we came through.

Challenge: Special characters, emojis and punctuation removal.
Solution: Since we applied our model to IMDB dataset a lot of reviews contain special characters, emojis, concatenated words and punctuations which are relevant to the review and modern language but offer hindrance to the actual analysis. For this we implemented the GloVe technique to compare with real vocabulary and remove any unnecessary words.

Results/Conclusion:

After about 3 months of proper teamwork and extensive research, we finalized building our Deep Learning model for Sentiment Analysis and secured an accuracy of 90.28% on the IMDB dataset. Our model out-performed that of our peers in the same category as well as all researches and models made previously related to Sentiment Analysis.

Eat Work Sleep Repeat Eat Work Sleep Repeat

Debkalpa Das👋

Enhancing Sentiment Analysis

Overview

Challenges

Results/Conclusion: