
NLP - Tweet Binary Classification (Disaster tweet or Not)
In this experiment, I tired multiple experiments to build a binary classification model with high accuracy to classify tweets into "Disaster or Not Disaster tweets". For that I started from traditional Machine learning baseline model Naive Bayes to complex Bi-Directional LSTM model and also used pretrained google's Universal Sentence Encoder (USE) model.
Using this I was able to achieve 82% accuracy. All the model that were build were fairly very simple, with one to two hidden layers and multiple of 8 neurons or units. This was done because of the quantity of the data with was not much (deep learning require very large amount of data to obtain high accuracy and to avoid overfitting).
Experiments Details
Following section will cover details about the different experiments like EDA, Models architecture, and its performance.
Predictions & Analysis
I also did custom predictions on unseen tweets from the world, for the I used model 6 which was based on transfer learning universal-sentence-encoder USE model. It was gave all the correct class of the tweets.
Tweet: |
#Beirut declared a “devastated city”, two-week state of emergency officially declared. #LebanonG |
|
---|---|---|
Prediction :: Real Disaster, Prob: 0.9760 | ||
Tweet |
This mobile App may help these people to predict this powerfull M7.9 earthquake |
|
Prediction :: Real Disaster, Prob: 0.694 | ||
Tweet |
Love the explosion effects in the new spiderman movie |
|
Prediction :: Not Real Disaster, Prob: 0.2220 |
More experiments can be done to improve models accuracy especially RNN based models which are overfitting with very simple model so more data is required to avoid overfitting and using some model regularization.
to check the most wrong prediction, other classification metrics likes precision or recall please visit my github page and check the code file..