Light

chauhan17nitin / haterase Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 1.0 58.98 MB

Realtime twitter hate speech detection

License: MIT License

Jupyter Notebook 99.88% Python 0.12%

haterase's Introduction

hatErase

Realtime twitter hate speech detection

This is the master branch of the repo it contains the codebase of Machine Learning approach that we followed to train our Model

The other branch is webapp which contains the codebase for deployment part mainly

Dataset Collection

We used different open source datasets, from different hackathons and competetions and combined them to make a bid dataset which containes variety of tweets the dataset majorly focuses on English Language Dataset Exploration has the code for all the exploration part of dataset and how we concatenated them.

Dataset Preprocessing

Dataset Preprocessing contains the code of how we cleaned the dataset as it can not be directly fed to the Machine Learning Models. How different techniques we used to useful features from the text like hashtags, user mentions etc.

Machine Learning Models

We trained our model using two prominent ML algorithms for Binary Classification, namely - Multinomial Naive Bayes and Logistic Regression.

The final model was saved based on training LR with n-grams of range (1,3) as lexical features.

The trainingg set classification report was:

The Test set classification report was:

The AUC-ROC curve for test set was:

Hate Score prediction

documentation goes here

Contributed By

Nitin Chauhan and Srijan Singh

haterase's People

Contributors

Stargazers

Watchers

Forkers

haterase's Issues

can not import model

please see my preprocess.py file i am not able to import models can u help me fix it.??

Using Real-time Streaming api

Bruh How we will use the streaming api code
where will we place that particular code
do we need to use threading??

Size of Saved TFIDF vector

Size of saved tfidf vector is around 550MB we surely need to reduce this.

Search - empty search is returning the previous search result. How can we stop it??

It seems that the problem lies in the api after i had debugged the search function.

what is 'controls_set' in detail.html line 3?

please tell where it is referenced from ?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.