Git Product home page Git Product logo

clustering-and-classification-of-meps-tweets's Introduction

European Parliament Twitter activites MEP 1.0

We will investigate a dataset of tweets made by Members of the European Parliament. We will use data collected by Darko Cherepnalkoski, Andreas Karpf, Igor Mozetič, and Miha Grčar for their paper Cohesion and Coalition Formation in the European Parliament: Roll-Call Votes and Twitter Activities

This assignment is based on an original assignment by Ioannis Pavlopoulos (postdoc researcher) and Vasiliki Kougia (PhD candidate at AUEB).


Ioannis (Ion) Petropoulos, 8160107
Department of Management Science and Technology
Athens University of Economics and Business
[email protected]

Read the notebook here

WordCloud

We drew wordclouds for EFDD and S&D

EFDD:

Is the Europe of Freedom and Direct Democracy strongly connected with the UKIP? Of course it is, since EFDD has been chaired by Nigel Farage who also happens to be the leader of UK Independence party (UKIP)

S&D:

  • These are the socialists & the Democrats.
  • They are fighting for social justice, equality and sustainability.
  • The words we see on the cloud maybe refer to the S&Ds outlook Brexit: work with Labour for a closer EU-UK relationship or put the question back to the people

Results - Clustering

Top terms per cluster:

Cluster 0: ukip labour nhs party people just immigration policy great debate
Cluster 1: trade ttip free union need deal deals great jude_kd intergroup
Cluster 2: ttip isds vote debate labour public eurolabour good malmstromeu eppgroup
Cluster 3: migration policy ukip net crisis asylum need eppgroup cameron mass
Cluster 4: greece tsipras greek euro eppgroup eurozone people yes imf new
Cluster 5: eppgroup great people good new meeting need support debate just
Cluster 6: vote ukip labour yes people report majority debate just leave
Cluster 7: uk ukip labour brexit migrants govt good steel leave people

  • Cluster 7 is reffering to brexit (or the political situation in UK)
  • Cluster 2 is probably talking about international issues (ttip - isds)
  • Cluster 3 is talking about a migration crisis in the UK
  • Cluster 6 is reffering to the UK Independance Party
  • Cluster 4 is addressing the political situation in Greece

Results - Classification

Using SGDClassifier we managed to predict the political group using tweet's text, achieving 60% accuracy with Cross Validation.

We used CountVectorizer as a vectorizer and TfidfTransformer as a transformer.

Libraries

Tweet Extraction:

  • tweepy

Data Manipulation:

  • pandas
  • numpy
  • re

Visualization:

  • yellowbrick
  • matplotlib
  • wordcloud

Machine Learning:

  • sklearn
  • xgboost

clustering-and-classification-of-meps-tweets's People

Contributors

ionpetro avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.