We will investigate a dataset of tweets made by Members of the European Parliament. We will use data collected by Darko Cherepnalkoski, Andreas Karpf, Igor Mozetič, and Miha Grčar for their paper Cohesion and Coalition Formation in the European Parliament: Roll-Call Votes and Twitter Activities
This assignment is based on an original assignment by Ioannis Pavlopoulos (postdoc researcher) and Vasiliki Kougia (PhD candidate at AUEB).
Ioannis (Ion) Petropoulos, 8160107
Department of Management Science and Technology
Athens University of Economics and Business
[email protected]
Read the notebook here
We drew wordclouds for EFDD and S&D
Is the Europe of Freedom and Direct Democracy strongly connected with the UKIP? Of course it is, since EFDD has been chaired by Nigel Farage who also happens to be the leader of UK Independence party (UKIP)
- These are the socialists & the Democrats.
- They are fighting for social justice, equality and sustainability.
- The words we see on the cloud maybe refer to the S&Ds outlook Brexit: work with Labour for a closer EU-UK relationship or put the question back to the people
Top terms per cluster:
Cluster 0: ukip labour nhs party people just immigration policy great debate
Cluster 1: trade ttip free union need deal deals great jude_kd intergroup
Cluster 2: ttip isds vote debate labour public eurolabour good malmstromeu eppgroup
Cluster 3: migration policy ukip net crisis asylum need eppgroup cameron mass
Cluster 4: greece tsipras greek euro eppgroup eurozone people yes imf new
Cluster 5: eppgroup great people good new meeting need support debate just
Cluster 6: vote ukip labour yes people report majority debate just leave
Cluster 7: uk ukip labour brexit migrants govt good steel leave people
- Cluster 7 is reffering to brexit (or the political situation in UK)
- Cluster 2 is probably talking about international issues (ttip - isds)
- Cluster 3 is talking about a migration crisis in the UK
- Cluster 6 is reffering to the UK Independance Party
- Cluster 4 is addressing the political situation in Greece
Using SGDClassifier we managed to predict the political group using tweet's text, achieving 60% accuracy with Cross Validation.
We used CountVectorizer
as a vectorizer and TfidfTransformer
as a transformer.
- tweepy
- pandas
- numpy
- re
- yellowbrick
- matplotlib
- wordcloud
- sklearn
- xgboost