Git Product home page Git Product logo

tweets-scrapper's Introduction

Tweets-Scrapper

This script has helped me to scrap more than 30K+ tweets from more that 40 authors. The script is written such that you only have to give it a list of Twitter handles and output csv file path and it'll download all the tweets, process them and save them to a csv file without any hassle. You can checkout the dataset here on Github and here on Kaggle. Also, I have done a comprehensive data analysis which you can find here. You can also checkout the jupyter notebook I have used to scrap 30K+ tweets here.

How the script works

The script will download tweets from all the authors whose Twitter handles are written in the authors.txt file in the newline seperated format. The script will download direct tweets, retweets and retweets with a comment. In a retweeted tweet, I took all the information (name, handle, tweetcontent and creation time) of the orignal author and stored it. Furthermore, Every retweet with a comment contains <Q> and </Q> tags. The author's comment is followed by <Q> tag and then the content of the retweet comes which is followed by </Q>.

How to run it

  1. First clone the repository
git clone https://github.com/Hsankesara/Tweets-Scrapper.git
  1. Then download the python dependencies.
cd Tweets-Scrapper
pip3 install -r requirements.txt
  1. Now, create cred.json file which is the copy of cred.json.sample,
cp cred.json.sample cred.json
  1. Get Twitter credentials and write them in cred.json file. You can follow this to get your access tokens. Now update the cred.json file with the tokens you've received from Twitter.

  2. Write the Twitter handle of the accounts you want to scrap in authors.txt in newline seperated format.

  3. run the script

python3 scrap.py authors.txt tweets.csv
  1. Wait for it! And you'll get all the tweets soon in the csv format.

tweets-scrapper's People

Contributors

hsankesara avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.