Git Product home page Git Product logo

facebook_scrape_live_gaming_page's Introduction

Task

Scrap facebook live gaming page and extracts live users and push them to faktory worker to parse their detail(name, uid, username, number of follower, number of likes, contact details - email, social links) and their posts(post-id, text, datetime, hashtags, links, images) and finally save them in MongoDB.

Setup

  • Install faktory server.
  • Install MongoDB and create database aggero_fb and two collection user_details and posts.
  • pip install -r requirements.txt
  • Change URL_FACTORY password in utils.py file.

Run

  • Navigate to main package.

  • First run python3 consumer.py

  • Then python3 producer.py -nup 10 -nps 3. You must need to provide -nup and -nps

    • -nup - Number of live users to parse
      • Choices : int value greater than 0 or str value all
    • -nps - Number of scroll while parsing user posts. Every scroll gives about 18 posts.
      • Choices : int value greater than 0 or str value all
    • -nup 10 -nps 3 works best for testing, all may be used in production.
  • Scraping should be started now.

Todo

  • Configure Tor for proxy.
  • Build a error database and send daily error report email to admin.
  • By default I am running one worker process but you can modify it in consumer.py file as number of cores in your pc.
  • Create index for MongoDB.
  • Write tests.

facebook_scrape_live_gaming_page's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

mmmdbot

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.