Git Product home page Git Product logo

pyscrappy's Introduction



PyScrappy: powerful Python data scraping toolkit

forthebadge made-with-python

Python 3.6 PyPI Latest Release

Package Status License

stars forks

What is it?

PyScrappy is a Python package that provides a fast, flexible, and exhaustive way to scrape data from various different sources. Being an easy and intuitive library. It aims to be the fundamental high-level building block for scraping data in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data scraping tool available.

Main Features

Here are just a few of the things that PyScrappy does well:

  • Easy scraping of Data available on the internet
  • Returns a DataFrame for further analysis and research purposes.
  • Automatic Data Scraping: Other than a few user input parameters the whole process of scraping the data is automatic.
  • Powerful, flexible

Where to get it

The source code is currently hosted on GitHub at: https://github.com/mldsveda/PyScrappy

Binary installers for the latest released version are available at the Python Package Index (PyPI).

pip install PyScrappy

Dependencies

  • selenium - Selenium is a free (open-source) automated testing framework used to validate web applications across different browsers and platforms.
  • webdriver-manger - WebDriverManager is an API that allows users to automate the handling of driver executables like chromedriver.exe, geckodriver.exe etc required by Selenium WebDriver API. Now let us see, how can we set path for driver executables for different browsers like Chrome, Firefox etc.
  • beautifulsoup4 - Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.
  • pandas - Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

License

MIT

Getting Help

For usage questions, the best place to go to is StackOverflow. Further, general questions and discussions can also take place on GitHub in this repository.

Discussion and Development

Most development discussions take place on GitHub in this repository.

Also visit the official documentation of PyScrappy for more information.

Contributing to PyScrappy

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

If you are simply looking to start working with the PyScrappy codebase, navigate to the GitHub "issues" tab and start looking through interesting issues.

End Notes

Learn More about this package on Medium.

This package is solely made for educational and research purposes.

pyscrappy's People

Contributors

mldsveda avatar vedant950 avatar vedaant2000 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.