Git Product home page Git Product logo

fangspider's Introduction

English | 简体中文

ScrapydWeb: A full-featured web UI for Scrapyd cluster management, with Scrapy log analysis & visualization supported.

PyPI - scrapydweb Version Downloads - total PyPI - Python Version Coverage Status GitHub license Twitter

Scrapyd x ScrapydWeb

Recommended Reading

How to efficiently manage your distributed web scraping projects

overview

Features

  • Scrapyd Cluster Management

    • Group, filter and select any number of nodes
    • Execute command on multinodes with just a few clicks
  • Scrapy Log Analysis

    • Stats collection
    • Progress visualization
    • Logs categorization
  • All Scrapyd JSON API Supported

    • Deploy project, Run Spider, Stop job
    • List projects/versions/spiders/running_jobs
    • Delete version/project
  • Enhancements

    • Basic auth for web UI
    • HTML caching for the Log and Stats page
    • Auto eggify your projects
    • Email notice
    • Mobile UI

Getting Started

Prerequisites

Make sure that Scrapyd has been installed and started on all of your hosts.

Note that if you want to visit your Scrapyd server remotely, you have to manually set the bind_address to bind_address = 0.0.0.0 and restart Scrapyd to make it visible externally.

Installing ScrapydWeb

  • use pip:
pip install scrapydweb
  • use git:
git clone https://github.com/my8100/scrapydweb.git
cd scrapydweb
python setup.py install

Starting ScrapydWeb

  1. Start ScrapydWeb via the scrapydweb command. (Your would be asked to add your SCRAPYD_SERVERS in the generated config file on first startup.)
  2. Visit http://127.0.0.1:5000. (It's recommended to use Google Chrome to get the best experience.)

Browser Support

The latest version of Google Chrome, Firefox and Safari.

Preview

Running the tests

$ git clone https://github.com/my8100/scrapydweb.git
$ cd scrapydweb

# To create isolated Python environments
$ pip install virtualenv
$ virtualenv venv/scrapydweb
# Or specify your Python interpreter: $ virtualenv -p /usr/local/bin/python3.7 venv/scrapydweb
$ source venv/scrapydweb/bin/activate

# Install dependent libraries
(scrapydweb) $ python setup.py install
(scrapydweb) $ pip install pytest
(scrapydweb) $ pip install coverage

# Make sure Scrapyd has been installed and started, then update the custom_settings item in tests/conftest.py
(scrapydweb) $ vi tests/conftest.py
(scrapydweb) $ curl http://127.0.0.1:6800

(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests/test_a_factory.py -s -vv
(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests -s -vv
(scrapydweb) $ coverage report
# To create an HTML report, check out htmlcov/index.html
(scrapydweb) $ coverage html

Built With

Changelog

Detailed changes for each release are documented in the HISTORY.md.

Author

Contributors

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

fangspider's People

Contributors

bigdatamatrix avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.