Git Product home page Git Product logo

scrape-and-ntfy's Introduction

Scrape and Ntfy

An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.

Features

  • Modular notification system
    • Currently supports Webhooks (e.g. Discord, Slack, etc.) and ntfy.sh
  • Web scraping via Selenium
  • Simple configuration of multiple scrapers with conditional notifications

Usage

Prerequisites

  • A browser
    • Most Chromium-based browsers and Firefox-based browsers should work
    • Edge is not recommended
    • Selenium should also be able to download and cache the appropriate browser if necessary

Basic Configuration

  • Configuration for the web scraper is handled through a TOML file
    • To see an example configuration, see config.example.toml
    • This can be copied to config.toml and edited to suit your needs
    • To get the CSS selector for an element, you can use your browser's developer tools (F12, Ctrl+Shift+I, right-click -> Inspect Element, etc.)
      1. If you're not already in inspect, you can press Ctrl+Shift+C to enter inspect element mode (or just click the inspect button in the developer tools)
      2. Click on the element you want to select
      3. Right-click on the element in the HTML pane
      4. Click "Copy" -> "Copy selector"
  • Some other configuration is handled through environment variables and/or command-line arguments (--help for more information)
    • For example, to set the path to the configuration file, you can set the PATH_TO_TOML environment variable or use the --path-to-toml command-line argument

Docker (Recommended)

Specific perquisites

  • Docker
    • Docker is a platform for developing, shipping, and running applications in containers
  • Docker Compose

Installation and usage

  1. Clone the repository
    • git clone https://github.com/slashtechno/scrape-and-ntfy
  2. Change directory into the repository
    • cd scrape-and-ntfy
  3. Configure via config.toml
    • Optionally, you can configure some other options via environment variables or command-line arguments in the docker-compose.yml file
  4. docker compose up -d
    • The -d flag runs the containers in the background
    • If you want, you can run sqlite-web by uncommenting the appropriate lines in docker-compose.yml to view the database in a browser on localhost:5050

pip

Specific perquisites

  • Python (3.11+)

Installation and usage

  1. Install with pip
    • pip install scrape-and-ntfy
    • Depending on your system, you may need to use pip3 instead of pip or python3 -m pip/python -m pip.
  2. Configure
  3. Run scrape-and-ntfy
    • This assumes pip-installed scripts are in your PATH

PDM

Specific perquisites

  • Python (3.11+)
  • PDM

Installation and usage

  1. Clone the repository
    • git clone https://github.com/slashtechno/scrape-and-ntfy
  2. Change directory into the repository
    • cd scrape-and-ntfy
  3. Run pdm install
    • This will install the dependencies in a virtual environment
    • You may need to specify an interpreter with pdm use
  4. Configure
  5. pdm run python -m scrape_and_ntfy
    • This will run the bot with the configuration in config.toml

scrape-and-ntfy's People

Contributors

slashtechno avatar

Stargazers

Jai A P avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.