Git Product home page Git Product logo

web_check's Introduction

web check

A script that will warn you, by opening a new browser tab, when there are new content in your favourite websites.

logo

What it does

The script will check, when run, if there are any changes in the websites. If any changes are found, it will open a new browser tab.

Not every website can be scrap.

How does it work?

After adding an url, the script creates a copy of website's content in your hard drive. When run again, it will compare the website against the cached one line by line,and if there are any differences, a new tab will open. Note: Script doesn't need to open browser when running, you'll only see the terminal.

A lot of websites have some kind of calendar, that means, every day there will be changes in those websites. To avoid this, you can add a unique css selector to each url. With this unique identification, the script targets only specific parts of the website, and avoid unnecessary calls to browser.

If there is a change, a new back up file will be created at storage/url_data/backup.

All urls are stored in a JSON file with all the needed information, including encoding.

How to get the unique css selector

Go to the website, right click in the zone you want the script to check. Go to inspect mode. Hover your mouse until you see (usually in blue) everything you want. Right click and copy selector. Paste that in the css field in add url, or modify url.

Set up

Running the script

Once everything is installed, launch the script with web_check/main.pyw.

There are four tabs.

  • Home: it's the main tab. From here you can launch checker.py with the button Run!. Checker.py it's in charge of all the logic. It will access your stored url and compare it with the actual website.

home

  • Add url: From this tab, you can add a new url for checking, and its unique css selector.

    Important: urls have to start with http:// or https://. Hit Submit new url and the script will make all necessary checks.

add url

There is a second option, Import file. Import file will let you select a .txt file with several urls, and all of them will be saved.

The txt file needs the following structure: url(white space)css selector.

Url only means script will download whole website. Only one url per line.

https://github.com/

https://www.reddit.com/ #SHORTCUT_FOCUSABLE_DIV

https://postal.fsc.ccoo.es/Inicio #divMainContent

  • Modify url: If you need to change an url css selector, you can do it from here. Enter a new css selector, or leave it empty for capturing the whole site, and hit submit.

modify url

  • Delete url: Two options for deleting. Check one, or several, urls and hit delete. Delete all will delete all urls stored.

delete url

At the Options' menu, it's possible to reset the url_list.txt if, for some reason, the file can't be read.

Automate the script

There is no need to run web_check/main.py every time you want to check your websites, for that, only checker.py is required.

You can run checker.py manually whenever you want, but that's tedious and forgettable, first you would have to activate a virtual environment, and then, run checker.py. With 'Create batch file' you only have to point where python.exe is (the virtual environment one) and a directory where the file will be created.

After all, it's easier to run directly web_check.bat, and even easier if you add said batch file to windows' task scheduler.

Create shortcut

Create shortcut at Options' menu will create a batch file with all information about the script itself and the virtual environment. It let you run main.pyw with only a double click.

Now you don't need to activate each time a venv, web_check.bat will take care of it.

What's new in your favourite websites

what's new Inside logs folder there are two files. whats_new.txt displays all the changes in your favourite websites. Each url starts with a hyphen for easier readability.

If script is run from main.pyw, there is no need to check this file everytime. Script will output those changes into a new window.

Log file

Every time the script is run, script will output a log file. It clears its content automatically for easier reading. Any error, or info, will be written down here.

Log is located in storage/logs/log.txt.

Copyright (C) 2021 Jaime Álvarez Fernández

web_check's People

Contributors

jaime-alv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

bangil0

web_check's Issues

Reformat add_url.py

  • Add all code from setup.py.
  • Integrate functions in cascade menu.
  • Delete old code from older CLI versions.
  • Clean sentinel.

Cascade menu

Cascade menu with:

  • create folder
  • reset url_list.txt
  • info/about
  • batch file

Clean main.py

main.py needs to be cleaned with a class, it should be easier to pass arguments and whatever I need into the diferent functions. It will improve its readability

Clear entries

Find a way for clearing entries once you submit anything

Refresh tabs

Tabs are completely frozen and don't refresh unless you restart the script.

Delete url

Delete url from JSON file.
Maybe set the dict and use numbers.

Add several url at once

Read a text file with all the urls and css and add all to url_list.txt instead of adding one by one.

Back up folder

Back up folder, easier to compare saved versions of websites

GUI

Develop a Graphical User Interface for add_url.py

modify_url

Right now, you can't modify the css selector for an url, the only way to do it, it's delenting and adding it again with the new css

Combine files

Combine add_url, delete_url, modify_url into one file.
Use class for organising file.

Pass dir argument

main.py use global directory when calling setup.py. Created directory is one level higher than it should.

Encoding format

Look up encoding in each website and store it as a new variable in the JSON file

Hashed file url_list.txt

It could be interesting to encrypt files and protect them from undesired views.
With encrypted file, it's possible to use password and user

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.