Git Product home page Git Product logo

opendata-content.log-tool's Introduction

content.log Tool

A reference implementation for processing the content.log files found at opendata.dwd.de/weather.

Example usage:

CONTENT_LOG_URL="https://opendata.dwd.de/weather/nwp/content.log.bz2"
PATTERN="/icon-d2/grib/03/t_2m/.*_icosahedral_"
LAST_RUN_AT=$(date -ud 00:00 -Ihours)

wget $CONTENT_LOG_URL -O content.log.bz2
bzgrep $PATTERN content.log.bz2 > my_content.log
./get_updated_files.py -b $CONTENT_LOG_URL -u $LAST_RUN_AT my_content.log > updated_files.txt
wget -i updated_files.txt

Running the program above will download all updated files into the current working directory. The produced file updated_files.txt will hold hyperlinks to files that are updated since the given date-time according to the file's modification date found in content.log.

Also mind that there are multiple servers behind https://opendata.dwd.de which might not be exactly in sync with each other regarding file modification timestamps. Look into the code of get_updated_files.py for a suggestion on how to deal with that.

While this program relies on the file modification timestamp dumped into content.log.bz2, it might be more feasible to process the data reference time that is contained in the filenames instead.

$ ./get_updated_files.py --help
usage: get_updated_files.py [-h] --updated-since UPDATED_SINCE [--url-base URL_BASE]
                            [--min-delta MIN_DELTA] [--version]
                            [CONTENT_LOG_FILE [CONTENT_LOG_FILE ...]]

Filters paths of a DWD Open Data content.log file for entries that have been updated.

positional arguments:
  CONTENT_LOG_FILE      The decompressed content.log file (default: STDIN)

optional arguments:
  -h, --help            show this help message and exit
  --updated-since UPDATED_SINCE, -u UPDATED_SINCE
                        last time files were checked for updates
  --url-base URL_BASE, -b URL_BASE
                        resolve the paths taken from content.log relative to the given
                        base URL; put the URL of the content.log.bz2 here to end up with
                        correct hyperlinks to DWD's Open Data
  --min-delta MIN_DELTA, -d MIN_DELTA
                        minimum number of seconds a file needs to be younger than
                        UPDATED_SINCE (default: 60)
  --version             show program's version number and exit

opendata-content.log-tool's People

Contributors

bjoern-reetz avatar

Stargazers

 avatar Stefan Kurz avatar Andreas Motl avatar Mensch avatar F Rehmann avatar  avatar Eduard Rosert avatar

Watchers

 avatar Andreas Motl avatar  avatar

Forkers

jdshfjsas nklever

opendata-content.log-tool's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.