Git Product home page Git Product logo

gribscan's Introduction

Documentation Status DOI

gribscan

Tools to scan GRIB files and create zarr-compatible indices.

warning

This repository is still experimental. The code is not yet tested for many kinds of files. It will likely not destroy your files, as it only accesses GRIB files in read-mode, but it may skip some information or may crash. Please file an issue if you discover something is missing.

installing

gribscan is on PyPI, you can install the recent released version using

python -m pip install gribscan

if you are interested in the recent development version, please clone the repository and install the package in development mode:

python -m pip install -e <path to your clone>

docs

The latest documentation can be found online or may be built using sphinx in your local clone:

pip install -e .[docs]
cd docs
make html

afterwards, the documentation is available at docs/build/html/index.html.

gribscan's People

Contributors

d70-t avatar lkluft avatar trackow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

trackow leifdenby

gribscan's Issues

archive on Zenodo?

Do we want to archive gribscan on Zenodo? - We'd be able to get a DOI for this project then, but do we want this?

If we do, we could do this similar to eurec4a/eurec4a-intake#147, where the details for filling out the required forms are stored in .zenodo.json.

Currently, the contributors are @lkluft, @trackow and @d70-t. In case we do want to publish this on Zenodo, everyone would have to fill out their information in the creators object. See docs here.

Test gribscan on latest FDB/eccodes grib output (including Healpix)

I leave this as an issue here since I could not find time yet to test this extensively. Wondering whether #26 also just works with our final IFS-FESOM output for the March hackathon. This includes Healpix for IFS and also for FESOM (new o2d and o3d .data files).

Example data for 30 November 2026. For several reasons this had to use ecCodes 2.32.5, by the way, and I am wondering whether this will lead to issues when trying to load in python with older versions.

Healpix grid:

/work/bm1235/b382776/cycle4/fdb/healpix/root/d1:climate-dt:ScenarioMIP:SSP3-7.0:1:IFS-FESOM:1:hz9o:clte:20261130/

Regular 0.25 grid:

/work/bm1235/b382776/cycle4/fdb/latlon/root/d1:climate-dt:ScenarioMIP:SSP3-7.0:1:IFS-FESOM:1:hz9o:clte:20261130/

Native resolution grid:

/work/bm1235/b382776/cycle4/fdb/native/root/d1:climate-dt:ScenarioMIP:SSP3-7.0:1:IFS-FESOM:1:hz9o:clte:20261130/

Add contribution guideline

Add a contribution guideline that among other things should cover:

  • general ways to contribute to gribscan
  • a request for first-time contributor to add themselves to the CITATION.cff

Failing on ECMWF ensemble

I was just giving this a try.

I have 3 files coming from the ECMWF ensemble with all the members inside.

Screen Shot 2022-09-02 at 09 16 32

I just tried to

gribscan-index *.grib2 -n 8
gribscan-build *.index -o dataset.json --prefix /home/ekman/ssd/guido/ecmwf-ens/

But received

Traceback (most recent call last):
  File "/home/ekman/miniconda3/envs/models/bin/gribscan-build", line 8, in <module>
    sys.exit(build_dataset())
  File "/home/ekman/miniconda3/envs/models/lib/python3.10/site-packages/gribscan/tools.py", line 29, in build_dataset
    refs = gribscan.grib_magic(args.indices, global_prefix=args.prefix)
  File "/home/ekman/miniconda3/envs/models/lib/python3.10/site-packages/gribscan/gribscan.py", line 424, in grib_magic
    global_attrs, coords, varinfo = inspect_grib_indices(messages, magician)
  File "/home/ekman/miniconda3/envs/models/lib/python3.10/site-packages/gribscan/gribscan.py", line 323, in inspect_grib_indices
    dims, dim_id, shape = map(tuple, zip(*((dim, i, len(coords))
ValueError: not enough values to unpack (expected 3, got 0)

I don't know whether it's because of the files dimensions or whether I'm doing something wrong.

Merge fixes for IFS output

During the preparation of the NextGEMS hackathon several fixes for IFS output have been tested that didn't find there way back to gribscan, yet. Some of these fixes seem to be useful for other use cases as well.

  • Support for regular_ll (implement in a consistent way as #22)
  • Do not fail for unknown GRIB definitions

Clean-up registration of GRIB coded

transferred issue from gitlab

Currently the RawGribCodec is registered through both numcodes.register_codec() and through the recently implemented entry points.

After the next numcodecs release we remove the functional approach and only use the entrypoints.

Automate PyPI releases

transferred issue from gitlab

This is a proposal to add a Gitlab CI that automates PyPI releases. Of course, the CI could/should also cover basic installation (and eventually tests) for normal commits.

multiple `stepRange` lead to wrong decoding time

Hi there,

analyzing the most recent data from IFS/FESOM we stepped in one variable which is not decoded as expected - or at least as I expect - by gribscan.

We have an experiment with 6-hour output from IFS, currently available on Levante (I can provide the path), for which we found that one variable, namely litota1 (Averaged total lightning flash density in the last hour) is incorrectly decoded by gribscan and it is stored in the json (but I imagine also in the original index file) as having the time at 5-11-17-23 hour instead of 6-12-18-24 as all the other variables.

This does NOT happen if we open the GRIB file directly using cfgrib. Further digging into the file show that this inconsistency is due by the fact that the original grib code has a stepRangewhich is "multiple", i.e. it covers two time steps (no clue why this is happening for this specific variabile) instead of one. For some reason, gribscan took the first one, while the real time is of course the second). See the attached screenshot, litota1 is the last one.

Schermata 2023-04-03 alle 14 55 17

Do you think this is an expected behaviour of gribscan or something that should be fixed? This leads to the issue of having an irregular time axis in the final xarray, having time 0-5-6-11-12-17-18-23 etc which are NaN every other time. We can certainly filter this a posteriori, but perhaps it can be tackled at the source.

Many thanks, please let me know if you need further details

Write_index fails b/c of `grib_handle` issue if cfgrib version below `cfgrib-0.9.10.1`

Issue

Any attempt to read a GRIB1 or GRIB2 file with write_index returns the following error:

gs.write_index("rtma_2011022404.grib2")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/gribscan/gribscan.py", line 275, in write_index
    for record in gen:
  File "/usr/local/lib/python3.8/dist-packages/gribscan/gribscan.py", line 232, in scan_gribfile
    global_attrs = {k: m[k] for k in cfgrib.dataset.GLOBAL_ATTRIBUTES_KEYS}
  File "/usr/local/lib/python3.8/dist-packages/gribscan/gribscan.py", line 232, in <dictcomp>
    global_attrs = {k: m[k] for k in cfgrib.dataset.GLOBAL_ATTRIBUTES_KEYS}
  File "/usr/local/lib/python3.8/dist-packages/cfgrib/messages.py", line 209, in __getitem__
    return super(ComputedKeysMessage, self).__getitem__(item)
  File "/usr/local/lib/python3.8/dist-packages/cfgrib/messages.py", line 161, in __getitem__
    return self.message_get(item)
  File "/usr/local/lib/python3.8/dist-packages/cfgrib/messages.py", line 123, in message_get
    values = eccodes.codes_get_array(self.codes_id, item, key_type)
  File "/usr/local/lib/python3.8/dist-packages/cfgrib/bindings.py", line 366, in codes_get_array
    key_type = codes_get_native_type(handle, key)
  File "/usr/local/lib/python3.8/dist-packages/cfgrib/bindings.py", line 359, in codes_get_native_type
    _codes_get_native_type(handle, key.encode(ENC), grib_type)
  File "/usr/local/lib/python3.8/dist-packages/cfgrib/bindings.py", line 163, in wrapper
    code = func(*args)
TypeError: initializer for ctype 'grib_handle *' must be a cdata pointer, not int

Inputs

GRIB1 files from ERA5 Land (temp and precipitation) and GRIB2 files from RTMA (https://www.nco.ncep.noaa.gov/pmb/products/rtma/)

These are saved to here for reference

Setup

Ubuntu 20.04.3 LTS
Python 3.8.3
Eccodes 2.26
cfgrib '0.9.8.5'
Latest released version of gribscan (installed via python -m pip install gribscan)

Enable Github actions for CI/CD

We should enable some sort of testing

  • Enable CI/CD to check if gribscan builds at all
  • Run actual tests on artificial GRIB data

new version in kerchunk

I created fsspec/kerchunk#198 , with some of the simpler ideas from here. I am not suggesting that my version should be a replacement for this repo, but it's good, I think, to have something simple in the main kerchunk repo too. Specifically, it creates one output reference set per grib message and does not do any combine - so no magician required. The user can pass such a list or set of lists from multiple files to MultiZarrToZarr and do a combine like that. The example shows this idea in action across several HRRR forecast files. Note that the coordinates are not "simple" but curved, so we don't attempt to generate them (which is a shame!), but rely on eccodes to fetch them from the first valid message.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.