jgrss / geowombat Goto Github PK

View Code? Open in Web Editor NEW

174.0 5.0 10.0 245.75 MB

GeoWombat: Utilities for geospatial data

Home Page: https://geowombat.readthedocs.io

License: MIT License

Python 16.32% Jupyter Notebook 82.81% Cython 0.84% Dockerfile 0.03%

remote-sensing raster satellite geography python dask rasterio xarray

geowombat's People

Contributors

Stargazers

Watchers

Forkers

geospatial-data-science rdenham jpv-costa dongyi1996 trellixvulnteam felix-chloris varunshah111 spatialdays mappboy prakrutkansara

geowombat's Issues

x y coord rounding issue with polygon_to_array

I just ran into an issue where polygon_to_array isn't replicating the x y coords of the xarray data example. Here's how I am resolving it, but maybe worth thinking about a better solution.

labels = gw.polygon_to_array(labels, col=col, data=data,band_name= [variable_name],  fill=0)

np.all(data.x.values == category.x.values) & np.all(data.y.values == category.y.values)
False

labels = labels.assign_coords({"x": data.x.values, 'y':data.y.values})
np.all(data.x.values == labels.x.values) & np.all(data.y.values == labels.y.values)
True

Test GPU apply

@rdenham if you have some spare time, test the new series class from the jax branch. I removed xarray and dask dependencies, so opening large images should be faster. Additionally, computation is on the GPU.

To test it:

Install geowombat from the jax branch.

Install jax with GPU support (command below is for CUDA 11.1)

pip install --upgrade "jax[cuda111]" -f https://storage.googleapis.com/jax-releases/jax_releases.html

Simple example to compute the mean over time:

import geowombat as gw

Create a custom class

class TemporalMean(gw.TimeModule):

    def __init__(self):        
        super(TemporalMean, self).__init__()

    def calculate(self, array):
       """Returns the window and the mean along the time axis"""
        return array.mean(axis=0).squeeze()

with gw.series(file_list, nodata=0, num_threads=4, window_size=(1024, 1024)) as src:
    src.apply(TemporalMean(), outfile='mean.tif', bands=2, gain=0.0001, num_workers=4)

gw broken?

I was just working on stuff for the book and ran into an odd issue on the most recent build of gw

import geowombat as gw


import geowombat as gw...
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/Documents/github/pyGIS/pygis/docs/test.py in 
----> 645 import geowombat as gw
      646 from geowombat.data import l8_224078_20200518
      647 
      648 # with gw.config.update(sensor="bgr"):
      649 with gw.open(l8_224078_20200518) as src:

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/__init__.py in <module>
      1 from . import config
----> 2 from .core.api import open
      3 from .core import extract
      4 from .core import sample
      5 from .core import calc_area

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/core/__init__.py in <module>
----> 1 from ..backends import transform_crs
      2 from .io import apply
      3 from .io import to_raster
      4 from .io import to_netcdf
      5 from .io import to_vrt

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/backends/__init__.py in <module>
      1 from .dask_ import Cluster
----> 2 from .xarray_ import concat, mosaic
      3 from .xarray_ import warp_open
      4 from .xarray_ import transform_crs
      5 

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/backends/xarray_.py in <module>
      5 
      6 from ..handler import add_handler
----> 7 from ..core.windows import get_window_offsets
      8 from ..core.util import parse_filename_dates
      9 from ..config import config

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/core/windows.py in <module>
----> 1 from .util import n_rows_cols
      2 
      3 from rasterio.windows import Window
      4 
      5 

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/core/util.py in <module>
      7 
      8 from ..handler import add_handler
----> 9 from ..moving import moving_window
     10 
     11 import numpy as np

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/moving/__init__.py in <module>
----> 1 from ._moving import moving_window

geowombat/moving/_moving.pyx in init geowombat.moving._moving()

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

Nodata value set to 0 if using ref_bounds

When you open an image using gw.config.update(refbounds=bound), any nodata values in the source data are converted to 0's.

Here is a reproducible example:

# create a dummy raster with nodata set to 255
from osgeo import gdal
from osgeo import osr
from rasterio.coords import BoundingBox

import numpy as np 

ncol = 100
nrow = 100

egdata = np.random.randint(0,10, (nrow,ncol))
# add some nodata
egdata[0:10, 0:10] = 255

rasterOrigin = (1238095,-2305756)
pixelWidth = 10
pixelHeight = -10
originX = rasterOrigin[0]
originY = rasterOrigin[1]
driver = gdal.GetDriverByName('GTiff')
outRaster = driver.Create("test.tif", ncol, nrow, 1, gdal.GDT_Byte)
outRaster.SetGeoTransform((originX, pixelWidth, 0, originY, 0, pixelHeight))
outband = outRaster.GetRasterBand(1)
outband.WriteArray(egdata)
outband.SetNoDataValue(255)
outband.FlushCache()
outRasterSRS = osr.SpatialReference()
outRasterSRS.ImportFromEPSG(3577)
outRaster.SetProjection(outRasterSRS.ExportToWkt())
outband = None
outRaster = None

Now, if you access this using gw:

with gw.config.update():
    with gw.open("test.tif") as src:
        data = src.data.compute()
        print(data[0,0:10,0:10])

you get the expected output, since these are all nodata values in the src

[[255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]
 [255 255 255 255 255 255 255 255 255 255]]

But if you use a bounds argument:

bounds = BoundingBox(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1])
with gw.config.update(ref_bounds=bounds):
   with gw.open("test.tif") as src:
       data = src.data.compute()
       print(data[0,0:10, 0:10])

You get all 0's:

[[0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]
 [0 0 0 0 0 0]]

As a comparison, rasterio behaves as expected:

with rasterio.open("test.tif") as src:
    rst = src.read(1, window=from_bounds(left=rasterOrigin[0], bottom=rasterOrigin[1] - 60, right=rasterOrigin[0] + 60, top=rasterOrigin[1], transform=src.transform))
    print(rst[0:10, 0:10])

[[255 255 255 255 255 255]
 [255 255 255 255 255 255]
 [255 255 255 255 255 255]
 [255 255 255 255 255 255]
 [255 255 255 255 255 255]
 [255 255 255 255 255 255]]

I experimented with nodata in the config and open but didn't seem to make any difference.

In [68]: print(gw.__version__)
1.7.3

ML DataArray

@mmann1123 the ML tests are failing for the same reason the docs are failing.

Adding sklearn compatability

I opened an issue with sklearn-xarray folks about integration with geowombat data structures. They answered with a working example and I simultaneously figured most of it out and created a pull request to make it easier for us to use.

They want to create an example using gw data but worried most folks will have trouble generating the environment. I hate to ask you for something, but they are hoping for a dummy data set with some randomly created data to use as an example. I honestly have tried a few times and for many hours (and failed) to generate the same structure as:

``
with gw.open([rgbn_20160101, rgbn_20160401, rgbn_20160517],
band_names=['blue', 'green', 'red','nir'],
time_names=['t1', 't2','t3']) as src:
src

Any chance you know how to recreate that easily using xarray only?  Might be useful for testing later anyways.

Semantic-release not working on CI

FYI @mmann1123, Dependabot broke the automatic release again. I'll try to work it out. In the meantime, if you have a PR that needs to bump any of major.minor.patch, it won't happen automatically.

config issue?

I was just rebuilding docs for the book and noticed this. Just wondering if this needs to be handled.

with gw.config.update(scale_factor=0.0001):
  with gw.open(l8_224078_20200518) as src:
    print(src.attrs['scales'])
(1.0,1.0,1.0)

The DataArray does not have a `crs` attribute.

Morning!

First of all, thanks for the work, the library looks wicked!

I created a dedicated conda env by following your recommended steps:

pip install pip-tools
conda create -n geowombat python=3.7  cython scipy numpy zarr requests -c conda-forge
conda activate geowombat
sudo apt install libspatialindex-dev libgdal-dev
conda install -c conda-forge libspatialindex zarr requests
pip install git+https://github.com/jgrss/geowombat
python -c "import geowombat as gw;print(gw.__version__)"

Then I installed jupyterlab on top of the conda env to run the following:

As you can see I am unable to save the raster despite I am using the example data available through the library. Same error message when trying to use some of my tif local files.

Any idea how to solve this bug? Thanks in advance!

geowombat v1.11.4 breaks conda

Just got this report back from the conda feedstock or for details go here

Assuming pinning the xarray version pulls the older rasterio - breaking things downstream? What do you think @jgrss

The reported errors are:
- Encountered problems while solving:
-   - package rasterio-1.0.23-py27h5b3f9e8_0 requires libgdal >=2.4.1,<2.5.0a0, but none of the providers can be installed
-   - package pyproj-1.9.4-py27_0 requires python 2.7*, but none of the providers can be installed

'open' object has no attribute 'crs'

Hi, I'm very excited about the prospects of geowombat making raster processing much easier! Thanks for doing this!

I'm still pretty new to Python, so maybe I am missing something with stacks. I'm trying to clip a raster stack but I get the 'open' object has no attribute 'crs' error. I tried using 'with' method as well as the code below and both return the same issue.

from geowombat.data import l8_224077_20200518_B2, l8_224077_20200518_B3,l8_224078_20200518_polygons
src= gw.open([l8_224077_20200518_B2, l8_224077_20200518_B3],
stack_dim="band",
band_names=["B2", "B3"])
srccl = gw.clip(src, l8_224078_20200518_polygons)

returns:

AttributeError Traceback (most recent call last)
File ~\Miniconda3\envs\gwenv\lib\site-packages\geowombat\core\sops.py:798, in SpatialOperations.clip(self, data, df, query, mask_data, expand_by)
796 try:
--> 798 if data.crs.strip() != CRS.from_dict(df_crs_).to_proj4().strip():
799 df = df.to_crs(data.crs)

AttributeError: 'open' object has no attribute 'crs'

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last)
Input In [62], in <cell line: 5>()
1 from geowombat.data import l8_224077_20200518_B2, l8_224077_20200518_B3,l8_224078_20200518_polygons
2 src= gw.open([l8_224077_20200518_B2, l8_224077_20200518_B3],
3 stack_dim="band",
4 band_names=["B2", "B3"])
----> 5 srccl = gw.clip(src, l8_224078_20200518_polygons)

File ~\Miniconda3\envs\gwenv\lib\site-packages\geowombat\core\sops.py:803, in SpatialOperations.clip(self, data, df, query, mask_data, expand_by)
799 df = df.to_crs(data.crs)
801 except:
--> 803 if data.crs.strip() != CRS.from_proj4(df_crs_).to_proj4().strip():
804 df = df.to_crs(data.crs)
806 row_chunks = data.gw.row_chunks

AttributeError: 'open' object has no attribute 'crs'

If I try the self method:

I get the error:

AttributeError Traceback (most recent call last)
Input In [66], in <cell line: 5>()
1 from geowombat.data import l8_224077_20200518_B2, l8_224077_20200518_B3,l8_224078_20200518_polygons
2 src= gw.open([l8_224077_20200518_B2, l8_224077_20200518_B3],
3 stack_dim="band",
4 band_names=["B2", "B3"])
----> 5 srccl = src.gw.clip( l8_224078_20200518_polygons)

AttributeError: 'open' object has no attribute 'gw'

Maybe I'm doing something incorrect with stacks, as I am not able to use to_raster with them either (added a note to #184 as well).

Thanks!
Paul

Usage of xarray ufuncs is deprecated

Xarray has removed support for the deprecated xarray.ufuncs functions since version v2022.06.rc0: https://docs.xarray.dev/en/stable/whats-new.html#breaking-changes.

Now I receive the following error:

import geowombat as gw
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [2], in <cell line: 1>()
----> 1 import geowombat as gw

File /opt/tljh/user/envs/stable/lib/python3.8/site-packages/geowombat/__init__.py:5, in <module>
      2 __version__ = '1.11.3'
      4 from . import config
----> 5 from .core.api import open
      6 from .core.api import load
      7 from .core.api import series

File /opt/tljh/user/envs/stable/lib/python3.8/site-packages/geowombat/core/__init__.py:1, in <module>
----> 1 from ..backends import transform_crs
      2 from .io import apply
      3 from .io import to_raster

File /opt/tljh/user/envs/stable/lib/python3.8/site-packages/geowombat/backends/__init__.py:2, in <module>
      1 from .dask_ import Cluster
----> 2 from .xarray_ import concat, mosaic
      3 from .xarray_ import warp_open
      4 from .xarray_ import transform_crs

File /opt/tljh/user/envs/stable/lib/python3.8/site-packages/geowombat/backends/xarray_.py:26, in <module>
     24 import dask.array as da
     25 import xarray as xr
---> 26 from xarray.ufuncs import maximum as xr_maximum
     27 from xarray.ufuncs import minimum as xr_mininum
     30 logger = logging.getLogger(__name__)

ModuleNotFoundError: No module named 'xarray.ufuncs'

Searching the repo: https://github.com/jgrss/geowombat/search?q=xarray.ufuncs&type=code
Reveals at least one file needs updating:

geowombat/src/geowombat/backends/xarray_.py

Lines 26 to 27 in 342bb2b

 from xarray.ufuncs import maximum as xr_maximum 

 from xarray.ufuncs import minimum as xr_mininum

Linked issue: #42

[BUG] retry package not listed as a requirement

Seems like the 'retry' package is now needed for gw 1.7.3, but it isn't listed in requirements.txt.

❯ python
Python 3.8.10 (default, Jun  2 2021, 10:49:15)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import geowombat as gw
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/__init__.py", line 2, in <module>
    from .core.api import open
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/core/api.py", line 18, in <module>
    from . import geoxarray
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/core/geoxarray.py", line 19, in <module>
    from ..util import imshow as gw_imshow
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/util/__init__.py", line 2, in <module>
    from .web import GeoDownloads
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/util/web.py", line 14, in <module>
    from ..radiometry import BRDF, LinearAdjustments, RadTransforms, landsat_pixel_angles, sentinel_pixel_angles, QAMasker, DOS, SixS
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/radiometry/__init__.py", line 4, in <module>
    from .sr import LinearAdjustments, RadTransforms, DOS
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/radiometry/sr.py", line 11, in <module>
    from .sixs import AOT, SixS
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/radiometry/sixs.py", line 20, in <module>
    from ..data import LUTDownloader, NASAEarthdataDownloader
  File ".venv38gdal33/lib/python3.8/site-packages/geowombat/data/__init__.py", line 7, in <module>
    from retry import retry
ModuleNotFoundError: No module named 'retry'
>>> exit()

Not a big issue, but is nice for installing if it is added as a requirement.

crs error

return the following error, even when using geowombat rgbn test image:

rasterio.errors.CRSError: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.

BRDF normalization

GeoWombat now has a geowombat.norm_brdf function, but it hasn't been thoroughly tested.

mpool issue

Just wondering if you have any thoughts about this error with mpool. I tried making the chunks smaller, to no avail. After the stackerizer each chunk end up being (2152143, 42), which really isn't very large. Any tricks?

Also, would you prefer me ask these types of questions on git issue?

pt = ParallelTask(ds,
scheduler='mpool',
n_workers=2)
print('n chunks %s' % pt.n_chunks)
print('n windows %s' % pt.n_windows)
res = pt.map(user_func, 6 )

error Traceback (most recent call last)
~/Documents/github/ET_drought_2019/scripts/crop_loss.py in
146 print('n chunks %s' % pt.n_chunks)
147 print('n windows %s' % pt.n_windows)
---> 148 res = pt.map(user_func, 6 )
149

~/Documents/github/geowombat/geowombat/core/parallel.py in map(self, func, *args)
144 if self.scheduler == 'mpool':
145
--> 146 for result in tqdm(executor.imap_unordered(func, data_gen), total=n_windows_slice):
147 results.append(result)
148

~/anaconda3/envs/geowombatdebug/lib/python3.7/site-packages/tqdm/std.py in iter(self)
1163
1164 try:
-> 1165 for obj in iterable:
1166 yield obj
1167 # Update and possibly print the progressbar.

~/anaconda3/envs/geowombatdebug/lib/python3.7/multiprocessing/pool.py in next(self, timeout)
746 if success:
747 return value
--> 748 raise value
749
750 next = next # XXX

~/anaconda3/envs/geowombatdebug/lib/python3.7/multiprocessing/pool.py in _handle_tasks(taskqueue, put, outqueue, pool, cache)
429 break
430 try:
--> 431 put(task)
432 except Exception as e:
433 job, idx = task[:2]

~/anaconda3/envs/geowombatdebug/lib/python3.7/multiprocessing/connection.py in send(self, obj)
204 self._check_closed()
205 self._check_writable()
--> 206 self._send_bytes(_ForkingPickler.dumps(obj))
207
208 def recv_bytes(self, maxlength=None):

~/anaconda3/envs/geowombatdebug/lib/python3.7/multiprocessing/connection.py in _send_bytes(self, buf)
391 n = len(buf)
392 # For wire compatibility with 3.2 and lower
--> 393 header = struct.pack("!i", n)
394 if n > 16384:
395 # The payload is large so Nagle's algorithm won't be triggered

error: 'i' format requires -2147483648 <= number <= 2147483647

ml install

Trying to help resolve #216 but running into a likely easily fixed issue. As you suggested a while back I can do pip install --user -e ."[ml]" in my dockerfile. However I am getting an error now with the your sklearn fork. Maybe there's an easy work around?

RUN git clone https://github.com/jgrss/geowombat.git --branch jgrss/scales_216

RUN cd geowombat && pip install --user -e ."[ml]"

ERROR: Invalid requirement: 'sklearn-xarray@ git+https://github.com/jgrss/sklearn-xarray.git; extra == "ml"'
The command '/bin/sh -c cd geowombat && pip install --user -e ."[ml]"' returned a non-zero code: 1

GDAL_DATA

I started a new conda env and once again have GDAL path issues. I noticed that fiona seems to have resolved this internally, and thinking maybe we could implement their solution, see link below. Wish I could be of help on this, but this is way above my pay grade.

CPLE_OpenFailedError: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.

Link to related fiona issue

Writing DataArrays to files keep throwing TypeError

Hello, I tried writing data arrays using the to_raster function. But, I am getting an error 'TypeError: tqdm.std.tqdm() argument after ** must be a mapping, not NoneType'. I don't know what I am doing wrong so I appreciate if you can look into it.

ParallelTask window_id actually chunk_id?

Ok after a lot of fumbling around, I think I figured out an issue I have been having. It seems like in data, window_id, num_workers = list(itertools.chain(*args)) , the window_id is actually the chunk_id. I came across this because of the unusual size of my data, but managed to finally figure out a way to replicate it.

If n_windows is ever larger than n_chunks, the window_id iterates through values of n_chunks, but then repeats itself until you reach window_id. So in my case I keep overwriting my file.

Here's an example

import numpy as np
import dask.array as da
import xarray as xr
from geowombat.core.parallel import ParallelTask
import geowombat as gw
import itertools
import os 
sys.path.append('/home/mmann1123/Documents/github/xr_fresh/')
from xr_fresh.utils import save_pickle,open_pickle, unique,add_categorical,add_time_targets,compressed_pickle,decompress_pickle
from xr_fresh.transformers import Stackerizer
from geowombat.data import rgbn_20160101, rgbn_20160401, rgbn_20160517


out_folder =   '~/Desktop/'
 
def user_func(*args):

    data, window_id, num_workers = list(itertools.chain(*args))    # num_workers=1
    X = Stackerizer(stack_dims = ('y','x','time'), direction='stack').fit_transform(data)   # NOTE stack y before x!!! 
    X = X.to_pandas() 
    X.columns = X.columns.astype(str)

    X.to_parquet(out_folder+'/testing/test_'+  '_%05d.parquet' %  window_id)

    return None


multiplier = 50

import geowombat as gw
with gw.open([rgbn_20160101, rgbn_20160401, rgbn_20160517]*multiplier,
                 band_names=['blue', 'green', 'red','nir'],
                 time_names=[str(x) for x in range(multiplier*3)],chunks=25) as src:
        print(src)

        print('stack:')

        pt = ParallelTask(src,
                    scheduler='threads',
                    n_workers=3)
        print('n chunks %s' % pt.n_chunks)
        print('n windows %s' % pt.n_windows)
        res = pt.map(user_func, 4)

VRT AttributeError

Ran into an issue when mosaicing a single band image and writing to vrt. Sorry not quite sure how to address it. If you point me in the right direction I can try to figure it out. For a working example see a nearly identical dataset here [here](https://drive.google.com/file/d/1xHPASY0ZT40rTvS8Uc866FJvpKwbWn1S/view?usp=sharing
https://drive.google.com/file/d/1uoJI7Hl1HFdW6MD6paBR_afWwh6jaZsG/view?usp=sharing
https://drive.google.com/file/d/1HYPauHCLyj5kVQMDbQtZa-BvEgHmIIDl/view?usp=sharing
https://drive.google.com/file/d/1lb_-yUoYpu3CCbl3nfAmPW_sT6jj3Zr-/view?usp=sharing)

`with gw.open(glob('./data/NDWI/2020tif'), band_names=['NDWI'],
time_names=['01-01-2020'],
bounds_by='union',mosaic=True,
num_workers=5) as ds:
print(ds)
#ds.sel(band=['NDWI']).gw.imshow(robust=True)
# Write the data to a VRT
gw.to_vrt(data=ds, filename='./data/NDWI/NDWI_01-01-2020.vrt', nodata=0.0)

File "", line 12, in
gw.to_vrt(data=ds, filename='./data/NDWI/NDWI_01-01-2020.vrt', nodata=0.0)

File "/home/mmann1123/anaconda3/envs/geowombat/lib/python3.7/site-packages/geowombat/core/io.py", line 366, in to_vrt
with rio.open(data.filename) as src:

File "/home/mmann1123/anaconda3/envs/geowombat/lib/python3.7/site-packages/xarray/core/common.py", line 233, in getattr
"{!r} object has no attribute {!r}".format(type(self).name, name)

AttributeError: 'DataArray' object has no attribute 'filename'

Data extraction using bounds raises TypeError

When trying to extract data using a tuple of bounded coordinates, I get an error. Here is the example taken from the documentation:

bounds = (793000.0, 2049000.0, 794000.0, 2050000.0)

with gw.open(rgbn,
             band_names=['green', 'red', 'nir'],
             num_workers=8,
             indexes=[2, 3, 4],
             bounds=bounds,
             out_dtype='float32') as ds:
    print(ds)

And the error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/anaconda3/envs/gwenv/lib/python3.7/site-packages/numpy/core/function_base.py in linspace(start, stop, num, endpoint, retstep, dtype, axis)
    116     try:
--> 117         num = operator.index(num)
    118     except TypeError:

TypeError: 'float' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-18-26e292b10c6b> in <module>
      6              indexes=[2, 3, 4],
      7              bounds=bounds,
----> 8              out_dtype='float32') as ds:
      9     print(ds)

~/anaconda3/envs/gwenv/lib/python3.7/contextlib.py in __enter__(self)
    110         del self.args, self.kwds, self.func
    111         try:
--> 112             return next(self.gen)
    113         except StopIteration:
    114             raise RuntimeError("generator didn't yield") from None

~/anaconda3/envs/gwenv/lib/python3.7/site-packages/geowombat/core/api.py in open(filename, return_as, band_names, time_names, stack_dim, bounds, bounds_by, resampling, mosaic, overlap, dtype, num_workers, **kwargs)
    348                     bounds=bounds,
    349                     num_workers=num_workers,
--> 350                     **kwargs)
    351 
    352     else:

~/anaconda3/envs/gwenv/lib/python3.7/site-packages/geowombat/core/api.py in read(filename, band_names, time_names, bounds, num_workers, **kwargs)
    162                 kwargs['window'] = from_bounds(*bounds, transform=src.transform)
    163 
--> 164             ycoords, xcoords, attrs = get_attrs(src, **kwargs)
    165 
    166         data = dask.compute(read_delayed(filename, 0, **kwargs),

~/anaconda3/envs/gwenv/lib/python3.7/site-packages/geowombat/core/api.py in get_attrs(src, **kwargs)
     35     ycoords = np.linspace(src.bounds.top - (kwargs['window'].row_off * src.res[0]),
     36                           src.bounds.top - (kwargs['window'].row_off * src.res[0]) -
---> 37                           (kwargs['window'].height * src.res[0]), kwargs['window'].height)
     38 
     39     xcoords = np.linspace(src.bounds.left + (kwargs['window'].col_off * src.res[0]),

<__array_function__ internals> in linspace(*args, **kwargs)

~/anaconda3/envs/gwenv/lib/python3.7/site-packages/numpy/core/function_base.py in linspace(start, stop, num, endpoint, retstep, dtype, axis)
    119         raise TypeError(
    120             "object of type {} cannot be safely interpreted as an integer."
--> 121                 .format(type(num)))
    122 
    123     if num < 0:

TypeError: object of type <class 'float'> cannot be safely interpreted as an integer.

This is caused in the call to numpy.linspace using the calculated window width as the argument num. But the window width as calculated by rio.window.from_bounds returns a float not an integer.

Perhaps this should be cast to an integer in the linspace call?

My pip freeze output:

affine==2.3.0
alabaster==0.7.12
asciitree==0.3.3
attrs==19.3.0
Babel==2.8.0
backcall==0.1.0
bleach==3.1.0
certifi==2019.11.28
cftime==1.0.4.2
chardet==3.0.4
Click==7.0
click-plugins==1.1.1
cligj==0.5.0
cloudpickle==1.2.2
cycler==0.10.0
Cython==0.29.14
dask==2.9.2
dask-glm==0.2.0
dask-ml==1.2.0
dateparser==0.7.2
decorator==4.4.1
defusedxml==0.6.0
Deprecated==1.2.7
distributed==2.9.3
docutils==0.16
entrypoints==0.3
fasteners==0.15
Fiona==1.8.13
fsspec==0.6.2
GDAL==2.4.0
geopandas==0.6.2
geowombat==1.0.7
h5netcdf==0.7.4
h5py==2.10.0
HeapDict==1.0.1
idna==2.8
imagesize==1.2.0
importlib-metadata==1.4.0
ipykernel==5.1.3
ipython==7.11.1
ipython-genutils==0.2.0
jedi==0.16.0
Jinja2==2.10.3
joblib==0.14.1
json5==0.8.5
jsonschema==3.2.0
jupyter-client==5.3.4
jupyter-core==4.6.1
jupyterlab==1.2.6
jupyterlab-server==1.0.6
kiwisolver==1.1.0
latexcodec==2.0.0
llvmlite==0.31.0
locket==0.2.0
MarkupSafe==1.1.1
matplotlib==3.1.2
mistune==0.8.4
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
monotonic==1.5
msgpack==0.6.2
multipledispatch==0.6.0
munch==2.5.0
nbconvert==5.6.1
nbformat==5.0.4
netCDF4==1.5.3
notebook==6.0.3
numba==0.47.0
numcodecs==0.6.4
numexpr==2.7.1
numpy==1.18.1
numpydoc==0.9.2
opencv-python==4.1.2.30
oset==0.1.3
packaging==20.1
pandas==0.25.3
pandocfilters==1.4.2
parso==0.5.2
partd==1.1.0
pexpect==4.8.0
pickleshare==0.7.5
prometheus-client==0.7.1
prompt-toolkit==3.0.2
psutil==5.6.7
ptyprocess==0.6.0
pybtex==0.22.2
pybtex-docutils==0.2.2
Pygments==2.5.2
pyparsing==2.4.6
pyproj==2.4.2.post1
pyrsistent==0.15.7
python-dateutil==2.8.1
pytz==2019.3
PyYAML==5.3
pyzmq==18.1.1
rasterio==1.1.2
regex==2020.1.8
requests==2.22.0
scikit-learn==0.22.1
scipy==1.4.1
Send2Trash==1.5.0
Shapely==1.6.4.post2
six==1.14.0
snowballstemmer==2.0.0
snuggs==1.4.7
sortedcontainers==2.1.0
Sphinx==2.3.1
sphinx-automodapi==0.12
sphinxcontrib-applehelp==1.0.1
sphinxcontrib-bibtex==1.0.0
sphinxcontrib-devhelp==1.0.1
sphinxcontrib-htmlhelp==1.0.2
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.2
sphinxcontrib-serializinghtml==1.1.3
tblib==1.6.0
terminado==0.8.3
testpath==0.4.4
toolz==0.10.0
tornado==6.0.3
tqdm==4.42.0
traitlets==4.3.3
tzlocal==2.0.0
urllib3==1.25.8
wcwidth==0.1.8
webencodings==0.5.1
wget==3.2
wrapt==1.11.2
xarray==0.14.1
zarr==2.4.0
zict==1.0.0
zipp==2.1.0affine==2.3.0
alabaster==0.7.12
asciitree==0.3.3
attrs==19.3.0
Babel==2.8.0
backcall==0.1.0
bleach==3.1.0
certifi==2019.11.28
cftime==1.0.4.2
chardet==3.0.4
Click==7.0
click-plugins==1.1.1
cligj==0.5.0
cloudpickle==1.2.2
cycler==0.10.0
Cython==0.29.14
dask==2.9.2
dask-glm==0.2.0
dask-ml==1.2.0
dateparser==0.7.2
decorator==4.4.1
defusedxml==0.6.0
Deprecated==1.2.7
distributed==2.9.3
docutils==0.16
entrypoints==0.3
fasteners==0.15
Fiona==1.8.13
fsspec==0.6.2
GDAL==2.4.0
geopandas==0.6.2
geowombat==1.0.7
h5netcdf==0.7.4
h5py==2.10.0
HeapDict==1.0.1
idna==2.8
imagesize==1.2.0
importlib-metadata==1.4.0
ipykernel==5.1.3
ipython==7.11.1
ipython-genutils==0.2.0
jedi==0.16.0
Jinja2==2.10.3
joblib==0.14.1
json5==0.8.5
jsonschema==3.2.0
jupyter-client==5.3.4
jupyter-core==4.6.1
jupyterlab==1.2.6
jupyterlab-server==1.0.6
kiwisolver==1.1.0
latexcodec==2.0.0
llvmlite==0.31.0
locket==0.2.0
MarkupSafe==1.1.1
matplotlib==3.1.2
mistune==0.8.4
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
monotonic==1.5
msgpack==0.6.2
multipledispatch==0.6.0
munch==2.5.0
nbconvert==5.6.1
nbformat==5.0.4
netCDF4==1.5.3
notebook==6.0.3
numba==0.47.0
numcodecs==0.6.4
numexpr==2.7.1
numpy==1.18.1
numpydoc==0.9.2
opencv-python==4.1.2.30
oset==0.1.3
packaging==20.1
pandas==0.25.3
pandocfilters==1.4.2
parso==0.5.2
partd==1.1.0
pexpect==4.8.0
pickleshare==0.7.5
prometheus-client==0.7.1
prompt-toolkit==3.0.2
psutil==5.6.7
ptyprocess==0.6.0
pybtex==0.22.2
pybtex-docutils==0.2.2
Pygments==2.5.2
pyparsing==2.4.6
pyproj==2.4.2.post1
pyrsistent==0.15.7
python-dateutil==2.8.1
pytz==2019.3
PyYAML==5.3
pyzmq==18.1.1
rasterio==1.1.2
regex==2020.1.8
requests==2.22.0
scikit-learn==0.22.1
scipy==1.4.1
Send2Trash==1.5.0
Shapely==1.6.4.post2
six==1.14.0
snowballstemmer==2.0.0
snuggs==1.4.7
sortedcontainers==2.1.0
Sphinx==2.3.1
sphinx-automodapi==0.12
sphinxcontrib-applehelp==1.0.1
sphinxcontrib-bibtex==1.0.0
sphinxcontrib-devhelp==1.0.1
sphinxcontrib-htmlhelp==1.0.2
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.2
sphinxcontrib-serializinghtml==1.1.3
tblib==1.6.0
terminado==0.8.3
testpath==0.4.4
toolz==0.10.0
tornado==6.0.3
tqdm==4.42.0
traitlets==4.3.3
tzlocal==2.0.0
urllib3==1.25.8
wcwidth==0.1.8
webencodings==0.5.1
wget==3.2
wrapt==1.11.2
xarray==0.14.1
zarr==2.4.0
zict==1.0.0
zipp==2.1.0

Fix iteration over extract outputs

Currently, geowombat.extract iterates over the output of dask.vindex using Python loops. This behavior should be changed to use numpy reshaping.

Reading Tiff : ValueError: negative dimensions are not allowed

Hi dear
When trying to open my tiff file, I get this error: ValueError: negative dimensions are not allowed
code:
with gw.open(mytif) as src:
print(src)

File "/home/---/.local/lib/python3.8/site-packages/geowombat/core/api.py", line 493, in init
self.data = warp_open(
File "/home/---/.local/lib/python3.8/site-packages/geowombat/backends/xarray_.py", line 271, in warp_open
with open_rasterio(
File "/home/---/.local/lib/python3.8/site-packages/geowombat/backends/xarray_rasterio_.py", line 316, in open_rasterio
x, _ = riods.transform * (np.arange(nx) + 0.5, np.zeros(nx) + 0.5)
ValueError: negative dimensions are not allowed

The same tiff file can be read easily using rasterio, gdal... by with geowombat it's not possible
Thanks

Rare rounding issue in polygon_to_array

I just ran into a case where generating the x and y coords doesn't match the original xarray dimensions. I can share the file if you want to see it.

When running polygon_to_array:

ds.shape
6681 by 5172

after generating coordinates:

xcoords = np.arange(left + cellxh, left + cellxh + dst_width * abs(cellx), cellx)
ycoords = np.arange(top - cellyh, top - cellyh - dst_height * abs(celly), -celly)
len(xcoords)
6682

Extra install examples fail; coreg extra builds unwanted GDAL version

Installed GDAL 3.3.2

gdalinfo --version
GDAL 3.3.2, released 2021/09/01

Installed virtualenv

pyenv virtualenv 3.8.12 venv.gw
pyenv activate venv.gw

Installed Python GDAL 3.3.2

(venv.gw) pip install GDAL==3.3.2

(venv.gw) python -c "from osgeo import gdal;print(gdal.__version__)"
3.3.2

Installing GeoWombat with no extras works fine

(venv.gw) pip install git+https://github.com/jgrss/geowombat.git

Installing GeoWombat with any extra using the suggested method from the online documentation fails. E.g., the following line does not work:

(venv.gw) pip install git+https://github.com/jgrss/geowombat.git#egg=project[stac]

Installing GeoWombat with an extra using the following method works

(venv.gw) pip install "geowombat[stac]@git+https://github.com/jgrss/geowombat.git"

Installing GeoWombat with multiple extras that include 'coreg' fails

(venv.gw) pip install "geowombat[ml,coreg,stac,perf,tests]@git+https://github.com/jgrss/geowombat.git"

For some reason, the command above initiates a Python GDAL build even when Python GDAL is already installed. For example, an attempted build for GDAL==3.5.* occurs even though I have GDAL=3.3.2 installed.

Removing the 'coreg' extra works

(venv.gw) pip install "geowombat[ml,stac,perf,tests]@git+https://github.com/jgrss/geowombat.git"

This issue stems from arosics, which has a gdal requirement. However, there is no GDAL version requirement specified, so I am not sure why pip is attempting to install a higher version.

Solution to install the 'coreg' extra and avoid an unwanted GDAL upgrade.

First, arosics will be removed from the 'coreg' extra list in geowombat. Then, using the following command, all dependencies can be installed without pip installing GDAL.

(venv.gw) pip install arosics>=1.7.8 --no-deps && pip install "geowombat[coreg]@git+https://github.com/jgrss/geowombat.git"

TODO

Change online documentation to use the pip install "geowombat[extra]@git+https://github.com/jgrss/geowombat.git" extra install format.
Fix setup.cfg to remove arosics.
Provide install example specific to the 'coreg' extra as pip install arosics>=1.7.8 --no-deps && pip install "geowombat[coreg]@git+https://github.com/jgrss/geowombat.git".

poylgon to array issue

I am trying to debug this, but having some trouble. Although working previously polygon_to_array is having trouble. I tried a few polygons and I am getting the same issue. Just wondering if you have the same issue.

with gw.open(tiffs, time_names = time_names) as ds:
        print(ds)

<Client: 'inproc://192.168.86.77/236471/8' processes=1 threads=12, memory=67.08 GB>
go to http://localhost:8787/status for dask dashboard
<xarray.DataArray (time: 7, band: 38, y: 2586, x: 3339)>
dask.array<concatenate, shape=(7, 38, 2586, 3339), dtype=float64, chunksize=(1, 1, 512, 512), chunktype=numpy.ndarray>
Coordinates:
  * band     (band) int64 1 2 3 4 5 6 7 8 9 10 ... 29 30 31 32 33 34 35 36 37 38
  * y        (y) float64 9.209 9.207 9.204 9.202 9.2 ... 3.41 3.408 3.406 3.403
  * x        (x) float64 40.5 40.5 40.51 40.51 40.51 ... 47.99 47.99 48.0 48.0
  * time     (time) <U4 '2010' '2011' '2012' '2013' '2014' '2015' '2016'
Attributes:
    transform:      (0.002245788210298804, 0.0, 40.500544584528626, 0.0, -0.0...
    crs:            +init=epsg:4326
    res:            (0.002245788210298804, 0.002245788210298804)
    is_tiled:       1
    nodatavals:     (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
    scales:         (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1...
    offsets:        (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0...
    filename:       ['y000002586_x000003341_h000002586_w000003340.tif', 'y000...
    resampling:     nearest
    AREA_OR_POINT:  Area

    labels = gw.polygon_to_array(labels,  data=ds )



---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/Documents/github/ET_drought_2019/scripts/crop_loss.py in 
     90             labels['band'] = [variable_name]
     91         else:
---> 92             labels = gw.polygon_to_array(labels, col=col, data=data )
     93             labels['band'] = [variable_name]
     94 

~/Documents/github/geowombat/geowombat/core/conversion.py in polygon_to_array(self, polygon, col, data, cellx, celly, band_name, row_chunks, col_chunks, src_res, fill, default_value, all_touched, dtype, sindex)
    576 
    577             # Get intersecting features
--> 578             int_idx = sorted(list(sindex.intersection(tuple(data.gw.geodataframe.total_bounds.flatten()))))
    579 
    580             if not int_idx:

~/Documents/github/geowombat/geowombat/core/properties.py in geodataframe(self)
    811         return gpd.GeoDataFrame(data=[1],
    812                                 columns=['grid'],
--> 813                                 geometry=[self.geometry],
    814                                 crs=self._obj.crs)
    815 

~/Documents/github/geowombat/geowombat/core/properties.py in geometry(self)
    766         """
    767 
--> 768         return Polygon([(self.left, self.bottom),
    769                         (self.left, self.top),
    770                         (self.right, self.top),

~/Documents/github/geowombat/geowombat/core/properties.py in left(self)
    659         """
    660 
--> 661         return float(self._obj.x.min().values) - self.cellxh
    662 
    663     @property

AttributeError: 'NoneType' object has no attribute 'x'

Google Cloud and Amazon AWS direct reads

Google Cloud public reference (Landsat)

https://cloud.google.com/storage/docs/public-datasets/landsat

Google Cloud public reference (Sentinel-2)

https://cloud.google.com/storage/docs/public-datasets/sentinel-2

Landsat 8 Collection 1

https://console.cloud.google.com/storage/browser/gcp-public-data-landsat/LC08/01

Read file from Google example

fn = 'http://storage.googleapis.com/gcp-public-data-landsat/LC08/01/042/034/LC08_L1TP_042034_20170616_20170629_01_T1/LC08_L1TP_042034_20170616_20170629_01_T1_B4.TIF'

ML.fit

@jgrss For the machine learning example in the tutorial I was getting the following:

import geowombat as gw
from geowombat.data import l8_224078_20200518, l8_224078_20200518_polygons
from geowombat.ml import fit

import geopandas as gpd
from sklearn_xarray.preprocessing import Featurizer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.decomposition import PCA
from sklearn.naive_bayes import GaussianNB

le = LabelEncoder()

# The labels are string names, so here we convert them to integers
labels = gpd.read_file(l8_224078_20200518_polygons)
labels['lc'] = le.fit(labels.name).transform(labels.name)

# Use a data pipeline
pl = Pipeline([('featurizer', Featurizer()),
                ('scaler', StandardScaler()),
                ('pca', PCA()),
                ('clf', GaussianNB())])

# Fit the classifier
with gw.open(l8_224078_20200518, chunks=128) as src:
        X, clf = fit(data= src, labels=labels, clf=pl, col='lc')

print(clf)
print(X)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/Documents/github/pyGIS/pygis/docs/test.py in <module>
     1874 # Fit the classifier
     1875 with gw.open(l8_224078_20200518, chunks=128) as src:
---> 1876         X, clf = fit(data= src, labels=labels, clf=pl, col='lc')
     1877 
     1878 print(clf)

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/ml/classifiers.py in fit(self, data, labels, clf, grid_search, targ_name, targ_dim_name, col)
    199         """
    200 
--> 201         data = self._prepare_labels(data, labels, col, targ_name)
    202         X, Xna = self._prepare_predictors(data, targ_name)
    203         clf = self._prepare_classifiers(clf)

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/ml/classifiers.py in _prepare_labels(data, labels, col, targ_name)
     67         labels = labels.where(labels != 0)
     68 
---> 69         data.coords[targ_name] = (['time', 'y', 'x'], labels)
     70 
     71         return data

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/xarray/core/coordinates.py in __setitem__(self, key, value)
     39 
     40     def __setitem__(self, key: Hashable, value: Any) -> None:
---> 41         self.update({key: value})
     42 
     43     @property

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/xarray/core/coordinates.py in update(self, other)
    162         other_vars = getattr(other, "variables", other)
    163         coords, indexes = merge_coords(
--> 164             [self.variables, other_vars], priority_arg=1, indexes=self.xindexes
    165         )
    166         self._update_coords(coords, indexes)

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/xarray/core/merge.py in merge_coords(objects, compat, join, priority_arg, indexes, fill_value)
    470         coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value
    471     )
--> 472     collected = collect_variables_and_indexes(aligned)
    473     prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat)
    474     variables, out_indexes = merge_collected(collected, prioritized, compat=compat)

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/xarray/core/merge.py in collect_variables_and_indexes(list_of_mappings)
    292                 append_all(coords, indexes)
    293 
--> 294             variable = as_variable(variable, name=name)
    295             if variable.dims == (name,):
    296                 variable = variable.to_index_variable()

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/xarray/core/variable.py in as_variable(obj, name)
    120         if isinstance(obj[1], DataArray):
    121             raise TypeError(
--> 122                 "Using a DataArray object to construct a variable is"
    123                 " ambiguous, please extract the data using the .data property."
    124             )

TypeError: Using a DataArray object to construct a variable is ambiguous, please extract the data using the .data property.

Odd cubing behavoir with extract

I am again seeing some oddities with extract. Wondering if you have any ideas for problem solving it, just not sure where to start. I have a large vector dataset of 70k administrative units in Ethiopia, trying to get monthly precip means for each unit. Precip is from CHIRPs and is 0.05 dd res.

Example image, resampled to 0.01dd.

Open and extract code:

with gw.config.update(ref_bounds=bounds, ref_res=(0.01, 0.01)):
    with gw.open(
        f_list,
        band_names=["ppt"],
        time_names=dates,
        nodata=-9999,
        resampling="bilinear",
    ) as ds:
        print(ds)
        df = ds.gw.extract(
            aoi=eas,
            all_touched=True,
            band_names=ds.band.values.tolist(),
            time_names=ds.time.values.tolist(),
            n_threads=4,  # n_jobs creates memory error that is uncaught
            verbose=2,
        )
        print(df.head)

        ag = df.groupby(by=["id"])   # ive also tried aggregating by the administrative code, no change
            .agg("mean")
        )

1st issue, without resampling to a higher resolution say 0.01 dd, 30k+ administrative units return no data, even with 'all touched' as true.

2nd issue, if I do resample, my outputted admin units take on a really weird block pattern. I have also tried the groupby based on the unique code for the admin unit, although id seems to correspond to the index on the features. This is the same for resample = bilinear and nearest.

@jgrss

extract error

Hey Jordan, I am trying to extract values to a large set of polygons. Just not sure how to interpret. Any ideas? I am using the lastest git version.

 with gw.config.update(ref_image=example, ref_bounds=bounds):
            with gw.open(
                files,
                band_names=["ppt_mean", "ppt_sum"],
                stack_dim="band",
                nodata=np.nan,
            ) as ds:

                df = ds.gw.extract(
                    EAs,
                    all_touched=True,
                    band_names=["ppt_mean", "ppt_sum"],
                    time_names=[date],
                    n_jobs=1,
                )

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/Documents/github/ET_Random_James_Tasks/ET_precip_2012_2019.py in <module>
      4                    # band_names=["ppt_mean", "ppt_sum"],
      5                    # time_names=[date],
----> 6                    n_jobs=1,
      7                )

~/anaconda3/envs/geowombat2/lib/python3.7/site-packages/geowombat/core/geoxarray.py in extract(self, aoi, bands, time_names, band_names, frac, all_touched, mask, n_jobs, verbose, **kwargs)
   1217                        n_jobs=n_jobs,
   1218                        verbose=verbose,
-> 1219                        **kwargs)
   1220 
   1221     def band_mask(self, valid_bands, src_nodata=None, dst_clear_val=0, dst_mask_val=1):

~/anaconda3/envs/geowombat2/lib/python3.7/site-packages/geowombat/core/sops.py in extract(self, data, aoi, bands, time_names, band_names, frac, min_frac_area, all_touched, id_column, time_format, mask, n_jobs, verbose, n_workers, n_threads, use_client, address, total_memory, processes, pool_kwargs, **kwargs)
    655                                        n_jobs=n_jobs,
    656                                        verbose=verbose,
--> 657                                        **pool_kwargs)
    658 
    659         if df.empty:

~/anaconda3/envs/geowombat2/lib/python3.7/site-packages/geowombat/core/conversion.py in prepare_points(self, data, aoi, frac, min_frac_area, all_touched, id_column, mask, n_jobs, verbose, **kwargs)
    418                                              id_column=id_column,
    419                                              n_jobs=n_jobs,
--> 420                                              **kwargs)
    421 
    422             if not df.empty:

~/anaconda3/envs/geowombat2/lib/python3.7/site-packages/geowombat/core/conversion.py in polygons_to_points(data, df, frac, min_frac_area, all_touched, id_column, n_jobs, **kwargs)
    476                                           meta,
    477                                           frac,
--> 478                                           min_frac_area)
    479 
    480                 if not point_df.empty:

~/anaconda3/envs/geowombat2/lib/python3.7/site-packages/geowombat/core/util.py in sample_feature(df_row, id_column, df_columns, crs, res, all_touched, meta, frac, min_frac_area, feature_array)
    553 
    554     for col in other_cols:
--> 555         fea_df.loc[:, col] = df_row[col]
    556 
    557     return fea_df

~/anaconda3/envs/geowombat2/lib/python3.7/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    690 
    691         iloc = self if self.name == "iloc" else self.obj.iloc
--> 692         iloc._setitem_with_indexer(indexer, value, self.name)
    693 
    694     def _validate_key(self, key, axis: int):

~/anaconda3/envs/geowombat2/lib/python3.7/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value, name)
   1586                             if not is_list_like_indexer(value):
   1587                                 raise ValueError(
-> 1588                                     "cannot set a frame with no "
   1589                                     "defined index and a scalar"
   1590                                 )

ValueError: cannot set a frame with no defined index and a scalar

Cython & _moving Issue?

I am trying to build out a docker container to do development in (keep breaking my env by accident).

Here is the compose file:


FROM ubuntu:20.04
USER root

RUN apt update -y && apt upgrade -y && \
    apt install -y software-properties-common && \
    add-apt-repository ppa:ubuntugis/ppa && \
    apt update -y && apt install -y \
    gdal-bin \
    geotiff-bin \
    git \
    libgdal-dev \
    libgl1 \
    libspatialindex-dev \ 
    wget \
    python-is-python3 \
    pip \
    g++ 

RUN pip install Cython numpy  

# RUN pip-compile requirements.txt
RUN git clone https://github.com/jgrss/geowombat.git
WORKDIR "/geowombat"
RUN pip install -r requirements.txt

RUN  python setup.py install  
RUN  pip install sklearn-xarray  pip-tools rtree ipykernel

# works but doesn't bind
RUN python -c "import geowombat as gw"
RUN python -c "import rasterio;from rasterio.crs import CRS; cc = CRS.from_epsg(4326)"

In the development environment I have always gotten the following error. I had been just commenting out the import, but that is stupid.

Any idea what is going on? I think this is an cython issue, but not sure how to resolve it.

Step 10/11 : RUN python -c "import geowombat as gw"
 ---> Running in 97cdbb6f6a2d
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/geowombat/geowombat/__init__.py", line 2, in <module>
    from .core.api import open
  File "/geowombat/geowombat/core/__init__.py", line 1, in <module>
    from ..backends import transform_crs
  File "/geowombat/geowombat/backends/__init__.py", line 2, in <module>
    from .xarray_ import concat, mosaic
  File "/geowombat/geowombat/backends/xarray_.py", line 7, in <module>
    from ..core.windows import get_window_offsets
  File "/geowombat/geowombat/core/windows.py", line 1, in <module>
    from .util import n_rows_cols
  File "/geowombat/geowombat/core/util.py", line 9, in <module>
    from ..moving import moving_window
  File "/geowombat/geowombat/moving/__init__.py", line 1, in <module>
    from ._moving import moving_window
ModuleNotFoundError: No module named 'geowombat.moving._moving'
The command '/bin/sh -c python -c "import geowombat as gw"' returned a non-zero code: 1
The terminal process "/usr/bin/bash '-c', 'docker build --pull --rm -f "docker_debug" -t dbug2 "."'" terminated with exit code: 1.

Remove Xarray ufuncs

Xarray is dropping support for xarray.ufuncs. Instead, NumPy can be used directly on a DataArray. The geowombat.radiometry.sr and geowombat.radiometry.angles modules contain xarray.ufuncs that can be replaced with numpy.

Image co-registration

Look into adding a co-registration module that uses the AROSICS algorithm.

https://pypi.org/project/arosics/

Unable to install on Windows with conda

Installing on Windows 10, I followed the Easiest Install instructions using conda. This led to repeated 'Solving environment: failed...'.

Then I tried the Detailed Install further down on the same page, which is slightly different and seems redundant.

As this doesn't actually include a command to install geowombat at the end, I tried doing this with conda (per the 'Easiest') with the same result.

In the same environment, I also tried with pip: pip install git+https://github.com/jgrss/geowombat.
This results in the fiona install failing with 'A GDAL API version must be specified.'

Following the third set of alternate installation instructions here, (again, slightly different) I get the same fiona error. I tried setting GDAL_VERSION as an environment variable as instructed, but this led to 'ERROR: Failed building wheel for fiona'. I also tried installing fiona with conda, which worked, but then the pip install failed with another wheel error.

Updating attributes with band math

Maybe a minor tweek to improve band math.

import geowombat as gw
from geowombat.data import rgbn_suba, rgbn_subb

with gw.config.update(sensor='rgbn', scale_factor=0.0001):
    with gw.open([rgbn_suba, rgbn_subb],
          band_names=['blue', 'green', 'red', 'nir'],
          mosaic=True,
          bounds_by='union') as ds_mos:

              # calculate NDVI
              ndvi = ds_mos.gw.ndvi()
              print(ndvi)

returns

<xarray.DataArray (band: 1, y: 283, x: 449)>
dask.array<broadcast_to, shape=(1, 283, 449), dtype=float64, chunksize=(1, 64, 64), chunktype=numpy.ndarray>
Coordinates:
  * y        (y) float64 2.05e+06 2.05e+06 2.05e+06 ... 2.049e+06 2.049e+06
  * x        (x) float64 7.929e+05 7.929e+05 7.929e+05 ... 7.952e+05 7.952e+05
  * band     (band) <U4 'ndvi'
Attributes:
    transform:    (5.0, 0.0, 792925.0, 0.0, -5.0, 2050115.0)
    crs:          +init=epsg:32618
    res:          (5.0, 5.0)
    is_tiled:     0
    nodatavals:   (0.0, 0.0, 0.0, 0.0)
    scales:       (1.0, 1.0, 1.0, 1.0)
    offsets:      (0.0, 0.0, 0.0, 0.0)
    resampling:   nearest
    pre-scaling:  0.0001
    sensor:       rgbn
    drange:       (-1, 1)

Shouldn't attributes be updated to reflect a single band after running band math? For instance nodatavals: (0.0), scales: (1.0) etc? This can be problematic I think when going to write out NDVI and needing to reset attributes.

STAC API

@mmann1123 What do you think about adding an open method using STAC tools? Something that simplifies search and read.

I started experimenting here.

I am thinking something like this, which just uses pystac_client and stackstac.

import geowombat as gw
from rasterio.enums import Resampling

data_l = gw.open_stac(
    stac_catalog='microsoft',
    bounds='map.geojson',
    collections=['landsat_c2_l2'],
    bands=['red', 'green', 'blue', 'qa_pixel'],
    mask_data=True,
    resolution=100,
    resampling=Resampling.cubic,
)

data_s2 = gw.open_stac(
    stac_catalog='microsoft',
    bounds='map.geojson',
    collections=['sentinel_s2_l2a'],
    bands=['B04', 'B03', 'B02'],
    resolution=100,
    resampling=Resampling.cubic,
    epsg=32621
)

print(data_l)

<xarray.DataArray 'stackstac-b46c4b9f62edc1d4b9cffdb0a334996a' (time: 4,
                                                                band: 3,
                                                                y: 269, x: 333)>
dask.array<fetch_raster_window, shape=(4, 3, 269, 333), dtype=float64, chunksize=(1, 1, 256, 256), chunktype=numpy.ndarray>
Coordinates: (12/27)
  * time                         (time) datetime64[ns] 2020-07-04T13:12:45.30...
    id                           (time) <U31 'LE07_L2SP_225078_20200704_02_T1...
  * band                         (band) <U5 'red' 'green' 'blue'
  * x                            (x) float64 6.626e+05 6.626e+05 ... 6.725e+05
  * y                            (y) float64 -2.832e+06 -2.832e+06 ... -2.84e+06
    landsat:wrs_path             (time) <U3 '225' '225' '224' '224'
    ...                           ...
    platform                     (time) <U9 'landsat-7' ... 'landsat-7'
    eo:cloud_cover               (time) float64 28.0 17.0 20.49 16.0
    gsd                          int64 30
    title                        (band) <U10 'Red Band' 'Green Band' 'Blue Band'
    raster:bands                 object {'scale': 2.75e-05, 'nodata': 0, 'off...
    epsg                         int64 32621
Attributes:
    spec:        RasterSpec(epsg=32621, bounds=(662580.0, -2840370.0, 672570....
    crs:         epsg:32621
    transform:   | 30.00, 0.00, 662580.00|\n| 0.00,-30.00,-2832300.00|\n| 0.0...
    resolution:  30.0

print(data.gw)

<geowombat.core.geoxarray.GeoWombatAccessor at 0x7f6c94740490>

gw.merge_stac(data_l, data_s2)

<xarray.DataArray 'landsat_c2_l2+sentinel_s2_l2a' (time: 12, band: 3, y: 81,
                                                   x: 101)>
dask.array<getitem, shape=(12, 3, 81, 101), dtype=float64, chunksize=(1, 1, 81, 101), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2020-07-04T13:12:45.307603 ... 2020-08-01T...
  * band     (band) <U8 'red' 'green' 'blue'
  * y        (y) float64 -2.832e+06 -2.832e+06 ... -2.84e+06 -2.84e+06
  * x        (x) float64 6.625e+05 6.626e+05 6.627e+05 ... 6.724e+05 6.725e+05
Attributes:
    spec:        RasterSpec(epsg=32621, bounds=(662500, -2840400, 672600, -28...
    crs:         epsg:32621
    transform:   | 100.00, 0.00, 662500.00|\n| 0.00,-100.00,-2832300.00|\n| 0...
    resolution:  100
    res:         (100, 100)
    collection:  None

n_chunks?

This might be my lack of understanding, but n_chunks seems to create problems when writing with n_worker and n_threads set.

with gw.open(l8_224078_20200518) as src:
    # Write the data to a GeoTiff
    src.sel(band=[3, 2, 1]).gw.to_raster('./output/output.tif',
                                          n_workers=4, n_threads=2, n_chunks=16,
                                        overwrite=True)

fig, ax = plt.subplots(dpi=200)

with gw.open('./output/output.tif') as src:
    src.where(src != 0).gw.imshow(robust=True, ax=ax)
plt.tight_layout(pad=1)

Predicted .tif file assigns a class to zero

I tried plotting the resulting .tif file in ArcMap and first, the predicted classes were not up to the number of classes in my training data. Second, I wasn't able to build unique values for the raster and the black bands around the raster are not disappearing when I set to display nodata value with no colour.

Pip & Conda and release assets

I think the major hurdle for most people (nubes like me at least) have with geowombat is the complexity of the install (outside linux). Someone already got a semi-working conda feedstock for v 1.2 up. But it points to .tar.gz release assets like this example from rasterio https://github.com/rasterio/rasterio/releases.

Maybe its because I am not the owner of the repository, but I'm not seeing the "create a release" button for geowombat on github for example. I am assuming this is something we can automate through circleci, but not sure.

Just wondering if that's in the works. Or if you can do it manually. I would love to get the conda feedstock running across all the systems.

Allow both subset using window and stack_dim

Just wondering if it would be possible to allow both opening multiple images using stack_dim and at the same time subsetting using window.

Here's what I'd like to do:

import geowombat as gw
from rasterio.windows import Window

w = Window(row_off=0, col_off=0, height=100, width=100)

with gw.open([rgbn, rgbn],
             band_names=['b1', 'g1', 'r1', 'n1', 'b2', 'g2', 'r2', 'n2'],
             stack_dim='band',
             window = w) as ds:
   print(ds)

gives an error "The band names do not match the output dimensions." Checking the error, it seems that the data shape is (2, 4, 100, 100). So,

with gw.open([rgbn, rgbn],
             band_names=['b', 'g', 'r', 'n'],
             time_names = ['old', 'new'],
             window = w) as ds:
   print(ds)

works fine.

I'm using geowombat version 1.2.9

Review data editing page

@mmann1123 I added a page with some editing examples.

Is that what you had in mind?

Data fusion

Add spatial-temporal fusion methods, such as StarFM.

to_raster seperate

Its late and this is a placeholder, but I think I found a problem with to_raster with separate = True (unless Im missing something).

            ds.gw.to_raster('./landuse_stack_2019.tif',
                n_workers=1, n_threads=12,bigtiff=True, compress='lzw',
                separate=True, readxsize=3341, readysize= 2586, out_block_type='gtiff' )
        
2010
100%|███ | 4/4 [00:32<00:00,  8.17s/it]
Traceback (most recent call last):
  File "<stdin>", line 16, in <module>
  File "/home/mmann1123/Documents/github/geowombat/geowombat/core/geoxarray.py", line 719, in to_raster
    **kwargs)
  File "/home/mmann1123/Documents/github/geowombat/geowombat/core/io.py", line 884, in to_raster
    group_keys = list(root.group_keys())
AttributeError: 'NoneType' object has no attribute 'group_keys'
>>>

Tassel cap issue?

Just looking to plot a tcap band and got an unexpected error.

fig, ax = plt.subplots(dpi=150)

with gw.config.update(sensor='qb', scale_factor=0.0001):
    with gw.open(rgbn) as ds:
        tcap = ds.gw.tasseled_cap()
        tcap.sel(band='wetness').plot(robust=True, ax=ax)
plt.tight_layout(pad=1)

throws

TypeError: Plotting requires coordinates to be numeric, boolean, or dates of type numpy.datetime64, datetime.datetime, cftime.datetime or pandas.Interval. Received data of type object instead.

Dtypes in ML

Resolved in PR #87

Code
Current in conversion.py

def polygon_to_array(
        self,
        polygon,
        col=None,
        data=None,
        ...
        dtype="uint16",
        ):
 ....

 if col:
            shapes = (
                (geom, value) for geom, value in zip(dataframe.geometry, dataframe[col])
                )
            
 else:
            shapes = dataframe.geometry.values

 varray = rasterize(
                    shapes,
                    out_shape=(dst_height, dst_width),
                    transform=dst_transform,
                    fill=fill,
                    default_value=default_value,
                    all_touched=all_touched,
                    dtype=dtype,
                )

Problem

ml.fit_predict attempted to recast model predictions to unit16.
this throws type error in rasterizer when prediction can't be recast

Solution

If 'col' is included, use rasterio.dtypes.get_minimum_dtype to update dtype.
The proposed solution is tesed and works for continuous and int dtypes

if col:
            shapes = (
                (geom, value) for geom, value in zip(dataframe.geometry, dataframe[col])
            )
            dtype=get_minimum_dtype(dataframe[col])
            
 else:
            shapes = dataframe.geometry.values

 varray = rasterize(
                shapes,
                out_shape=(dst_height, dst_width),
                transform=dst_transform,
                fill=fill,
                default_value=default_value,
                all_touched=all_touched,
                dtype=dtype,
            )

Ubuntu 22.4 & conda

I was just mucking around with docker and testing the latest ubuntu version 22.4, and conda install.

I got the following errors on testing. Sops is installed, so not sure what's going on here.

I am wondering if it has to do more with conda install rather than the new ubuntu version

======================================================================
ERROR: test_coreg_data (test_coreg.TestCOREG)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/geowombat/tests/test_coreg.py", line 29, in test_coreg_data
    data = gw.coregister(
  File "/root/miniconda3/lib/python3.9/site-packages/geowombat/core/sops.py", line 1242, in coregister
    raise NameError
NameError

======================================================================
ERROR: test_coreg_shift (test_coreg.TestCOREG)
Tests a 1-pixel shift.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/geowombat/tests/test_coreg.py", line 79, in test_coreg_shift
    shifted = gw.coregister(
  File "/root/miniconda3/lib/python3.9/site-packages/geowombat/core/sops.py", line 1242, in coregister
    raise NameError
NameError

======================================================================
ERROR: test_coreg_transform_data (test_coreg.TestCOREG)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/geowombat/tests/test_coreg.py", line 50, in test_coreg_transform_data
    data = gw.coregister(
  File "/root/miniconda3/lib/python3.9/site-packages/geowombat/core/sops.py", line 1242, in coregister
    raise NameError
NameError

======================================================================
ERROR: test_array_type (test_properties.TestProperties)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/geowombat/tests/test_properties.py", line 36, in test_array_type
    self.assertTrue(src.gw.array_is_dask)
AttributeError: 'GeoWombatAccessor' object has no attribute 'array_is_dask'

======================================================================
ERROR: test_chunksize_check (test_properties.TestProperties)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/geowombat/tests/test_properties.py", line 42, in test_chunksize_check
    self.assertEqual(src.gw.check_chunksize(64, 128), 64)
AttributeError: 'GeoWombatAccessor' object has no attribute 'check_chunksize'

----------------------------------------------------------------------

Read value error with rasterio.WarpedVRT()

When geowombat.open() is used within a configuration context with reference bounds set, the underlying rasterio.vrt.WarpedVRT() modifies the data. The issue appears to only occur with floating point data. Tests on thematic data did not result in incorrect data reads.

Example

bounds = (left, bottom, right, top)

with gw.config.update(ref_bounds=bounds):
    with gw.open(image) as src:
        # do something here

results in the correct positional read (i.e., the correct bounding box is read), but the data values are modified in rasterio.vrt.WarpedVRT().

'DatasetReader' object has no attribute 'gw'

I have run into this a few times now since the latest update. I haven't had time to look into it. Sorry with Fiona pulling at me all the time, I have about 1 hr of productive work time a day :(

You probably already have it resolved but thought I would check

import geowombat as gw
from geowombat.data import rgbn

from rasterio.windows import Window
w = Window(row_off=0, col_off=0, height=100, width=100)

with gw.open(rgbn,
                band_names=['blue', 'green', 'red'],
                num_workers=8,
                indexes=[1, 2, 3],
                window=w,
                ) as src:
    print(src)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~/Documents/github/pyGIS/pygis/docs/test.py in 
      1185                 num_workers=8,
     1186                 indexes=[1, 2, 3],
---> 1187                 window=w,
     1188                 ) as src:
     1189     print(src)

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/core/api.py in __init__(self, filename, band_names, time_names, stack_dim, bounds, bounds_by, resampling, persist_filenames, netcdf_vars, mosaic, overlap, nodata, dtype, num_workers, **kwargs)
    412                              chunks=chunks,
    413                              num_workers=num_workers,
--> 414                              **kwargs)
    415 
    416             self.__filenames = [filename]

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/core/api.py in read(filename, band_names, time_names, bounds, chunks, num_workers, **kwargs)
    174                 kwargs['window'] = from_bounds(*bounds, transform=src.gw.transform)
    175 
--> 176             ycoords, xcoords, attrs = get_attrs(src, **kwargs)
    177 
    178         data = dask.compute(read_delayed(filename,

~/anaconda3/envs/pygisbookgw/lib/python3.7/site-packages/geowombat/core/api.py in get_attrs(src, **kwargs)
     71     attrs = dict()
     72 
---> 73     attrs['transform'] = src.gw.transform
     74 
     75     if hasattr(src, 'crs'):

AttributeError: 'DatasetReader' object has no attribute 'gw'

Extract key error

For a simple gpkg file, getting a key error. Wondering if the index should be used for the id?

sheds
Out[21]: 
                                              geometry
0    POLYGON ((37.28166 11.48168, 37.30379 11.48181...
1    POLYGON ((36.75196 11.33893, 36.78143 11.33915...
2    POLYGON ((37.35700 12.48734, 37.39401 12.48757...
3    POLYGON ((37.34259 12.49546, 37.37960 12.49569...
4    POLYGON ((36.89493 11.42643, 36.91459 11.42657...
..                                                 ...

Extracting data using the following:

with gw.open('./data/NDVI_MODIS/NDVI_2013_01_01_250m.tif') as modis:
    print(modis)
    df = gw.extract(modis,'./data/CommunitySurvey/watershed_bboxes.gpkg',
                            #bands=3,
                            #band_names=['red'],
                            #frac=0.1,
                            #n_jobs=8,
                            #num_workers=8,
                            verbose=1)

Returns

Coordinates:
  * band     (band) int64 1
  * y        (y) float64 14.89 14.88 14.88 14.88 ... 3.412 3.41 3.408 3.406
  * x        (x) float64 33.01 33.01 33.01 33.01 ... 47.99 47.99 47.99 47.99
Attributes:
    transform:     (0.002245788210298804, 0.0, 33.00410353855122, 0.0, -0.002...
    crs:           +init=epsg:4326
    res:           (0.002245788210298804, 0.002245788210298804)
    is_tiled:      1
    nodatavals:    (nan,)
    scales:        (1.0,)
    offsets:       (0.0,)
    descriptions:  ('NDVI',)
    filename:      ./data/NDVI_MODIS/NDVI_2013_01_01_250m.tif
    resampling:    nearest
  0%|          | 0/136 [00:00<?, ?it/s]
Traceback (most recent call last):

  File "/home/mmann1123/Documents/github/IFPRI_ET_watersheds/scripts/extract_timeseries_to_poly.py", line 33, in <module>
    verbose=1)

  File "/home/mmann1123/anaconda3/envs/xr_fresh/lib/python3.7/site-packages/geowombat/core/sops.py", line 423, in extract
    verbose=verbose)

  File "/home/mmann1123/anaconda3/envs/xr_fresh/lib/python3.7/site-packages/geowombat/core/conversion.py", line 423, in prepare_points
    n_jobs=n_jobs)

  File "/home/mmann1123/anaconda3/envs/xr_fresh/lib/python3.7/site-packages/geowombat/core/conversion.py", line 464, in polygons_to_points
    point_df = sample_feature(dfrow[id_column],

  File "/home/mmann1123/anaconda3/envs/xr_fresh/lib/python3.7/site-packages/pandas/core/series.py", line 871, in __getitem__
    result = self.index.get_value(self, key)

  File "/home/mmann1123/anaconda3/envs/xr_fresh/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 4404, in get_value
    return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))

  File "pandas/_libs/index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value

  File "pandas/_libs/index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value

  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'id'

	from xarray.ufuncs import maximum as xr_maximum
	from xarray.ufuncs import minimum as xr_mininum