Git Product home page Git Product logo

eodal's Introduction

Test Status Documentation Status codecov

E:earth_africa:dal Earth Observation Data Analysis Library

pip install eodal

A truely open-source package for unified analysis of Earth Observation (EO) data

✔️ Cloud-native by design thanks to STAC

✔️ Access to Petabytes of global EO data including satellite imagery with native Mapper module

✔️ EO data querying, I/O, processing, analysis and visualization in a single package

✔️ Modulare and lightweight architecture

✔️ Almost unlimited expandability with interfaces to xarray, numpy, geopandas, and many more

About

E:earth_africa:dal is a Python library enabling the acquisition, organization, and analysis of EO data in a completely open-source manner within a unified framework.

E:earth_africa:dal enables open-source, reproducible geo-spatial data science. At the same time, E:earth_africa:dal lowers the burden of data handling and provides access to global satellite data archives through downloaders and the fantastic SpatioTemporalAssetsCatalogs (STAC).

E:earth_africa:dal supports working in cloud-environments using STAC catalogs ("online" mode) and on local premises using a spatial PostgreSQL/PostGIS database to organize metadata ("offline" mode).

Read more about E:earth_africa:dal in our peer reviewed article.

Citing E:earth_africa:dal

We put a lot of effort in developing E:earth_africa:dal. To give us proper credit please respect our license agreement. When you use E:earth_africa:dal for your research please cite our paper in addition to give us proper scientific credit.

	@article{GRAF2022107487,
	title = {EOdal: An open-source Python package for large-scale agroecological research using Earth Observation and gridded environmental data},
	journal = {Computers and Electronics in Agriculture},
	volume = {203},
	pages = {107487},
	year = {2022},
	issn = {0168-1699},
	doi = {https://doi.org/10.1016/j.compag.2022.107487},
	url = {https://www.sciencedirect.com/science/article/pii/S0168169922007955},
	author = {Lukas Valentin Graf and Gregor Perich and Helge Aasen},
	keywords = {Satellite data, Python, Open-source, Earth Observation, Ecophysiology}
	}

Data Model

EOdal data model

E:earth_africa:dal has a sophisticated data model projecting the complexity of Earth Observation data into Python classes. The object-based design of E:earth_africa:dal has four base classes:

  • E:earth_africa:dal Band is the class for handling single bands. A band is a two-dimensional raster layer (i.e., an two-dimensional array). Each raster cell takes a value. These values could represent color intensity, elevation above mean sea level, or temperature readings, to name just a few examples. A band has a name and an optional alias. Its raster grid cells are geo-referenced meaning each cell can be localized in a spatial reference system.
  • E:earth_africa:dal RasterCollection is a class that contains 0 to n Band objects. The bands are identified by their names or alias (if available).
  • E:earth_africa:dal Scene is essential a RasterCollection with SceneMetadata assigning the RasterCollection a time-stamp and an optional scene identifier.
  • E:earth_africa:dal SceneCollection is a collection of 0 to n Scenes. The scenes are identified by their timestamp or scene identifier (if available).

Mapper

The E:earth_africa:dal Mapper is one of the key components of E:earth_africa:dal. If you are familiar with GEE you can expect a similar easy access to vast amounts of EO data - except that is truely open-source. If you are absolutely new to EO you will quickly learn how to query, read and process large data volumes.

In the example below Sentinel-2 data is loaded for an area-of-interest in central Switzerland from Microsoft Planetary Computer (no authentication required).

import geopandas as gpd

from datetime import datetime
from eodal.core.sensors.sentinel2 import Sentinel2
from eodal.mapper.feature import Feature
from eodal.mapper.filter import Filter
from eodal.mapper.mapper import Mapper, MapperConfigs
from typing import List


#%% user-inputs
# -------------------------- Collection -------------------------------
collection: str = 'sentinel2-msi'
	
# ------------------------- Time Range ---------------------------------
time_start: datetime = datetime(2022,3,1)  		# year, month, day (incl.)
time_end: datetime = datetime(2022,6,30)   		# year, month, day (incl.)
	
# ---------------------- Spatial Feature  ------------------------------
geom: Path = Path('data/sample_polygons/lake_lucerne.gpkg')
	
# ------------------------- Metadata Filters ---------------------------
metadata_filters: List[Filter] = [
	Filter('cloudy_pixel_percentage','<', 80),
	Filter('processing_level', '==', 'Level-2A')
]
	
#%% query the scenes available (no I/O of scenes, this only fetches metadata)
feature = Feature.from_geoseries(gpd.read_file(geom).geometry)
mapper_configs = MapperConfigs(
	collection=collection,
	time_start=time_start,
	time_end=time_end,
	feature=feature,
	metadata_filters=metadata_filters
)

# now, a new Mapper instance is created
mapper = Mapper(mapper_configs)
mapper.query_scenes()
	
#%% load the scenes available from STAC (reading bands B02 "blue", B03 "green", B04 "red")
scene_kwargs = {
	'scene_constructor': Sentinel2.from_safe,
	'scene_constructor_kwargs': {'band_selection': ['B02', 'B03', 'B04'], 'read_scl': False}
}

mapper.load_scenes(scene_kwargs=scene_kwargs)

# the data loaded into `mapper.data` as a EOdal SceneCollection
mapper.data

Examples

We have compiled a set of Jupyter notebooks showing you the capabilities of E:earth_africa:dal and how to unlock them.

Contributing

Contributions to E:earth_africa:dal are welcome. Please make sure to read the contribution guidelines first.

Code Coverage

The inner-most circle is the entire project, moving away from the center are folders then, finally, a single file. The size and color of each slice is representing the number of statements and the coverage, respectively.

codecov

eodal's People

Contributors

atoparseks avatar dependabot[bot] avatar gperich avatar lukasvalentin avatar rubenbaer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

eodal's Issues

Improving the Mapper class: Proposal of SceneCollections

Currently, the Mapper class in E:earth_africa:dal is overloaded and lacks generality (its underlying data structures are poorly structured). This hampers the usage of the mapper class when writing application code since users have to code a lot of stuff still on their sides. In particular, the Mapper class currently lacks a convenient API for working with retrieved data and iterating over the retrieved scenes is currently not implemented in an elegant way.

This issue is to initiate development to improve the current Mapper class which will be renamed to SceneCollection. A mock-up of a future API call could look like:

import cv2
from shapely.geometry import Point
from eodal.scenes import Filter, Feature, SceneCollection

# define area of interest (in this case it's a single point)
feature = Feature(name='test',  geometry=Point([49,11]), epsg=4326)

# define filters (might be further refined)
date_filter = Filter('date', 'between 2022-10-01 and 2022-10-15')
cloud_filter = Filter('cloud_cover', '<30')

# define options for the actual I/0 of data into RasterCollections
io_kwargs = {'band_selection': ['B02','B03','B04'], spatial_resolution: 10, resampling_method: cv2.INTER_NEAREST_EXACT}

collection = SceneCollection(
   catalog='Sentinel-2',
   processing_level='Level-2A',
   feature=feature,
   filters=[date_filter, cloud_filter],
   io_kwargs=io_kwargs
)

# load metadata + data
collection.load()

# to access the actual data, SceneCollections will be iterable
for scene in SceneCollection:
   pass

A further idea is to pass RasterCollection Operators directly to the SceneCollection instances, to, e.g., perform operations on RasterCollections with the SceneCollection on the fly as discussed with @m-i-g-u-e-l .

Improve README.rst

The current version of the README.rst file is partly outdated and could need some more drive.

BUG: No proper heuristic for selection of reference in EOdal mapper

In version 0.2.1 we announced that the grid alignment issue of scenes from different projections should be fixed in the mapper call. However, this seems still not to be completely true as shown when trying to fetch data from two Sentinel-2 tiles in two different UTM zones.

This is because of a flaw in the mapper: While we select the largest common spatial bounding box of all scenes and then force the mapper to align all scenes into that bounding box, we do not check if the grids are aligned (reprojection errors and inaccuracies!) and the pixel size matches as we can see from the code snippet below:

minx, maxy = total_bounds[0], total_bounds[-1]
dst_transform = Affine(
    a=geo_info.pixres_x,
    b=0,
    c=minx,
    d=0,
    e=geo_info.pixres_y,
    f=maxy)

While total_bounds is the same for all scenes (desired), geo_info potentially different for the single scenes (see here):

for _, scene in scoll:
    if scene.is_bandstack():
        # get a band of the SceneCollection
        reference_band = scene[scene.band_names[0]]
        geo_info = reference_band.geo_info

Thus, while the total bounds are enforced correctly, the pixel size could still differ and thus the number of rows and columns per scene. As a solution, we need to select a single reference scene for all scenes in the mapper.

However, we need to define such a heuristic. I'd propose to use the following:

  1. Are there scenes that have not been reprojected (least error)?
  2. If NO: use the first scene in the collection as reference
  3. If YES: use this scene as reference

Retry when STAC query fails

Due to some problems at Microsoft Planetary Computer (see here) the REST-API calls to STAC using the pystac-client should be made more robust to avoid HTTPS errors when querying the STAC catalog.

As proposed here we will add the Retry operator to the HTTP Adapter when working with pystac-client.

CREODIAS downloader broken

Description

The CREODIAS downloader implemented in EOdal is broken due to the migration of the CREODIAS platform to the new Copernicus Data Space Ecosystem earlier this year.

In detail, the generation of keycloak tokens required to sign download requests raises an error that credentials are invalid. This is due to the need for 2 factor authentication (2FA) now enforced by the platform. Moreover, the API endpoint for obtaining zipped Sentinel scenes has changed.

Desired behavior

Reactive the download functionality so that zipped Sentinel scenes can be downloaded again from CREODIAS.

bug in get_feature_timeseries

Something is wrong with get_feature_timeseries.

median is the same as 50% percentile -- but it isn't in my case. And both mean and median are over NDVI range and same for all dates

image

I wonder, is it relevant the 2 polygons I passed overlap? One is original, the other is buffered inwards (to eliminate tree shadows)

eodal mapper import breaks with pydantic version 2.0.2

The Mapper class of Eodal doesn't work with the newest pydantic version 2.0.2.

To Reproduce
Steps to reproduce the behavior:

  1. Install eodal using pip install eodal
  2. Activate the python environment
  3. In a python script; import the Mapper class: from eodal.mapper.mapper import Mapper, MapperConfigs
  4. See error

Error Message

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/gperich/Projects/CropClassesTerensis/.venv/lib64/python3.11/site-packages/eodal/mapper/mapper.py", line 43, in <module>
    from eodal.config import get_settings
  File "/home/gperich/Projects/CropClassesTerensis/.venv/lib64/python3.11/site-packages/eodal/config/__init__.py", line 1, in <module>
    from .settings import get_settings
  File "/home/gperich/Projects/CropClassesTerensis/.venv/lib64/python3.11/site-packages/eodal/config/settings.py", line 35, in <module>
    from pydantic import BaseSettings
  File "/home/gperich/Projects/CropClassesTerensis/.venv/lib64/python3.11/site-packages/pydantic/__init__.py", line 206, in __getattr__
    return _getattr_migration(attr_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gperich/Projects/CropClassesTerensis/.venv/lib64/python3.11/site-packages/pydantic/_migration.py", line 279, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.0.2/migration/#basesettings-has-moved-to-pydantic-settings for more details.

Fix
downgrade pydantic package to version 1.10.9

pip uninstall pydantic
pip install pydantic==1.10.9

numpy masked array arithmetics; ValueError: output array is read-only for numpy>=1.22

File "/home/miguel/code/agroscope/eodal/eodal-venv39_v2/lib/python3.9/site-packages/numpy/lib/function_base.py", line 4489, in _lerp
subtract(b, diff_b_a * (1 - t), out=lerp_interpolation, where=t >= 0.5)

ValueError: output array is read-only

lerp_interpolation
Out [3]: masked

type(lerp_interpolation)
Out [4]: <class 'numpy.ma.core.MaskedConstant'>

Example code:

#!/usr/bin/env python3

-- coding: utf-8 --

"""
Created on Tue Aug 9 11:53:46 2022

@author: miguel
"""
#import cv2
import numpy as np
import os

from pathlib import Path
from eodal.core.spectral_indices import SpectralIndices
from eodal.core.sensors import Sentinel2

make plots larger by default

import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [10, 10]

download test data (if not done yet)

import requests
from eodal.downloader.sentinel2.utils import unzip_datasets

URL to the public dataset

url = 'https://data.mendeley.com/public-files/datasets/ckcxh6jskz/files/e97b9543-b8d8-436e-b967-7e64fe7be62c/file_downloaded'

testdata_dir = Path('../../data')
if not os.path.exists(testdata_dir):
os.makedirs(testdata_dir)
testdata_fname = testdata_dir.joinpath('S2A_MSIL2A_20190524T101031_N0212_R022_T32UPU_20190524T130304.zip')
testdata_fname_unzipped = Path(testdata_fname.as_posix().replace('.zip', '.SAFE'))

check first if the dataset has been already downloaded; only start the download if the dataset is not yet available locally

if not testdata_fname_unzipped.exists():

# download dataset
r = requests.get(url, stream=True)
r.raise_for_status()
with open(testdata_fname, 'wb') as fd:
    for chunk in r.iter_content(chunk_size=5096):
        fd.write(chunk)

# unzip dataset
unzip_datasets(download_dir=testdata_dir)

we can either read all bands are define a subset to read

band_selection = ['B02', 'B03', 'B04', 'B05', 'B07', 'B08']

define file-path to ESRI shapefile (all formats understood by fiona work)

import os

base_dir = Path(os.path.dirname(os.path.realpath("file"))).parent.parent
in_file_aoi = base_dir.joinpath('data/ref_feats/test_by.shp')

read data from .SAFE dataset for the selected AOI and spectral bands

handler = Sentinel2().from_safe(
in_dir=testdata_fname_unzipped,
vector_features=in_file_aoi,
band_selection=band_selection,
apply_scaling=False # if True scales the reflectance values between 0 and 1
)
handler

fig_blue = handler.plot_band('blue', colorbar_label='Surface Reflectance')
vals = handler.get_values(['B02'])

Band.from_rasterio() check for Path or URL objects and raise and error depending on the status of USE_STAC

Currently E:earth_africa:dal does not check if the user passes a "real" file-system path or an URL when reading raster data using Band.from_rasterio() (same for the Band.read_pixels method). If, e.g., USE_STAC = True and a file-system path is passed, this raises an error not properly handled at the moment.

The idea would be to check for the status of the USE_STAC flag and check if the data type of the passed input dataset matches it.

Sentinel2Mapper unique_id_attribute error

If unique_id_attribute = 'id' points to field of type 'string', everything works as expected.

If unique_id_attribute = 'id' points to field of type 'int', following example happens:

Example:

get a new mapper instance

mapper = Sentinel2Mapper(
date_start=date_start,
date_end=date_end,
processing_level=processing_level,
cloud_cover_threshold=scene_cloud_cover_threshold,
mapper_configs=mapper_configs,
feature_collection=aoi,
unique_id_attribute = 'id'
)

File "/home/miguel/code/eodal_fork/eodal/eodal/operational/mapping/sentinel2.py", line 428, in get_complete_timeseries
if isinstance(res, gpd.GeoDataFrame):

UnboundLocalError: local variable 'res' referenced before assignment

Bug: `RasterCollection.to_dataframe()` fails when `RasterCollection.is_bandstack()` is False

In case a RasterCollection where is_bandstack() returns False (i.e., the bands in the RasterCollection have dfferent pixel sizes, geographic extents or reference systems), the RasterCollection.to_dataframe() method fails with a key error, e.g.

KeyError: ('B03', 'geometry')

This is because of an incorrect call of pandas.join that was never tested properly.

The solution is to use the work-around proposed by @atoparseks in case the is_bandstack() method returns False.

px_list =  [handler[b].to_dataframe() for b in handler.band_names]

gdf_rapeseed_pixels = reduce(lambda left, right:   
                     pd.merge(left , right,
                              on = ["geometry"]
                             ),
                    px_list)

Use EOdal and pystac_client behind Proxy Server

The way the pystac_client library is called to query STAC items currently fails when sitting behind a proxy server. See also this thread.

The issue has been resolved by specifying the path to a custom SSL certificate bundle. In EOdal this will be addressed by adding a new environment variable STAC_API_IO_CA_BUNDLE in the eodal.config.settings module. The default value of the variable will be True and should be overwritten with a path to a certificate bundle (e.g., REQUESTS_CA_BUNDLE or CURL_CA_BUNDLE) when a custom SSL certificate is required behind a proxy server.

The stac_client is then called as following:

from pystac_client.stac_api_io import StacApiIO
from pystac_client.client import Client
from eodal.config import get_settings

Settings = get_settings()
stac_api_io = StacApiIO()
stac_api_io.session.verify = Settings.STAC_API_IO_CA_BUNDLE  # the default of this variable is True
client = Client.from_file("https://planetarycomputer.microsoft.com/api/stac/v1", stac_io=stac_api_io)

Sentinel-2.from_safe issue

_ in eodal/examples/sentinel2_l2a_safe.py:

read data from .SAFE (all 10 and 20m bands + scene classification layer)

s2_ds = Sentinel2().from_safe(
in_dir=dot_safe_dir,
vector_features=bbox_gdf
)

gives me the error:

File "/eodal/eodal/utils/sentinel2.py", line 268, in get_S2_bandfiles_with_res
raise BandNotFoundError(

BandNotFoundError: Could not determine file-path of B02 from S2A_MSIL2A_20190524T101031_N0212_R022_T32UPU_20190524T130304.SAFE:

Landsat Support

This issue is a tracker for the progress on the full integration of Landsat into EOdal. The goal is to provide users with a similar smooth experience as with Sentinel-2 including the usage of the Mapper class. We focus on Collection-2 products (1982 to present) but also an integration of Collection-1 is feasible.

Thanks to the modular and sensor-agnostic design of the core and mapper submodule in EOdal, only few coding efforts are necessary.

  • define a class Landsat in eodal.core.sensors that inherits from the RasterCollection class
  • add a class-method from_usgs that allows to read Landsat data from GeoTiff files issued by USGS
  • add a method to conveniently mask out clouds and other artifacts
  • ensure Landsat data can be fetched from STAC from Microsoft Planetary Computer
  • test the integration with the Mapper class

SceneCollection.get_feature_timeseries doesn't accept custom function

I'm not even sure is it bug or missing feature or I should pass the functions in some other way.

I would like to compute summary statistic as percentile from some band over polygon.

scoll.get_feature_timeseries(sample_fields, method=['median', "percentile_10"])

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File /usr/local/lib/python3.10/dist-packages/rasterstats/main.py:294, in gen_zonal_stats(vectors, raster, layer, band, nodata, affine, stats, all_touched, categorical, category_map, add_stats, zone_func, raster_out, prefix, geojson_out, boundless, **kwargs)
    293 try:
--> 294     feature_stats[stat_name] = stat_func(masked, feat["properties"])
    295 except TypeError:
    296     # backwards compatible with single-argument function

TypeError: Band.reduce.<locals>._fun_prototype() takes 1 positional argument but 2 were given

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
Cell In[82], line 1
----> 1 scoll.get_feature_timeseries(sample_fields, method=['median', "percentile_10"])

File /usr/local/lib/python3.10/dist-packages/eodal/core/scene.py:499, in SceneCollection.get_feature_timeseries(self, vector_features, reindex_dataframe, **kwargs)
    497     _gdf = scene.get_pixels(vector_features=gdf, **kwargs)
    498 else:
--> 499     _gdf = scene.band_summaries(by=gdf, **kwargs)
    500 _gdf['acquisition_time'] = timestamp
    501 gdf_list.append(_gdf)

File /usr/local/lib/python3.10/dist-packages/eodal/utils/decorators.py:160, in check_band_names.<locals>.wrapper(self, *args, **kwargs)
    157         if not set(band_names).issubset(self.band_names):
    158             raise BandNotFoundError(f"{band_names} not found in collection")
--> 160 return f(self, *args, **kwargs)

File /usr/local/lib/python3.10/dist-packages/eodal/core/raster.py:1259, in RasterCollection.band_summaries(self, band_selection, **kwargs)
   1257     band_selection = self.band_names
   1258 for band_name in band_selection:
-> 1259     band_stats = self[band_name].reduce(**kwargs)
   1260     # band_stats is a list of 1:N entries (one per feature on which reduce
   1261     # was called); we add the band name as attribute
   1262     for idx in range(len(band_stats)):

File /usr/local/lib/python3.10/dist-packages/eodal/core/band.py:2069, in Band.reduce(self, method, by)
   2064         vals = vals.filled(np.nan)
   2065     # unfortunately, rasterstats always calculates some default statistcs. We therefore
   2066     # trick it by calling only the count method and delete the result afterwards
   2067     # Also, there seems to be a bug in rasterstats preventing more than a single
   2068     # operator to be passed in add_stats (otherwise the values are returned are wrong)
-> 2069     stats = zonal_stats(
   2070         features, vals, affine=affine, stats='median', add_stats=add_stats
   2071     )
   2072     stats_operator_list.append(stats)
   2073 # combine the list of stats into a format consistent with the standard zonal_stats call

File /usr/local/lib/python3.10/dist-packages/rasterstats/main.py:40, in zonal_stats(*args, **kwargs)
     32 def zonal_stats(*args, **kwargs):
     33     """The primary zonal statistics entry point.
     34 
     35     All arguments are passed directly to ``gen_zonal_stats``.
   (...)
     38     The only difference is that ``zonal_stats`` will
     39     return a list rather than a generator."""
---> 40     return list(gen_zonal_stats(*args, **kwargs))

File /usr/local/lib/python3.10/dist-packages/rasterstats/main.py:297, in gen_zonal_stats(vectors, raster, layer, band, nodata, affine, stats, all_touched, categorical, category_map, add_stats, zone_func, raster_out, prefix, geojson_out, boundless, **kwargs)
    294             feature_stats[stat_name] = stat_func(masked, feat["properties"])
    295         except TypeError:
    296             # backwards compatible with single-argument function
--> 297             feature_stats[stat_name] = stat_func(masked)
    299 if raster_out:
    300     feature_stats["mini_raster_array"] = masked

File /usr/local/lib/python3.10/dist-packages/eodal/core/band.py:2055, in Band.reduce.<locals>._fun_prototype(x)
   2050 def _fun_prototype(x: np.ndarray | np.ma.MaskedArray):
   2051     """
   2052     a function prototype to by-pass custom numpy functions
   2053     to rasterstats
   2054     """
-> 2055     return eval(f'{expression}(x)')

File <string>:1

AttributeError: module 'numpy.ma' has no attribute 'percentile_10'

Then I tried to define percentile_10 as custom function

def percentile_10 (x):
    return np.percentile(x, 10)

scoll.get_feature_timeseries(sample_fields, method=['median', percentile_10])

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[81], line 1
----> 1 scoll.get_feature_timeseries(sample_fields, method=['median', percentile_10])

File /usr/local/lib/python3.10/dist-packages/eodal/core/scene.py:499, in SceneCollection.get_feature_timeseries(self, vector_features, reindex_dataframe, **kwargs)
    497     _gdf = scene.get_pixels(vector_features=gdf, **kwargs)
    498 else:
--> 499     _gdf = scene.band_summaries(by=gdf, **kwargs)
    500 _gdf['acquisition_time'] = timestamp
    501 gdf_list.append(_gdf)

File /usr/local/lib/python3.10/dist-packages/eodal/utils/decorators.py:160, in check_band_names.<locals>.wrapper(self, *args, **kwargs)
    157         if not set(band_names).issubset(self.band_names):
    158             raise BandNotFoundError(f"{band_names} not found in collection")
--> 160 return f(self, *args, **kwargs)

File /usr/local/lib/python3.10/dist-packages/eodal/core/raster.py:1259, in RasterCollection.band_summaries(self, band_selection, **kwargs)
   1257     band_selection = self.band_names
   1258 for band_name in band_selection:
-> 1259     band_stats = self[band_name].reduce(**kwargs)
   1260     # band_stats is a list of 1:N entries (one per feature on which reduce
   1261     # was called); we add the band name as attribute
   1262     for idx in range(len(band_stats)):

File /usr/local/lib/python3.10/dist-packages/eodal/core/band.py:2023, in Band.reduce(self, method, by)
   2021 default_stats = True
   2022 try:
-> 2023     check_stats(stats=method, categorical=False)
   2024 except ValueError:
   2025     default_stats = False

File /usr/local/lib/python3.10/dist-packages/rasterstats/utils.py:101, in check_stats(stats, categorical)
     99             stats = stats.split()
    100 for x in stats:
--> 101     if x.startswith("percentile_"):
    102         get_percentile(x)
    103     elif x not in VALID_STATS:

AttributeError: 'function' object has no attribute 'startswith'

Is there a way how to compute custom aggregation functions using get_feature_timeseries?

suggestion -- ndvi colormap

I created a custom colormap for NDVI

from  matplotlib.colors import LinearSegmentedColormap
c = [ "darkred","red","yellow", "yellowgreen" , "green","darkgreen"]
v = [0  ,          0.15  ,   0.3    ,     0.5,    0.75 ,    1]
l = list(zip(v,c))
cmap_ndvi=LinearSegmentedColormap.from_list('rg',l, N=256)

image

Would it be nice addition to the eodal library? What do you think, @lukasValentin ?

Band.clip() does not allow masking pixel values outside the actual feature geometry

currently the Band.clip method only allows to crop the Band to the spatial extent of the bounding box of the feature. Pixel values within the bounding box but not overlapping the actual feature geometry are not masked. However, this behavior is desired in many applications, e.g., when one wants to calculate field parcel statistics or get only pixel values of that parcel.

Configure conda packaging

Apart from bringing EOdal to PyPI publishing EOdal at Anaconda is desired as well.

Following this guide we therefore need to configure a meta.yaml file:

package:
  name: "my-py-lib"
  version: "0.1.0"
about:
  summary: "This is an awesome Python library"source:
 path: ..build:
  script: python setup.py install

Then we execute conda build --output-folder ./conda-out/ ./conda/ followed by anaconda upload ./conda-out/osx-64/my-py-lib-0.1.0-0.tar.bz2

Sentinel2.mask_clouds_and_shadows add optional argument fill value

It would be great if the method Sentinel2.mask_clouds_and_shadows would have an optional argument to specify the filling value, which for now is 0 by default. In case raw pixel values are extracted and treated programmatically one may want to label missing values differently (e.g. as nans).

Best,
Fabio

Implement Spatial Join of RasterCollections

It would beneficial if RasterCollection instances could be joined spatially (RasterCollection.join()).

This functionality would allow to

  • perform spatial joins, i.e., merge raster datasets (a.k.a. mosaic generation) from adjacent tiles or different data souces
  • check for spatial overlap of rasters (implementing classic GIS functionality and topological operators)

Currently, we still have to rely on rasterio. This is actually not really a problem but we have to perform more I/O operations because we have to write files to disk, merge them using rasterio and read the result back into EOdal.

The questions are:

  • is it feasible to implement it in "pure" EOdal style?
  • is there any advantage over using rasterio?

If none of the questions above is answered positively, I'd suggest to continue using this function and implement it on the RasterCollection class level in join.

unexpected behaviour when clipping small AOIs

I wanted to clip small AOIs, which are approx one-two S2 pixels in each direction (highlighted field "1" in the screenshot below). I noticed that EOdal added "extra" pixels to my small AOI.

EOdal_extraction_screenshot

Masking using Scene classification layer

Describe the bug
Some Sentinel-2 scenes do not get masked properly using the mask_clouds_and_shadows() in preprocess_sentinel2_scenes().
See the attached example code, where class 9 is specified for masking but it remains unmasked in the image. So far I observed it in images with a specific partial tile coverage.

To Reproduce
Run the example code in :
masking_problem.zip

Enhancement: ignore nodata values by default when scaling data (or any other transformation)

Currently, users have to specify explicitly that the no-data value should be excluded from scaling. In same cases this is necessary to avoid that the no-data changes its value causing then subsequent problems with a "diverging" no-data value. I guess most users (including myself) will forget about this issue so EOdal should take over here.

Thus, the no-data value should be excluded by default from scaling operations, using a new flag called exclude_nodata that will be True by default. We also have to check if this flag makes sense/ is required for other operations. This could be the case for the band algebra operators and other transformers...

For the time being, users are strongly recommended to specify that the nodata value should be excluded from scaling as shown in the code snippet below:

nodata = srf[srf.band_names[0]].nodata
# scale data to correct physical units
srf.scale(
   pixel_values_to_ignore=[nodata],  # important to ignore nodata!
   inplace=True)

cli_s2_creodias_update returns/downloads twice per dataset

2022-11-23 17:31:36,409 eodal INFO S2A_MSIL2A_20190420T103031_N0211_R108_T32TLS_20190420T132227.SAFE already in database
2022-11-23 17:31:36,409 eodal INFO S2A_MSIL2A_20190420T103031_N0211_R108_T32TLS_20190420T132227.SAFE already in database
2022-11-23 17:31:44,495 eodal INFO Starting downloading S2B_MSIL2A_20210404T102559_N0300_R108_T32TLS_20210404T133042.zip (1/1)
2022-11-23 17:31:44,495 eodal INFO Starting downloading S2B_MSIL2A_20210404T102559_N0300_R108_T32TLS_20210404T133042.zip (1/1)

Error in RasterCollection.from_multi_band_raster(Path(gdd_path)) if no band indices or band names are provided

"gdd_path" points to 365 bands geotiff

gdd_scenes = RasterCollection.from_multi_band_raster(
Path(gdd_path))

Traceback (most recent call last):

File "/home//code/eodal_fork/eodal/eodal/core/raster.py", line 521, in from_multi_band_raster
handler.add_band(

File "/home//code/eodal_fork/eodal/eodal/core/raster.py", line 707, in add_band
raise KeyError(f"Cannot add raster band: {e}")

KeyError: "Cannot add raster band: 'Duplicate band names not permitted'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/home//code/eodal_fork/venv/lib/python3.10/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec
exec(code, globals, locals)

File "/home//code/nutzflaechenanalyse/v2_eodal/extract_data.py", line 184, in
gdd_scenes = RasterCollection.from_multi_band_raster(

File "/home//code/eodal_fork/eodal/eodal/core/raster.py", line 531, in from_multi_band_raster
f"Could not add band {band_names_src[band_idx]} "

TypeError: 'NoneType' object is not subscriptable

Error when importing Sentinel2 class on Windows

Upon importing the Sentinel2 class from eodal.core.sensors import Sentinel2 on my Windows machine, i get the following error message:
File "C:\Users\gperich\python_venvs\eodal\lib\site-packages\eodal\core\utils\geometry.py", line 129, in <module> def multi_to_single_points(point_features: gpd.GeoDataFrame | Path) -> gpd.GeoDataFrame: TypeError: unsupported operand type(s) for |: 'type' and 'type'

The error traces back to the multi_to_single_points function in the geometry.py file in the core\utils folder. I suspect Windows doesn't like the | operation? Or is it a python issue? I am on Python 3.9.7

Bug: RasterCollection.to_xarray() results in data being flipped up-side down

Description

When converting a RasterCollection (or SceneCollection or Band; they all call the same method, i.e., Band.to_xarray) the data ends up flipped up-side down. This is because y coordinates in the resulting xarray.DataArray are sorted in reverse order, i.e., starting from the smallest y coordinate towards the largest.

I still have to figure out if this is a problem in the way we construct the coordinates internally or if it is something strange in xarray, i.e., not obeying to the GDAL conventions...

Feature request: Move the preprocessing functions to a specific folder within EOdal

I extracted S2 scenes and did not use a preprocessing function as I thought I was only going to use the 10m bands anyways. Turns out you cannot do that, as I got the following error message:
image

I guess that is due to the SCL bands having a native resolution of 20m. Therefore, you need the resampling.

My feature request / idea is now to move the preprocessing functions (in the current examples always preprocess_sentinel2_scenes) to a library within the package EOdal. Maybe call it "userfuns" or "preprocessing_funs" or so. That way, the community could quickly share functions and you wouldn't need to define your own funs all the time.

Good idea or not?

EOdal Mapper: Grids do not align when reprojection is necessary from one UTM Zone into another

Description:

When using the Mapper to fetch data from Sentinel-2 tiles in different UTM zones, the data needs to be reprojected from one UTM zone into another. This works fine. However, the resulting grids do not necessarily fully match because of the reprojection routine not "snapping" to a reference grid.

E.g., when data from UTM zones 32 and 33 is read, and the majority of scenes is in zone 32, the scenes in zone 33 projection are re-projected into zone 32N. However, the mapper currently does not use a reference grid, i.e., the grid of the 32 scenes, so that there might be small offsets in the coordinates.

This does not matter as long as the data is not stacked, e.g., for time series analysis.

How to recreate the problem

The code snippet below fetches S2 data from zones 31 and 32N (western Switerland).

from datetime import datetime
from pathlib import Path
from shapely.geometry import box
from typing import List

from eodal.config import get_settings
from eodal.core.scene import SceneCollection
from eodal.core.sensors.sentinel2 import Sentinel2
from eodal.mapper.feature import Feature
from eodal.mapper.filter import Filter
from eodal.mapper.mapper import Mapper, MapperConfigs

Settings = get_settings()
# set to False to use a local data archive
Settings.USE_STAC = True

# user-inputs
# -------------------------- Collection -------------------------------
collection: str = 'sentinel2-msi'

# ------------------------- Time Range ---------------------------------
time_start: datetime = datetime(2022, 6, 10)  		# year, month, day (incl.)
time_end: datetime = datetime(2022, 6, 30)   		# year, month, day (incl.)

# ---------------------- Spatial Feature  ------------------------------
bbox = box(*[6.5738, 46.4565, 7.2628, 47.2190])
feature = Feature(
    name='lake_neuchatel',
    geometry=bbox,
    epsg=4326,
    attributes={}
)

# ------------------------- Metadata Filters ---------------------------
metadata_filters: List[Filter] = [
    Filter('cloudy_pixel_percentage', '<', 25),
    Filter('processing_level', '==', 'Level-2A')
]

# query the scenes available (no I/O of scenes, this only fetches metadata)
mapper_configs = MapperConfigs(
    collection=collection,
    time_start=time_start,
    time_end=time_end,
    feature=feature,
    metadata_filters=metadata_filters)

# to enhance reproducibility and provide proper documentation, the MapperConfigs
# can be saved as yaml (and also then be loaded again from yaml)
mapper_configs.to_yaml('data/sample_mapper_call.yaml')

# now, a new Mapper instance is created
mapper = Mapper(mapper_configs)
mapper.query_scenes()

# load the least cloudy scene available from STAC
scene_kwargs = {
    'scene_constructor': Sentinel2.from_safe,
    'scene_constructor_kwargs': {'band_selection': ["B02"], 'read_scl': False}}

mapper.load_scenes(scene_kwargs=scene_kwargs)

ulx_list = []
uly_list = []
epsg_list = []
for _, scene in mapper.data:
    ulx = scene['blue'].geo_info.ulx
    uly = scene['blue'].geo_info.uly
    epsg = scene['blue'].geo_info.epsg
    ulx_list.append(ulx)
    uly_list.append(uly)
    epsg_list.append(epsg)

# resulting ulx_list -> there is an offset of ~2 m to 9.6 m
>>> [770564.8761871913, 770564.8761871913, 770566.8953721962, 770564.8761871913, 770574.5245121085, 770574.5245121085]
# resulting uly_list -> larger offsets here are a due no-data in single tiles
>>> [5238323.835322518, 5238323.835322518, 5238322.759971117, 5238323.835322518, 5207434.29725682, 5207434.29725682]

As mentioned also by @orianif it would be nicer if all scenes in a SceneCollection share the same reference grid to allow stacks through time.

Proposed solution

We should think about a post-processing step in the Mapper to ensure all scenes are projected into the same reference grid. This will then also ensure that all scenes share exactly the same extent in terms of rows and columns.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.