Problem Deion I filtered a set of stac items for <code class

Thanks for updating here <a class="user-mention notranslate" data-hovercard-type="user

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Here is a quick function that <a class="user-mention notranslate" data-hovercard-type=

improve projection metadata for sen12floods and other stacs so they can be used more easily with stackstac about radiant-mlhub HOT 4 OPEN

radiantearth commented on May 28, 2024 1

improve projection metadata for sen12floods and other stacs so they can be used more easily with stackstac

from radiant-mlhub.

Comments (4)

lillythomas commented on May 28, 2024 1

Thanks for updating here @rbavery. To use that function, make sure to import re too 😉

from radiant-mlhub.

kbgg commented on May 28, 2024 1

Thanks @rbavery,

To follow up in regards to timeline, part of our dataset updates also require some additional changes to our pipeline so we can't begin to publish the updated catalogs before those changes are made as well. We have some additional projects that we're also working on at the moment but we're targeting that we'll have the updated catalogs published before the holidays towards the end of this year.

We'll follow up in the call that we will schedule soon and also feel free to reach out for updates on our MLHub Slack!

from radiant-mlhub.

rbavery commented on May 28, 2024

This is also a problem for the recently announced competition dataset for AgrifieldNet

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [15], line 1
----> 1 stack = stackstac.stack(source_items, epsg = 4326)

File ~/Library/Application Support/hatch/env/virtual/ml-pipeline-sYYlrDaP/nb/lib/python3.10/site-packages/stackstac/stack.py:279, in stack(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds, resampling, chunksize, dtype, fill_value, rescale, sortby_date, xy_coords, properties, band_coords, gdal_env, errors_as_nodata, reader)
    272 if sortby_date is not False:
    273     plain_items = sorted(
    274         plain_items,
    275         key=lambda item: item["properties"].get("datetime", "") or "",
    276         reverse=sortby_date == "desc",
    277     )
--> 279 asset_table, spec, asset_ids, plain_items = prepare_items(
    280     plain_items,
    281     assets=assets,
    282     epsg=epsg,
    283     resolution=resolution,
    284     bounds=bounds,
    285     bounds_latlon=bounds_latlon,
    286     snap_bounds=snap_bounds,
    287 )
    288 arr = items_to_dask(
    289     asset_table,
    290     spec,
   (...)
    298     errors_as_nodata=errors_as_nodata,
    299 )
    301 return xr.DataArray(
    302     arr,
    303     *to_coords(
   (...)
    312     name="stackstac-" + dask.base.tokenize(arr)
    313 )

File ~/Library/Application Support/hatch/env/virtual/ml-pipeline-sYYlrDaP/nb/lib/python3.10/site-packages/stackstac/prepare.py:294, in prepare_items(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds)
    291 # If there's no geotrans, compute resolutions from `proj:shape`
    292 else:
    293     if asset_bbox_proj is None or asset_shape is None:
--> 294         raise ValueError(
    295             f"Cannot automatically compute the resolution, "
    296             f"since asset {id!r} on item {item_i} {item['id']!r} "
    297             f"doesn't provide enough metadata to determine its native resolution.\n"
    298             f"We'd need at least one of (in order of preference):\n"
    299             f"- The `proj:transform` and `proj:epsg` fields set on the asset, or on the item\n"
    300             f"- The `proj:shape` and one of `proj:bbox` or `bbox` fields set on the asset, "
    301             "or on the item\n\n"
    302             "Please specify the `resolution=` argument to set the output resolution manually. "
    303             f"(Remember that resolution must be in the units of your CRS ([http://epsg.io/{out_epsg](http://epsg.io/%7Bout_epsg)})"
    304             "---not necessarily meters."
    305         )
    307     # NOTE: this would be inaccurate if `proj:bbox` was provided,
    308     # but the geotrans was non-rectilinear
    309     # TODO check for that if there's a geotrans??
    310     res_y = (asset_bbox_proj[3] - asset_bbox_proj[1]) / asset_shape[0]

ValueError: Cannot automatically compute the resolution, since asset 'B01' on item 0 'ref_agrifieldnet_competition_v1_source_ffe8c' doesn't provide enough metadata to determine its native resolution.
We'd need at least one of (in order of preference):
- The `proj:transform` and `proj:epsg` fields set on the asset, or on the item
- The `proj:shape` and one of `proj:bbox` or `bbox` fields set on the asset, or on the item

Please specify the `resolution=` argument to set the output resolution manually. (Remember that resolution must be in the units of your CRS (http://epsg.io/4326)---not necessarily meters.

A script to reproduce is here

import os
from configparser import ConfigParser
from radiant_mlhub import Dataset, get_session
import requests
from pystac_client import Client
from urllib.parse import urljoin
import stackstac

config = ConfigParser()
configFilePath = '../.mlhub_api_key.cfg'
with open(configFilePath) as f:
    config.read_file(f)
MLHUB_API_KEY = config.get('credentials', 'api_key')
os.environ['MLHUB_API_KEY'] = MLHUB_API_KEY
MLHUB_ROOT_URL = "https://api.radiant.earth/mlhub/v1/"
client = Client.open(
    MLHUB_ROOT_URL, parameters={"key": MLHUB_API_KEY}, ignore_conformance=True
)
class MLHubSession(requests.Session):
    def __init__(self, *args, api_key=None, **kwargs):
        super().__init__(*args, **kwargs)
        self.params.update({"key": api_key})

    def request(self, method, url, *args, **kwargs):
        url_prefix = MLHUB_ROOT_URL.rstrip("/") + "/"
        url = urljoin(url_prefix, url)
        return super().request(method, url, *args, **kwargs)
session = MLHubSession(api_key=MLHUB_API_KEY)
search = client.search(collections=["ref_agrifieldnet_competition_v1_source"])
source_items = search.get_all_items()
search = client.search(collections=["ref_agrifieldnet_competition_v1_labels_train"])
train_label_items = search.get_all_items()
search = client.search(collections=["ref_agrifieldnet_competition_v1_labels_test"])
test_label_items = search.get_all_items()
stack = stackstac.stack(source_items, epsg = 4326)

from radiant-mlhub.

rbavery commented on May 28, 2024

Here is a quick function that @lillythomas put together to update the STAC metadata for a given item. CC @kbgg We made this for the sen12floods collection, so it would need to be adapted depending on the asset in the stac item.

On my connection, this takes about 5 seconds per Item. We're curious to hear what the timeline is for updating the metadata collections since 5 seconds per item can get long when it's run for large Item collections and whenever we use stackstac.

import time
import rasterio
from rasterio import logging

log = logging.getLogger()
log.setLevel(logging.ERROR)

def set_transform_epsg(source_item, verbose=False):
    """
    This modifies the source item in place to update projection metadata.
    Assumes this metadata is missing or it will be overwritten.
    """
    start = time.time()
    with rasterio.open(source_item.assets['VV'].href) as src:
        x = src.profile
    source_item.properties['proj:transform'] = list(x['transform'])
    source_item.properties['proj:epsg'] = re.findall('\d+', str(x['crs']))[0]
    #print(source_item.properties)
    if verbose:
        print(f"Time to update metadata for item {source_item.id}: ", time.time() - start)
    return source_item

from radiant-mlhub.

improve projection metadata for sen12floods and other stacs so they can be used more easily with stackstac about radiant-mlhub HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent