Comments (4)
Thanks for updating here @rbavery. To use that function, make sure to import re
too 😉
from radiant-mlhub.
Thanks @rbavery,
To follow up in regards to timeline, part of our dataset updates also require some additional changes to our pipeline so we can't begin to publish the updated catalogs before those changes are made as well. We have some additional projects that we're also working on at the moment but we're targeting that we'll have the updated catalogs published before the holidays towards the end of this year.
We'll follow up in the call that we will schedule soon and also feel free to reach out for updates on our MLHub Slack!
from radiant-mlhub.
This is also a problem for the recently announced competition dataset for AgrifieldNet
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [15], line 1
----> 1 stack = stackstac.stack(source_items, epsg = 4326)
File ~/Library/Application Support/hatch/env/virtual/ml-pipeline-sYYlrDaP/nb/lib/python3.10/site-packages/stackstac/stack.py:279, in stack(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds, resampling, chunksize, dtype, fill_value, rescale, sortby_date, xy_coords, properties, band_coords, gdal_env, errors_as_nodata, reader)
272 if sortby_date is not False:
273 plain_items = sorted(
274 plain_items,
275 key=lambda item: item["properties"].get("datetime", "") or "",
276 reverse=sortby_date == "desc",
277 )
--> 279 asset_table, spec, asset_ids, plain_items = prepare_items(
280 plain_items,
281 assets=assets,
282 epsg=epsg,
283 resolution=resolution,
284 bounds=bounds,
285 bounds_latlon=bounds_latlon,
286 snap_bounds=snap_bounds,
287 )
288 arr = items_to_dask(
289 asset_table,
290 spec,
(...)
298 errors_as_nodata=errors_as_nodata,
299 )
301 return xr.DataArray(
302 arr,
303 *to_coords(
(...)
312 name="stackstac-" + dask.base.tokenize(arr)
313 )
File ~/Library/Application Support/hatch/env/virtual/ml-pipeline-sYYlrDaP/nb/lib/python3.10/site-packages/stackstac/prepare.py:294, in prepare_items(items, assets, epsg, resolution, bounds, bounds_latlon, snap_bounds)
291 # If there's no geotrans, compute resolutions from `proj:shape`
292 else:
293 if asset_bbox_proj is None or asset_shape is None:
--> 294 raise ValueError(
295 f"Cannot automatically compute the resolution, "
296 f"since asset {id!r} on item {item_i} {item['id']!r} "
297 f"doesn't provide enough metadata to determine its native resolution.\n"
298 f"We'd need at least one of (in order of preference):\n"
299 f"- The `proj:transform` and `proj:epsg` fields set on the asset, or on the item\n"
300 f"- The `proj:shape` and one of `proj:bbox` or `bbox` fields set on the asset, "
301 "or on the item\n\n"
302 "Please specify the `resolution=` argument to set the output resolution manually. "
303 f"(Remember that resolution must be in the units of your CRS ([http://epsg.io/{out_epsg](http://epsg.io/%7Bout_epsg)})"
304 "---not necessarily meters."
305 )
307 # NOTE: this would be inaccurate if `proj:bbox` was provided,
308 # but the geotrans was non-rectilinear
309 # TODO check for that if there's a geotrans??
310 res_y = (asset_bbox_proj[3] - asset_bbox_proj[1]) / asset_shape[0]
ValueError: Cannot automatically compute the resolution, since asset 'B01' on item 0 'ref_agrifieldnet_competition_v1_source_ffe8c' doesn't provide enough metadata to determine its native resolution.
We'd need at least one of (in order of preference):
- The `proj:transform` and `proj:epsg` fields set on the asset, or on the item
- The `proj:shape` and one of `proj:bbox` or `bbox` fields set on the asset, or on the item
Please specify the `resolution=` argument to set the output resolution manually. (Remember that resolution must be in the units of your CRS (http://epsg.io/4326)---not necessarily meters.
A script to reproduce is here
import os
from configparser import ConfigParser
from radiant_mlhub import Dataset, get_session
import requests
from pystac_client import Client
from urllib.parse import urljoin
import stackstac
config = ConfigParser()
configFilePath = '../.mlhub_api_key.cfg'
with open(configFilePath) as f:
config.read_file(f)
MLHUB_API_KEY = config.get('credentials', 'api_key')
os.environ['MLHUB_API_KEY'] = MLHUB_API_KEY
MLHUB_ROOT_URL = "https://api.radiant.earth/mlhub/v1/"
client = Client.open(
MLHUB_ROOT_URL, parameters={"key": MLHUB_API_KEY}, ignore_conformance=True
)
class MLHubSession(requests.Session):
def __init__(self, *args, api_key=None, **kwargs):
super().__init__(*args, **kwargs)
self.params.update({"key": api_key})
def request(self, method, url, *args, **kwargs):
url_prefix = MLHUB_ROOT_URL.rstrip("/") + "/"
url = urljoin(url_prefix, url)
return super().request(method, url, *args, **kwargs)
session = MLHubSession(api_key=MLHUB_API_KEY)
search = client.search(collections=["ref_agrifieldnet_competition_v1_source"])
source_items = search.get_all_items()
search = client.search(collections=["ref_agrifieldnet_competition_v1_labels_train"])
train_label_items = search.get_all_items()
search = client.search(collections=["ref_agrifieldnet_competition_v1_labels_test"])
test_label_items = search.get_all_items()
stack = stackstac.stack(source_items, epsg = 4326)
from radiant-mlhub.
Here is a quick function that @lillythomas put together to update the STAC metadata for a given item. CC @kbgg We made this for the sen12floods collection, so it would need to be adapted depending on the asset in the stac item.
On my connection, this takes about 5 seconds per Item. We're curious to hear what the timeline is for updating the metadata collections since 5 seconds per item can get long when it's run for large Item collections and whenever we use stackstac.
import time
import rasterio
from rasterio import logging
log = logging.getLogger()
log.setLevel(logging.ERROR)
def set_transform_epsg(source_item, verbose=False):
"""
This modifies the source item in place to update projection metadata.
Assumes this metadata is missing or it will be overwritten.
"""
start = time.time()
with rasterio.open(source_item.assets['VV'].href) as src:
x = src.profile
source_item.properties['proj:transform'] = list(x['transform'])
source_item.properties['proj:epsg'] = re.findall('\d+', str(x['crs']))[0]
#print(source_item.properties)
if verbose:
print(f"Time to update metadata for item {source_item.id}: ", time.time() - start)
return source_item
from radiant-mlhub.
Related Issues (20)
- Failure to download catalogs or data for datasets dlr_fusion_competition_germany and ref_fusion_competition_south_africa HOT 4
- Enable downloading assets within a pytest environment HOT 5
- Bug caused by PySTAC upgrade. bad version pinning in our setup.py
- 404 not found for registry url from recent mlhub version HOT 2
- Data download Error with collection_filter option HOT 2
- LandCoverNet download includes unnecessary metadata when using collection_filter HOT 1
- Provide new release to allow more recent `shapely`
- Able to update pystac pin?
- Include LICENSE file in package distribution
- Additional dataset attributes break Dataset.collections HOT 1
- Human-readable Collection info HOT 3
- When using api_key parameter, some class properties cannot be accessed HOT 1
- Move to One Flow branching strategy instead of Git Flow HOT 3
- Using `api_key` in `Dataset.download(...)` raises an exception HOT 4
- Drop support for Python 3.6 HOT 1
- Dataset organisation of SpaceNet 2: Vegas collection HOT 1
- Nothing is happening when I start a download HOT 2
- Continuous 104 exception trying to download the data for "dlr_fusion_competition_germany" HOT 3
- SpaceNet missing collections HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from radiant-mlhub.