Git Product home page Git Product logo

pystac-client's People

Contributors

bmcandr avatar cholden-ag avatar chuckwondo avatar dchandan avatar dependabot[bot] avatar drtodd13 avatar duckontheweb avatar fredliporace avatar gadomski avatar greyskyy avatar ircwaves avatar jpolchlo avatar jsignell avatar keenan-nicholson avatar kurtmckee avatar lossyrob avatar matthewhanson avatar mmcfarland avatar mneagul avatar ocefpaf avatar philvarner avatar rubenbaer avatar sachsbl avatar sebastic avatar stark525 avatar thomas-maschler avatar tomaugspurger avatar tomiiwa avatar weiji14 avatar wietzesuijker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pystac-client's Issues

catalog.search method returns NotImplementedError

Using the python library after instantiating a client connection to a STAC server, calling the catalog.search method returns a NotImplementedError. Maybe this has to do with how the subclass inherits method definitions from the parent class?

Example method that returns NotImplementedError on line 3:

def my_read_method(stacserver_url):

  catalog = Client.open(stacserver_url)
  mysearch = catalog.search(bbox=[-72.5,40.5,-72,41], max_items=10)
  return(f"{mysearch.matched()} items found")`

Federated search

A big advantage of STAC is being able to use data from multiple sources.
It would be a nice feature to be able to search multiple STAC endpoints and combine the results into a single FeatureCollection

validate_all throws KeyError when validating Microsoft Planetary Computer STAC API

The code below throws a KeyError instead of giving a STACValidationError. I think it needs to be more defensive so if the "id" field is missing, and convert the KeyError into a STACValidationError. It's also possible that it's trying to process an entity that's not a STAC object and has no "id" field.

import pystac_client

pystac_client.Client.open(
    'https://planetarycomputer.microsoft.com/api/stac/v1/').validate_all()

The exception:

Traceback (most recent call last):
  File "/Users/philvarner/code/stac-api-validation-suite/stac_api_validation_suite/validate_all_error.py", line 3, in <module>
    pystac_client.Client.open(
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac/catalog.py", line 679, in validate_all
    child.validate_all()
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac/catalog.py", line 680, in validate_all
    for item in self.get_items():
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac/stac_object.py", line 343, in get_stac_objects
    link.resolve_stac_object(root=self.get_root())
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac/link.py", line 146, in resolve_stac_object
    obj = STAC_IO.read_stac_object(target_href, root=root)
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac/stac_io.py", line 131, in read_stac_object
    return cls.stac_object_from_dict(d, href=uri, root=root)
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac/serialization/__init__.py", line 37, in stac_object_from_dict
    return Catalog.from_dict(d, href=href, root=root)
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac/catalog.py", line 790, in from_dict
    id = d.pop('id')
KeyError: 'id'

API or Client

Since we're keeping inheritance (#19), the API class is interchangeable with a static catalog or an API. The main difference is really that this repo is almost exclusively used for read-only (transaction extension excepted).

Users can use API.open() on any valid endpoint, even a local file.

Should the API class be renamed to Client? STACClient? Something else?

Conformance with the USGS STAC API is failing

USGS STAC API: https://ibhoyw8md9.execute-api.us-west-2.amazonaws.com/prod

It seems that PySTAC Client is testing against an older version of the conformances than USGS has implemented!

Edit: this was with 0.1.1 of pystac-client

ConformanceError                          Traceback (most recent call last)
<ipython-input-2-667e3e4574de> in <module>
----> 1 client = pystac_client.Client.open("https://ibhoyw8md9.execute-api.us-west-2.amazonaws.com/prod")

/usr/local/lib/python3.9/site-packages/pystac_client/client.py in open(cls, url, headers)
     96         old_read_text_method = STAC_IO.read_text_method
     97         STAC_IO.read_text_method = read_text_method
---> 98         catalog = cls.from_file(url)
     99         STAC_IO.read_text_method = old_read_text_method
    100         catalog.headers = headers

/usr/local/lib/python3.9/site-packages/pystac/stac_object.py in from_file(cls, href)
    494             o = STAC_IO.stac_object_from_dict(d, href=href)
    495         else:
--> 496             o = cls.from_dict(d, href=href)
    497 
    498         # Set the self HREF, if it's not already set to something else.

/usr/local/lib/python3.9/site-packages/pystac_client/client.py in from_dict(cls, d, href, root)
    132         d.pop('stac_version')
    133 
--> 134         catalog = cls(
    135             id=id,
    136             description=description,

/usr/local/lib/python3.9/site-packages/pystac_client/client.py in __init__(self, id, description, title, stac_extensions, extra_fields, href, catalog_type, conformance, headers)
     66         if conformance is not None and not self.conforms_to(ConformanceClasses.STAC_API_CORE):
     67             allowed_uris = "\n\t".join(ConformanceClasses.STAC_API_CORE.all_uris)
---> 68             raise ConformanceError(
     69                 'API does not conform to {ConformanceClasses.STAC_API_CORE}. Must contain one of the following '
     70                 f'URIs to conform (preferably the first):\n\t{allowed_uris}.')

ConformanceError: API does not conform to {ConformanceClasses.STAC_API_CORE}. Must contain one of the following URIs to conform (preferably the first):
	https://api.stacspec.org/v1.0.0-beta.1/core
	http://stacspec.org/spec/api/1.0.0-beta.1/core.

Running stac-client with no arguments should give defined error and usage rather than exception

I ran stac-client with no arguments, and got the following:

$ stac-client
Traceback (most recent call last):
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/bin/stac-client", line 8, in <module>
    sys.exit(cli())
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac_client/cli.py", line 127, in cli
    loglevel = args.pop('logging')
KeyError: 'logging'

error with max_items and limit set against earth-search

The following code throws the exception below with pystac_client==0.1.1

The issue seems to be related to limit or pagination. There are 880 results. When I change limit to 500, it runs successfully.

I see now that the server is returning a non-200 status code, but it's not clear if it's a problem with the server or the request, or how pystac-client is handling the error. @matthewhanson

from pystac_client import Client

catalog = Client.open("https://earth-search.aws.element84.com/v0")

mysearch = catalog.search(
    collections=['sentinel-s2-l2a-cogs'], 
    datetime = "2021-04-25T00:00:00Z/2021-04-25T02:00:00Z",
    max_items=1000,
    limit=1000
)
print(f"{mysearch.matched()} items found")

items = mysearch.items_as_collection()

Exception

---------------------------------------------------------------------------
APIError                                  Traceback (most recent call last)
<ipython-input-14-d789a1ae9ce9> in <module>
     13 print(f"{mysearch.matched()} items found")
     14 
---> 15 items = mysearch.items_as_collection()
     16 items.save('items.json')

/usr/local/Caskroom/miniconda/base/lib/python3.8/site-packages/pystac_client/item_search.py in items_as_collection(self)
    406         item_collection : ItemCollection
    407         """
--> 408         return ItemCollection(self.items())

/usr/local/Caskroom/miniconda/base/lib/python3.8/site-packages/pystac_client/item_collection.py in __init__(self, features)
     19 
     20         features = features or []
---> 21         self.features = [f.clone() for f in features]
     22         self.links = []
     23         for f in self.features:

/usr/local/Caskroom/miniconda/base/lib/python3.8/site-packages/pystac_client/item_collection.py in <listcomp>(.0)
     19 
     20         features = features or []
---> 21         self.features = [f.clone() for f in features]
     22         self.links = []
     23         for f in self.features:

/usr/local/Caskroom/miniconda/base/lib/python3.8/site-packages/pystac_client/item_search.py in items(self)
    391 
    392         try:
--> 393             yield from it.islice(_paginate(), self._max_items)
    394         except HTTPError as e:
    395             if e.code == 405:

/usr/local/Caskroom/miniconda/base/lib/python3.8/site-packages/pystac_client/item_search.py in _paginate()
    387         """
    388         def _paginate():
--> 389             for item_collection in self.item_collections():
    390                 yield from item_collection.features
    391 

/usr/local/Caskroom/miniconda/base/lib/python3.8/site-packages/pystac_client/item_search.py in item_collections(self)
    372         request = deepcopy(self.request)
    373 
--> 374         for page in get_pages(session=self.session,
    375                               request=request,
    376                               next_resolver=self._next_resolver):

/usr/local/Caskroom/miniconda/base/lib/python3.8/site-packages/pystac_client/stac_io.py in get_pages(session, request, next_resolver)
    168     while True:
    169         # Yield all items
--> 170         page = make_request(session, request)
    171         yield page
    172 

/usr/local/Caskroom/miniconda/base/lib/python3.8/site-packages/pystac_client/stac_io.py in make_request(session, request, additional_parameters)
     46     resp = session.send(prepped)
     47     if resp.status_code != 200:
---> 48         raise APIError(resp.text)
     49     return resp.json()
     50 

APIError: {"message": "Internal server error"}

move to pystac

I'm wondering if it makes sense to move this to pystac. I think it makes sense because there aren't any additional dependencies besides pystac itself.

cc @lossyrob

Enhancement: Support searching static catalogs

Based on input from @matthewhanson and @TomAugspurger over in intake/intake-stac#95 (comment),

Currently pystac-client will raise an APIError if there is no rel=search link (but it is intended to be a client for both static catalogs and APIs).

It would indeed be nice if pystac-client had the ability to search static catalogs in addition to API endpoints. There are a couple of use-cases for this:

  1. Filtering pystac-client search results further (i.e. a local results.json file with an ItemCollection) without having to issue new requests to an API enpoint.

  2. Searching static STAC catalogs that do not have an API endpoint set up.

It would be great if the "local search" function used the same syntax as an API search. Perhaps adding an optional dependency on geopandas would make this easy to implement?

Client.get_collections_list throws type error

Code:

catalog: Client = Client.open('https://franklin.nasa-hsi.azavea.com/')
catalog.get_collections_list()

Error:

Traceback (most recent call last):
...
  File "/Users/philvarner/code/stac-api-validation-suite/stac_api_validation_suite/__init__.py", line 65, in validate_api
    for collection in catalog.get_collections_list():
  File "/Users/philvarner/.local/share/virtualenvs/stac-api-validation-suite-tzI1nfla/lib/python3.9/site-packages/pystac_client/client.py", line 160, in get_collections_list
    return self.get_child_links()
TypeError: get_child_links() missing 1 required positional argument: 'self'

Relevant code:

in client.py:

 @classmethod
    def get_collections_list(self):
        """Gets list of available collections from this Catalog. Alias for get_child_links since children
            of an API are always and only ever collections
        """
        return self.get_child_links()

calling this in pystac/catalog.py:

 def get_child_links(self):
        """Return all child links of this catalog.

        Return:
            List[Link]: List of links of this catalog with ``rel == 'child'``
        """
        return self.get_links('child')

Should API inherit from Catalog?

The API class currently inherits from pystac.Catalog. This was originally done because the an API landing page has to be valid STAC Catalog according to the spec, and because it made all of the typical methods of a pystac.Catalog immediately available, which seemed convenient.

However, pystac.Catalog has a bunch of methods used for writing that really don't apply to the API class. One option would be to overwrite these so that they are either no-ops or raise a NotImplemented error. Another option would be to use composition rather than inheritance and have the landing page be an attribute within a completely separate API class.

Opening this issue to start a discussion on what the best approach would be for this.

CLI behavior

Looking for feedback on the behavior of the CLI.

Right now the default is to hit the API with limit=0 and print the total number matched. To actually fetch the items you specify either --stdout or --save. This allows piping of the output to something else such as jq or stacterm.

The requirement of --stdout to print out the results doesn't seem consistent with how most CLIs work.

Having the default be total matches only is nice to prevent people from making requests on a lot of items. But maybe it's better to have --matched be the keyword and prevent mistakes in another way (e.g., confirmation, default max_items).

Also --save is redundant since you can just pipe to a file. Is it worth keeping, or just update docs to show piping to an output file?

So the question is Do we by default fetch items (and have a --matched switch) or a default of matches (and have a --fetch switch)?

Support for STAC API 1.0.0-beta.2

https://tamn.snapplanet.io/ uses 1.0.0-beta.2 in it's conformance classes, and fails with:

Error https://tamn.snapplanet.io/: <class 'pystac_client.exceptions.ConformanceError'> API does not conform to {ConformanceClasses.STAC_API_CORE}. Must contain one of the following URIs to conform (preferably the first):
        https://api.stacspec.org/v1.0.0-beta.1/core
        http://stacspec.org/spec/api/1.0.0-beta.1/core.

It would be nice to validate against either beta.1 or beta.2 . I recognize this is complex to implement, but it would be nice to support.

Performance to construct a large ItemCollection

This issue documents some slowness on moderately large queries. In the snippet below we fetch 8,234 items. It takes about a minute to construct the results.

import rasterio.features
import pystac_client

area_of_interest = {
    "type": "Polygon",
        "coordinates": [
          [
            [
              -123.46435546875,
              46.4605655457854
            ],
            [
              -119.608154296875,
              46.4605655457854
            ],
            [
              -119.608154296875,
              48.26125565204099
            ],
            [
              -123.46435546875,
              48.26125565204099
            ],
            [
              -123.46435546875,
              46.4605655457854
            ]
          ]
        ]
}
bbox = rasterio.features.bounds(area_of_interest)
stac = pystac_client.Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")

search = stac.search(
    bbox=bbox,
    datetime="2016-01-01/2020-12-31",
    collections=["sentinel-2-l2a"],
    limit=2500,  # fetch items in batches of 2500
)

print(search.matched())  # 8234
items = list(search.items())

I ran list(search.items()) under snakeviz and came up with this result: https://gistcdn.rawgit.org/TomAugspurger/fb5b3bde8cee09d2d9aa2f7215edf2b2/94e4ec2ae97bec2169f9263e8f41183418e885d9/mosaic-static.html

A few notes:

  1. We're spend roughly 2/3s of our time in stac_io.get_pages, which includes IO, waiting for the endpoint (and maybe parsing the JSON into Python objects?)
  2. We spend the other 1/3 of our time in item_collection.from_dict

Some ideas for optimization:

  1. Most of the time in item_collections.from_dict is spent on a deepcopy in pystac.Item.from_dict. It might be safe to skip that copy (since these should be coming off the network with no other references) and provide a copy=False flag to pystac.Item.from_dict, to allow it to mutate the incoming dict.
  2. Maybe pystac_client.Client or .search could provide a raw=True/False flag to allow skipping constructing pystac Items?
  3. Maybe some kind of async magic would speed up the reads? Hard to say, since I don't know how much time is spent waiting for results vs. parsing JSON. I don't know if it's a good idea to parse JSON on the asyncio event loop.

Error message should probably be an f-string

'API does not conform to {ConformanceClasses.STAC_API_CORE}. Must contain one of the following '

The error message looks like:

Error https://api.radiant.earth/mlhub/v1/: <class 'pystac_client.exceptions.ConformanceError'> API does not conform to {ConformanceClasses.STAC_API_CORE}. Must contain one of the following URIs to conform (preferably the first):
        https://api.stacspec.org/v1.0.0-beta.1/core
        http://stacspec.org/spec/api/1.0.0-beta.1/core.

I'm guessing this is supposed to be an f-string, but isn't.

Number of matched items with limit 0 invalid

Hi all,

according to the STAC API specification, the limit parameter needs to be between 1 and 10.000:
https://github.com/radiantearth/stac-api-spec/blob/dev/item-search/openapi.yaml#L200-L204

Within the matched function, the limit is set to 0 to just get the number of found features within a query request. This request results in an exception (e.g., when using the stac-fastapi (pgstac) API backend) because the API returns an HTTP status code not equal to 200, which is fine according to the STAC API specification.

def matched(self) -> int:
params = {**self._parameters, "limit": 0}
resp = self._stac_io.read_json(self.url, method=self.method, parameters=params)

Maybe the limit parameter needs to be change from 0 to 1?

Best
Jonas

Use STAC_URL environment variable in Client.open

Currently, only the command line can use the STAC_URL environment variable for automatically discovering the STAC endpoint. It'd be nice to use it with the Python Client class as well.

This should do the trick, however I don't have a good idea how to test it. Anyone have thoughts?

diff --git a/pystac_client/client.py b/pystac_client/client.py
index 0fc03c6..c8e684b 100644
--- a/pystac_client/client.py
+++ b/pystac_client/client.py
@@ -75,13 +75,13 @@ class Client(pystac.Catalog, STACAPIObjectMixin):
         return '<Catalog id={}>'.format(self.id)
 
     @classmethod
-    def open(cls, url, headers=None):
+    def open(cls, url=None, headers=None):
         """Alias for PySTAC's STAC Object `from_file` method
 
         Parameters
         ----------
-        url : str
-            The URL of a STAC Catalog
+        url : str, optional
+            The URL of a STAC Catalog. If not specified, this will use the `STAC_URL` environment variable.
 
         Returns
         -------
@@ -89,6 +89,12 @@ class Client(pystac.Catalog, STACAPIObjectMixin):
         """
         import pystac_client.stac_io
 
+        if url is None:
+            url = os.environ.get("STAC_URL")
+
+        if url is None:
+            raise TypeError("'url' must be specified or the 'STAC_URL' environment variable must be set.")
+
         def read_text_method(url):
             request = Request(url, headers=headers or {})
             return pystac_client.stac_io.read_text_method(request)

Virtual environment + build tools

Using poetry is nice for a lot of reasons, but might end up being a barrier to some contributors if it's not part of their workflow already. Bringing the build and environment tools for this library in line with what PySTAC uses might encourage more contribution.

Update to use PySTAC 1.0.X

PySTAC 1.0.X has several large changes:

  • stac io is now a class
  • extensions have been refactored

pystac-client needs to use the new StacIO class.
pystac-client extensions, which were based on the old PySTAC extension mechanic, need to be reworked

Update docs

Several of the docs are no longer applicable (e.g., extensions) and/or need to be updated to account for changes in the way conformance is handled, extensions, and naming of the Client class.

`Client.get_collections_list` shouldn't be a classmethod?

catalog = Client.open('https://earth-search.aws.element84.com/v0')
catalog.get_collections_list()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-ade3012ce385> in <module>
----> 1 catalog.get_collections_list()

~/Library/Caches/pypoetry/virtualenvs/stackstac-talk-YcC8wOkC-py3.9/lib/python3.9/site-packages/pystac_client/client.py in get_collections_list(self)
    158             of an API are always and only ever collections
    159         """
--> 160         return self.get_child_links()
    161 
    162     def search(self,

TypeError: get_child_links() missing 1 required positional argument: 'self'

get_collections_list is a classmethod, but it seems like it's not meant to be?

@classmethod
def get_collections_list(self):
"""Gets list of available collections from this Catalog. Alias for get_child_links since children
of an API are always and only ever collections
"""
return self.get_child_links()

Version: 0.1.1

Search datetime parameter allows invalid RFC3339 datetimes

The datetime regex used for the search parameter datetime has a few problems wrt RFC 3339 datetimes.

DATETIME_REGEX = re.compile(
    r"(?P<year>\d{4})(-(?P<month>\d{2})(-(?P<day>\d{2})(?P<remainder>T\d{2}:\d{2}:\d{2}\w*)?)?)?")

Copying this from my personal notes on RFC 3339 datetimes:

RFC 3339 is a profile of ISO 8601, adding these constraints:

  • a complete representation of date and time (fractional seconds optional).
  • requires 4-digit years
  • only allows a period character to be used as the decimal point for fractional seconds
  • requires the zone offset to be Z or like +00:00, while ISO8601 allows like +0000

These are a few examples of what would be allowed for ISO8601 but not RFC 3339:

1985-04-12
1937-01-01T12:00:27.87+0100
37-01-01T12:00:27.87+0100

Below are all valid RFC 3339 datetimes. Note the fractional seconds, Z or z as a timezone, positive and negative arbitrary offset timezones, T or any other character as a separator between date and time.

1985-04-12T23:20:50.52Z
1996-12-19T16:39:57-08:00
1990-12-31T23:59:60Z
1990-12-31T15:59:60-08:00
1937-01-01T12:00:27.87+01:00
1985-04-12T23:20:50.52Z
1937-01-01T12:00:27.8710+01:00
1937-01-01T12:00:27.8+01:00
1937-01-01T12:00:27.8Z
1985-04-12t23:20:50.5202020z
2020-07-23T00:00:00Z
2020-07-23T00:00:00.0Z
2020-07-23T00:00:00.01Z
2020-07-23T00:00:00.012Z
2020-07-23T00:00:00.0123Z
2020-07-23T00:00:00.01234Z
2020-07-23T00:00:00.012345Z
2020-07-23T00:00:00.000Z
2020-07-23T00:00:00.000+03:00
2020-07-23T00:00:00+03:00
2020-07-23T00:00:00.000+03:00
2020-07-23T00:00:00.000z

I think this is the correct regex for an RFC 3339 datetime:

r"^(\d\d\d\d)\-(\d\d)\-(\d\d)(T|t)(\d\d):(\d\d):(\d\d)(\.\d+)?(Z|([-+])(\d\d):(\d\d))$"

This is slightly different from the one in python-strict-rfc3339, as it allows T or t for the sep, and reverses +- and -+ so that the - doesn't need to be escaped with \-

Matching the datetime string to this will ensure it is a valid RFC 3339 (not just an ISO 8601 datetime), and then an ISO8601 parser can be used to parse it further if need be.

The built-in Python datetime library is not sufficient to parse all valid datetimes here -- notably, it doesn't parse Z as a timezone.

There are two options for this:

Additionally, hypothesis-jsonschema has support for generating dt's for testing: https://github.com/Zac-HD/hypothesis-jsonschema/blob/1c5f107230ccbd48c66d7c6693833745a598e294/src/hypothesis_jsonschema/_from_schema.py

Invalid Search.matched requests should raise an exception?

Currently, if I provide an invalid parameter to .search and call .matched(), a warning is printed.

>>> api = pystac_client.API.open("https://planetarycomputer-staging.microsoft.com/api/stac/v1/")

>>> search = api.search(
...     collections=["sentinel-2-l2a"],
...     bbox=[-93.112301, 29.649001, -92.075965, 30.719868],
...     datetime="2019-07-01/2020-06-30",
... )

>>> search.matched()
numberMatched or context.matched not in response  # this is the warning.

If I make that request explicitly, we'll see that the server returned a 422

import requests

payload = dict(
    collection=["sentinel-2-l2a"],
    bbox=[-93.112301, 29.649001, -92.075965, 30.719868],
    datetime="2019-07-01/2020-06-30",
)
>>> r = requests.post("https://planetarycomputer-staging.microsoft.com/api/stac/v1/search", json=payload)
>>> r.status_code
422

With this body

{'detail': [{'loc': ['body', 'datetime'],
   'msg': 'Invalid datetime, must match format (%Y-%m-%dT%H:%M:%SZ).',
   'type': 'value_error'}]}

That information should be surfaced to the user, ideally in an exception.

(as a tangent, I can't tell from the spec if 2019-07-01/2020-06-30 is a valid datetime value or if it has to be 2019-07-01T00:00:00Z/2020-06-30T00:00:00Z

Can not import 'API'

Hi,

Apologies if this is a dull question. But I can't find any solution or workaround elsewhere.

I was trying the example in the docs: https://pystac-client.readthedocs.io/en/latest/usage/stac_api.html

But encountered errors in the very beginning. Please see below:

from pystac_client.Client import API
Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'pystac_client.Client'

And:

from pystac_client import API
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name 'API' from 'pystac_client' (/Applications/anaconda3/envs/deep/lib/python3.8/site-packages/pystac_client/init.py)

I tried it on both the base and virtual envs but they have the same error message.

The pystac-client package was installed by the command in doc: pip install git+https://github.com/stac-utils/pystac-client.git#egg=pystac_client

My python version is 3.7

Can someone help me with this issue? Appreciate it.

Update Conformance logic

Combine Classes STACAPIObjectMixin and Conformance

The STACAPIObjectMixin is a class that takes in a Conformance class to determine conformance. Combine the logic into a single Conformance class that can be added to other classes, currently:

  • Client: Used to determine basic required conformance initially
  • Search: Used to determine conformance of search endpoints/parameters

Conformance is not currently used, and the code for handling non-conformance needs to be added/updated.

I see this working as follows:

  • you can always open any STAC catalog-like object with Client
  • Most (all?) functions added in the Client class (which inherits from PySTAC Catalog) should require a conformance check. If non-conformant a ConformanceError is thrown with details.
  • If there is no conformsTo array, then assume this is a static catalog. Client will functionally work as a PySTAC Catalog, but most (all?) functions should return a ConformanceErrror
  • There should be an override parameter, ignore_conformance to the Client.open() function
  • Conformance is also used to determine if certain parameters to the ItemSearch class are allowed. Thus, the ItemSearch class should be passed a ConformanceClass that can be used to check if specific query parameters are allowed. (e.g., if ItemSearch gets a field kwarg and the field conformance is not present, a ConformanceError should be thrown.

`simple_stac_resolver` does not handle requests properly

simple_stac_resolver does not handle the original_request: requests.Requests field correctly, accessing it like a dictionary with original_request.get() instead of using field access:

  1. original_request.get('headers', {})
  2. original_request.get('json', {})

Both cause the function to fail (taken from the docstrings, tested on python 3.8):

import json
import requests
from pystac_api.stac_io import simple_stac_resolver

original_request = requests.Request(
    method='POST',
    url='https://stac-api/search',
    data=json.dumps({'collections': ['my-collection']}).encode('utf-8'),
    headers={'x-custom-header': 'hi-there'}
)

next_link = {
     'href': 'https://stac-api/search?next=sometoken',
     'rel': 'next'
}

next_request = simple_stac_resolver(next_link, original_request)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-3727c92cfc17> in <module>
     14      'rel': 'next'
     15 }
---> 16 next_request = simple_stac_resolver(next_link, original_request)

/anaconda/envs/geo/lib/python3.8/site-packages/pystac_api/stac_io.py in simple_stac_resolver(link, original_request)
    117     # If the link object includes a "headers" property, use that and respect the "merge" property.
    118     link_headers = link.get('headers')
--> 119     headers = original_request.get('headers', {})
    120     if link_headers is not None:
    121         headers = {**headers, **link_headers} if merge else link_headers

AttributeError: 'Request' object has no attribute 'get'

Documentation on how to use with NASA CMR STAC endpoint

I'm trying to migrate over to using pystac-client from https://github.com/sat-utils/sat-search. Specifically, I'd like to use NASA's CMR stac endpoint but I'm unable to get any searches to work using a bbox parameter, which is pretty critical! Is this a bug? Or am I missing something important with formatting these keyword arguments?

For example, the following search does work with sat-search

from satsearch import Search
results = Search(url='https://cmr.earthdata.nasa.gov/stac/LPCLOUD',
                 collections=['HLSS30.v1.5'], 
                 #bbox = [-122.4,41.3,-122.1,41.5], #SatSearchError: "Request failed with status code 400"
                 bbox = '-122.4,41.3,-122.1,41.5', #works! 
                 datetime='2021-01-01/2021-02-01',  
                 #sortby='-properties.datetime' # results in 0 found but no error raised
                )
results.found()

But trying to replicate this does not work with pystac-client.

# Works if bbox commented out
catalog = Client.open("https://cmr.earthdata.nasa.gov/stac/LPCLOUD")
results = catalog.search(collections=['HLSS30.v1.5'],
                         #bbox = [-122.4,41.3,-122.1,41.5], 
                         bbox = '-122.4,41.3,-122.1,41.5',
                         datetime='2021-01-01/2021-02-01',
                         )
print(f"{results.matched()} items found")
---------------------------------------------------------------------------
APIError                                  Traceback (most recent call last)
<ipython-input-124-b65c76d02221> in <module>
      6                          datetime='2021-01-01/2021-02-01',
      7                          )
----> 8 print(f"{results.matched()} items found")

~/miniconda3/envs/hyp3/lib/python3.9/site-packages/pystac_client/item_search.py in matched(self)
    351 
    352     def matched(self) -> int:
--> 353         resp = make_request(self.session, self.request, {"limit": 0})
    354         found = None
    355         if 'context' in resp:

~/miniconda3/envs/hyp3/lib/python3.9/site-packages/pystac_client/stac_io.py in make_request(session, request, additional_parameters)
     46     resp = session.send(prepped)
     47     if resp.status_code != 200:
---> 48         raise APIError(resp.text)
     49     return resp.json()
     50 

APIError: "Request failed with status code 400"

I suspect because of formatting, but the docs do state, bbox can be passed as a string:

bbox: list or tuple or Iterator or str, optional

related: nasa/cmr-stac#153, sat-utils/sat-search#106

Expand partial datetimes when querying

STAC APIs should accept just dates and shouldn't require full datetimes (e.g., 2020-01-01 vs 2020-01-01T12:00:00Z), since querying by time component isn't a common use case.

However, it appears that not all existing STAC API implementations support this and may require full datetimes.

The client can handle this logic and be explicit about the datetimes it requests.
Users should be able to specify:

  • A single year which should translate to a datetime range encompassing the whole year
  • A range of years which should translate to beginning of year1 to end of year2
  • A single month or range of months (2020-01/2020-02) which should act as above
  • A single fully complete datetime, which should be unaltered
  • A single date which should translate to a range encompassing the whole day
  • A date range translate to beginning of date1 to end of date2

Update documentation with examples showing all the possible inputs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.