Git Product home page Git Product logo

stac-server's People

Contributors

amystamile-usgs avatar atdaniel avatar dependabot[bot] avatar drewbo avatar fredliporace avatar gadomski avatar geomatician avatar j08lue avatar jlaura avatar jwalgran avatar kbgg avatar kurtmckee avatar mattbialas avatar matthewhanson avatar metasim avatar peteshepley avatar philvarner avatar quetcodesfire avatar samsipe avatar sharkinsspatial avatar sjwoodr avatar stark525 avatar vincentsarago avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stac-server's Issues

Topics should be a list

According to the AWS CloudFormation docs AWS::SNS::TopicPolicy.Topics should be an array of strings. However, in the current serverless.yml, Topics is a single string.

Full compliance with OAF

Check and implement any missing functionality to be fully compliant with the OGC API - Features 1.0 specification.

Update primary key to be collectionId + itemId

Currently the primary key in the elasticsearch backend is the Item ID.

However, if the server has an item that has the same ID across multiple collections this can cause a problem.

For instance, a sentinel-s2-l2 collection and sentinel-s2-l2-cogs collection that is a mirror of the first collection but with COGs is going to have the same Item IDs. This is not possible within the same stac-server.

Instead the primary key should be something like collectionId_itemId

This has repercussions to the transaction extension (see #37 ), if either ID is edited during an update. In this case a new item will need to be added and the old one removed.

Logging batch ingest errors

When one or more Items fails ingestion into Elasticsearch during a bulk write, the returned error message is long and unhelpful as it is a big string with info from all the 'docs' that were in the batch, here's a sample:

2019-01-22T20:51:11.320Z - error: _index=items, _type=doc, _id=LC80230012015141LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3705, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015125LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3690, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015109LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3789, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015093LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3706, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230012015077LGN00, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3691, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102016096LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3600, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102015253LGN02, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3601, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102014106LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3604, _primary_term=1, status=201, _index=items, _type=doc, _id=LC80230102014234LGN01, _version=1, result=created, total=2, successful=2, failed=0, _seq_no=3602, _primary_term=1, status=201,
Therefore one record starts with "_index" and ends with "status". If they were successful they will have "failed=0"

However, a failed record actually looks different:

_index=items, _type=doc, _id=LC80231212013343LGN00, status=400, type=mapper_parsing_exception, reason=failed to parse [geometry], type=parse_exception, reason=invalid number of points in LinearRing (found [1] - must be >= [4]),
It doesn't contain "successful", "failed" or other terms and instead has status=400 and a "reason" which is the actual error.

We don't want to log the entire string, instead we want to log each item that failed in the batch, if any.

We just need to log the "_id" field and the "reason" field for the error.
(via Matt H.)

Keyword search enhancement

Provide an ability to specify a list of keywords on an Item to use in item searching. This allows the item to remain generalized but include indexable/searchable information for specific use cases. For example, an item could be used to represent a parcel, county, contract, easement or land use agreement.

This is important for applications using STAC to index remote sensing data (e.g., imagery) across a number of vertical markets. I.e., using the same properties schema but allowing arbitrary searchable keywords.

Usage examples:

3111822110030 (to find an item listing this ID as a keyword)
HEN (to find keywords starting with "Hen")

Update paging to follow API spec

stac-server currently converts POST requests to GET requests and uses that as a next link.

Instead it needs to follow the spec where it returns a next page token variable, along with headers and body that the client needs to resend to go to the next page.

AWS Architecture

Can someone help me with AWS Architecture which gets deployed by this repo. Thanks

publish libs package

Hi,
it would be really nice to have a package that exposes the library index https://github.com/stac-utils/stac-server/blob/master/libs/index.js so anyone can use the API implementation and deploy it as needed.

For exemple we easily plugged the library to fastify (because we don't use AWS cloud or have access to a FaaS). But currently we have to fork the repo in order to achieve this. Publishing the libs/index.js package would easily split the web server implementation/deployment from the main library. And maybe in a near future we could imagine some plugin packages like stac-server-(express|fastify|hapi|serverless) that can leverage the base library independently of the web server implementation used.

Anyway thanks for this work, it gave us a great idea of what we can achieve with the stac spec !

add publish SNS for newly ingested Items

There should be an SNS topic deployed with stac-server.

And then there should be an option (controlled via an envvar) that will publish newly ingested items to the topic.

This allows users to subscribe and monitor new Items added to the server.

Furthermore, the topic should be published with attributes for:

  • bounding box
  • datetime
  • collection

which allows a subscriber to filter on these attributes and only get messages which meet the criteria, e.g., message me all new Items within this bounding box for 2019 and for collection sentinel-s2-l2a.

Add STAC API Filter Extension (CQL2) support

The conformance classes Basic CQL2 and Basic Spatial Operators should be implemented. Basic CQL2 allows the logical operators (AND, OR, NOT), comparison operators (=, <>, <, <=, >, >=), and IS NULL against string, numeric, boolean, date, and datetime types. Basic Spatial Operators allows S_INTERSECTS (spatial intersects) on geometry fields.

Stretch: Advanced Comparison Operators defines the LIKE, BETWEEN, and IN operators, though BETWEEN and IN can be be written less-concisely using comparisons AND'ed or OR'ed together. LIKE cannot (practically) be worked-around this way.

Other implementations:

  • pgstac - most up-to-date one
  • Pygeoap - might be out of date wrt the latest spec

Open Questions:

  • CQL2-Text and/or CQL2-JSON?
  • which conformance classes?

Next link generated in search results causes JSON parse error

When the result for the collection/items endpoint is returned, it includes a link to fetch the next page of results which looks like this:
https://api.server.com/collections/some_collection/items?collections=some_collection&page=2&limit=10

When navigating to this link, this error is generated:
Unexpected token r in JSON at position 0

It looks like the collections parameter is expected to be a JSON array of strings rather than just a single string causing a JSON parsing error.

Fix Field Extension

(matthewhanson)
I had to disable the fields code, because the default behavior was that it wasn't returning the complete item - only a subset.

Targeting this for a 0.4.0 final, if it doesn't make it then we can just keep the fields extension out and put it back in for 0.5.0

add new paging support

pagination is now handled by providing a link in the returned ItemCollection with a rel type of next.

It doesn't seem to be clearly explained in the spec, only referred to the OpenAPI yaml file.

Missing items return with status code 200 instead of 404

I've been using Earth Search for Sentinel-2 data (it's awesome!) and noticed some weird behavior around missing items. For example,

https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l1c/items/foo

Will return a HTTP 200 status code, but with a body that looks like:

{
  "code": 404,
  "message": "Item not found"
}

When searching the catalog, I need to (extracted from search function):

import requests
from urllib.error import HTTPError

S2_SEARCH_URL = 'https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l1c/items'

search_url = f'{S2_SEARCH_URL}/foo'
response = requests.get(search_url)
response.raise_for_status()

scene_metadata = response.json()
if scene_metadata.get('code') == 404:
    raise HTTPError(search_url, 404, scene_metadata['message'], response.headers, None)

which is a little cumbersome.

I'd expect this:

>>> import requests
>>> S2_SEARCH_URL = 'https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l1c/items'
>>> response = requests.get(f'{S2_SEARCH_URL}/foo')
>>> response.raise_for_status()
Traceback (most recent call last):
...
requests.exceptions.HTTPError: 404 Client Error: Not Found for url:...

Which, for comparison, is how USGS's STAC catalog works:

>>> import requests
>>> LC2_SEARCH_URL = 'https://landsatlook.usgs.gov/sat-api/collections/landsat-c2l1/items'
>>> response = requests.get(f'{LC2_SEARCH_URL}/foo')
>>> response.raise_for_status()
Traceback (most recent call last):
...
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://landsatlook.usgs.gov/sat-api/collections/landsat-c2l1/items/foo

PUT and PATCH transactions that change the collection should move the item to another index

Updating the backend to use separate elasticsearch indexes (#38) broke transactions (#37) when the collection is one of the fields to be edited.

If collection is one of the fields to be changed, then:
1 - the PATCH operation needs to retrieve the original Item and merge the updated fields with it
2 - add the new Item to the new index with the name of the collection the item uses
3 - delete the old Item from the old collection index

cc @seanmurph

Review mappings

Review mappings in fixtures directory vs latest stac-spec version

update name from stac-api to stac-server

this codebase was forked from sat-api to stac-api since it's become a server for STAC and not restricted to satellite data.

However this is causing confusion since the STAC API spec is also called stac api.

Instead this should be called stac-server. The repo has been renamed (stac-api will redirect here), but there are references in the code that should be updated to stac-server.

Update documentation

Whether in readme.md or in a separate file, a few things should be expanded or changed:

Subscribing to SNS Topics - Implies this can be done through 'Create subscription' interface, but that doesn't seem to work. Instead, adding a trigger on Lambda edit page is the simple way through interface.

Don't index all fields

Not all fields need to be indexed. Instead the full document can be stored in elasticsearch but only certain fields are indexed.

Should use exclusion logic to avoid indexing new fields that may be added from extensions.

For example:

  • collection need not be indexed, since each item is in a collection specific index
  • item assets need not be indexed

links is not defined

I am seeing a ReferenceError when results need to be paged. Maybe coming from here?

links.href = `${endpoint}?${nextQueryParameters}`

Should this be link (defined on line 317) not links?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.