Git Product home page Git Product logo

eidaws's People

Contributors

dependabot[bot] avatar sheimers avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

damb leaneb kaestli

eidaws's Issues

Enhance `fdsnws-station` crawling

Introduce the following configuration parameters for fdsnws-station-text|xml federating instances:

  • crawl-omit-frontend-cache: Omits a frontend-cache lookup for requests issued by eida-crawl-fdsnws-station
  • crawl-omit-routing: Omits routing and forwards the request directly to the endpoint. Note that this is a reasonable further optimization since crawler requests are based on eidaws-stationlite routing information, anyway.

RuntimeError in middleware

The following exception was logged

Traceback (most recent call last):
File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/middleware.py", line 46, in exception_handling_middleware
    return await handler(request)
File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/remote.py", line 16, in middleware
    return await super().middleware(request, handler)
File "/var/www/eidaws-federator/venv/lib/python3.7/site-packages/aiohttp_remotes/x_forwarded.py", line 94, in middleware
    return await handler(request)
File "/var/www/eidaws-federator/venv/lib/python3.7/site-packages/aiohttp/web_urldispatcher.py", line 948, in _iter
    resp = await method()
File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/view.py", line 52, in get
    return await processor.federate(timeout=self.client_timeout)
File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/process.py", line 104, in wrapper
    return await coro(self, *args, **kwargs)
    RuntimeError: coroutine ignored GeneratorExit

which originally referred to the request (aiohttp access log):

2021-04-13T06:06:13.684492 "GET /fedws/wfcatalog/json/1/query?starttime=2004-01-01T00:00:00&endtime=2004-01-02T00:00:00&net=CH HTTP/1.1' '-' 'ApacheBench/2.3'

Logs:

Apr 13 06:06:13 a6e70dd05f2a <EIDA> 2021-04-13T06:06:13+0000 INFO eidaws.federator.middleware 75 misc.py:229 - [3ae057cd-8baa-45d7-b449-b7d991ab200f]  2021-04-13T06:06:13.684492 "GET /fedws/wfcatalog/json/1/query?starttime=2004-01-01T00:00:00&endtime=2004-01-02T00:00:00&net=CH&starttime=2004-01-01T00:00:00&endtime=200
4-01-02T00:00:00&net=CH HTTP/1.1' '-' 'ApacheBench/2.3'
[...]
Apr 13 07:34:56 a6e70dd05f2a <EIDA> 2021-04-13T07:34:56+0000 CRITICAL eidaws.federator.middleware 75 middleware.py:66 - [3ae057cd-8baa-45d7-b449-b7d991ab200f] Local Exception: <class 'RuntimeError'>
Apr 13 07:34:56 a6e70dd05f2a <EIDA> 2021-04-13T07:34:56+0000 CRITICAL eidaws.federator.middleware 75 middleware.py:70 - [3ae057cd-8baa-45d7-b449-b7d991ab200f] Traceback information: ['Traceback (most recent call last):\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/middleware.py", line 46, in exception_handling_middleware\n    return await handler(request)\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/remote.py", line 16, in middleware\n    return await super().middleware(request, handler)\n', '  File "/var/www/eidaws-federator/venv/lib/python3.7/site-packages/aiohttp_remotes/x_forwarded.py", line 94, in middleware\n    return await handler(request)\n', '  File "/var/www/eidaws-federator/venv/lib/python3.7/site-packages/aiohttp/web_urldispatcher.py", line 948, in _iter\n    resp = await method()\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/view.py", line 52, in get\n    return await processor.federate(timeout=self.client_timeout)\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/process.py", line 104, in wrapper\n    return await coro(self, *args, **kwargs)\n', 'RuntimeError: coroutine ignored GeneratorExit\n']

Note the timestamps of the log messages.

The issue seems to be reproducible, however it seems to be always related to eidaws-wfcatalog requests.


Dependencies:

# pip freeze
aiodns==2.0.0
aiofiles==0.6.0
aiohttp==3.7.4.post0
aiohttp-cors==0.7.0
aiohttp-remotes==1.0.0
aioredis==1.3.1
async-timeout==3.0.1
attrs==20.3.0
brotlipy==0.7.0
cached-property==1.5.2
cchardet==2.1.7
cffi==1.14.5
chardet==4.0.0
ConfigArgParse==1.4
# Editable install with no version control (eidaws.federator==0.12.0)
-e /usr/local/src/eidaws/eidaws.federator
# Editable install with no version control (eidaws.utils==0.1)
-e /usr/local/src/eidaws/eidaws.utils
hiredis==2.0.0
idna==3.1
importlib-metadata==3.0.0
intervaltree==3.1.0
jsonschema==3.2.0
lxml==4.6.3
marshmallow==3.2.1
multidict==4.7.6
pkg-resources==0.0.0
pycares==3.1.1
pycparser==2.20
pyrsistent==0.17.3
python-dateutil==2.8.1
PyYAML==5.4.1
six==1.15.0
sortedcontainers==2.3.0
typing-extensions==3.7.4.3
webargs==5.5.3
yarl==1.5.1
zipp==3.4.1

Wrong route selection in case of stations on different networks, but same names

There are two stations "EF02", one after 2021 in the 2D network in Germany, one around 2012 in the CL network in Greece:
http://eida-federator.ethz.ch/fdsnws/station/1/query?station=EF02&format=text

a query asking for the time window of the first, but the coordinates of the second, will erraneously return the first (which does not match coordinate requirements)
http://eida-federator.ethz.ch/fdsnws/station/1/query?starttime=2022-06-01T00%3A00%3A00&endtime=2022-06-10T00%3A00%3A00&minlatitude=38.4137&maxlatitude=38.4138&minlongitude=21.94&maxlongitude=21.95&format=text&level=channel&nodata=404

Allow non-standard URL base paths

Allow harvesting of routes including non-standard base FDSN URL paths i.e. base paths which do not fully match the pattern

/fdsnws/<service>/<majorversion>/

as specified by https://www.fdsn.org/webservices/FDSN-WS-Specification-Commonalities-1.2.pdf. As long as services implement FDSN webservice specifications apart from the URL base path both eidaws-federator and eidaws-stationlite should be able to cope with this particular minor deviation.

Note, for eidaws-federator clients nothing changes since the service uses standardized FDSN base paths. Non-standard URL base paths are exclusively used internally.

Epochs returned by `eidaws-stationlite` for `service=station&level=network|station`

At the time being all epochs returned by eidaws-stationlite are based on the underlying ChannelEpochs.

In the context of stream epoch canonicalization this is not ideal when querying fdsnws-station metadata for level=network|station. Instead, it would be better to rely on the underlying NetworkEpoch or rather StationEpoch, respectively, in order to maximize cache hit performance.

Routing harvesting duplicates channelepochs (and related epochs) in the routing DB

pre-existing channel-epochs are not recognized - each harvesting of the routing declaration introduces all channel epochs again as new ones, rather than updating the last-seen dates of the pre-existing ones.
(This behaviour is not observed with network epochs and station epochs; however those still have the issue that the lastseen date is not updated; cfr #31

Optionally force HTTP while harvesting

Due to https://github.com/EIDA/etc/issues/36 routes are configured via HTTPS, too. Though, eidaws-federator's backend is not prepared to fully rely on HTTPS. Since all EIDA endpoint nodes still provide HTTP and eidaws-federator's backend does not strictly require HTTPS to be used as a simple workaround HTTPS routing URLs may be (optionally) overridden with the corresponding HTTP routing URL while harvesting. This solution should work out as long as:

  • EIDA endpoint nodes provide HTTP APIs
  • eidaws-federator does not implement restricted data access.

`fdsnws-station` query filter parameters for `eidaws-stationlite`

With #12 eidaws-stationlite returns fully canonicalized stream epochs for service=station requests. However, endpoint requests can be further canonicalized by means of resolving as many query filter parameters as possible already at eidaws-stationlite, instead of forwarding those to the endpoints. Query filter parameters which still should be resolved at eidaws-stationlite include:

  • the time parameters startbefore, startafter, endbefore, endafter
  • includerestricted

Allow harvesting from files

Allow specifying eidaws-routing localconfig configuration files by means of file URIs when harvesting. Currently, when harvesting, localconfig destinations must be specified with a remote URL. However, this requires a webserver to be up and running (though usually this is the case when e.g. deploying eidaws-stationlite behind a reverse-proxy).

Upgrade to `aiohttp>=3.8.0`

Note that upgrading currently leads to the following error, e.g.:

Mar  8 11:12:44 22ed0a6a9060 nginx: 172.22.0.1 - - 08/Mar/2022:11:12:44 +0000 "GET /eidaws/routing/1/query?sta=BFO&service=station&cha=HHE HTTP/1.1" 404 326 0.000 "-" "curl/7.58.0"
Mar  8 11:13:10 22ed0a6a9060 <EIDA> 2022-03-08T11:13:10+0000 INFO eidaws.federator.middleware 56 misc.py:228 - [aed64c07-6a6e-46dd-a638-95068a52f91f]  2022-03-08T11:13:10.883844 "GET /fedws/station/text/1/query?sta=BFO&cha=HHE&format=text HTTP/1.1' '-' 'curl/7.58.0'
Mar  8 11:13:10 22ed0a6a9060 <EIDA> 2022-03-08T11:13:10+0000 CRITICAL eidaws.federator.middleware 56 middleware.py:66 - [aed64c07-6a6e-46dd-a638-95068a52f91f] Local Exception: <class 'RuntimeError'>
Mar  8 11:13:10 22ed0a6a9060 <EIDA> 2022-03-08T11:13:10+0000 CRITICAL eidaws.federator.middleware 56 middleware.py:67 - [aed64c07-6a6e-46dd-a638-95068a52f91f] Traceback information: ['Traceback (most recent call last):\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/middleware.py", line 46, in exception_handling_middleware\n    return await handler(request)\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/remote.py", line 16, in middleware\n    return await super().middleware(request, handler)\n', '  File "/var/www/eidaws-federator/venv/lib/python3.8/site-packages/aiohttp_remotes/x_forwarded.py", line 103, in middleware\n    return await handler(request)\n', '  File "/var/www/eidaws-federator/venv/lib/python3.8/site-packages/aiohttp/web_urldispatcher.py", line 954, in _iter\n    resp = await method()\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/view.py", line 52, in get\n    return await processor.federate(timeout=self.client_timeout)\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/process.py", line 104, in wrapper\n    return await coro(self, *args, **kwargs)\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/process.py", line 299, in federate\n    self._routed_urls, routes = await self._route()\n', '  File "/usr/local/src/eidaws/eidaws.federator/eidaws/federator/utils/process.py", line 254, in _route\n    async with req() as resp:\n', '  File "/var/www/eidaws-federator/venv/lib/python3.8/site-packages/aiohttp/client.py", line 1138, in __aenter__\n    self._resp = await self._coro\n', '  File "/var/www/eidaws-federator/venv/lib/python3.8/site-packages/aiohttp/client.py", line 466, in _request\n    with timer:\n', '  File "/var/www/eidaws-federator/venv/lib/python3.8/site-packages/aiohttp/helpers.py", line 701, in __enter__\n    raise RuntimeError(\n', 'RuntimeError: Timeout context manager should be used inside a task\n']
Mar  8 11:13:10 22ed0a6a9060 nginx: 172.22.0.1 - - 08/Mar/2022:11:13:10 +0000 "GET /fdsnws/station/1/query?sta=BFO&cha=HHE&format=text HTTP/1.1" 500 475 0.007 "-" "curl/7.58.0"

As a temporary workaround pin aiohttp==3.7.4 (see #26).

Improve route cleanup

Cleaning up routes is not automatized, yet. That is, once a route has been harvested it remains in the routing definitions without being removed. Clients of eidaws-federator do not notice this fact, however, it

  • forces eidaws-federator to perform endpoint requests just to receive no data (i.e. HTTP 204|404 status codes) and therefore it
  • degenerates the performance.

Circular spatial query filter parameters for `eidaws-stationlite`

This one falls into a similar category as #14.

At the time being, for fdsnws-station requests only rectangular spatial query filter parameters are resolved by means of eidaws-stationlite while circular spatial query filter parameters aren't.

As an optimization and in order to avoid broadcast requests across all EIDA including all stream epochs, resolve spatial query filter parameters at eidaws-stationlite, as well. This includes:

  • latitude | lat
  • longitude | lon
  • minradius
  • maxradius

Note that circular spatial query filter parameters are optional w.r.t. fdsnws-station (see https://www.fdsn.org/webservices/fdsnws-station-1.1.pdf). However, they are provided by most of EIDA's fdsnws-station endpoint implementations.

Thanks for reporting, @jfclinton.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.