Git Product home page Git Product logo

routing's People

Contributors

andres-h avatar damb avatar javiquinte avatar jbienkowski avatar jollyfant avatar petrrr avatar pevans-gfz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

routing's Issues

Change the masterTable option for a new synchronisation feature

Since the beginning exists the capability to have a masterTable.xml file where some networks can be included and these will not be part of the internal synchronisation with the other RS nodes.
The idea was to include there networks from data centres not belonging to EIDA (or our internal trustable circle of DCs). These have the highest priority when matching input parameters.
However, this is becoming cumbersome and some new features as the station name cache are starting to bring problems with this "exceptional case".

This should be translated into a more similar case as the nodes being synchronised. Those routes follow a similar workflow as the local ones and the overall behaviour shows no problem at all.
To achieve this we need to include a new import option similar to the URL pointing to a static file. Just the file name of the masterTable and then follow the usual workflow.

All code related to the masterTable should be removed after this change.

Include routes from masterTable if there is no input?

As it is implemented today, there are internally two set of routes in the RS.
The normal routing table, which could be synchronized, for instance, with the other EIDA nodes, and a second "master table", which contains some routes that have the highest priority and are not synchronized. This was thought as a way to include IRIS and other DCs not belonging to EIDA in the RS.
Today, routes from the master and the routing table are not mixed. If someone requests routes for a NSLC it will be first check in the master table and only if there is nothing it will be search in the routing table.
A user came with a use case in which the whole list of routes will be requested (no input to use as filter). Today this is not possible.

How should it work this? Probably we should remove the master table and make the implementation even simpler. If master routes are needed, these should be included in the local configuration.

Geolocation within HTTP POST

Hi Javier,
while requesting information from eidaws-routing using the HTTP POST format, it seems that the current implementation does not accept any geolocation information:

$ cat routing.post
format=post
minlatitude=-70.8
maxlatitude=-70.
AW * * * 2017-01-01 2018-01-01
$ wget -O - --post-file routing.post "http://geofon.gfz-potsdam.de/fdsnws/station/1/query"
--2018-05-14 15:39:56--  http://geofon.gfz-potsdam.de/fdsnws/station/1/query
Resolving geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)... 139.17.3.177
Connecting to geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)|139.17.3.177|:80... connected.
HTTP request sent, awaiting response... 400 Bad Request
2018-05-14 15:39:56 ERROR 400: Bad Request.

However using a GET request gives a valid result:

$ wget -O - "http://geofon.gfz-potsdam.de/eidaws/routing/1/query?format=post&net=AW&start=2017-01-01&end=2018-01-01&minlatitude=-70.8&maxlatitude=-70"
--2018-05-14 15:50:13--  http://geofon.gfz-potsdam.de/eidaws/routing/1/query?format=post&net=AW&start=2017-01-01&end=2018-01-01&minlatitude=-70.8&maxlatitude=-70
Resolving geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)... 139.17.3.177
Connecting to geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)|139.17.3.177|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 107 [text/plain]
Saving to: ‘STDOUT’

-                     0%[                    ]       0  --.-KB/s               http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query
AW VNA1 * * 2017-01-01T00:00:00 2018-01-01T00:00:00
-                   100%[===================>]     107  --.-KB/s    in 0s      

2018-05-14 15:50:13 (6.46 MB/s) - written to stdout [107/107]

The eidaws-routing documentation does not say anything about geolocation parameters using HTTP POST (i.e. the corresponding format).

cheers, Daniel

Alternate input at test/testService.py

Suggestion for input at test/testService.py.
if no args:
read baseURL variable from routing.cfg (default)
else:
read args

eg.1
./testService.py
(reads baseURL from routing.cfg)

eg.2
./testService.py url
(reads url)

Problem with date filters (start, end)

With start and end seems to work OK:
start=2010-01-01T00:00:00Z&end=2012-01-01T00:00:00Z
but with only start it returns 204:
start=2010-01-01T00:00:00Z

Tested against the BGR deployment

If no endtime is defined wrong routes are defined

Improve the messages from updateAll.py

When a static file is specified in the configuration for the synchronization, the output message is not clear. It shows an error when the URL is tested to be a Routing Service and only after that checks if it is a static URL. Probably a warning would be better.

Add nodata query parameter

I would like to propose that a nodata query parameter as defined by FDSNWS specification is added to the service. This would help in understand behavior especially if used in a browser window, e.g. for debugging.

Update testRoute to check the same cases as testService

testRoute checks the functionality of the RouteCache object without any network connection; only the program locally with local objects read from the configuration file.
Try to match as much as we can to testService, which performs a full check of coherency and functionality.

POST 500

Hi Javier, I installed the new WebDC 3 and made a request for FDSN-Station but it broke routing:

req.txt:

service=station
format=json
NL OPLO -- BHE 2007-05-01T00:00:00.0000Z 2017-05-23T23:59:59.0000Z
NL VKB -- BHE 2007-05-01T00:00:00.0000Z 2017-05-23T23:59:59.0000Z
NL WIT -- BHE 2007-05-01T00:00:00.0000Z 2017-05-23T23:59:59.0000Z

curl -X POST -d req.txt http://geofon.gfz-potsdam.de/eidaws/routing/1/query

and the service returns an internal service error.

Include support for virtual networks

Virtual networks should be included in the configuration. They need to be synchronized and translated to real network-station codes for responses.

Include index for fields in the cache

Filter of stations by name or location can be done at the very beginning. It would be much faster. Follow the normal procedure once it was already filtered. To do that I need to introduce an index for the fields in the cache.

Inconsistencies in application.wadl

The following problems have been detected by the ObsPy people.

  • |<resources base="http://localhost/fdsnws/routing">| -> This should be the actual URL of the service.
  • |<param name="format" style="query" type="xsd:string" default="json">| -> The default value is |"xml"| and not |"json"|. The list of options also does not list the XML possibility.
  • |<param name="endtime" style="query" type="xsd:dateTime" default="tomorrow"/>| -> |"tomorrow"| is not a valid |xsd:dateTime|.
  • |<param name="service" style="query" type="xsd:string" default="dataselect"/>| -> This should probably also list all available options.

Python3

Hi Javier, it seems the routing only works with Python3 now. Is that correct? When trying to run it says:

Cannot import module request urllib.request. This changes a lot of the installation procedures since you need mod_wsgi for Python3.

M

Input as option for test/testRoute.py

Suggestion for input at test/testRoute.py.
if no args:
read routing.xml (default)
else:
read args

eg.1
./testRoute.py
(reads routing.xml)

eg.2
./testRoute.py file
(reads file)

HTTP Status 500 instead of 400

Dear colleagues,

there might be an issue while parsing request parameters, since with the request:

$ cat eida-routing.500 
service=station
format=post
OrderedDict([('network', 'GR'), ('station', '*'), ('location', '*'), ('channel', 'BHZ'), ('starttime', '2017-01-01T00:00:00'), ('endtime', '2017-01-02T00:00:00')])
$ wget -O - --post-file eida-routing.500 "http://geofon.gfz-potsdam.de/eidaws/routing/1/query"
--2018-02-09 12:30:29--  http://geofon.gfz-potsdam.de/eidaws/routing/1/query
Resolving geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)... 139.17.3.177
Connecting to geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)|139.17.3.177|:80... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2018-02-09 12:30:29 ERROR 500: Internal Server Error.

I receive HTTP Status Code 500. However, code 400 would be more appropriate, I guess.

Kind regards,
Daniel

Adding WFCatalog to routing

Hi Javier, hope you are doing well. It would be nice to add the WFCatalog to the routing service. I can help with the implementation if you can point me to the routines where services are resolved.

Here are five addresses of running catalogues:

$nodesWFCatalog = (object) Array (
  "ODC" => "http://www.orfeus-eu.org/eidaws/wfcatalog/alpha",
  "NOA" => "http://eida.gein.noa.gr/eidaws/wfcatalog/alpha",
  "INGV" => "catalog.data.ingv.it/wfcatalog/1",
  "BGR" => "http://eida.bgr.de/wfcatalog/alpha",
  "GFZ" => "http://geofon.gfz-potsdam.de/eidaws/wfcatalog/alpha"
);

Best,
Mathijs

Route merging

This proposal aims to merge routes returned by the routing service. Consider the following request:

http://geofon.gfz-potsdam.de/eidaws/routing/1/query?net=Z3&format=get

returns an excessive amount of routes e.g.:

http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A339A&start=2015-01-01T00:00:00&net=Z3
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A338A&start=2015-01-01T00:00:00&net=Z3
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A337A&start=2015-01-01T00:00:00&net=Z3
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A336A&start=2015-01-01T00:00:00&net=Z3
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A335A&start=2015-01-01T00:00:00&net=Z3
... and many, many more.

Most of these routes can be combined to a single route using comma delimited station requests (e.g. in the case of above):

http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A335A,A336A,A337A,A338A,A339A&start=2015-01-01T00:00:00&net=Z3

This would improve federated services as the number of requests to each node is reduced.

Latitude and longitude should be cached to support discovery

When a route contains only network information, some attributes at the level of station should be also cached. See also related issue #3
The extra problem with latitude and longitude is that this implies that the API will be extended to allow this!

Permanent networks with end date far in the future

If time windows have unreallistic dates (e.g. year 2999) some problems could appear when trying to calculate overlaps. Now there is a fix maximum date (2100), but it should be solved in a better way.

Wildcard not working unless a letter is present (new bug)

For instance:

http://rz-vm258.gfz-potsdam.de/eidaws/routing/nightly/query?net=*&format=post&service=wfcatalog

returns an HTTP error 204, but

http://rz-vm258.gfz-potsdam.de/eidaws/routing/nightly/query?net=I*&format=post&service=wfcatalog

returns

http://catalog.data.ingv.it/wfcatalog/1/query
IX * * * 2005-01-01T00:00:00 2017-09-08
IV * * * 1988-01-01T00:00:00 2017-09-08

http://www.orfeus-eu.org/eidaws/wfcatalog/1/query
IU * * * 1980-01-01T00:00:00 2017-09-08
IP * * * 1980-01-01T00:00:00 2017-09-08
II * * * 1980-01-01T00:00:00 2017-09-08
IB * * * 1980-01-01T00:00:00 2017-09-08

http://geofon.gfz-potsdam.de/eidaws/wfcatalog/alpha/query
IS * * * 1980-01-01T00:00:00 2017-09-08
IQ * * * 1980-01-01T00:00:00 2017-09-08
IO * * * 1980-01-01T00:00:00 2017-09-08

This bug must have been introduced in the last 2 days, so focus on the changes from the last commits.

Automatically generate Virtual Network entry

Since the routing.xml file contains explicitly all of each stations and due to the fact that Virtual Networks can consist of networks that belong to different nodes (e.g. _NFOCRL), there should be a way to solve add/removals of stations that occurred daily (e.g. new stations have just been added and vice versa).

I would suggest that all involved nodes should implement the corresponding entries to the routing.xml, in a similar way, as they were the ones that containing the virtual network. However, there would be a tag that indicates the "master" node and the "slaves" ones, for the specific virtual network. Daily the "master" node, will collect the changes to its routing.xml by requesting all other nodes' routing service [or only the "slaves" ones, but this means that master node needs to know a list of the other involved nodes].

For example, let's say we have:

Node no1:
Network: A
Stations: A1,A2

Node no2:
Network: B
Stations: B1,B2

and assuming that we have virtual network _X on node no2 which consist of A2, B1, B2.
So the master node is no2.

Thus,
node no1 will have:

<ns0:vnetwork networkCode="_X" state="slave">
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>

node no2 will have:

<ns0:vnetwork networkCode="_X" state="master">
<ns0:stream networkCode="B" stationCode="B1" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>

Now,
Daily node no1 will do nothing, since there is no "master" tag. On the other hand node no2 will request by using the routing service of each node to collect the stations that involded to specific virtual network, and will implement a new virtual network entry but without the state (actually exactly as it is right now), that contains all stations (i.e.):

<ns0:vnetwork networkCode="_X">
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B1" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>

The above tag <ns0:vnetwork networkCode="_X"> (and what it contains) would be re-created daily -only to the master node ofcourse-.

Since it would be daily it wouldn't harm the performance. But this can achieve a better one, if its virtual network maintains a list of the involved nodes (for the particular virtual network). This however means, that the node operators need to update the list of nodes for the specific network, if there is any change to this -probably not that often-.

In that case would have:
node no1:

<ns0:vnetwork networkCode="_X">
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>

node no2:

<ns0:vnetwork networkCode="_X" list="http://nodeno1"> #comma separated
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>

and the routing service to the master node would still generate daily the full entry:

<ns0:vnetwork networkCode="_X">
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B1" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>

To sum up:

  1. All involved nodes have a virtual network entry in state "slave" and including ONLY their stations to it (concerning this particular virtual network).
  2. The primary node will do exact the step 1, apart from setting the state to "master".

(Also the primary node will observe that daily there would be an automatically generation of <ns0:vnetwork networkCode="_X"> tag entry that includes all involved stations in its routing.xml)

No way to filter routes properly if the request contains only the station

If the request has only a station code, there is no way to filter out routes containing only a network code.
For instance: route XX.* and query *.APE. This can never be filtered out because there is no match.
One possible solution could be to query a list of station codes from the station-WS service if it's available.

Insufficient routes selected when reused network code and many epochs

If a network code has been reused (e.g. ZE) the entry in the cfg file will look like:

<ns0:route networkCode="ZE" stationCode="" locationCode="" streamCode="*">
<ns0:dataselect address="http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query" priority="1" start="2010-03-17T00:00:00" end="2011-10-22T00:00:00" />
<ns0:dataselect address="http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query" priority="1" start="1996-09-14T00:00:00" end="1997-12-31T00:00:00" />
<ns0:dataselect address="http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query" priority="1" start="2012-04-19T00:00:00" end="2014-12-31T00:00:00" />
<ns0:dataselect address="http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query" priority="1" start="1999-05-01T00:00:00" end="1999-11-01T00:00:00" />
</ns0:route>`

If only the network code is used to filter we expect to receive the 4 epochs, but only the first one is retrieved.

GET http://geofon.gfz-potsdam.de/eidaws/routing/1/query?net=ZE&service=station&format=get

gives the following response:

http://geofon.gfz-potsdam.de/fdsnws/station/1/query?end=2011-10-22T00:00:00&start=2010-03-17T00:00:00&net=ZE

Error when start or end time are not provided

Queries like

http://geofon.gfz-potsdam.de/eidaws/routing/1/query?net=GE

will give an error claiming that starttime could not be converted. If starttime is provided the same problem happens with endtime.

Old routes are removed even if the datacenter does not provide new ones

When a datacenter is irresponsive and does not provide the routes, still the old ones present in the cache should be available to avoid the complete lack of data.
Cached files are in data/routing-XXXX.xml where XXXX is the institution.
However, INGV has not responded in 2 days and now the old routes are gone.

all user configuration inside routing.cfg

I suggest that the path inside routing.wsgi file should be a variable that is being read by the routing.cfg file, so that the user would need to configure only a particular file (routing.cfg).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.