eida / routing Goto Github PK
View Code? Open in Web Editor NEWServer side application to provide the Routing Service used in EIDA
License: GNU General Public License v3.0
Server side application to provide the Routing Service used in EIDA
License: GNU General Public License v3.0
Hi Javier, it seems the routing only works with Python3 now. Is that correct? When trying to run it says:
Cannot import module request urllib.request
. This changes a lot of the installation procedures since you need mod_wsgi for Python3.
M
When a route contains only network information, some attributes at the level of station should be also cached. See also related issue #3
The extra problem with latitude and longitude is that this implies that the API will be extended to allow this!
Hi Javier, hope you are doing well. It would be nice to add the WFCatalog to the routing service. I can help with the implementation if you can point me to the routines where services are resolved.
Here are five addresses of running catalogues:
$nodesWFCatalog = (object) Array (
"ODC" => "http://www.orfeus-eu.org/eidaws/wfcatalog/alpha",
"NOA" => "http://eida.gein.noa.gr/eidaws/wfcatalog/alpha",
"INGV" => "catalog.data.ingv.it/wfcatalog/1",
"BGR" => "http://eida.bgr.de/wfcatalog/alpha",
"GFZ" => "http://geofon.gfz-potsdam.de/eidaws/wfcatalog/alpha"
);
Best,
Mathijs
Virtual networks should be included in the configuration. They need to be synchronized and translated to real network-station codes for responses.
Check that synchronization works as expected.
If the request has only a station code, there is no way to filter out routes containing only a network code.
For instance: route XX.* and query *.APE. This can never be filtered out because there is no match.
One possible solution could be to query a list of station codes from the station-WS service if it's available.
Queries like
http://geofon.gfz-potsdam.de/eidaws/routing/1/query?net=GE
will give an error claiming that starttime could not be converted. If starttime is provided the same problem happens with endtime.
Since the routing.xml
file contains explicitly all of each stations and due to the fact that Virtual Networks can consist of networks that belong to different nodes (e.g. _NFOCRL), there should be a way to solve add/removals of stations that occurred daily (e.g. new stations have just been added and vice versa).
I would suggest that all involved nodes should implement the corresponding entries to the routing.xml
, in a similar way, as they were the ones that containing the virtual network. However, there would be a tag that indicates the "master" node and the "slaves" ones, for the specific virtual network. Daily the "master" node, will collect the changes to its routing.xml
by requesting all other nodes' routing service [or only the "slaves" ones, but this means that master node needs to know a list of the other involved nodes].
For example, let's say we have:
Node no1:
Network: A
Stations: A1,A2
Node no2:
Network: B
Stations: B1,B2
and assuming that we have virtual network _X on node no2 which consist of A2, B1, B2.
So the master node is no2.
Thus,
node no1 will have:
<ns0:vnetwork networkCode="_X" state="slave">
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
node no2 will have:
<ns0:vnetwork networkCode="_X" state="master">
<ns0:stream networkCode="B" stationCode="B1" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
Now,
Daily node no1 will do nothing, since there is no "master" tag. On the other hand node no2 will request by using the routing service of each node to collect the stations that involded to specific virtual network, and will implement a new virtual network entry but without the state (actually exactly as it is right now), that contains all stations (i.e.):
<ns0:vnetwork networkCode="_X">
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B1" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
The above tag <ns0:vnetwork networkCode="_X">
(and what it contains) would be re-created daily -only to the master node ofcourse-.
Since it would be daily it wouldn't harm the performance. But this can achieve a better one, if its virtual network maintains a list of the involved nodes (for the particular virtual network). This however means, that the node operators need to update the list of nodes for the specific network, if there is any change to this -probably not that often-.
In that case would have:
node no1:
<ns0:vnetwork networkCode="_X">
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
node no2:
<ns0:vnetwork networkCode="_X" list="http://nodeno1"> #comma separated
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
and the routing service to the master node would still generate daily the full entry:
<ns0:vnetwork networkCode="_X">
<ns0:stream networkCode="A" stationCode="A2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B1" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
<ns0:stream networkCode="B" stationCode="B2" locationCode="*" streamCode="*" start="2018-01-01T00:00:00" end=""/>
To sum up:
(Also the primary node will observe that daily there would be an automatically generation of <ns0:vnetwork networkCode="_X">
tag entry that includes all involved stations in its routing.xml
)
Dear colleagues,
there might be an issue while parsing request parameters, since with the request:
$ cat eida-routing.500
service=station
format=post
OrderedDict([('network', 'GR'), ('station', '*'), ('location', '*'), ('channel', 'BHZ'), ('starttime', '2017-01-01T00:00:00'), ('endtime', '2017-01-02T00:00:00')])
$ wget -O - --post-file eida-routing.500 "http://geofon.gfz-potsdam.de/eidaws/routing/1/query"
--2018-02-09 12:30:29-- http://geofon.gfz-potsdam.de/eidaws/routing/1/query
Resolving geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)... 139.17.3.177
Connecting to geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)|139.17.3.177|:80... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2018-02-09 12:30:29 ERROR 500: Internal Server Error.
I receive HTTP Status Code 500. However, code 400 would be more appropriate, I guess.
Kind regards,
Daniel
If the end dat is not present in the query is ignored in the output, but for the POST format this is wrong.
Add a test with the following URL:
http://localhost/eidaws/routing/1/query?net=GE&start=2010-01-01T00:00:00Z&format=post
Hi Javier, I installed the new WebDC 3 and made a request for FDSN-Station but it broke routing:
req.txt:
service=station
format=json
NL OPLO -- BHE 2007-05-01T00:00:00.0000Z 2017-05-23T23:59:59.0000Z
NL VKB -- BHE 2007-05-01T00:00:00.0000Z 2017-05-23T23:59:59.0000Z
NL WIT -- BHE 2007-05-01T00:00:00.0000Z 2017-05-23T23:59:59.0000Z
curl -X POST -d req.txt http://geofon.gfz-potsdam.de/eidaws/routing/1/query
and the service returns an internal service error.
Still running on mod_wsgi under Apache, but full Python3 compatibility is needed.
Virtual Networks are created only from local routing files. The section with virtual networks is skipped in the remote ones.
For instance:
http://rz-vm258.gfz-potsdam.de/eidaws/routing/nightly/query?net=*&format=post&service=wfcatalog
returns an HTTP error 204, but
http://rz-vm258.gfz-potsdam.de/eidaws/routing/nightly/query?net=I*&format=post&service=wfcatalog
returns
http://catalog.data.ingv.it/wfcatalog/1/query
IX * * * 2005-01-01T00:00:00 2017-09-08
IV * * * 1988-01-01T00:00:00 2017-09-08http://www.orfeus-eu.org/eidaws/wfcatalog/1/query
IU * * * 1980-01-01T00:00:00 2017-09-08
IP * * * 1980-01-01T00:00:00 2017-09-08
II * * * 1980-01-01T00:00:00 2017-09-08
IB * * * 1980-01-01T00:00:00 2017-09-08http://geofon.gfz-potsdam.de/eidaws/wfcatalog/alpha/query
IS * * * 1980-01-01T00:00:00 2017-09-08
IQ * * * 1980-01-01T00:00:00 2017-09-08
IO * * * 1980-01-01T00:00:00 2017-09-08
This bug must have been introduced in the last 2 days, so focus on the changes from the last commits.
http://geofon.gfz-potsdam.de/eidaws/routing/1/query?service=station&format=get&maxlon=48.34717&minlon=43.75462&minlat=-23.41386&maxlat=-20.79577
The error is that a str does not have a isoformat method. This happens usually when datetimes and string are mixed because of different ways of getting the routes.
item['end']
is of type datetime.datetime
and is supposed to be a string
. The + operator with another string is throwing an error only in case that the format is POST
. The query to test it is:
http://rz-vm258.gfz-potsdam.de/eidaws/routing/1/query?net=GE&format=post
This proposal aims to merge routes returned by the routing service. Consider the following request:
http://geofon.gfz-potsdam.de/eidaws/routing/1/query?net=Z3&format=get
returns an excessive amount of routes e.g.:
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A339A&start=2015-01-01T00:00:00&net=Z3
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A338A&start=2015-01-01T00:00:00&net=Z3
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A337A&start=2015-01-01T00:00:00&net=Z3
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A336A&start=2015-01-01T00:00:00&net=Z3
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A335A&start=2015-01-01T00:00:00&net=Z3
... and many, many more.
Most of these routes can be combined to a single route using comma delimited station requests (e.g. in the case of above):
http://www.orfeus-eu.org/fdsnws/dataselect/1/query?sta=A335A,A336A,A337A,A338A,A339A&start=2015-01-01T00:00:00&net=Z3
This would improve federated services as the number of requests to each node is reduced.
The call to the method total_seconds
of a datetime.timedelta
makes it incompatible with versions older than python 2.7.
It was actually never properly used and everyone is handling the updates via cronjobs.
I would like to propose that a nodata
query parameter as defined by FDSNWS specification is added to the service. This would help in understand behavior especially if used in a browser window, e.g. for debugging.
Since the beginning exists the capability to have a masterTable.xml file where some networks can be included and these will not be part of the internal synchronisation with the other RS nodes.
The idea was to include there networks from data centres not belonging to EIDA (or our internal trustable circle of DCs). These have the highest priority when matching input parameters.
However, this is becoming cumbersome and some new features as the station name cache are starting to bring problems with this "exceptional case".
This should be translated into a more similar case as the nodes being synchronised. Those routes follow a similar workflow as the local ones and the overall behaviour shows no problem at all.
To achieve this we need to include a new import option similar to the URL pointing to a static file. Just the file name of the masterTable and then follow the usual workflow.
All code related to the masterTable should be removed after this change.
http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query?end=2006-01-01T00:00:00&sta=BG23&start=2001-01-01T00:00:00&net=ZV
http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query?end=2006-01-01T00:00:00&sta=BM12&start=2001-01-01T00:00:00&net=ZV
http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query?end=2006-01-01T00:00:00&sta=BM13&start=2001-01-01T00:00:00&net=ZV
http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query?end=2006-01-01T00:00:00&sta=BM14&start=2001-01-01T00:00:00&net=ZV
http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query?end=2006-01-01T00:00:00&sta=BM15&start=2001-01-01T00:00:00&net=ZV
http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query?end=2006-01-01T00:00:00&sta=NALB&start=2001-01-01T00:00:00&net=ZV
http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query?end=2004-11-17T00:00:00&sta=IMA&start=2002-01-01T00:00:00&net=ZO
http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query?end=2004-11-17T00:00:00&sta=IMO&start=2002-01-01T00:00:00&net=ZO
@nikosT proposed that we could offer another option, apart from of having a list of servers, in the synchronize
option in the configuration file.
This will make it easier to maintain for the node operators, because we should update only the file pointed by the URL.
If time windows have unreallistic dates (e.g. year 2999) some problems could appear when trying to calculate overlaps. Now there is a fix maximum date (2100), but it should be solved in a better way.
The parsing specification is too strict. Only the long format including milliseconds is accepted.
2000-02-02 would be f.i. not accepted.
I suggest that the path inside routing.wsgi
file should be a variable that is being read by the routing.cfg
file, so that the user would need to configure only a particular file (routing.cfg
).
Do it like SC3 is doing it.
When a static file is specified in the configuration for the synchronization, the output message is not clear. It shows an error when the URL is tested to be a Routing Service and only after that checks if it is a static URL. Probably a warning would be better.
Check if there is a convertor from WADL to Swagger.
Hi Javier,
while requesting information from eidaws-routing using the HTTP POST format, it seems that the current implementation does not accept any geolocation information:
$ cat routing.post
format=post
minlatitude=-70.8
maxlatitude=-70.
AW * * * 2017-01-01 2018-01-01
$ wget -O - --post-file routing.post "http://geofon.gfz-potsdam.de/fdsnws/station/1/query"
--2018-05-14 15:39:56-- http://geofon.gfz-potsdam.de/fdsnws/station/1/query
Resolving geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)... 139.17.3.177
Connecting to geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)|139.17.3.177|:80... connected.
HTTP request sent, awaiting response... 400 Bad Request
2018-05-14 15:39:56 ERROR 400: Bad Request.
However using a GET request gives a valid result:
$ wget -O - "http://geofon.gfz-potsdam.de/eidaws/routing/1/query?format=post&net=AW&start=2017-01-01&end=2018-01-01&minlatitude=-70.8&maxlatitude=-70"
--2018-05-14 15:50:13-- http://geofon.gfz-potsdam.de/eidaws/routing/1/query?format=post&net=AW&start=2017-01-01&end=2018-01-01&minlatitude=-70.8&maxlatitude=-70
Resolving geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)... 139.17.3.177
Connecting to geofon.gfz-potsdam.de (geofon.gfz-potsdam.de)|139.17.3.177|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 107 [text/plain]
Saving to: ‘STDOUT’
- 0%[ ] 0 --.-KB/s http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query
AW VNA1 * * 2017-01-01T00:00:00 2018-01-01T00:00:00
- 100%[===================>] 107 --.-KB/s in 0s
2018-05-14 15:50:13 (6.46 MB/s) - written to stdout [107/107]
The eidaws-routing documentation does not say anything about geolocation parameters using HTTP POST (i.e. the corresponding format).
cheers, Daniel
The following problems have been detected by the ObsPy people.
|<resources base="http://localhost/fdsnws/routing">|
-> This should be the actual URL of the service.|<param name="format" style="query" type="xsd:string" default="json">|
-> The default value is |"xml"|
and not |"json"|
. The list of options also does not list the XML possibility.|<param name="endtime" style="query" type="xsd:dateTime" default="tomorrow"/>|
-> |"tomorrow"|
is not a valid |xsd:dateTime|
.|<param name="service" style="query" type="xsd:string" default="dataselect"/>|
-> This should probably also list all available options.New functionality must be documented. Also the specification should be updated, as more parameters are going to be accepted (e.e. minlat, maxlat, etc)
This should be replaced by the sample file.
Suggestion for input at test/testService.py.
if no args:
read baseURL variable from routing.cfg (default)
else:
read args
eg.1
./testService.py
(reads baseURL from routing.cfg)
eg.2
./testService.py url
(reads url)
As it is implemented today, there are internally two set of routes in the RS.
The normal routing table, which could be synchronized, for instance, with the other EIDA nodes, and a second "master table", which contains some routes that have the highest priority and are not synchronized. This was thought as a way to include IRIS and other DCs not belonging to EIDA in the RS.
Today, routes from the master and the routing table are not mixed. If someone requests routes for a NSLC it will be first check in the master table and only if there is nothing it will be search in the routing table.
A user came with a use case in which the whole list of routes will be requested (no input to use as filter). Today this is not possible.
How should it work this? Probably we should remove the master table and make the implementation even simpler. If master routes are needed, these should be included in the local configuration.
If a network code has been reused (e.g. ZE) the entry in the cfg file will look like:
<ns0:route networkCode="ZE" stationCode="" locationCode="" streamCode="*">
<ns0:dataselect address="http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query" priority="1" start="2010-03-17T00:00:00" end="2011-10-22T00:00:00" />
<ns0:dataselect address="http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query" priority="1" start="1996-09-14T00:00:00" end="1997-12-31T00:00:00" />
<ns0:dataselect address="http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query" priority="1" start="2012-04-19T00:00:00" end="2014-12-31T00:00:00" />
<ns0:dataselect address="http://geofon.gfz-potsdam.de/fdsnws/dataselect/1/query" priority="1" start="1999-05-01T00:00:00" end="1999-11-01T00:00:00" />
</ns0:route>`
If only the network code is used to filter we expect to receive the 4 epochs, but only the first one is retrieved.
GET http://geofon.gfz-potsdam.de/eidaws/routing/1/query?net=ZE&service=station&format=get
gives the following response:
@nikosT found that the response for the version
method returns the wrong version number.
A variable is not being defined if an Exception is raised.
testRoute
checks the functionality of the RouteCache
object without any network connection; only the program locally with local objects read from the configuration file.
Try to match as much as we can to testService
, which performs a full check of coherency and functionality.
If a network from the normal routes is requested as well as one from the master table, the POST output format gives an error. For instance,
In the example provided for the synchronize there is only one data centre and it is not clear the syntax for more than one.
Suggestion for input at test/testRoute.py.
if no args:
read routing.xml (default)
else:
read args
eg.1
./testRoute.py
(reads routing.xml)
eg.2
./testRoute.py file
(reads file)
The order in which routes are being read seems not to be correct. First the XML, then the binary version overrides everything (if present), and then the synchronization took place (but only sometimes).
With start and end seems to work OK:
start=2010-01-01T00:00:00Z&end=2012-01-01T00:00:00Z
but with only start it returns 204:
start=2010-01-01T00:00:00Z
Tested against the BGR deployment
Some methods (e.g. str2date
) and classes (e.g. Stream
and TW
) should be more used along the code in order to make it look more compact and coherent.
If routes with different priorities are read for the same stream, these are loaded with a wrong priority.
Filter of stations by name or location can be done at the very beginning. It would be much faster. Follow the normal procedure once it was already filtered. To do that I need to introduce an index for the fields in the cache.
When a datacenter is irresponsive and does not provide the routes, still the old ones present in the cache should be available to avoid the complete lack of data.
Cached files are in data/routing-XXXX.xml
where XXXX is the institution.
However, INGV has not responded in 2 days and now the old routes are gone.
For instance, for 3D the system keeps only the stations from 3D-2010 and not from 3D-2014.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.