hapi-server / servers Goto Github PK
View Code? Open in Web Editor NEWCatalogs of known HAPI servers
Catalogs of known HAPI servers
This repository will become very large because of the history of changes to the metadata. I think it would be better to store this information in a new repo named server-metadata
.
Alternatively, we could not put the metadata in a repo and just push it to a directory visible from, say hapi-server.org/server-metadata. The root dir would contain the latest metadata and there would be subdirs of YYYY-MM-DD for old metadata.
Hello,
I would like to use hapi-client in our project SciQLop/spwc to add more servers and maybe deprecate our direct REST API usage.
SPWC is for us an abstraction to access remote data with efficient multi layer caching system, it should also provide all the information to build a tree for each server like here.
I'm not sure if this is a client or server limitation or I missed something, but I see no way to generate such tree. First the fact that we need to do multiple requests per server to list datasets/parameters/parameters_desc makes it quite inefficient. Then it seems to miss information such as spacecraft/observatory name.
That would be nice to have the same level of information as we get here for AMDA:
http://amda.irap.omp.eu/AMDA//data/WSRESULT/obsdatatree_impex_20201118_AmdaLocalDataBaseParameters.xml
Last summer a student at APL wrote a script which would test HAPI servers for aliveness. This
was to replace the ping test I would do with a test which would dig a little deeper into each server but still complete the test within a couple of minutes. The test then would be suitable to run hourly.
I've had a heck of a time dealing with servers which do not provide a sampleStartDate and sampleStopDate. There's really no rules about the startDate and stopDate, which are required, and often requesting the last day of the interval does not return any data. The script would then enlarge the interval and test again, repeating this several times before giving up and declaring the server broken. Because of this I've been unable to reliably run the test, and first I resorted to using fixed hashes (repeating the randomly picked dataset), but even this has been shown to be unreliable.
Bob and I agreed that the indexing should result in two files. The first is the list of datasets, units, and labels, and then the second would be just the coverage reported. This allows a bit more precision in glancing for changes, where a dataset growing doesn't obscure a change caused by a new dataset being published. For example here's the two files for lisird:
http_lasp.colorado.edu_lisird_hapi.json
http_lasp.colorado.edu_lisird_hapi_coverage.json
This is implemented and will run for the first time automatically tomorrow (Sun Jan 15).
The cron job which scans the HAPI servers for their info responses and compiles the data into searchable JSON files (https://github.com/hapi-server/servers/blob/master/index/makeGiantCatalog.jy) causes problems with the CDAWeb server. I caused problems for them yesterday when running it twice while I was making changes, and without any pauses in between calls. Bernie suggested any trivial pause is not going to fix the problem.
I've disabled this scan for now.
Script that builds index for all servers should also get nominal cadence for datasets that don't have a cadence in metadata.
We should consider introducing all.json and dev.json which would replace all.txt and dev.txt. It seems strange that we didn't use json since it's self-documenting and supports schemas.
Bob and I were thinking it would be nice to have some way of aggregating all the labels for datasets in HAPI servers, which clients could use to locate data. I could write an Autoplot script to go through the known servers, which would create a JSON file containing the dataset identifiers and descriptions. This file would then be posted here, when changes are detected, so that we could see the history and evolution of the servers, and so there's a known place where clients can search for parameters. Bob's use case was DST, which might be "dst" on one server and "DST" on another, and it's not even clear who is hosting it.
The process which indexes the servers appears to be failing, and has been for maybe nine months. It looks like it fails on one of the servers and the entire process is halted. I put in what I think is a fix, and also printing which server is being indexed, since logging doesn't seem to indicate this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.