metadatacenter / cedar-cadsr-tools Goto Github PK
View Code? Open in Web Editor NEWIngestor to transform caDSR XML-based CDEs into CEDAR metadata
License: Other
Ingestor to transform caDSR XML-based CDEs into CEDAR metadata
License: Other
No error being reported but CADSR-VS value sets are not updated nightly.
For CDEs, we are selecting the last entry for the values in DataElement/VALUEDOMAIN/PermissableValues/PermissableValues_ITEM/MEANINGCONCEPTS
as the concept code for each answer to a CDE-based question. We should be selecting the first as the concepts get more general and less specific later in the list. The first element is the most specific.
Some CDEs have a complex controlled term that is made of several atomic concepts joined together.
As an example the label "Frozen Section Cell Pellet Formation" has a concept code C49338,C12508,C45813
where each code corresponds to a particular concept in NCIT:
C49338
: Frozen Section DiagnosisC12508
: CellC45813
: Pellet FormationThe caDSR XML does not indicate the source of controlled terms. Mostly are from NCIT but not all are.
Denise talked about potentially extending their XML to provide this information.
Currently the JAXB2 Maven plugin only works with Java 8. The update of the plugin to support Java 10 is ongoing:
mojohaus/jaxb2-maven-plugin#43 (comment)
When it is completed we can upgrade reader to use Java 11.
Requires metadatacenter/cedar-project#772
Support schema:identifier, skos:prefLabel, skos:altLabel fields in generated CDE fields.
Investigate error with CDE import.
02:04:47.919 INFO o.m.c.ingestor.util.CedarServices - 11400/70886 CDEs retrieved.
02:04:48.878 INFO o.m.c.ingestor.util.CedarServices - 11500/70886 CDEs retrieved.
02:04:49.746 INFO o.m.c.ingestor.util.CedarServices - 11600/70886 CDEs retrieved.
02:04:50.663 INFO o.m.c.ingestor.util.CedarServices - 11700/70886 CDEs retrieved.
02:04:51.775 ERROR o.m.c.i.t.c.CadsrCategoriesAndCdesUpdaterTool - Error: Server returned HTTP response code: 500 for URL: https://resource.metadatacenter.org/folders/https%3A%2F%2Frepo.metadatacenter.org%2Ffolders%2F1ee5ef41-0605-4c18-9054-b01eb4290339/contents-extract?resource_types=field&field_names=schema:identifier,pav:version,sourceHash,categories&offset=11700&limit=100
java.io.IOException: Server returned HTTP response code: 500 for URL: https://resource.metadatacenter.org/folders/https%3A%2F%2Frepo.metadatacenter.org%2Ffolders%2F1ee5ef41-0605-4c18-9054-b01eb4290339/contents-extract?resource_types=field&field_names=schema:identifier,pav:version,sourceHash,categories&offset=11700&limit=100
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at java.base/sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1974)
at java.base/sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1969)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1968)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)
at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:250)
at org.metadatacenter.cadsr.ingestor.util.CedarServices.findCdeSummariesInFolder(CedarServices.java:123)
at org.metadatacenter.cadsr.ingestor.util.CdeUtil.getExistingCedarCdeSummaries(CdeUtil.java:133)
at org.metadatacenter.cadsr.ingestor.tools.cde.CadsrCategoriesAndCdesUpdaterTool.main(CadsrCategoriesAndCdesUpdaterTool.java:172)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:254)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Server returned HTTP response code: 500 for URL: https://resource.metadatacenter.org/folders/https%3A%2F%2Frepo.metadatacenter.org%2Ffolders%2F1ee5ef41-0605-4c18-9054-b01eb4290339/contents-extract?resource_types=field&field_names=schema:identifier,pav:version,sourceHash,categories&offset=11700&limit=100
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1924)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)
at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527)
at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:334)
at org.metadatacenter.cadsr.ingestor.util.CedarServices.findCdeSummariesInFolder(CedarServices.java:106)
... 4 more
The current solution is to ignore the CEDAR certificate when running the Java upload code. This is not really secure and it needs a better solution. Below is the error message received when executing the code:
javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException:
PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException:
unable to find valid certification path to requested target
The implemented solution is based on this post: https://stackoverflow.com/questions/13626965/how-to-ignore-pkix-path-building-failed-sun-security-provider-certpath-suncertp
We currently support the 10 most common caDSR CDE datatypes (CHARACTER, NUMBER, java.lang.String, DATE, java.lang.Long, java.lang.Integer, java.lang.Double, java.util.Date, ALPHANUMERIC, ISO21090CDv1.0).
We do not support the remaining ~190. Some are pretty obscure (e.g., ISO21090TELURLv1.0, ISO21090.TEL.URL.v1.0, ISO21090BLv1.0) and are used in only a few CDEs. However, we cannot handle ~10% of caDSR CDEs due to not handling these types.
Supporting some of these datatypes may be non trivial if we assume that we also validate that supplied values conform to the specified datatype.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.