A set of arche-core handlers implementing the ACDH business logic.
Checks metadata consistency, manages automatically managed metadata properties, etc.
A set of arche-core handlers implementing the ACDH business logic.
License: MIT License
A set of arche-core handlers implementing the ACDH business logic.
Checks metadata consistency, manages automatically managed metadata properties, etc.
PID minting should be done after checks so it's impossible to end up with a PID minted when checks fail.
Currently we are only checking a minimum cardinality. We should also check maximum/exact ones taking into account they mean maximum/exact per language.
Depends on acdh-oeaw/arche-lib-schema#2
Such resources must have:
according to our ontology e.g. hasCoverageStartDate
has range http://www.w3.org/2001/XMLSchema#date
but currently doorkeepers accepts dateTime strings as well; either reject them or transform them to proper iso-dates
At the moment the check determines if the content is a full biblatex record or only a set of properties by checking literally if the first character is @
. This check fails for content beginning with whitespaces (which is common when it comes from a pretty-formatted XMLs). It should be addressed.
running the script ids_and_binaries.php failed to create handle-PIDs
example resource
<rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:acdh="https://vocabs.acdh.oeaw.ac.at/schema#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xml:base="https://id.acdh.oeaw.ac.at/">
<acdh:Resource rdf:about="https://id.acdh.oeaw.ac.at/rita/editions/inventar__1756-03-15.xml">
<acdh:isPartOf rdf:resource="https://id.acdh.oeaw.ac.at/rita/editions"/>
<acdh:hasTitle xml:lang="de">1756 III 15 – 1756 III 29, Saalen (in der Benefiziaten-Behausung) [Stainer Sebastian Joseph (Benefiziat)]</acdh:hasTitle>
<acdh:hasNonLinkedIdentifier xml:lang="de">Archivsignatur: Südtiroler Landesarchiv (SLA) Inventare des Mittleren Pustertals Pos. Nr. 760 </acdh:hasNonLinkedIdentifier>
<acdh:hasCoverageStartDate rdf:datatype="http://www.w3.org/2001/XMLSchema#date">1756-03-15</acdh:hasCoverageStartDate>
<acdh:hasCoverageEndDate rdf:datatype="http://www.w3.org/2001/XMLSchema#date">1756-03-15</acdh:hasCoverageEndDate>
<acdh:hasRelatedDiscipline rdf:resource="https://vocabs.acdh.oeaw.ac.at/oefosdisciplines/601"/>
<acdh:hasMetadataCreator rdf:resource="https://d-nb.info/gnd/1043833846"/>
<acdh:hasCurator rdf:resource="https://d-nb.info/gnd/1043833846"/>
<acdh:hasContact rdf:resource="https://d-nb.info/gnd/1133094783"/>
<acdh:hasDepositor rdf:resource="https://d-nb.info/gnd/1133094783"/>
<acdh:hasPid>create</acdh:hasPid>
<acdh:hasLicensor rdf:resource="https://d-nb.info/gnd/16332395-1"/>
<acdh:hasLicense rdf:resource="https://vocabs.acdh.oeaw.ac.at/archelicenses/cc-by-4-0"/>
<acdh:hasOaiSet rdf:resource="https://vocabs.acdh.oeaw.ac.at/archeoaisets/kulturpool"/>
<acdh:hasOaiSet rdf:resource="https://vocabs.acdh.oeaw.ac.at/archeoaisets/clarin-vlo"/>
<acdh:hasLanguage rdf:resource="https://vocabs.acdh.oeaw.ac.at/iso6393/deu"/>
<acdh:hasOwner rdf:resource="https://d-nb.info/gnd/16332395-1"/>
<acdh:hasRightsHolder rdf:resource="https://d-nb.info/gnd/36165-3"/>
<acdh:hasCategory rdf:resource="https://vocabs.acdh.oeaw.ac.at/archecategory/text/tei"/>
<acdh:hasCreator rdf:resource="https://d-nb.info/gnd/1133094783"/>
<acdh:hasAvailableDate rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2021-04-01+02:00</acdh:hasAvailableDate>
</acdh:Resource>
</rdf:RDF>
same data source as mentioned in the script can be found here https://rita.acdh.oeaw.ac.at/archeutils/ids.xql
For all classes new acdh:hasTitle values should only overwrite existing values in the same language. Existing values in other languages should be kept (it's not a case now due to default RDF merging rules we use).
This change will allow us to express domains (for properties, e.g. for default values as well for cardinality rules) in terms of a resource having a binary payload.
With the new approach to the named entities handling we will by design leave "empty" resources in the repository. They should be accepted until they were covered by the checkRanges
check. Unfortunately it's not easy to tell because their class is derived from the property relating to them and the namespaces check is regex-based so a precise at-the-end-of-transaction check would need to be pretty complex. Still we can't just release the check because we currently allow creation of "empty" resources for properties which range isn't defined in the checkRanges
(or which are out of ontology).
It looks like a metadata as follows:
<sbj> <objectProperty> <non-standardized-URI> .
passes the doorkeeper check thanks to the URI standarization but still creates an ARCHE resource with the original non-standarized URI as an identifier.
Experienced by @csae8092 while ingesting arche_ttl.txt
If a resource belongs to the $cfg.doorkeeper.epicPid.clarinSet
OAI-PMH set and PID generation is turned on, a separate PID should be generated for the CMDI record and stored in the $cfg.schema.cmdiPid
. The PID should point to the additional ID created as {$cfg.schema.namespaces.id}/cmdi/{resource internal integer ID}
.
Make use of arche-lib-schema PropertyDesc->defaultValue
The main challenge here is the performance. We have a few big vocablaries (e.g. oefos and iso languages) and we don't want reading them to kill the doorkeeper performance (which is not perfect even now).
acdh:isNewVersionOf can only form tree structure (there can be no more than one triple pointing to a given resource with acdh:isNewVersionOf)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.