google / schemarama Goto Github PK
View Code? Open in Web Editor NEWSchemarama is a project exploring standards-based validation for structured data, especially Schema.org.
License: Apache License 2.0
Schemarama is a project exploring standards-based validation for structured data, especially Schema.org.
License: Apache License 2.0
It is difficult to add "check this URL" functionality into a clientside app, but a bookmarklet that injects JS into the currently viewed page can work similarly. We had some tests of this early on - where are we with this?
Here is the current draft of an export in ShEx format. SHACL is also below. For contrast see also the last version of something close to the schemarama codebase (albeit not guaranteed to run right now):
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix sx: <http://www.w3.org/ns/shex#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
[] a sx:Schema ;
sx:shapes [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max 1 ;
sx:min 1 ;
sx:predicate schema:url ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:semActs ( [ a sx:SemAct ;
sx:code "console.log('some url checking code here')" ;
sx:name <https://google.com/search/validation/valid-url> ] ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max 1 ;
sx:min 1 ;
sx:predicate schema:claimReviewed ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:itemReviewed ;
sx:valueExpr [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:CreativeWork ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Claim ) ] ] ] ) ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:CreativeWork ) ] ] ] ] [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:author ;
sx:valueExpr [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Organization ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Person ) ] ] ] ) ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Organization ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Person ) ] ] ] ) ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max 1 ;
sx:min 0 ;
sx:predicate schema:name ] ] ) ] ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:datePublished ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:semActs ( [ a sx:SemAct ;
sx:code "console.log('some datetime checking code here')" ;
sx:name <https://google.com/search/validation/valid-date-time> ] ) ] ] ] ) ] ) ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Claim ) ] ] ] ] [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:appearance ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:firstAppearance ] ] ) ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:appearance ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:firstAppearance ] ] ) ] ) ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:appearance ;
sx:valueExpr [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:NodeConstraint ;
sx:pattern "https?://.*" ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:CreativeWork ) ] ] ] ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:firstAppearance ;
sx:valueExpr [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:NodeConstraint ;
sx:pattern "https?://.*" ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:CreativeWork ) ] ] ] ) ] ] ] [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:author ;
sx:valueExpr [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Organization ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Person ) ] ] ] ) ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Organization ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Person ) ] ] ] ) ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max 1 ;
sx:min 0 ;
sx:predicate schema:name ] ] ) ] ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:datePublished ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:semActs ( [ a sx:SemAct ;
sx:code "console.log('some datetime checking code here')" ;
sx:name <https://google.com/search/validation/valid-date-time> ] ) ] ] ] ) ] ) ] ) ] ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:author ;
sx:valueExpr [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Organization ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Person ) ] ] ] ) ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Organization ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Person ) ] ] ] ) ] ] [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:name ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:url ] ] ) ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:name ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:url ] ] ) ] ) ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:url ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:semActs ( [ a sx:SemAct ;
sx:code "console.log('some url checking code here')" ;
sx:name <https://google.com/search/validation/valid-url> ] ) ] ] ] ) ] ) ] ) ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:reviewRating ;
sx:valueExpr [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Rating ) ] ] ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:predicate rdf:type ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:values ( schema:Rating ) ] ] ] ] [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max 1 ;
sx:min 0 ;
sx:predicate schema:alternateName ] ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:name ] ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:alternateName ] ] ) ] [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:alternateName ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:name ] ] ) ] ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:ratingValue ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:bestRating ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:worstRating ] ] ) ] [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:ratingValue ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:pattern "-1" ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:bestRating ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:pattern "-1" ] ] ] [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:worstRating ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:pattern "-1" ] ] ] ) ] ] ) ] ] [ a sx:ShapeAnd ;
sx:shapeExprs ( [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 0 ;
sx:predicate schema:ratingValue ;
sx:valueExpr [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:NodeConstraint ;
sx:pattern "(\\d+[\\.,]?\\d*)/(\\d+[\\.,]?\\d*)" ] [ a sx:NodeConstraint ;
sx:pattern "(\\d+[\\.,]?\\d*)%" ] [ a sx:NodeConstraint ;
sx:pattern "(\\d+[\\.,]?\\d*)" ] ) ] ] ] [ a sx:ShapeOr ;
sx:shapeExprs ( [ a sx:ShapeNot ;
sx:shapeExpr [ a sx:Shape ;
sx:expression [ a sx:TripleConstraint ;
sx:max -1 ;
sx:min 1 ;
sx:predicate schema:ratingValue ;
sx:valueExpr [ a sx:NodeConstraint ;
sx:pattern "(\\d+[\\.,]?\\d*)" ] ] ] ] [ a sx:Shape ;
sx:expression [ a sx:SemAct ;
sx:name <https://google.com/search/validation/valid-rating> ] ] ) ] ) ] ) ] ) ] ) ] ) ] ] ] ) ] .
Here is the corresponding SHACL. The internal representation is very similar and has a few other constructs:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
[] a sh:NodeShape ;
schema:identifier "ValidSoClaimReview" ;
sh:property [ sh:maxCount 1 ;
sh:message "VALID_URL" ;
sh:minCount 1 ;
sh:path schema:url ],
[ sh:maxCount 1 ;
sh:minCount 1 ;
sh:path schema:claimReviewed ],
[ sh:and ( [ sh:or ( [ sh:class schema:CreativeWork ] [ sh:class schema:Claim ] ) ] [ sh:or ( [ sh:not [ sh:class schema:CreativeWork ] ] [ a sh:NodeShape ;
sh:property [ sh:and ( [ sh:or ( [ sh:class schema:Organization ] [ sh:class schema:Person ] ) ] [ sh:or ( [ sh:not [ sh:or ( [ sh:class schema:Organization ] [ sh:class schema:Person ] ) ] ] [ a sh:NodeShape ;
sh:property [ sh:maxCount 1 ;
sh:minCount 1 ;
sh:path schema:name ;
sh:severity sh:Warning ] ] ) ] ) ;
sh:minCount 1 ;
sh:path schema:author ;
sh:severity sh:Warning ],
[ sh:message "VALID_DATE_TIME" ;
sh:minCount 1 ;
sh:path schema:datePublished ;
sh:severity sh:Warning ] ] ) ] [ sh:or ( [ sh:not [ sh:class schema:Claim ] ] [ a sh:NodeShape ;
sh:and ( [ sh:or ( [ sh:property [ sh:minCount 1 ;
sh:path schema:appearance ] ] [ sh:property [ sh:minCount 1 ;
sh:path schema:firstAppearance ] ] ) ;
sh:severity sh:Warning ] [ sh:or ( [ sh:property [ sh:minCount 1 ;
sh:path schema:appearance ] ] [ sh:property [ sh:minCount 1 ;
sh:path schema:firstAppearance ] ] ) ;
sh:severity sh:Warning ] ),
( [ sh:property [ sh:and ( [ sh:or ( [ sh:class schema:Organization ] [ sh:class schema:Person ] ) ] [ sh:or ( [ sh:not [ sh:or ( [ sh:class schema:Organization ] [ sh:class schema:Person ] ) ] ] [ a sh:NodeShape ;
sh:property [ sh:maxCount 1 ;
sh:minCount 1 ;
sh:path schema:name ;
sh:severity sh:Warning ] ] ) ] ) ;
sh:minCount 1 ;
sh:path schema:author ;
sh:severity sh:Warning ] ] [ sh:property [ sh:message "VALID_DATE_TIME" ;
sh:minCount 1 ;
sh:path schema:datePublished ;
sh:severity sh:Warning ] ] ) ;
sh:property [ sh:or ( [ sh:datatype xsd:anyURI ] [ sh:class schema:CreativeWork ] ) ;
sh:path schema:appearance ],
[ sh:or ( [ sh:datatype xsd:anyURI ] [ sh:class schema:CreativeWork ] ) ;
sh:path schema:firstAppearance ] ] ) ] ) ;
sh:minCount 1 ;
sh:path schema:itemReviewed ;
sh:severity sh:Warning ],
[ sh:and ( [ sh:or ( [ sh:class schema:Organization ] [ sh:class schema:Person ] ) ] [ sh:or ( [ sh:not [ sh:or ( [ sh:class schema:Organization ] [ sh:class schema:Person ] ) ] ] [ a sh:NodeShape ;
sh:and ( [ sh:or ( [ sh:property [ sh:minCount 1 ;
sh:path schema:name ] ] [ sh:property [ sh:minCount 1 ;
sh:path schema:url ] ] ) ;
sh:severity sh:Warning ] [ sh:or ( [ sh:property [ sh:minCount 1 ;
sh:path schema:name ] ] [ sh:property [ sh:minCount 1 ;
sh:path schema:url ] ] ) ;
sh:severity sh:Warning ] ) ;
sh:property [ sh:message "VALID_URL" ;
sh:path schema:url ] ] ) ] ) ;
sh:minCount 1 ;
sh:path schema:author ;
sh:severity sh:Warning ],
[ sh:and ( [ a sh:NodeShape ;
sh:property [ sh:hasValue schema:Rating ;
sh:path rdf:type ] ] [ sh:or ( [ sh:not [ a sh:NodeShape ;
sh:property [ sh:hasValue schema:Rating ;
sh:path rdf:type ] ] ] [ a sh:NodeShape ;
sh:not [ sh:and ( [ sh:property [ sh:minCount 1 ;
sh:path schema:alternateName ] ] [ sh:property [ sh:minCount 1 ;
sh:path schema:name ] ] ) ] ;
sh:or ( [ sh:not [ sh:not [ sh:property [ sh:minCount 1 ;
sh:path schema:name ] ] ] ] [ sh:property [ sh:minCount 1 ;
sh:path schema:alternateName ] ] ),
( [ sh:not [ sh:and ( [ sh:or ( [ sh:property [ sh:minCount 1 ;
sh:path schema:ratingValue ] ] [ sh:property [ sh:minCount 1 ;
sh:path schema:bestRating ] ] [ sh:property [ sh:minCount 1 ;
sh:path schema:worstRating ] ] ) ] [ sh:not [ sh:and ( [ sh:property [ sh:path schema:ratingValue ;
sh:pattern "-1" ] ] [ sh:property [ sh:path schema:bestRating ;
sh:pattern "-1" ] ] [ sh:property [ sh:path schema:worstRating ;
sh:pattern "-1" ] ] ) ] ] ) ] ] [ sh:and ( [ sh:property [ sh:or ( [ sh:pattern "(\\d+[\\.,]?\\d*)/(\\d+[\\.,]?\\d*)" ] [ sh:pattern "(\\d+[\\.,]?\\d*)%" ] [ sh:pattern "(\\d+[\\.,]?\\d*)" ] ) ;
sh:path schema:ratingValue ] ] [ sh:or ( [ sh:not [ sh:property [ sh:minCount 1 ;
sh:path schema:ratingValue ;
sh:pattern "(\\d+[\\.,]?\\d*)" ] ] ] [ sh:message "VALID_RATING" ] ) ] ) ] ) ;
sh:property [ sh:maxCount 1 ;
sh:path schema:alternateName ] ;
sh:severity sh:Warning ] ) ] ) ;
sh:minCount 1 ;
sh:path schema:reviewRating ] .
These should also be html visible somewhere, but for now it would be good to know how to go to https://google.github.io/schemarama/demo/ or whatever, and pop up browser devtools and quickly see this info.
We currently have shacl/shex generated from an old release of schema.org, people will ask which release we support.
(should also re-run convertors when Schema project does a release)
/cc @ericprud
We should have documentation showing how to set this up for (a) static serving (b) docker serving, and a live installation of at least one of these.
The original demos used server side processing for 3 things:
The whole thing can be run as a docker container but it would be good to have a simplified pure static version that could be run by anyone very easily. To do this:
I made a first attempt at the documentation below.
SchemaramaJS configures itself with various files loaded from relative URIs:
It will also typically serve icons associated with the hierarchy of services, e.g. initial demo uses:
The original demo shows a mix of shapes - some basic structures from Schema.org's definitions, and some associated with example online services. SchemaramaJS will try to load these upon initialization.
This can be quite large, e.g. looking at headers using
curl -s -D - -o /dev/null http://127.0.0.1:3002/shacl/shapes
Content-Disposition: inline; filename=full.shacl
Content-Type: application/octet-stream
Content-Length: 223194
We get a large dump of SHACL in RDF/Turtle syntax.
Similarly, here we are served (in demo configuration):
HTTP/1.0 200 OK
Content-Disposition: inline; filename=full.shexj
Content-Type: application/octet-stream
Content-Length: 633692
Last-Modified: Wed, 09 Mar
Similarly, for the ShEx version we get a large dump of ShEx in ShExJ syntax.
curl -s -D - http://127.0.0.1:3002/shacl/subclasses
This data file reproduces rdfs:subClassOf assertions from relevant schemas. It is in Turtle format, and is not tightly linked to SHACL, except by the fact that only the SHACL validator uses it; it is not passed to ShEx validator during setup. In principle it could be used for other purposes, and we could change the file/url path accordingly.
In demo configuration, it is every subtype-supertype relationship defined in schema.org (and therefore note sometimes a type has multiple supertypes). Here are the lines relating to the ComedyClub type:
curl -s -D - http://127.0.0.1:3002/shacl/subclasses | grep ComedyClub
schema:ComedyClub rdfs:subClassOf schema:Place .
schema:ComedyClub rdfs:subClassOf schema:EntertainmentBusiness .
schema:ComedyClub rdfs:subClassOf schema:Organization .
schema:ComedyClub rdfs:subClassOf schema:LocalBusiness .
schema:ComedyClub rdfs:subClassOf schema:Thing .
SchemaramaJS loads a JSON configuration file defining a hierarchy of services/applications that can be associated with the various validations being checked. In turn this file can include image URLs.
Demo config is this:
{
"nested": [
{
"service": "ServiceA"
},
{
"nested": [
{
"service": "ServiceBProduct1"
},
{
"service": "ServiceBProduct2"
},
{
"service": "ServiceBProduct3"
}
],
"service": "ServiceB"
},
{
"service": "ServiceC"
},
{
"service": "ServiceD"
}
],
"service": "Schema"
}
SchemaramaJS also uses a JSON service mapping file, which associates validation shapes (named in common across
SHACL and ShEX) with the services described in /services:
{
"ValidSchemaAboutPage": "Schema",
"ValidSchemaAcceptAction": "Schema",
"ValidSchemaAccommodation": "Schema",
"ValidSchemaAccountingService": "Schema",
"ValidSchemaAchieveAction": "Schema",
"ValidSchemaAction": "Schema",
"ValidSchemaActionAccessSpecification": "Schema",
"ValidSchemaActionStatusType": "Schema",
"ValidSchemaActivateAction": "Schema",
"ValidSchemaAddAction": "Schema",
"ValidSchemaAdministrativeArea": "Schema",
"ValidSchemaAdultEntertainment": "Schema",
"ValidSchemaAggregateOffer": "Schema",
"ValidSchemaAgreeAction": "Schema",
"ValidSchemaAirline": "Schema",
"ValidSchemaAirport": "Schema", [...etc etc...]
"ValidSchemaWriteAction": "Schema",
"ValidSchemaXPathType": "Schema",
"ValidSchemaZoo": "Schema",
"ValidServiceBRecipe": "ServiceB",
"ValidServiceBProduct1Recipe": "ServiceBProduct1",
"ValidServiceBProduct2Recipe": "ServiceBProduct2",
"ValidServiceBProduct3Recipe": "ServiceBProduct3",
"ValidServiceARecipe": "ServiceA",
"ValidServiceDRecipe": "ServiceD",
"ValidServiceCRecipe": "ServiceC"
}
Finally, SchemaramaJS loads a collection of example tests, each is an appropriately escaped text value,
structured in a very plain JSON file:
{
"tests": [
"escaped markup here e.g. json-ld...",
"second example here e.g. microdata..."
]
}
No additional metadata is included; SchemaramaJS will try to figure out how to parse it.
These files are all loaded by static/js/scc/core.js:
$(document).ready(async () => {
$.getJSON("https://api.ipify.org/?format=json", function(e) {
ip = e.ip;
});
await $.get(`shacl/shapes`, (res) => shaclShapes = res);
await $.get(`shacl/subclasses`, (res) => subclasses = res);
await $.get(`shex/shapes`, (res) => shexShapes = JSON.parse(res));
await $.get(`hierarchy`, (res) => {
hierarchy = res;
constructHierarchySelector(hierarchy, 0);
});
await $.get(`services/map`, (res) => shapeToService = res);
$.get(`tests`, (res) => initTests(res.tests));
shexValidator = new schemarama.ShexValidator(shexShapes, {annotations: annotations});
shaclValidator = new schemarama.ShaclValidator(shaclShapes, {
annotations: annotations,
subclasses: subclasses,
});
});
Relaying from Aaron:
I pasted the first http://schema.org/Event example into the tool and it passed as error free with ShEx validation, but generated 4 errors with SHACL...
(I have verified this -- @danbri)
{
"@context": "https://schema.org",
"@type": "MusicGroup",
"event": [
{
"@type": "Event",
"location": "Memphis, TN, US",
"offers": "ticketmaster.com/foofighters/may20-2011",
"startDate": "2011-05-20",
"url": "foo-fighters-may20-fedexforum"
},
{
"@type": "Event",
"location": "Council Bluffs, IA, US",
"offers": "ticketmaster.com/foofighters/may23-2011",
"startDate": "2011-05-23",
"url": "foo-fighters-may23-midamericacenter"
}
],
"image": [
"foofighters-1.jpg",
"foofighters-2.jpg",
"foofighters-3.jpg"
],
"name": "Foo Fighters",
"track": [
{
"@type": "MusicRecording",
"audio": "foo-fighters-rope-play.html",
"duration": "PT4M5S",
"inAlbum": "foo-fighters-wasting-light.html",
"interactionStatistic": {
"@type": "InteractionCounter",
"interactionType": "https://schema.org/ListenAction",
"userInteractionCount": "14300"
},
"name": "Rope",
"offers": "foo-fighters-rope-buy.html",
"url": "foo-fighters-rope.html"
},
{
"@type": "MusicRecording",
"audio": "foo-fighters-everlong-play.html",
"duration": "PT6M33S",
"inAlbum": "foo-fighters-color-and-shape.html",
"name": "Everlong",
"interactionStatistic": {
"@type": "InteractionCounter",
"interactionType": "https://schema.org/ListenAction",
"userInteractionCount": "11700"
},
"offers": "foo-fighters-everlong-buy.html",
"url": "foo-fighters-everlong.html"
}
],
"subjectOf": {
"@type": "VideoObject",
"description": "Catch this exclusive interview with Dave Grohl and the Foo Fighters about their new album, Rope.",
"duration": "PT1M33S",
"name": "Interview with the Foo Fighters",
"thumbnail": "foo-fighters-interview-thumb.jpg",
"interactionStatistic": {
"@type": "InteractionCounter",
"interactionType": "https://schema.org/CommentAction",
"userInteractionCount": "18"
}
}
}
At Google we're looking into publishing .ttl but the JS tools here need ShEx in JSON(-LD) form.
I believe everything is in the shexjs codebase as exposed at http://shex.io/webapps/shex.js/doc/shex-simple.html somewhere, @ericprud is considering API possibilities.
I believe there is code out there to auto-generate SHACL, and maybe ShEx too.
/cc @tombaker
According to the PR #23, current shex bundle has some issues and requires regeneration (or substitution with shex npm module).
except for special cases
Implement remote web-pages loading and consequent validation:
A brief inspection of https://github.com/google/schemarama/tree/main/kgx/wikidata shows that those queries have mostly the same structure:
wikibase:label
is invoked)So the distinct part of each query selects props, and optionally maps them to bioschema.
I think it makes sense to extract these specific parts and then generate SPARQL from them.
This will help to:
BTW, have you considered generating extraction queries from WD SHEX like https://www.wikidata.org/wiki/EntitySchema:E258 ?
Wikidata has a brutalistic timeout of 1 min: it may even cut a response in the middle, making it invalid.
I think that many of the queries in https://github.com/google/schemarama/tree/main/kgx/wikidata/basic will hit that timeout.
How to deal with it?
We have some code that first gets IDs, then batches them up into reasonable pieces to fetch the extra data...
Perhaps SEMICeu/dcat-ap_shacl#32 (comment) (DCAT, Dublin Core, FOAF vocabularies)?
When I was playing around with with the demo I noticed it didn't seem to report syntax errors but instead indicated: "The input doesn't contain structured data".
Is there any chance a future update will include syntax error reporting (similar to that of Google's Rich Result Test or schema.org's validator)?
We are probably doing something inefficient - needs profiling. Might be loading large data files?
If it can't be sped up, let's at least acknowledge this in the UI with an 'in progress' indicator, and something to avoid the button being re-clicked (which is a textbook example in some book I was reading on Functional JS :)
I tried a test with
{
"@context": "https://schema.org/",
"@type": "HeadphoneProduct", ...etc}
where some properties and specifically this type don't exist.
It gives only a dissapearing popup: "validation error - cannot ready property 'failures' of undefined".
Hello,
I have the attached html file. The file has 3x <script type="application/ld+json">
(one of which is empty).
Should schemarama be able to parse them? The cli doesn't seem to find them:
node cli --parse --input /tmp/3.html --format turtle
<http://example.org/> <fb:admins> "1825066490"@en-us;
<fb:app_id> "488770804557249"@en-us;
<http://ogp.me/ns#title> "Seoul Apartments & Vacation Rentals from $20 | HomeToGo"@en-us;
<http://ogp.me/ns#url> "https://www.hometogo.com/seoul/"@en-us;
<http://ogp.me/ns#description> "Click here and compare 16,895 vacation rentals from 19 providers in Seoul! โ Find deals & save up to 40% with HomeToGo."@en-us;
<http://ogp.me/ns#image> "//cdn2.hometogo.net/assets/media/pics/1200_628/585a9417b7c24.jpg"@en-us;
<http://ogp.me/ns#type> "website"@en-us;
<http://ogp.me/ns#site_name> "HomeToGo - search engine for vacation rentals"@en-us;
<http://ogp.me/ns#locale> "en_US"@en-us;
<article:author> "https://www.facebook.com/hometogo"@en-us.
_:df_1_0 <http://www.w3.org/1999/xhtml/vocab#role> <http://www.w3.org/1999/xhtml/vocab#combobox>.
In the doc (schemarama/core/README.md) about SHACL validation cli, the required argument should be --shacl and not --shex
I forget the terminology, but the AND/NOT/OR complexity here serves to suppress errors that are downstream of some more fundamental error. @ericprud et al have plans for doing this within ShEx in more standardized ways, so the actual intended shape content doesn't get lost in all the boolean trickery.
PREFIX : <http://schema.org/>
PREFIX validate: <https://google.com/search/validation/valid>
<S1> {
:url . + %validate:url{console.log('some url checking code here')%}
} AND {
:datePublished . ? %validate:date-time{console.log('some datetime checking code here')%}
} AND {
:claimReviewed .
} AND {
:itemReviewed {
a [:CreativeWork]
} AND (
NOT {
a [:CreativeWork]
} OR {
:author (
{
a [:Organization]
} OR {
a [:Person]
}
) AND (
NOT (
{
a [:Organization]
} OR {
a [:Person]
}
) OR {
:name . ?
}
)?
} AND {
:datePublished . ? %validate:date-time{console.log('some datetime checking code here')%}
}
)?
} AND {
:author (
{
a [:Organization]
} OR {
a [:Person]
}
) AND (
NOT (
{
a [:Organization]
} OR {
a [:Person]
}
) OR (
{
:name .
} OR {
:url .
}
) AND {
:url . * %validate:url{console.log('some url checking code here')%}
}
)?
} AND {
:reviewRating {
a [:Rating]
} AND (
NOT {
a [:Rating]
} OR {
:alternateName .
} AND (
(
NOT {
:name .
} OR {
:alternateName . ?
}
) AND (
NOT (
NOT {
:name .
}
) OR {
:alternateName . +
}
)
) AND NOT (
{
:alternateName .
} AND {
:name .
}
) AND (
NOT (
(
{
:ratingValue .
} OR {
:bestRating .
} OR {
:worstRating .
}
) AND NOT (
{
:ratingValue /-1/
} AND {
:bestRating /-1/
} AND {
:worstRating /-1/
}
)
) OR {
:ratingValue /([0-9]+[\.,]?[0-9]*)\/([0-9]+[\.,]?[0-9]*)/ OR /([0-9]+[\.,]?[0-9]*)%/ OR /([0-9]+[\.,]?[0-9]*)/ +
} AND (
NOT {
:ratingValue /([0-9]+[\.,]?[0-9]*)/ +
} OR {
} %validate:rating%
)
)
)+
}
i use schema properties inside html documents.
<body itemscope itemtype="https://schema.org/Person">
<h1 itemprop="name">anatol</h1>
will schemarama validate my html documents?
e.g.
pip3 install -r requirements.txt
Fake @id field needs to be removed from the pretty markup representation
The failures
array in the report currently does not list the identifier of the node that violates the shape, e.g.:
{
"property": "https://example.com/ssn",
"message": "More than 1 values",
"shape": "https://example.com/PersonShape",
"severity": "error"
},
This makes the report hard to use: which resource in my data is the culprit?
Is there an easy way to have the identifier of the incorrect node be part of the report?
https://github.com/structured-data/linter
/cc @gkellogg @jvandriel @jaygray0919
(there could also be a larger conversation about having the linter use https://github.com/ruby-rdf/shex and potential for interop testing...)
From a quick chat with @gkellogg:
'The structured view of examples would fit in pretty well with our system.'
For a concrete example, consider https://github.com/google/schemarama/blob/main/demo/validation/shex/specific/ServiceB/Dataset.shex
which was in fact based on Google's rules for describing datasets to Google Dataset Search. This all got anonymized for the opensource launch here for simplicity.
Here's a fragment of the ShExC:
<#ValidServiceBDataset> @<#ValidSchemaDataset> AND EXTRA a {
schema:description .
// techdoc:url "https://schema.org/url"
// techdoc:description "A short summary describing a dataset."
// techdoc:identifier "error";
schema:name .
// techdoc:url "https://schema.org/name"
// techdoc:description "A descriptive name of a dataset."
// techdoc:identifier "error";
schema:alternateName . +
// techdoc:url "https://schema.org/alternateName"
// techdoc:description "Alternative names that have been used to refer to this dataset, such as aliases or abbreviations."
// techdoc:identifier "warning";
schema:creator . +
// techdoc:url "https://schema.org/creator"
// techdoc:description "The creator or author of this dataset."
// techdoc:identifier "warning";
schema:citation .
// techdoc:url "https://schema.org/citation"
// techdoc:description "Identifies academic articles that are recommended by the data provider be cited in addition to the dataset itself."
// techdoc:identifier "warning";
...
It might be that the linking from bits of ShEx to supporting documents is better supported by out-of-band metadata, rather than inline. To be discussed!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.