cygri / void Goto Github PK
View Code? Open in Web Editor NEWAn RDF schema and associated documentation for expressing metadata about RDF datasets
Home Page: http://www.w3.org/TR/void/
An RDF schema and associated documentation for expressing metadata about RDF datasets
Home Page: http://www.w3.org/TR/void/
Raised by: the gang of four ;)
Content: need to decide if it is
void:Dataset <-[void:from]- void:Linkset - [void:to] -> void:Dataset
OR rather
void:Dataset -[void:contains]-> void:Linkset - [void:target] ->
void:Dataset
Original issue reported on code.google.com by Michael.Hausenblas
on 12 Jun 2008 at 1:17
The term "dataset" is used in voiD (inherited from the LOD community), but it
is also used in
SPARQL in a different sense (as "RDF Dataset", see definition here:
http://www.w3.org/TR/rdf-
sparql-query/#sparqlDataset )
We should have a section somewhere that clearly explains the difference.
If we get an extra section on describing SPARQL endpoints (aligned with the
SPARQL WG's output),
that might be a good place.
Original issue reported on code.google.com by [email protected]
on 21 Sep 2009 at 4:37
How to describe the relationship between a dataset and its mirror?
eg: I have a copy of some part of dbpedia (but not an exact copy), how
should I link my dataset to dbpedia in the voiD description, so that people
can use my endpoint as an alternatives to dbpedia, and know what caveats
there are if doing so?
Original issue reported on code.google.com by [email protected]
on 25 Sep 2009 at 9:09
The group ack the need for such terms but decides to wait and see what the
SPARQL WG does. There are some existing proposals such as [1] and we have
compiled some use cases which will be given to the SPARQL WG as an input.
We will liaise with the WG on this issue.
[1] http://ontologi.es/sparql
Original issue reported on code.google.com by Michael.Hausenblas
on 27 Aug 2009 at 1:23
[deleted issue]
The voiD Guide is very brief in its definition about datasets and linksets. I
wrote some text about
this for the LDOW'09 paper, it became Section 3.1 and 3.2. Is some of this
worth lifting into the
Guide?
The paper is here:
http://events.linkeddata.org/ldow2009/papers/ldow2009_paper20.pdf
Original issue reported on code.google.com by [email protected]
on 10 Aug 2009 at 7:08
Raised by: [http://semanticweb.org/wiki/Olaf_Hartig Olaf Hartig]
Content:
A description of an RDF dataset should contain information about the
provenance of the data in the dataset.
See also:
* http://community.linkeddata.org/MediaWiki/index.php?
MetaLOD#Provenance_information
Original issue reported on code.google.com by Michael.Hausenblas
on 12 Jun 2008 at 1:13
This has been requested by Hugh Glaser and Ian Millard from ECS Southampton to
model their
CRS (Co-reference Service) deployments.
The idea is to add another method of accessing datasets, in addition to linked
data (simply
dereferencing the URIs), RDF data dumps, and SPARQL endpoints, all of which can
already be
described in voiD.
The proposal is to add one more property which can be used on datasets:
:MyDS a void:Dataset;
void:uriLookupEndpoint <http://example.com/lookup?uri=>;
.
The semantics of this: There is a service that will give you RDF triples about
a certain URI x. To
invoke it, you URL-encode x and append the result to the lookup endpoint URI,
and look up the
resulting URI.
Examples of services that can be described as a voiD dataset using this
approach:
* Sindice and other search engines that offer URI lookup APIs
* Hugh Glaser's CRS (Co-reference Service)
* Converter services that translate from other data formats into RDF
* Theoretically: caches that serve linked data for someone else (using the
original URIs)
Open issue: Should the object of void:uriLookupEndpoint be an RDF resource or a
literal?
Real-world example of a voiD description for a CRS:
<http://dblp.explorer.com/id/crs> a void:Dataset;
foaf:homepage <http://dblp.rkbexplorer.com/crs/>;
rdfs:label "Coreference Resolution Service for the DBLP dataset at rkbexplorer.com";
void:vocabulary <http://www.rkbexplorer.com/ontologies/coref>;
void:uriLookupEndpoint <http://dblp.rkbexplorer.com/crs/export/?type=uri&term=>;
.
Original issue reported on code.google.com by [email protected]
on 17 Dec 2008 at 2:26
A linkset is a kind of dataset, isn't it? Or not? Maybe the two classes are
disjoint? We should discuss
this and maybe express it explicitly in the vocabulary spec.
Original issue reported on code.google.com by [email protected]
on 10 Dec 2008 at 11:51
The talk given by Marc from Bio2RDF in the HCLS call in November 2008
results in a possible story of using voiD in Bio2RDF. But we had not enough
information to understand their real problems and their work was also just
at the starting point. We keep the story here for future reference.
Example Use Case - the Bio2RDF dataset: (JUN)
Basically, they have a lot of datasets now. For the sake of performance,
they are partitioning these datasets and storing them on separate SPARQL
endpoints. It is very possible that every SPARQL endpoint hosts the dataset
from a different source. At the same time, they want the ability of
backtracking the statements about a thing from each of the SPARQL endpoint.
For example, to find out which SPARQL endpoint provides statements about a
UniProt gene. What they are doing now is by creating a map of how the data
from different endpoints are linked with each other by properties, and
using this map to guide their query processing.
How this can be supported using voiD. I am thinking the following process:
1. Bio2RDF gets the URI of a data resource, say a :uniprot_genexx
2. they query for the dataset where this data resource belongs to, say
:uniprot_ds
3. they search for the links from/to:uniprot_ds, and they found the
linksets, one of them could be :uniprotToKegg_ls
3. they search for which datasets that :uniprot_ds is linked with/to, say
one of them is :kegg_ds
4. then, they can send SPARQL queries to :kegg_ds, to find out what they
say about :uniprot_genexx
Maybe, the linkset :uniprotToKegg_ls contains all the owl:sameAs statements
about :uniprot_genexx, which will help with the queries.
Original issue reported on code.google.com by [email protected]
on 12 Dec 2008 at 3:07
Minor point: rdf:type can also be seen as "linking to external resources"
because the objects of
rdf:type statements reside on a different domain. We should clarify that this
does not constitute a
linkset, and that the void:vocabulary construct is to be used for classes.
Original issue reported on code.google.com by [email protected]
on 10 Nov 2009 at 2:15
There should be a property to be used in Linksets, which specifies the kind of
links that are
contained in the Linkset. Are they sameAs links? Are they seeAlso links? Are
they foaf:knows links?
Proposal:
void:linkPredicate a rdf:Property;
void:domain void:Linkset;
void:range rdf:Property;
rdfs:label "link predicate";
rdfs:comment "The linkset contains links of this RDF predicate.";
.
Original issue reported on code.google.com by [email protected]
on 23 Jun 2008 at 7:25
SPARQL endpoints provide access to an "RDF dataset", that is, a default graph
and zero or more
named graphs.
voiD could be handy for characterising the contents of these different graphs.
This can be
helpful for SPARQL user interfaces, for query optimization, and for federation.
I suppose one would define several void:subsets of the main void:Dataset, and
then express that
certain of these subsets reside in certain named graphs or in the default
graph. The SPARQL
endpoint URI (specified via void:sparqlEndpoint) could be used as a connecting
node to tie the
different graphs together.
Some of this will probably be provided by the SPARQL vocabulary under
development in the
SPARQL WG. So we could just wait and see, or try to anticipate their design.
A strawman from Gregory Williams is pasted below. I assume that the top-level
blank node
would be the SPARQL endpoint URI.
[] a sd:Service ;
sd:dataset [
sd:defaultGraph [
sd:datasetDescription <uri of voiD data>
] ;
sd:namedGraph [
sd:graphName <uri of named graph>
sd:datasetDescription <uri of voiD data>
]
] .
Original issue reported on code.google.com by [email protected]
on 21 Sep 2009 at 5:50
Some vocabularies define a class for documents that mainly use the vocabulary.
For example, FOAF
has foaf:PersonalProfileDocument. Shall we add something similar to void, e.g.
void:DatasetDescription? This would be a subclass of foaf:Document.
Benefit: It might encourage people to add metadata to their voiD files, e.g.
<> a void:DatasetDescription;
foaf:primaryTopic :MyDataset;
dc:modified "2009-02-15"^^xsd:date;
.
Original issue reported on code.google.com by [email protected]
on 16 Feb 2009 at 12:16
Stuart Williams commented on the picture at the start of Section 2:
| The diagram at the start of section 2 is actually a little confusing. It
| looks like it presents two datasets :DS1 and :DS2 (each being a collection
| of statements) and that each dataset'contains a named subset :LS1 and LS2
| respectively, of linksets - in the example expressing populations of links
| using foaf:knows, rdfs:seeAso and owl:sameAs properties. However, as the
| later examples unfold, :LS1 and :LS2 are not 'subgraphs' of their
| respective graphs, they are (optionally) named linkset resources that act
| as statement subjects for some statements describing *a* particular
| linkset. Indeed even for the example illustrated there are (or would be)
| three defined linkset nodes (two in :DS1 and one in :DS2) and the regions
| that are demarque as :LS1 and :LS2 don't exist quite as presented (AFAICT).
Opinions? Does it make sense that all the three arrows originate/end in the
same two :LS
resources?
Original issue reported on code.google.com by [email protected]
on 27 Aug 2009 at 8:50
Talis is spending considerable resources to improve the general situation
around dataset licensing.
They've contributed to the PDDL effort, there was a high-profile article by Ian
Davis [1], and several
Talis folks did a tutorial on dataset licensing at ISWC [2].
They tie licenses to void:Datasets, but at least in Ian's proposal, it doesn't
use dc:license, but a
separate vocabulary, the Waiver vocabulary. See [1] for example use.
Should we align this somehow? Document the use of the Waiver vocabulary? At
least I think we
should create consensus with them about the proper way of marking up different
licenses/waivers
in voiD.
[1] http://blogs.talis.com/nodalities/2009/07/linked-data-public-domain.php
[2]
http://www.opendatacommons.org/events/iswc-2009-legal-social-sharing-data-web/
Original issue reported on code.google.com by [email protected]
on 10 Nov 2009 at 12:47
Jiri Prochazka complained about the Guide section on dataset partitioning:
“Another thing - dataset partitioning. Combination of dataset
categorization and partitioning led me to great confusion - I have
thought voiD also wanted to categorize the data in the dataset.
Better to put a notice that partitioning should be used carefully and
that it was designed for mirroring of datasets.”
I think the section should be improved to make the following points clear:
- partitioning is about the case where a voiD author wants to say something
that applies to just
a part of the dataset, and wants to stress that it does *not* apply to all of it
- a partition is itself a void:Dataset and can be described using any of the
usual properties
- list more examples why we would want to have partitions: different
provenance, different
publication dates, different SPARQL endpoints, different dumps, different
vocabularies used,
different topic
- probably also worth mentioning that the same mechanism can also be used to
*aggregate*
datasets
Original issue reported on code.google.com by [email protected]
on 2 Feb 2009 at 1:31
Pointed out by Simon Renhardt on public-lod:
The example void:features in section 1.5 use dcterms:format in a poor way. It's
probably more
intended to be used on documents (rather than on features), and its object
should be a resource
rather than a literal.
We should probably change the example to something a bit more neutral, and make
clearer that it's
just an example and not something that everyone should do.
We should work towards collecting things that people want to express as
void:features, and maybe
include a list of predefined features in the next voiD version.
Original issue reported on code.google.com by [email protected]
on 30 Jan 2009 at 4:59
void:linkPredicate is currently only mentioned in passing. I'd like to see a
fuller discussion.
- What does it mean if no linkPredicate is stated for a linkset?
- What does it mean if a linkPredicate is stated?
- What does it mean if several linkPredicates are stated?
What can a consumer of the data expect in each of the cases above?
Original issue reported on code.google.com by [email protected]
on 10 Nov 2009 at 2:13
VoiD should have the ability to express that a dataset contains URIs of a
certain shape. For
example, DBpedia has resources of this shape:
http://dbpedia.org/resource/*
Knowing this can be useful to find datasets that contain information about a
given URI. Note that
Semantic Sitemaps can express this using the <sc:linkedDataPrefix> element.
This is especially
useful for integrating several SPARQL endpoints, because knowing which SPARQL
endpoint has
information about what URIs can save a lot of processing resources.
Proposal:
:MyDataset a void:Dataset;
void:uriPattern "^http://example\.com/data/".
The value of void:uriPattern would be a regular expression. Making this a regex
has one nice
advantage. Assuming we want to know which dataset contains information about a
resource
http://example.com/data/12341234, We can ask SPARQL queries like this:
SELECT ?dataset
WHERE {
?dataset a void:Dataset .
?dataset void:uriPattern ?pattern .
FILTER(REGEX("http://example.com/data/12341234", ?pattern))
}
This will return all datasets matching the URI.
Two other possibilities would be:
void:uriPattern "http://example.com/data/{id}";
void:uriPrefix "http://example.com/data/";
The first one uses URI Templates (see [1]), the second one is a much simpler
prefix-based
solution.
[1] http://bitworking.org/news/URI_Templates
Original issue reported on code.google.com by [email protected]
on 11 Dec 2008 at 10:25
To check if it makes sense to enable LRDD-based discovery for voiD, based
on [1] and [2].
[1] http://linkeddata.deri.ie/tr/2009-discovery
[2]http://uldis.deri.ie/
Original issue reported on code.google.com by Michael.Hausenblas
on 13 Aug 2009 at 10:17
Section 1 of the Guide says: "Note: It is assumed that the intersection of all
resources defines the
topic." And it gives the following example:
:DBLP a void:Dataset;
dcterms:subject <http://dbpedia.org/resource/Computer_science> ;
dcterms:subject <http://dbpedia.org/resource/Journal> ;
dcterms:subject <http://dbpedia.org/resource/Proceedings> .
Stuart Williams commented on this:
| Conjunctive use of dcterms:subject... whilst I think I understand the
| pragmatic appeal, given the example I think that the intersection of
| Computer Science (some conceptual domain of study/investigation; a Journal
| (a form of publication); and Proceedings (a different form of publication
| usually arising from a workshop or conference and IIUC dijoint with
| Journals); is empty. Yes I know that's very anal (and maybe big 'O'ist) of
| me. I think that you have several dimensions squeeze int one - here
| computer science truely is a subject domain, but journal and proceeding
| really are more modes or category of publication that being subject
| domains.
I think he's right. I don't remember why we put that sentence in there. I would
prefer to simply
strike it.
Original issue reported on code.google.com by [email protected]
on 27 Aug 2009 at 8:44
Raised by: [http://rdf.ecs.soton.ac.uk/person/21 Hugh Glaser]
Content: backlink from any part of the dataset to the voiD description
See also:
Original issue reported on code.google.com by Michael.Hausenblas
on 12 Jun 2008 at 1:09
I was thinking that void:Dataset might get sub-classed to dcmitype:Dataset
(see <http://dublincore.org/documents/dcmi-type-vocabulary/>) *but* -
sometimes it may not be a true dataset but just a service
(dcmitype:Service).
See, there are different kinds of providers of Linked Data at the moment
(which might be defined as sub-classes of void:Dataset):
- Those that have RDF stored somewhere and just export it. The storage
might be a native RDF store or a relational database (which doesn't even
have to store the RDF directly but may convert on-the-fly from a regular
schema). Data may be produced locally or come from somewhere else, like a
dump of data that is to be exposed as Linked Data.
- Those that call a Web service and wrap its response as RDF on-the-fly.
- Those that call a Web service and cache the results for later requests.
I'm not sure what value there is in defining these sub-classes. Describing
the costs of making certain calls (in the case of wrappers) or the up-to-
dateness of the data (in the case of RDFised dumps) might hold more value.
But the point is that a wrapper is actually a service and not a dataset
itself (then again DBpedia is a service which exposes their dataset).
Thoughts?
Original issue reported on code.google.com by [email protected]
on 19 Jan 2009 at 10:54
As per my discussion on #swig:
http://chatlogs.planetrdf.com/swig/2009-05-27.html#T16-08-26
Original issue reported on code.google.com by [email protected]
on 27 May 2009 at 3:41
Raised by: [http://moustaki.org/foaf.rdf#c4dm Yves Raimond]
Content:
It should be possible to describe the content of a SPARQL end-point
(proposal add a void:example property).
See also:
* http://community.linkeddata.org/MediaWiki/index.php?MetaLOD#Requirements
* http://blog.dbtune.org/post/2008/06/12/Describing-the-content-of-RDF-
datasets
Original issue reported on code.google.com by Michael.Hausenblas
on 12 Jun 2008 at 1:03
A property void:changeFrequency could give an estimate of update times,
just like <changefreq> in sitemaps <http://www.sitemaps.org/
protocol.php#changefreqdef> (which is also used in the semantic sitemap
extension: <http://sw.deri.org/2007/07/sitemapextension/#xml-tags>).
Maybe this should even be a scovo:Dimension to be used for void:statItem?
Someone please set this as an enhancement request.
Original issue reported on code.google.com by [email protected]
on 20 Jan 2009 at 9:19
The idea was mentioned to me by Alex Passant. It's something we should explore,
especially since
aggregates are going to be part of SPARQL2.
Original issue reported on code.google.com by [email protected]
on 10 Feb 2009 at 2:07
The licensing section of the VoiD Guide lists this URI for the GFDL:
http://www.gnu.org/copyleft/fdl.html
This should be:
http://www.gnu.org/licenses/fdl.html
A link to the list of all GNU licenses might also be a useful addition:
http://www.gnu.org/licenses/
Original issue reported on code.google.com by [email protected]
on 18 Jun 2009 at 11:10
This issue explores alignment between voiD and the DARQ Service Description
ontology (DOSE –
Description Of a SErvice).
An example DOSE description:
http://darq.sourceforge.net/index.html#Service_Descriptions
The DOSE ontology definition (incomplete, e.g. doesnt have selectivity):
http://darq.svn.sourceforge.net/viewvc/darq/darq/trunk/Schema/dose.n3?
revision=9&view=markup
=== SUMMARY OF DOSE ===
The main item in a DOSE description is an sd:Service, which is a SPARQL service
(the ontology
comment suggests that it could be renamed to sd:Endpoint).
An endpoint can have these properties: sd:url (as a literal), sd:isDefinitive
(not clear what it
means), sd:totalTriples, and a number of sd:Capabilities and
sd:RequiredBindings.
An sd:Capability states that the service can process queries that include a
certain triple pattern.
A capability can be described in more detail using thse properties:
sd:predicate (predicate of the
triple pattern), sd:triples (how many of the triples exist), sd:sofilter
(SPARQL FILTER expression
involving ?subject and ?object that holds for all triples; typically used to
constrain URIs using
regexes), sd:subjectSelectivity and sd:objectSelectivity (selectivity if the
subject/object is known).
sd:Requiredbindings come in two flavours, subjectBinding and objectBinding. A
subjectBinding
instance declares that a certain property may only occur with a fixed subject
(that is, no variable)
in the query.
=== ALIGNMENT VOID/DOSE ===
The simplest alignment would be to state that the object of void:sparqlEndpoint
is an sd:Service.
When a voiD description includes a SPARQL endpoint, then further DOSE
descriptions can simply
be added using the endpoint URI.
Another option would be to add triple pattern statistics directly to voiD. A
dataset could have
TripleSets. They would be something similar to a LinkSet, but not requiring
external URIs. A triple
set could have: predicate, number of triples, URI templates for subject and
object, selectivity
information (modelled as another stat?). Adding a RequiredBindings mechanism to
voiD seems a
bit more difficult since it's really a property of the SPARQL access mechanism
and not of the
dataset as such.
:myDataset void:contains [
a void:TripleSet;
void:predicate foaf:knows;
void:subjectTemplate "http://sn.example.com/user/{username}";
void:objectTemplate "http://sn.example.com/user/{username}";
void:statItem [
scovo:dimension void:numberOfTriples;
rdf:value 25432;
];
void:statItem [
scovo:dimension void:objectSelectivity;
rdf:value 0.00002;
];
] .
Original issue reported on code.google.com by [email protected]
on 11 Jul 2008 at 7:11
Which URI exactly do I use for any given vocabulary?
We say, the one that's the object of isDefinedBy triples for the vocab terms,
but often isDefinedBy is
not used in real-world vocabularies.
Should we say "downloadable location"? Should we say "namespace URI"? What
about trailing
hashes, leave them or remove them?
I would prefer having some really clear guidance in the Guide.
Original issue reported on code.google.com by [email protected]
on 10 Nov 2009 at 12:26
In the current examples, owl:sameAs is used to link a dataset to its website.
Yet, I think it would be better to make a difference between the dataset itself
and the website /
homepage that hosts / gives information about it.
foaf:topic is a solution (domain = owl:Thing / range = foaf:Document), but as
this is an IFP, we
cannot have 2 different datasets linked to dbpedia.org for example. It would
imply to be careful
with that and use, http://wiki.dbpedia.org/Downloads30#titles rather that
http://dbpedia.org to
make the link for each 'subdataset' hosted by dbpedia.
Original issue reported on code.google.com by [email protected]
on 13 Jun 2008 at 7:34
As pointed out by Jacco van Ossenbruggen on 2009-02-11:
'One of the key things I would like to see added is some simple versioning.
For interlinking, statistics and publishing this seems a crucial feature.
If you could only make the version of a Dataset explicit by adding a
void:version datatype property, this would also also allow you to say
that, for example, a Linkset interlinks two specifically versioned
Datasets, or that a Dataset uses a specific version of a vocabulary, etc.'
Original issue reported on code.google.com by Michael.Hausenblas
on 11 Feb 2009 at 10:55
As recently discussed on #swig [1] one could 'Link header to point from the
RDF document response to the void description' (via Shepard).
Maybe we should address this in the voiD guide (eg in section 4. Consuming
Process)
Related:
+ http://esw.w3.org/topic/FindingResourceDescriptions
+ http://www.mnot.net/drafts/draft-nottingham-http-link-header-03.txt
[1] http://chatlogs.planetrdf.com/swig/2009-01-11#T12-38-19
Original issue reported on code.google.com by Michael.Hausenblas
on 11 Jan 2009 at 1:00
A discussion the public-lod mailing list[1] raises the issue that linked
datasets benefit greatly in usability by having full text search over their
data.
Many datasets already provide this feature (with some notable exceptions),
and we should provide a way to include the search service of a dataset in
its voiD description.
One point to bear in mind may be that the service might be external to the
dataset itself - for example, if the dataset is deployed as RDFa, Google
could be used to provide the search service, using the site: syntax for
restricting results to a particular domain.
It may be worth looking at http://schemas.talis.com/2005/service/schema#
which provides generic terms for describing web services.
I'd probably like something more obvious and specific though - subclassing
from void:feature perhaps. eg: void:textSearchService
[1] http://lists.w3.org/Archives/Public/public-lod/2009Feb/0058.html
Original issue reported on code.google.com by [email protected]
on 9 Feb 2009 at 8:58
VoiD should be able to express which vocabulary is used in a dataset. Example
use cases would
be queries like:
“Give me datasets that contain FOAF data.”
“Does MusicBrainz use the Music Ontology?”
“Which vocabulary or ontology is used by the largest number of datasets?”
This information is also useful for tools like query builders, they can use it
to pre-populate fields
for selecting classes and properties.
We already have the void:linkPredicate property for linksets, but that has a
much more narrow
scope and doesn't address the use cases above.
Proposal 1:
Have a new property void:vocabulary with domain void:Dataset and range
owl:Ontology. It links a
dataset to a vocabulary. The vocabulary would be something like the formal FOAF
spec, that is,
the object of rdfs:isDefinedBy for the FOAF terms.
Proposal 2:
Maybe we want to list not just used vocabularies, but something more
fine-grained: some of the
classes and properties used in the dataset? I guess this will never be an
exhaustive list, but only
the most important/frequent ones:
void:prominentClass
void:prominentProperty
Original issue reported on code.google.com by [email protected]
on 4 Dec 2008 at 12:32
noticed a few typos and so on that I'd like to fix
Original issue reported on code.google.com by [email protected]
on 28 May 2009 at 8:28
We are aware of the following issues with the statistics mechanism as it stands
today:
1. In SCOVO, scovo:Items are grouped into scovo:Datasets, and there seems to be
an implicit
assumption that all items in such a dataset share the same dimensions. As
described here, we
attach items directly to a void:Dataset, which leads to mixing of items of
different dimensionality.
On the other hand, the correct SCOVO modelling would lead to awkwardly complex
notation for
simple statistics.
2. We encourage the use of classes and properties in places where SCOVO
requires an instance of
scovo:Dimension. This breaks the symmetry of the SCOVO model. SCOVO would
require us to
create a scovo:Dimension for each class or property. This would be quite
verbose.
3. Because of the issues above, SPARQLing for statistics can be awkward. It
will often require a
verbose check to make sure that an item has only certain dimensions and no
others.
Two possible approaches for fixing these issues:
1. Adapt SCOVO to better suit our needs, e.g. by making it a bit less verbose
(esp. around
definition of dimensions), making it easier to query (e.g.
scovo:numberOfDimensions on the
dataset, scovo:domainObject property for connecting dimensions to the domain,
removing the
subclassing of dcterms:Event) (downside: will still be verbose)
2. Create a new mechanism based on simple properties like void:numberOfTriples
and having a
powerful mechanism for specifying void:subsets (downside: how to do attribution
of statistics
with this is completely unclear)
We decided not to take action on those issues until after the first release of
the Guide.
Original issue reported on code.google.com by [email protected]
on 14 Jan 2009 at 9:10
Is the sitemaps ontology developed prior to sitemaps.xml usable? how do we
link void:DataSet to data access points (sparql endpoints, REST interfaces,
etc)?
Original issue reported on code.google.com by [email protected]
on 1 Jul 2008 at 1:42
Should voiD 2.0 define a list of “standard” feature URIs that people can
just use so they don't
need to define their own?
Example:
void:ContentNegotiation a void:Feature .
Ideally, such a list would be composed from examples found in the wild but I
don't know if
void:Feature is used at all out there in the wild.
A starting point might be to trawl the Linking Open Data wiki pages for dataset
descriptions such
as "It supports X, Y and Z". Or to look at the homepages of some datasets for
similar descriptions.
Original issue reported on code.google.com by [email protected]
on 27 Aug 2009 at 1:42
From: "Williams, Stuart (HP Labs, Bristol)"
Date: Mon, 23 Feb 2009 16:22:28 +0000
Subject: voID
Hello Richard, Michael,
Just had a rapid read through the voID Guide
(http://rdfs.org/ns/void-guide) and thought that I'd offer some comments
for whatever they may be worth...
I think that it would be useful to comment in the difference between the
SPARQL conceptualisation of dataset (a default graph and collection of
named graphs) and the voID conceptualisation of a dataset (which I think is
a single graph - though
Section 1:
Conjunctive use of dcterms:subject... whilst I think I understand the
pragmatic appeal, given the example I think that the intersection of
Computer Science (some conceptual domain of study/investigation; a Journal
(a form of publication); and Proceedings (a different form of publication
usually arising from a workshop or conference and IIUC dijoint with
Journals); is empty. Yes I know that's very anal (and maybe big 'O'ist) of
me. I think that you have several dimensions squeeze int one - here
computer science truely is a subject domain, but journal and proceeding
really are more modes or category of publication that being subject
domains. I certainly think that the range of dcterms:subject should be
something like skos:Concept (not looked as skos of late). But I think that
composite subjects are hard.
Section 2:
The 'subset' property could do with being renamed 'hasSubset' or
'isSubSetOf' - I think that the sense of it is the former, but at least for
me the directionality does not stick in my memory for long.
The diagram at the start of section 2 is actually a little confusing. It
looks like it presents two datasets :DS1 and :DS2 (each being a collection
of statements) and that each dataset'contains a named subset :LS1 and LS2
respectively, of linksets - in the example expressing populations of links
using foaf:knows, rdfs:seeAso and owl:sameAs properties. However, as the
later examples unfold, :LS1 and :LS2 are not 'subgraphs' of their
respective graphs, they are (optionally) named linkset resources that act
as statement subjects for some statements describing *a* particular
linkset. Indeed even for the example illustrated there are (or would be)
three defined linkset nodes (two in :DS1 and one in :DS2) and the regions
that are demarque as :LS1 and :LS2 don't exist quite as presented (AFAICT).
Section 3:
I don't quite understand how you could attribute a value to
void:numberOfDocuments. Taking a deliberatly obtuse stand, a dataset
contains triples, numbers of distinct subjects and objects makes sense,
but numbers of documents - doesn't seem to me to be a dimension of such a
dataset.
In the voID ontology, void:LinkSet is defined to be a subclass of
void:Dataset which gives some syntactic convenience in the re-use of
statistical properties (and probably some others too) but I'm not convinced
that ontologically a :LinkSet us a subclass of a :Dataset - particularly in
the form given where an instance of :LinkSet really can only establish that
a single given property is used to link between a pair of :Dataset.
Section 5.1:
Hmmm... lots of scope for confussion.
<document.rdf> dcterms:isPartOf <void.ttl#MyDataset> .
Kind of curious from the point of view of having previously established
sparql, uriLookup and dump endpoints why one would be remotely interested
in <document.rdf> as being a part of the dataset (unless separately it was
a dataset in its own right with it's own set of endpoints etc).
Original issue reported on code.google.com by Michael.Hausenblas
on 4 Mar 2009 at 12:13
MetaVocab is a proposed “vocabulary for describing vocabularies”, dating
back to 2002. It has
found fairly widespread use because it is a documented module for RSS 1.0 feeds:
http://web.resource.org/rss/1.0/modules/admin/
and because it is used in the popular FOAF-a-matic. The vocabulary itself
should be defined
here, but seems to be a victim of bitrot:
http://webns.net/mvcb/
The two key term of the vocabulary are generatorAgent (pointing to a URI that
identifies the
software that generated an RDF document) and errorReportsTo (usually a mailto:
URI for
contacting the webmaster responsible for an RDF document).
GeneratorAgent triples could be attached to void:Datasets and void:Linksets to
point to the
software that generated the dataset. ErrorReportsTo could also be provided with
a linkset as a
nice shortcut for getting in touch with the publisher. We could put examples
into the Guide.
On the other hand, the vocabulary appears to suffer from some bitrot and isn't
properly
published according to best practice guidelines. Therefore, it might be better
to first contact the
authors and get them to fix it, or just copy the terms into our own namespace.
Original issue reported on code.google.com by [email protected]
on 18 Jul 2008 at 12:06
One wants not only functional description but also know about the quality
of a dataset. How are we gonna deal with this? Using a review voc such as
http://hyperdata.org/xmlns/rev/ or invent new entities for it?
Original issue reported on code.google.com by Michael.Hausenblas
on 23 Jan 2009 at 10:55
The Guide and vocabulary documentation leaves this question unclear: Does the
URI pattern
"http://example.com/" match only the single URI "http://example.com/", or does
it match any URI
containing the string, e.g., "http://example.com/myresource"?
This should be stated explicitly, and all examples should clearly communicate
this.
Original issue reported on code.google.com by [email protected]
on 17 Nov 2009 at 3:25
http://dublincore.org/groups/collections/collection-application-profile/
This describes an interesting meta-model for collections, has some terms that
could be useful to
align to, and is also interesting from an editorial POV because the voiD Guide
is, in some sense,
also an “application profile” of several vocabularies.
Original issue reported on code.google.com by [email protected]
on 2 Feb 2009 at 1:12
some datasets are actively seeking data. How can I indicate what my dataset
will accept, and how to submit it? RDF Forms ?
Original issue reported on code.google.com by [email protected]
on 19 Jun 2008 at 5:22
We have void:exampleResource to give example URIs for a dataset. Now I have a
scenario where I
want examples for linksets. How to do that? The items in a linkset are triples,
not simple resources,
so it's not so easy.
The simplest answer would be: Just take your example triple and give either the
subject or the
object (depending on what you consider the better example) with
void:exampleResource. For
linksets that are hosted inside a larger dataset, probably the resource which
is within the DS should
be given, not the target.
If this design is acceptable, then we should briefly document it in the next
Guide.
Original issue reported on code.google.com by [email protected]
on 15 Feb 2009 at 11:57
[deleted issue]
Looking at the SIOC ontology, 2 classes might be of interest for voiD
sioc:Space : A Space is defined as being a place where data resides. [...]
I guess it will make sens to subclass void:DataSet from it
siocs:Service (http://rdfs.org/sioc/services#Service) : Service is web service
associated with a
Site or part of it
Actually it says 'Site' but in the related siocs:has_service property, there is
no range so that a
void:SparqlEndpoint could be a subclass of it, linked to the DataSet using this
siocs:has_service.
There are also service_endpoint / service_protocol properties that could be used
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 13 Jun 2008 at 7:25
This is requirements from Andy Gibson, as quoted below.
{{{
I'm now at the point where I want to describe the relationship between:
1) A graph (dataset) of asserted triples
- lets say a skos vocabulary
2) (optional) A graph containing some 'semantics'
- like an OWL ontology, which may of course be a part of 1)
- lets say I've extended SKOS with some of my own Classes / Properties
and I'm inferring broader/narrower relationships
3) A graph of inferred triples obtained by some reasoning process
- I can combine 1) with several different 2)s and get very different 3)s
This would allow me to effectively add metadata to a graph of *inferred*
triples about the dataset from which they were derived, what semantics were
applied to generate them and which reasoner was used etc. Crucially, I
would be able to find out if a dataset has been through any sort of
reasoning process, including any reasoning process inherent in a SPARQL
endpoint.
Like I said, this seems to me to be in the scope of VoID, and thought it
was worth raising. Would be glad to hear your thoughts. If you think its
valid perhaps you could forward it to the others.
Best regards,
Andrew
}}
Original issue reported on code.google.com by [email protected]
on 13 Aug 2009 at 8:45
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.