Git Product home page Git Product logo

Comments (11)

wetneb avatar wetneb commented on June 3, 2024 1

Yes, totally - for instance, people have very uninformative types (Q5) so it would be better if we could use "occupation" or "position held" in some cases… For cases like these, I think we could try to revive Freebase's "query-based reconciliation", or take some inspiration from the SPARQL-based reconciliation services of the RDF extension… The standard reconciliation service API is useful for its simplicity and efficiency, but there is no hope to make it general enough to handle all cases.

from openrefine-wikibase.

thadguidry avatar thadguidry commented on June 3, 2024 1

from openrefine-wikibase.

thadguidry avatar thadguidry commented on June 3, 2024 1

@ettorerizza well it is because it does not ask the question your trying to ask...because it depends on what your really trying to ask. My hunch is that your asking for "populated classes", in other words, classes/types in Wikidata that actually have recommended properties that could be filled in ?

That would be this query:
https://query.wikidata.org/#SELECT%20DISTINCT%20%3Fitem%20%3FitemLabel%20%3Fproperties_for_this_type%20%3Fproperties_for_this_typeLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20OPTIONAL%20%7B%20%3Fitem%20wdt%3AP1963%20%3Fproperties_for_this_type.%20%7D%0A%7D%20ORDER%20BY%20%3Fitem%0A

This is a shortened listing of all subclasses of a class... again potentially anything can be placed in a bucket type (a bucket type is a type that may or may not have any corresponding populated properties) ...
https://query.wikidata.org/#SELECT%20%3Fsubclass%20%3FsubclassLabel%20%3Fclass%20%3FclassLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20OPTIONAL%20%7B%20%3Fsubclass%20wdt%3AP279%20%3Fclass.%20%7D%0A%7D%0ALIMIT%20100

from openrefine-wikibase.

wetneb avatar wetneb commented on June 3, 2024

Wikidata's structure does not make it very easy as there is no clear-cut distinction between instances and classes: it is possible for some concept to be an instance of something and a subclass of something else.

One thing that was suggested to me would be to automatically add a special type ("class") to all items that have a "subclass of" (P279) statement. In this way you could reconcile only against this special type. I think that would work pretty well but that would not let you filter these classes to a particular domain.

from openrefine-wikibase.

ettorerizza avatar ettorerizza commented on June 3, 2024

One thing that was suggested to me would be to automatically add a special type ("class") to all items that have a "subclass of" (P279) statement.

I don't know the Wikidata API very well, but I suppose there is no way to filter the entities for which a property exists or does not exist? (eg : "skyphos" NOT EXISTS P31 | EXISTS P279)

from openrefine-wikibase.

wetneb avatar wetneb commented on June 3, 2024

This is definitely doable in SPARQL, but that would be too expensive an API call for the reconciliation endpoint. But even with performance aside, the main question for me is how do we expose that to the user.

from openrefine-wikibase.

ettorerizza avatar ettorerizza commented on June 3, 2024

A sort of filter ?

screenshot-127 0 0 1-3333-2018 04 08-16-13-44

from openrefine-wikibase.

wetneb avatar wetneb commented on June 3, 2024

Hmm… I'm not sure: this would not make it possible to reconcile to particular subclasses of a given class. Also, ideally I would prefer not to mess up too much with the existing reconciliation API.

from openrefine-wikibase.

ettorerizza avatar ettorerizza commented on June 3, 2024

You may be right, I admit I have not thought much about it. My intuition was just : now that we can enrich the data matched with Wikidata, it can be useful to filter a priori those who have the properties that interests us (population, geographical location ...), which will avoid useless API call. In other words, "match these names with Wikidata only if I can then get this or that property".

This would also allow a finer reconciliation. We can not always rely on P31 and P279 in Wikidata, they are too often misused. But as a general rule of thumb, a human must have at least one "given name" (P735)

from openrefine-wikibase.

ettorerizza avatar ettorerizza commented on June 3, 2024

@thadguidry Weird that Q5 doesn't appear in the first results of your sparql query : http://tinyurl.com/yajxhnzc

What's wrong with it ?

from openrefine-wikibase.

wetneb avatar wetneb commented on June 3, 2024

There is no simple way to tell those things apart in Wikidata's modelling so let's close this.

from openrefine-wikibase.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.