Git Product home page Git Product logo

entity-explosion's People

Contributors

99of9 avatar connorshea avatar quiddity-wp avatar samwilson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

entity-explosion's Issues

Developer edition: How can we better support developers with Entity Explosion

One possibility with Entity Explosion is to help developers to see the json code available for a source.... Today I cheat using Third party Property:P3303 "third-party formatter URL"

Is there a better solution?

E:g

image

  • Property:P4819 "Swedish Portrait Archive"
    image

Example how WikiAPIConnector is designed were Andrew Lih looks into define different GLAM APIs in YAML i.e. there is a need to easier access the API...

image

image

image

Set up P6821 Alvin correct

Alvin is Uppsala University Property:P6821

Any way to better set up the property to support this?

  • we have format as a regular expression ^alvin-(record|person|place|organisation|work):[1-9]\d{0,6}$

**Wikicommons ** can we sport them?

performance issue: matchpattern_match blocks the browser for more than 1s at every URL change

Environment

  • addon version: v0.9.1
  • browser: Firefox v102.0

Issue

The addon blocks the browser for more than 1s at every URL change, as matchpattern_match runs its 5000+ regexes on the new URL. From the addon logs:

matchpattern_match false on about:addons took 1345ms

It might be that some of those regexes are particularly slow, or just that there are too many of them to run fast

Use P8966, "URL match pattern", for a more inclusive entity match finder

The context

I have noticed that in Deezer, when visiting the page of a album, the URL contains the language the page is being displayed in. For example, all of the following URLs correspond to the album "Everyday Life" of "Coldplay" in the Deezer database. The Wikidata item that corresponds to this album is Q72087708

https://www.deezer.com/us/album/119673702
https://www.deezer.com/ru/album/119673702
https://www.deezer.com/mx/album/119673702
https://www.deezer.com/es/album/119673702
https://www.deezer.com/fr/album/119673702
https://www.deezer.com/en/album/119673702

The problem

As of the time of this writing, the current revision of the property contain the following values for P1630, "formatter URL"

https://www.deezer.com/album/$1
https://www.deezer.com/us/album/$1
https://www.deezer.com/ru/album/$1

This makes the first two links (in the group of links shown above) to be matched with the album in Wikidata.

https://www.deezer.com/us/album/119673702
https://www.deezer.com/ru/album/119673702

but the other URLs won't be matched to the album

https://www.deezer.com/mx/album/119673702
https://www.deezer.com/es/album/119673702
https://www.deezer.com/fr/album/119673702
https://www.deezer.com/en/album/119673702

The proposed solution

The problem would be solved if the value for P8966 would be used instead. The value is shown below.

^https?:\/\/www\.deezer\.com\/(?:[a-z]{2}\/)?album\/([1-9]\d+)

All of the following URLs shown above are matched by this regular expression. See this test at regex101.com for more information (a screenshot of regex101.com is shown below)

send

Personal thoughts

  • I think if EntityExplosion only used P2723, "formatter URL", we would need to add all language code to all properties that point to a website that show the language code in its URLs. A flaw of doing this is that some websites might support some languages codes will others not. Therefore, ensuring that "formatter URL" is complete in a property is a non-trivial task, and doesn't seem to be a bot doing this arduous task.
  • I concluded that the current implementation of EntityExplosion doesn't use P8966 because I searched P8966 in the root folder of the source code using grep and no match was found.

Missing URL formatter

Hello,
Thank you a lot for your project, it gives me many ideas.

I discovered that the person ID for France Culture (wd:P5301) is not working in Entity Explosion. By exploring the code, it turns out that the query:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?prop ?regex ?formatter_url WHERE {
{?prop wdt:P1630 ?formatter_url .}
UNION
{?prop wdt:P3303 ?formatter_url .}
FILTER (CONTAINS( ?formatter_url, "$1" ) )
?prop wdt:P1793 ?regex .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } .
}

does not return the entity wd:P5301, since it has no regex defined using the property wdt:P1793. But it has a regex defined by the path ?prop <http://www.wikidata.org/prop/P2302> ?o. ?o <http://www.wikidata.org/prop/qualifier/P1793> ?regex. and reformulating the above query as:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?prop ?regex ?formatter_url WHERE {
{?prop wdt:P1630 ?formatter_url .}
UNION
{?prop wdt:P3303 ?formatter_url .}
FILTER (CONTAINS( ?formatter_url, "$1" ) )
?prop <http://www.wikidata.org/prop/P2302> ?o.
?o <http://www.wikidata.org/prop/qualifier/P1793> ?regex.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } .
}

allows to obtained wd:P5301 in the answer set. It also returns more answers: 5660 against 4428 with the previous query. The answers may be different anyway.

What do you think of changing the first query by the second one ?

Firefox warning: Event pages are not supported

When loading in Firefox, the following warning is shown:

Reading manifest: Warning processing background.persistent: Event pages are not currently supported. This will run as a persistent background page.

It doesn't seem to affect operation.

Question: What is the correct set up for JSON

If you go to page en/article/NaimaSahlbom and try Entity Explosion I get two entries for Svenskt kvinnobiografiskt lexikon that is what I want

image

Question:
Have I set up P4963 correct as you think we should to get the best usage of Entity Explorer??

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.