Git Product home page Git Product logo

skill-wiki's Introduction

Wikipedia

Wikipedia

About

Query Wikipedia for answers to all your questions. Get just a summary, or ask for more to get in-depth information.

This Skill uses the Wikimedia API.

Examples

  • "Tell me about Elon Musk"
  • "Tell me about beans"
  • "Tell me something random"
  • "Check Wikipedia for beans"
  • "Tell me about the Pembroke Welsh Corgi"
  • "Search for chocolate"
  • "More information" (followup after an initial summary)
  • "Tell me More" (followup after an initial summary)

Credits

Mycroft AI (@MycroftAI)

Category

Information

Tags

#wikipedia #encyclopedia #general-knowledge #wiki #question #query

skill-wiki's People

Contributors

aiix avatar augustnmonteiro avatar chrisveilleux avatar dave-esch avatar devs-mycroft avatar dschweppe avatar e-gor avatar forslund avatar goldyfruit avatar kathyreid avatar ken-mycroft avatar krisgesling avatar luca-vercelli avatar luke5sky avatar matthewscholefield avatar thorstenmueller avatar tony763 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

skill-wiki's Issues

No image for "who is Julia Gillard"

Describe the bug
Title is displayed, but the image area is just black. The Busy Indicator is not shown.

To Reproduce
"Hey Mycroft, who is Julia Gillard"

Environment (please complete the following information):

  • Device type: Mark II
  • Mycroft-core version: v21.2.2
  • Other versions: container build 2021-12-13

Clean summary text further

Summary text gets certain characters cleaned out of it for presentation on the GUI. Sometimes this leaves double spaces eg "some word". Duplicate white space should be removed.

Local wiki

If you want to run against a local mediawiki instance, you'll have to adjust the API_URL settings in the pip package's wikipedia.py (usually in /usr/local/lib/python3.6/dist-packages/wikipedia/) to match your local instance. If you've downloaded the simple wikipedia for simplicity's sake, then you may wish to force the lang to "simple" always, as you may not find answers otherwise.

Asking about "Milky Way" raises Page ID error

Hey Mycroft, tell me about the Milky Way

Mycroft CLI then outputs the text, but does not speak:

Let me look up the milky way

An error is then raised:

2020-02-17 11:21:10.322 | ERROR    | 14206 | WikipediaSkill | Error: Page id "milk way" does not match any pages. Try another id!

Asking the same query with log level debug provides.

2020-02-17 11:21:05.767 | DEBUG    | 14206 | mycroft.skills.intent_service:handle_utterance:329 | Utterances: ['tell me about the milky way']
2020-02-17 11:21:06.190 | DEBUG    | 14206 | mycroft.skills.intent_service:handle_utterance:349 | Padatious intent: {'name': 'tronald-dump-skill.krisgesling:dump.tronald.intent', 'sent': 'tell me about the {topic}', 'matches': {'topic': 'milky way'}, 'conf': 0.35015021861149453}
2020-02-17 11:21:06.190 | DEBUG    | 14206 | mycroft.skills.intent_service:handle_utterance:350 |     Adapt intent: {'intent_type': 'mycroft-wiki.mycroftai:handle_wiki_query', 'mycroft_wiki_mycroftaiWikipedia': 'tell me about', 'mycroft_wiki_mycroftaiArticleTitle': 'the milky way', 'target': None, 'confidence': 0.375, '__tags__': [{'start_token': 0, 'entities': [{'key': 'tell me about', 'match': 'tell me about', 'data': [('tell me about', 'mycroft_wiki_mycroftaiWikipedia'), ('tell us', 'mycroft_weather_mycroftaiQuery')], 'confidence': 1.0}], 'confidence': 1.0, 'end_token': 2, 'match': 'tell me about', 'key': 'tell me about', 'from_context': False}, {'start_token': 3, 'entities': [{'key': 'the milky way', 'match': 'the milky way', 'data': [('the milky way', 'mycroft_wiki_mycroftaiArticleTitle')], 'confidence': 0.5}], 'confidence': 0.5, 'end_token': 5, 'match': 'the milky way', 'key': 'the milky way', 'from_context': False}], 'utterance': 'tell me about the milky way'}
2020-02-17 11:21:06.192 | DEBUG    | 14206 | urllib3.connectionpool | Starting new HTTPS connection (1): api.mycroft.ai:443
2020-02-17 11:21:06.197 | DEBUG    | 14206 | urllib3.connectionpool | Starting new HTTP connection (1): en.wikipedia.org:80
2020-02-17 11:21:06.538 | DEBUG    | 14206 | urllib3.connectionpool | http://en.wikipedia.org:80 "GET /w/api.php?list=search&srprop=&srlimit=5&limit=5&srsearch=the+milky+way&format=json&action=query HTTP/1.1" 301 0
2020-02-17 11:21:06.546 | DEBUG    | 14206 | urllib3.connectionpool | Starting new HTTPS connection (1): en.wikipedia.org:443
2020-02-17 11:21:07.306 | DEBUG    | 14206 | urllib3.connectionpool | https://api.mycroft.ai:443 "POST /v1/device/529c694d-82e6-417b-8576-cf4be48acb2b/metric/timing HTTP/1.1" 204 0
2020-02-17 11:21:07.650 | DEBUG    | 14206 | urllib3.connectionpool | https://en.wikipedia.org:443 "GET /w/api.php?list=search&srprop=&srlimit=5&limit=5&srsearch=the+milky+way&format=json&action=query HTTP/1.1" 200 282
2020-02-17 11:21:07.653 | DEBUG    | 14206 | urllib3.connectionpool | Starting new HTTP connection (1): en.wikipedia.org:80
2020-02-17 11:21:07.995 | DEBUG    | 14206 | urllib3.connectionpool | http://en.wikipedia.org:80 "GET /w/api.php?list=search&srprop=&srlimit=1&limit=1&srsearch=Milky+Way&srinfo=suggestion&format=json&action=query HTTP/1.1" 301 0
2020-02-17 11:21:08.000 | DEBUG    | 14206 | urllib3.connectionpool | Starting new HTTPS connection (1): en.wikipedia.org:443
2020-02-17 11:21:09.031 | DEBUG    | 14206 | urllib3.connectionpool | https://en.wikipedia.org:443 "GET /w/api.php?list=search&srprop=&srlimit=1&limit=1&srsearch=Milky+Way&srinfo=suggestion&format=json&action=query HTTP/1.1" 200 203
2020-02-17 11:21:09.034 | DEBUG    | 14206 | urllib3.connectionpool | Starting new HTTP connection (1): en.wikipedia.org:80
2020-02-17 11:21:09.374 | DEBUG    | 14206 | urllib3.connectionpool | http://en.wikipedia.org:80 "GET /w/api.php?prop=info%7Cpageprops&inprop=url&ppprop=disambiguation&redirects=&titles=milk+way&format=json&action=query HTTP/1.1" 301 0
2020-02-17 11:21:09.379 | DEBUG    | 14206 | urllib3.connectionpool | Starting new HTTPS connection (1): en.wikipedia.org:443
2020-02-17 11:21:10.318 | DEBUG    | 14206 | urllib3.connectionpool | https://en.wikipedia.org:443 "GET /w/api.php?prop=info%7Cpageprops&inprop=url&ppprop=disambiguation&redirects=&titles=milk+way&format=json&action=query HTTP/1.1" 200 235
2020-02-17 11:21:10.322 | ERROR    | 14206 | WikipediaSkill | Error: Page id "milk way" does not match any pages. Try another id!
2020-02-17 11:21:10.325 | DEBUG    | 14206 | urllib3.connectionpool | Starting new HTTPS connection (1): api.mycroft.ai:443
2020-02-17 11:21:11.449 | DEBUG    | 14206 | urllib3.connectionpool | https://api.mycroft.ai:443 "POST /v1/device/529c694d-82e6-417b-8576-cf4be48acb2b/metric/timing HTTP/1.1" 204 0

Haven't yet looked at the Skill, but the API response contains a valid page ID: https://en.wikipedia.org/?curid=2589714

I8N support for language-specific Wikipedia

The wikipedia.set_lang("fr") mechanism can be used to support queries made in languages other than English. This can be implemented as a language resource if there is a difference between the spoken language code and the wikipedia code.

Handling nested descriptions for Summary

I think it's not uncommon that you will find nested descriptions that should be dumped (#/#) in the summary

ex (german): (auch Bert Brecht; * 10. Februar 1898 als Eugen Berthold Friedrich Brecht in Augsburg; โ€  14. August 1956 in Berlin (Ost))

The regex pattern in place will carve out from the first "(" to the first ")" leaving the summary like "Bertolt Brecht ) war ein einflussreicher" (Actually that's not the best example, but in case something's following the first ")" it would reside in the summary)

To handle nested descriptions the code should be changed to

pattern = r"\([^()]*\)|/[^/]*/"     #note that an "(" is added to [^)]

while re.findall(pattern, summary):
    summary = re.sub(pattern, "", summary)

If the pattern should be expanded to search for (nested) "[" or "]" (which it should)
pattern = r"\([^()]*\)|/[^/]*/|\[[^\[\]]*\]"

It also seems to be beneficial if the pattern's looking for a space in front of (,[,/ to reduce false positives:

pattern = r"\s\([^()]*\)|\s/[^/]*/|\s\[[^\[\]]*\]"

Stopping Mycroft from Narrating last querry

Hi all,

I am using picroft on Raspberry Pi 3B+ and there is an issue I wanted to ask about and how to achieve that.

Example:

User: Hey Mycroft tell me about the Himalayas?
Mycroft: Starts narrating answer.
(Between the Narration)
User asks: Hey Mycroft tell me about America?

Then Mycroft should stop tell me about Himalaya's and should tell me about America.

But right now it first completes whole Himalaya's narration and then starts telling me about America.

I don't want to manually say Hey Mycroft stop.

It should be automatic.

So how can I make this?

Feature request: random page

Could a way be added to request the wiki page "Special:Random"? I realize there's a "random wiki fact" skill, but that works a little differently and anyway it seems to be non-functional at the moment.

Cannot use the word 'search' in other skills

It appears that because this skill is installed by default it blocks other skills from being able to use the word search.

For example, trying to write my own skill the phrase
search confluence for chicken results in Mycroft activating the wiki skill. It responds with:

 >> Just a moment while I look up chicken
 >> The chicken is a type of domesticated fowl, a
    subspecies of the red junglefowl .

This to me would suggest that the wiki skill has too wide of a match with the word search. The same problem would manifest with the word check

I believe it is unreasonable to expect a user to black list default skills in order to use other skills. A default skill should not have a monopoly of generic terms like search. This leads to non-default skills having to use more awkward phrasing to be triggered. It also makes Mycroft seem "stupid" to the end user and it doesn't "understand"

This regex is too wide:

.*(wiki|for|about|wikipedia)

This needs to be more specific. Perhaps consider something like

(search | check) (wiki | wikipedia) for 
tell me about 
tell me something random

I see how this means that the regexes would match less but perhaps that is a good thing?

Using textextraction method other than "exsentences" (wikipedia.summary)

While at times this works reasonably well, there are many cases the summary will be completly butchered.

(TextExtracts API)

There are various things to be aware of when using the API

We do not recommend the usage of exsentences. It does not work for HTML extracts and there are many edge cases for which it doesn't exist. For example "Arm. gen. Ing. John Smith was a soldier." will be treated as 4 sentences. We do not plan to fix this.

In our case the output would be "Arm. gen." if passing lines = 2 (#)

wikipedia.summary

Font size Hack does not work on Mark-2 prototype

On my Mark-2 prototype using the BOM display the hack for the fontsize does not work well (The font is super small).
I fixed this differently in WikipediaDelegate.qml:
...
import QtQuick.Window 2.2
...
font.pointSize: Math.floor(20 * 4 / Screen.pixelDensity)
...

This provides the same font size on Neo running in a VirtualMachine and on the device.

Can't get articles starting with "pink"

This is weird... if I request an article starting with "pink", like "pink noise", "pink floyd", or "pink lemonade", I get "keyword not recognized". Requesting the article on "pink" by itself works, however.

Mycroft reads section dividers

In MediaWiki syntax, article sections are usually delimited as "==Section name==". On the articles with shorter introductions where Mycroft gets to a section, it reads the equals signs literally.

More info about donald duck fails

When asking about donald duck and more info the wiki lookup fails for some reason.

To Reproduce
Steps to reproduce the behavior:

  1. Ask "tell me about donald duck"
  2. listen to response
  3. Then ask "tell me more"
  4. See the error getting wiki summary for "donald du"

The input is actually "donald duck" but somewhere in the wikipedia it seem to change.

Expected behavior
A clear and concise description of what you expected to happen.

Log files
If possible, add log files from /var/log/mycroft/ to help explain your problem.

You may also include screenshots, however screenshots of log files are often difficult to read and parse.

If you are running Mycroft, the Support Skill helps to automate gathering this information. Simply say "Create a support ticket" and the Skill will put together a support package and email it to you.

Environment (please complete the following information):

  • Device type: [e.g. Raspberry Pi, Mark 1, desktop]
  • OS: [e.g. Ubuntu, Picroft]
  • Mycroft-core version: [e.g. 20.08]
  • Other versions: [e.g. Adapt v0.3.7]

Additional context
Add any other context about the problem here.

Please think carefully about whether you have modified anything in Mycroft's code or configuration files. If so, can you reproduce this on a clean installation of Mycroft? Many "bugs" turn out to be non-standard configuration errors.

Improve readability of UI

Currently the text and the image background may be the same or similar colours.

Need to consider ways to ensure there is an appropriate level of contrast between the text and it's background.

Possibilities:

  • text doesn't overlay the image
  • text element has a semi transparent background but still overlays the image
  • ?

I left off - detecting the image colour and changing font colour accordingly as overly complex and prone to error.

"Who is" vs "Tell me about"

Is your feature request related to a problem? Please describe.
"Who is" requests don't seem to trigger the Wikipedia Skill - or DDG is just way quicker?
Alternatively - "tell me about" directly triggers the Wikipedia Skill.

Describe the solution you'd like
Either of these question bases should trigger the Common Query Framework - and Wikipedia should be returning it's confidence quickly enough that it can attempt to answer.

"Tell me about Elon Musk" cannot find result

Describe the bug
"Elon Musk" is directly translatable to a specific wikipedia page, hence the Skill should easily handle it but doesn't. Need to investigate why as there are likely other requests failing in a similar fashion.

Speed up response times

Following on from PR #46 - we should be able to speed up the response time from the wikipedia service. A user shouldn't need to wait > 10 seconds for an answer.

Options are to switch libraries and/or reduce the number of requests being made. Currently for a single search we can make up to 3 requests to wikipedia as we:

  1. List possible pages
  2. Fetch two line summary of most likely page.
  3. If too long, re-fetch one line summary

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.