Git Product home page Git Product logo

canarybot's Introduction

Hey, I'm Iván! 👋

I am an Elixir developer, although I use Python and JavaScript too, as well as many other technologies! And I don't like GitHub:

  • I prefer decentralised projects.
  • I prefer to use other forges, like Codeberg or sourcehut.

Thus I always try to keep all my code in Codeberg and keep GitHub just for participate in projects of other people.

Find me!

canarybot's People

Contributors

ivanhercaz avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

canarybot's Issues

Replace local libraries by the libraries of the Wikimedia Toolforge CDNJS

"Removes the duplicates automatically" enter in a loop of actions

Testing in PAWS the new functionality (#10), when the "Removes the duplicates automatically" is chosen, it enters in a loop of "Updating the log..." and "Description added to the file". Effectively, as it said, the log is updated and the description added each time it appears.

This gist shows three files. The first and the second shows the error commented but in different quantity (the second is longer). And the third file is the "duplicatedFullStopsDesc.csv", the file in which the descriptions have to be saved, and it shows how the identifier and the description is repeated several and different times each time I mark a description with the option mentioned above. This can be checked in the generated log too.

Tasks:

  • Review the structure of the code and discover why it happens.

Replace print functions by logging system

Prints are not the best way to show in the terminal what is doing the script, neither it is the best way to test if something works as expected. These prints aren't necessary due it is possible to address its purposes with:

  • Show information about what is doing the script: logging system (log.py).
  • Test how is working an executed piece of code: PySnooper

Make easier to reuse the code

Some ideas to make easier the reuse of the code by other people:

  • requirements.txt
  • makefile to prepare all the environment.
  • meet the necessary changes for the makefile written in this comment.

Any idea is welcome! :)

Remove fullstop automatically when it finds the same description

This idea is to save time clicking to remove exactly the same description that already has been approved to be removed. For example:

  • "Grade II listed building in Powys." [action: Remove full stop]
    But the script can save time if the action would be:
  • "Grade II listed building in Powys." [action: Remove identical full stops automatically]
    It means that if the script find again "Grade II listed building in Powys.", it isn't going to ask for the action to perform, neither quit or skip, it would be removed instantly.

This new action would save time and improve the efficiency of the script, but it couldn't be used with all the descriptions, because it would overload the system for nothing. It has to be used for a specific kind of descriptions, like the one mentioned.

How would it work

The script would have a new action: Remove identical full stops automatically. This new action will add the description to a CSV, previously created and loaded with only one column named sentence. But, before to add it the script has to confirm if the description is already in the CSV: if it is in the CSV, it isn't added, if it isn't, it is added.

This descriptions saved in this CSV would be useful for the next times the script would be run, because the script would read this document, which would storage the old descriptions marked to find and remove automatically and the new ones.

Tasks

  • Development of the action and the requirements to work.
  • System to avoid to add duplicated descriptions in the CSV.
  • Make easy to generate the CSV file in order to follow the commented in #5.
  • Test several times.

Run CanaryBot scripts outside PAWS

If someone needs to execute CanaryBot outside PAWS is necessary to have Pywikibot and configure it. It might be solved with:

  • pywikibot instruction in makefile.

Exception in try block breaks the script

There was an exception in the try statement of the "remove full stop" option. After read the stack trace I think the error is caused due log.check(e, logName, mode="csv"), because e had not all the keys specified in the log.py module (time, item, key, msg).

I tried to solve it with #2e14789 but I have to wait to another similar issue to check the error. This happened when I try to remove the full stop of the item Q178493, in which the full stop is separated of the last word (senegalese .), so it is another error that I have to fix (#2):

  Q178493
 - it-desc:     calciatore senegalese  . 
 + Replacement: calciatore senegalese

This error is very important because it breaks the script and cause the script doesn't execute fine the log.update() and doesn't create the HTML CSV-viewer.

Stack trace:

Q178493 it-desc full stop removed                                                                                                    
WARNING: API error modification-failed: Item [[Q2821366|Q2821366]] already has label "Abdoulaye Diallo" associated with language code
using the same description text.                                                                                                     
Edit to page [[wikidata:Q178493]] failed:                                                                                            
modification-failed: Item [[Q2821366|Q2821366]] already has label "Abdoulaye Diallo" associated with language code it, using the same
ription text. [messages:[{'name': 'wikibase-validator-label-with-description-conflict', 'parameters': ['Abdoulaye Diallo', 'it', '[[Q
66|Q2821366]]'], 'html': {'*': 'El elemento <a href="/wiki/Q2821366" title="Q2821366">Q2821366</a> ya tiene la etiqueta "Abdoulaye Di
 asociada con el código de idioma it, usando el mismo texto de descripción.'}}]; help:See https://www.wikidata.org/w/api.php for API 
. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt;
notice of API deprecations and breaking changes.]                                                                                    
                                                                                                                                     
Updating log...                                                                                                                      
                                                                                                                                     
Traceback (most recent call last):                                                                                                   
  File "/srv/paws/pwb/pywikibot/page.py", line 118, in handle                                                                        
    func(self, *args, **kwargs)                                                                                                      
  File "/srv/paws/pwb/pywikibot/page.py", line 4059, in editEntity                                                                   
    baserevid=baserevid, **kwargs)                                                                                                   
  File "/srv/paws/pwb/pywikibot/site.py", line 1312, in callee                                                                       
    return fn(self, *args, **kwargs)                                                                                                 
  File "/srv/paws/pwb/pywikibot/site.py", line 7668, in editEntity                                                                   
    data = req.submit()                                                                                                              
  File "/srv/paws/pwb/pywikibot/data/api.py", line 2195, in submit                                                                   
    raise APIError(**result['error'])                                                                                                
pywikibot.data.api.APIError: modification-failed: Item [[Q2821366|Q2821366]] already has label "Abdoulaye Diallo" associated with lan
 code it, using the same description text. [messages:[{'name': 'wikibase-validator-label-with-description-conflict', 'parameters': ['
laye Diallo', 'it', '[[Q2821366|Q2821366]]'], 'html': {'*': 'El elemento <a href="/wiki/Q2821366" title="Q2821366">Q2821366</a> ya ti
a etiqueta "Abdoulaye Diallo" asociada con el código de idioma it, usando el mismo texto de descripción.'}}]; help:See https://www.wi
a.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listin
diawiki-api-announce&gt; for notice of API deprecations and breaking changes.]                                                       
                                                                                                                                     
During handling of the above exception, another exception occurred:                                                                  
                                                                                                                                     
Traceback (most recent call last):                                                                                                   
  File "fullstopschecker.py", line 96, in editDesc                                                                                   
    itemPage.editDescriptions(replacement, summary=summary["removed"])                                                               
  File "/srv/paws/pwb/pywikibot/page.py", line 4091, in editDescriptions                                                             
    self.editEntity(data, **kwargs)                                                                                                  
  File "/srv/paws/pwb/pywikibot/page.py", line 136, in wrapper                                                                       
    handle(func, self, *args, **kwargs)                                                                                              
  File "/srv/paws/pwb/pywikibot/page.py", line 128, in handle                                                                        
    raise pywikibot.OtherPageSaveError(self, err)                                                                                    
pywikibot.exceptions.OtherPageSaveError: Edit to page [[wikidata:Q178493]] failed:                                                   
modification-failed: Item [[Q2821366|Q2821366]] already has label "Abdoulaye Diallo" associated with language code it, using the same
ription text. [messages:[{'name': 'wikibase-validator-label-with-description-conflict', 'parameters': ['Abdoulaye Diallo', 'it', '[[Q
66|Q2821366]]'], 'html': {'*': 'El elemento <a href="/wiki/Q2821366" title="Q2821366">Q2821366</a> ya tiene la etiqueta "Abdoulaye Di
 asociada con el código de idioma it, usando el mismo texto de descripción.'}}]; help:See https://www.wikidata.org/w/api.php for API 
. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt;
notice of API deprecations and breaking changes.]                                                                                    
                                                                                                                                     
During handling of the above exception, another exception occurred:                                                                  
                                                                                                                                     
Traceback (most recent call last):                                                                                                   
  File "fullstopschecker.py", line 314, in <module>                                                                                  
    checkDesc(query, editMode)                                                                                                       
  File "fullstopschecker.py", line 242, in checkDesc                                                                                 
    edit = editDesc(item, key, description, newDescription, count, editMode, editGroup, logName)                                     
  File "fullstopschecker.py", line 113, in editDesc                                                                                  
    log.check(e, logName, mode="csv")                                                                                                
  File "/home/paws/CanaryBot/log.py", line 105, in check                                                                             
    update(info, script, fullNameLog, nowFormat, generateHTML, mode)                                                                 
  File "/home/paws/CanaryBot/log.py", line 62, in update                                                                             
    row = info["time"], info["item"], info["key"], info["msg"]                                                                       
TypeError: 'OtherPageSaveError' object is not subscriptable                                                                          
<class 'TypeError'>                                                                                                                  
CRITICAL: Closing network session.                                                                                                   
@PAWS:~/CanaryBot$

"WARNING: API error modification-failed"

The bot got this error when it tried to edit the item Q26682121.

Change tried

   Q26682121
 - en-desc:     Redgrave, Mid Suffolk, Suffolk, IP22.
 + Replacement: Redgrave, Mid Suffolk, Suffolk, IP22

Error

WARNING: API error modification-failed: Item [[Q26545861|Q26545861]] already has label "Pond Farm House" associated with language code en, using the same description text.
Edit to page [[wikidata:Q26682121]] failed:
modification-failed: Item [[Q26545861|Q26545861]] already has label "Pond Farm House" associated with language code en, using the same description text. [messages:[{'name': 'wikibase-validator-label-with-description-conflict', 'parameters': ['Pond Farm House', 'en', '[[Q26545861|Q26545861]]'], 'html': {'*': 'El elemento <a href="/wiki/Q26545861" title="Q26545861">Q26545861</a> ya tiene la etiqueta "Pond Farm House" asociada con el código de idioma en, usando el mismo texto de descripción.'}}]; help:See https://www.wikidata.org/w/api.php for API usage. Subscribeto the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt; for notice of API deprecations and breaking changes.]

Issues related

  • #1 (closed and solved).
  • #2 (pending to solved).

Broken url in list of logs (logslist.html)

When a log is added in the list of logs (logslist.html) the url is broken due the url hasn't the subdirectory in which the logs are stored.

  • 2018-10-20-descriptionCheckList.html (incorrect)
  • logs/2018-10-20-descriptionCheckList.html (correct)

The file in which it's necessary to fix it is the createhtml.sh, specifically the last line.

Show label of the item

Sometimes to know the label in the respective language of the description to edit might be useful. For that reason I think it could be interesting to add the label to this schema:

   Q93304
 - es-desc:     las leyes babilónicas de la antigüedad.
 + Replacement: las leyes babilónicas de la antigüedad

As I show in the next block:

   Q93304<tab>Código de Hammurabi
 - es-desc:     las leyes babilónicas de la antigüedad.
 + Replacement: las leyes babilónicas de la antigüedad

Or as I show in this last one:

   Q93304
   Código de Hammurabi
 - es-desc:     las leyes babilónicas de la antigüedad.
 + Replacement: las leyes babilónicas de la antigüedad

Some relevant questions to note when this idea would be developed:

  • If the label is very long, what should do CanaryBot? Show it shortened or complete?
  • ...

Move Utilities class to utils.py

Make an independent file for the Utilities class (utils.py). Tasks to do:

  • Migrate the code to utils.py.
  • Simplify the use of the class in bot.py deleting the repeated u = Utilities() to something like:
import utils as u

def removeFullStop(edit):
    # Code
    fsc.checkDesc(query, edit)

if __name__ == '__main__':
   # Code
   # After the projects prompt
   edit = u.editMode()
   # Before the available tasks prompt
   # Then, in the specific task it pass edit as an argument:
   removeFullStop(edit)
  • Update the calls to the Utilities class after the replacement of u = bot.Utilities() in fullstopschecker.py by import utils as u.

Error exception when the full stop is separated of the last word

This issue is derived from #1. The description that caused the error was the it-desc of the item Q178493.

  Q178493
 - it-desc:     calciatore senegalese  . 
 + Replacement: calciatore senegalese

The stack trace say the error is because the script try to add the same description, so it got an APIError. The reason why this happened might be due the regular expression used to match the full stop:

item.descriptions[key] = re.sub("\\.$", redFullStop, item.descriptions[key])

But I have to check it yet.

Stack trace:

Q178493 it-desc full stop removed                                                                                                    
WARNING: API error modification-failed: Item [[Q2821366|Q2821366]] already has label "Abdoulaye Diallo" associated with language code
using the same description text.                                                                                                     
Edit to page [[wikidata:Q178493]] failed:                                                                                            
modification-failed: Item [[Q2821366|Q2821366]] already has label "Abdoulaye Diallo" associated with language code it, using the same
ription text. [messages:[{'name': 'wikibase-validator-label-with-description-conflict', 'parameters': ['Abdoulaye Diallo', 'it', '[[Q
66|Q2821366]]'], 'html': {'*': 'El elemento <a href="/wiki/Q2821366" title="Q2821366">Q2821366</a> ya tiene la etiqueta "Abdoulaye Di
 asociada con el código de idioma it, usando el mismo texto de descripción.'}}]; help:See https://www.wikidata.org/w/api.php for API 
. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce&gt;
notice of API deprecations and breaking changes.]                                                                                    
                                                                                                                                     
Updating log...                                                                                                                      
                                                                                                                                     
Traceback (most recent call last):                                                                                                   
  File "/srv/paws/pwb/pywikibot/page.py", line 118, in handle                                                                        
    func(self, *args, **kwargs)                                                                                                      
  File "/srv/paws/pwb/pywikibot/page.py", line 4059, in editEntity                                                                   
    baserevid=baserevid, **kwargs)                                                                                                   
  File "/srv/paws/pwb/pywikibot/site.py", line 1312, in callee                                                                       
    return fn(self, *args, **kwargs)                                                                                                 
  File "/srv/paws/pwb/pywikibot/site.py", line 7668, in editEntity                                                                   
    data = req.submit()                                                                                                              
  File "/srv/paws/pwb/pywikibot/data/api.py", line 2195, in submit                                                                   
    raise APIError(**result['error'])                                                                                                
pywikibot.data.api.APIError: modification-failed: Item [[Q2821366|Q2821366]] already has label "Abdoulaye Diallo" associated with lan
 code it, using the same description text. [messages:[{'name': 'wikibase-validator-label-with-description-conflict', 'parameters': ['
laye Diallo', 'it', '[[Q2821366|Q2821366]]'], 'html': {'*': 'El elemento <a href="/wiki/Q2821366" title="Q2821366">Q2821366</a> ya ti
a etiqueta "Abdoulaye Diallo" asociada con el código de idioma it, usando el mismo texto de descripción.'}}]; help:See https://www.wi
a.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at &lt;https://lists.wikimedia.org/mailman/listin
diawiki-api-announce&gt; for notice of API deprecations and breaking changes.]          

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.