Git Product home page Git Product logo

languagetool-community-website's Introduction

languagetool-community-website's People

Contributors

danielnaber avatar dpelle avatar jaumeortola avatar mailaender avatar milekpl avatar paolob67 avatar stevio89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

languagetool-community-website's Issues

Error Compilation error: startup failed: unable to resolve class XMLValidator

| Compiling 32 source files.
| Error Compilation error: startup failed:
/home/matthias/Projekte/LanguageTool/languagetool-community-website/grails-app/controllers/org/languagetool/RuleEditorController.groovy: 91: unable to resolve class XMLValidator 
 @ line 91, column 22.
           XMLValidator validator = new XMLValidator()
                        ^

/home/matthias/Projekte/LanguageTool/languagetool-community-website/grails-app/controllers/org/languagetool/RuleEditorController.groovy: 91: unable to resolve class XMLValidator 
 @ line 91, column 34.
           XMLValidator validator = new XMLValidator()
                                    ^

2 errors

Use HTTPS URL in GitHub description

Could you please use the more secure HTTPS URL in the GitHub description?

$ curl -I http://community.languagetool.org/
HTTP/1.1 301 Moved Permanently
Date: Wed, 15 Nov 2017 13:29:07 GMT
Server: Apache/2.4.18 (Ubuntu)
Location: https://community.languagetool.org/
Content-Type: text/html; charset=iso-8859-1

$ curl -I https://community.languagetool.org/
HTTP/1.1 200 OK
Date: Wed, 15 Nov 2017 13:29:01 GMT
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=utf-8
Content-Language: de-DE
Content-Length: 8552

This would also save one request.

Integration of various dictionaries

Hello there, I have been looking into the language tool for sometime to help us (a small team of freelancing copyeditors) with certain basic editing tasks, we would like to be a part of this online community to further enhance the system for various editing tasks with out inputs and experience of copyediting

For starters I had some questions and a clarity on these issues would be of great help.

  1. One thing that I would like to know is which dictionaries are integrated into the system for spellcheck. As we could not find that information in the grammar.xml or the other community pages?
  2. Are the dictionaries separate for US English and UK English?
  3. Is there a possibility to integrate scientific dictionaries such as Index Medicus?

Ignore anchors in Wikipedia titles

After navigating to a certain section or editing it, the URL will contain an anchor ("#blablahla") at the end. The WikiCheck bookmarklet should cut it away; it does not right now so the page is not found at all.

Rule editor: correction attribute is not filled

The correction attribute is not filled automatically, even when it should, i.e., when the suggestion is generated. Maybe there should be some field added to the form to make sure that the correction is filled.

Otherwise, when pasting the rule to the rule file, the rule will raise mistakes due to the missing correction attribute value. This can be misleading to the user.

Check evaluation does not work

As soon as the evaluation is run, the following message appears:

Error: XML validation failed: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 48; cvc-complex-type.4: Attribute 'id' must appear on element 'category'.

error in rule creator with a correct example

An error occurs when a correct example is added in the rule creator (expert mode):

<rule id="CONFUSION_OF_BED_BAD" name="confusion of bed/bad">
    <pattern>
        <token>bed</token>
        <token>English</token>
    </pattern>
    <message>Did you mean <suggestion>bad English</suggestion>?</message>
    <example correction="bad English">Sorry for my <marker>bed English</marker>.</example>
    <example>Sorry for my bad English</example>
</rule>

Error:
Sorry, an error occurred trying to check your rule: No signature of method: org.languagetool.RuleEditorController.cleanMarkers() is applicable for argument types: (org.languagetool.rules.CorrectExample) values: [Sorry for my bad English]

Prevent users from misusing the email field when suggesting words for the dictionary

On https://community.languagetool.org/suggestion/edit, we're receiving a lot of suggestions that look like this:
2017-02-03-222748_1024x768_scrot
In other words, some users are apparently unaware that they are supposed to enter an email address.
Instead they end up sending us a misspelled word plus a bona fide correction (which can be annoying if it happens too often, because it costs us time).

Suggestion:

  • Add a basic sanity check to the email text field. Checking for the presence of the "@" character would be sufficient for this particular use case.
  • If the above check fails, display some kind of pop-up that explains what the whole process is intended for, something like 'Are you sure you want the word "WORD" to be added to the LanguageTool dictionary? Please double-check its spelling. Click "Yes" to continue, "No" to cancel.'

XML comment regarded as text in rule editor

Rule editor regards XML comments as text when editing, leading to a failure of the rule.

Steps to reproduce:
Go to http://community.languagetool.org/rule/show/DET_NOM_SING?subId=1&lang=es

Click on Mostrar XML

Notice there is a TODO comment:

<rule id="DET_NOM_SING" name="Concordancia singular en Determinante + nombre">
  <pattern case_sensitive="yes">
    <token postag="D.{3}S.*" postag_regexp="yes"><exception postag="DI0CS0"/></token>
    <token postag="N.{2}P.*" postag_regexp="yes"><!--TODO: Include adjectives: N.{2}P.*|AQ.{2}P. --><exception negate_pos="yes" postag="N.{2}P.*" postag_regexp="yes"/><exception regexp="yes">[Bb]otones|\p{Lu}\p{L}*</exception></token>
  </pattern>
  <message>Posible falta de concordancia de número entre «\1» y «\2».</message>
  <short>Concordancia de número dudosa.</short>
  <example type="incorrect">Acércame <marker>la sillas</marker>, por favor.</example>
  <example type="correct">Acércame las sillas, por favor.</example>
</rule>

Now click on Mostrar en el Editor de Reglas.

image

Notice there is a warning sign. The plain text of the token is populated with the XML comment with the TODO tag:

<pattern case_sensitive='yes'>
  <token postag='D.{3}S.*' postag_regexp='yes'><exception postag='DI0CS0'></exception></token>
  <token postag='N.{2}P.*' postag_regexp='yes'>TODO: Include adjectives: N.{2}P.*|AQ.{2}P. <exception postag='N.{2}P.*' postag_regexp='yes' negate_pos='yes'></exception><exception regexp='yes'>[Bb]otones|\p{Lu}\p{L}*</exception></token>
 </pattern>

Check evaluation fails and resulting XML rule is messed up.

Errors in text extraction on the community website?

In WikiCheck, I get numerous false alarms for sentences like this:

'''Opieka artystyczna''': [[Mariusz Arno Jaworowski]]&lt;br /&gt; 

(br is highlighted as it is an abbreviation).

Now, I immunized appropriate entities, and I cannot replicate that using the command-line, the GUI nor the org.languagetool.dev.wikipedia.Main. So why is there a match on the community website?

The error is displayed for this page:

https://pl.wikipedia.org/wiki/Iron_Man_3

And there is no problem in getting text from it using check.getPlainText()

Could a way to blacklist user suggestions be added?

There are a handful of very common spelling errors we are receiving as user suggestions over and over again. There are probably less than ten of those ('immernoch' and 'vorallem' are examples for German), but having to uncheck those again and again can become slightly annoying for the admins who look after the user suggestions.

Would it be possible to

  • add a text file with these unwanted words that serves as a blacklist,
  • filter user suggestions against this blacklist before they are saved on the server side?

words with diacritics in rule editor

Characters with diacritics are not handled properly in the rule editor (both expert and simple mode). I have never found this behavior before.

Put a word like "pàgina" in an example sentence, and it is analyzed as having three tokens: "p" "?" "gina".

Highlight special characters for admins overseeing addition of new user suggestions to dictionary

This issue is about the page https://community.languagetool.org/suggestion/edit.

While working on the user suggestions for German, I have more than once been one step shy of adding unwanted words with special characters that are visually almost indistinguishable from their "normal" counterparts.
This screenshot shows a good example:
2017-02-25-203635_1024x768_scrot
What looks like the two-character string 'fi' in the second token is actually the ligature 'fi' (a single character).
Similar, more frequent difficulties include the distinction between the German 'ß' and the Greek lowercase beta. Depending on the font, they can be difficult to distinguish.
In my opinion it is far from trivial to do, because the set of expected standard characters is different for each language, but is it possible to highlight "unexpected" characters for the admins in some way (different background color or something)? The effort and cognitive load of checking every suggestion for those characters is too high.

Add some new features in the language rules lists

Hello,
Do you think that, it will be possible to add a few features to the language rules lists, for example: https://community.languagetool.org/?lang=en

Firstly I will say an auto numbering for the name rules.
If a new rule is created could it be sent to the list and then sorted by alphabetical order and then auto numbered?

Secondly, When we are in a rule description could it be added do you think a kind of navigating arrows: “previous rule” and “next rule”. Instead of returning to the rule list.
I think it could be much easier, and it will take less time.
Regards
Pierre

unable to test rules with unification

The online editor does not offer unification checks at all. If you have a rule for language with standard unification defined in the grammar.xml, for example for Polish:


<rule id="TEST" name="unification test">
    <pattern>
        <unify><feature id="gender"/><feature id="number"/>
        <token skip="-1" postag="(?:subst|ger):.*" postag_regexp="yes"/>
        <marker>
          <token inflected='yes' regexp='yes'>który|jaki</token>
        </marker>
        </unify>
    </pattern>
    <message><suggestion>test</suggestion>?</message>
    <example correction="test">To były rozmowy, <marker>które</marker> nie miały sensu.</example>
</rule>

You get the following error message:

Error: XML validation failed: org.xml.sax.SAXParseException; lineNumber: 15; columnNumber: 20; cvc-id.1: There is no ID/IDREF binding for IDREF 'gender'.

Table content is ignored by WikiCheck

Compare this page:

http://community.languagetool.org/wikiCheck/pageCheck?url=http%3A%2F%2Fpl.wikipedia.org%2Fwiki%2FWWE_Elimination_Chamber&enabled=IMIONA_Z_APOSTROFAMI

The rule IMIONA_Z_APOSTROFAMI should work for the page (there is an error still there, even two) but it doesn't show anything. The errors are correctly detected here:

http://community.languagetool.org/feedMatches/list?lang=pl&notFixedFilter=120&categoryFilter=B%C5%82%C4%99dy+ortograficzne&filter=IMIONA_Z_APOSTROFAMI

It seems it's because tables are not handled yet. They end up in TextConverter.visit(AstNode n) where nothing happens.

New Rule editor mixes up matches

For the suggestion:

\1 to bardzo rzadki wyraz, oznaczający zwolennika sanacji. Czy chodziło o '\1'?

I also have "sa" and "se" in regexp_replace boxes for \1 (there's only one box).

I get:

<message>
<match no="1" regexp_match="sa" regexp_replace="se"/> to bardzo rzadki wyraz, oznaczający zwolennika sanacji. Czy chodziło o <suggestion>\1</suggestion>?
</message>

But there's just one box for regex replacement, so I expected it to be applied to the one contained in the suggestion:

<message>
<match no="1"/> to bardzo rzadki wyraz, oznaczający zwolennika sanacji. Czy chodziło o <suggestion><match no="1" regexp_match="sa" regexp_replace="se"/></suggestion>?
</message>

The bug is that there's no code that recognizes that the same match number is used twice. It may have a different role then!

Some rules do not run in WikiCheck but run in Recent Changes...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.