languagetool-org / languagetool-community-website Goto Github PK
View Code? Open in Web Editor NEWthe website https://community.languagetool.org
License: GNU Lesser General Public License v2.1
the website https://community.languagetool.org
License: GNU Lesser General Public License v2.1
| Compiling 32 source files.
| Error Compilation error: startup failed:
/home/matthias/Projekte/LanguageTool/languagetool-community-website/grails-app/controllers/org/languagetool/RuleEditorController.groovy: 91: unable to resolve class XMLValidator
@ line 91, column 22.
XMLValidator validator = new XMLValidator()
^
/home/matthias/Projekte/LanguageTool/languagetool-community-website/grails-app/controllers/org/languagetool/RuleEditorController.groovy: 91: unable to resolve class XMLValidator
@ line 91, column 34.
XMLValidator validator = new XMLValidator()
^
2 errors
Could you please use the more secure HTTPS URL in the GitHub description?
$ curl -I http://community.languagetool.org/
HTTP/1.1 301 Moved Permanently
Date: Wed, 15 Nov 2017 13:29:07 GMT
Server: Apache/2.4.18 (Ubuntu)
Location: https://community.languagetool.org/
Content-Type: text/html; charset=iso-8859-1
$ curl -I https://community.languagetool.org/
HTTP/1.1 200 OK
Date: Wed, 15 Nov 2017 13:29:01 GMT
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=utf-8
Content-Language: de-DE
Content-Length: 8552
This would also save one request.
Hello there, I have been looking into the language tool for sometime to help us (a small team of freelancing copyeditors) with certain basic editing tasks, we would like to be a part of this online community to further enhance the system for various editing tasks with out inputs and experience of copyediting
For starters I had some questions and a clarity on these issues would be of great help.
There is some bug that is hard to debug when processing this URL:
When I click Polish, then LanguageTool WikiCheck, I'm transfered to English LanguageTool WikiCheck (Chrome).
After navigating to a certain section or editing it, the URL will contain an anchor ("#blablahla") at the end. The WikiCheck bookmarklet should cut it away; it does not right now so the page is not found at all.
The correction
attribute is not filled automatically, even when it should, i.e., when the suggestion is generated. Maybe there should be some field added to the form to make sure that the correction is filled.
Otherwise, when pasting the rule to the rule file, the rule will raise mistakes due to the missing correction attribute value. This can be misleading to the user.
As soon as the evaluation is run, the following message appears:
Error: XML validation failed: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 48; cvc-complex-type.4: Attribute 'id' must appear on element 'category'.
An error occurs when a correct example is added in the rule creator (expert mode):
<rule id="CONFUSION_OF_BED_BAD" name="confusion of bed/bad">
<pattern>
<token>bed</token>
<token>English</token>
</pattern>
<message>Did you mean <suggestion>bad English</suggestion>?</message>
<example correction="bad English">Sorry for my <marker>bed English</marker>.</example>
<example>Sorry for my bad English</example>
</rule>
Error:
Sorry, an error occurred trying to check your rule: No signature of method: org.languagetool.RuleEditorController.cleanMarkers() is applicable for argument types: (org.languagetool.rules.CorrectExample) values: [Sorry for my bad English]
On https://community.languagetool.org/suggestion/edit, we're receiving a lot of suggestions that look like this:
In other words, some users are apparently unaware that they are supposed to enter an email address.
Instead they end up sending us a misspelled word plus a bona fide correction (which can be annoying if it happens too often, because it costs us time).
Suggestion:
For Polish, there is an important mistake -- missing comma between two component sentences in a compound sentence -- which is difficult to fix using WikiCheck because the correction box is tiny. Make the correction box proportional to the length of the error.
See for example:
Rule editor regards XML comments as text when editing, leading to a failure of the rule.
Steps to reproduce:
Go to http://community.languagetool.org/rule/show/DET_NOM_SING?subId=1&lang=es
Click on Mostrar XML
Notice there is a TODO comment:
<rule id="DET_NOM_SING" name="Concordancia singular en Determinante + nombre">
<pattern case_sensitive="yes">
<token postag="D.{3}S.*" postag_regexp="yes"><exception postag="DI0CS0"/></token>
<token postag="N.{2}P.*" postag_regexp="yes"><!--TODO: Include adjectives: N.{2}P.*|AQ.{2}P. --><exception negate_pos="yes" postag="N.{2}P.*" postag_regexp="yes"/><exception regexp="yes">[Bb]otones|\p{Lu}\p{L}*</exception></token>
</pattern>
<message>Posible falta de concordancia de número entre «\1» y «\2».</message>
<short>Concordancia de número dudosa.</short>
<example type="incorrect">Acércame <marker>la sillas</marker>, por favor.</example>
<example type="correct">Acércame las sillas, por favor.</example>
</rule>
Now click on Mostrar en el Editor de Reglas.
Notice there is a warning sign. The plain text of the token is populated with the XML comment with the TODO tag:
<pattern case_sensitive='yes'>
<token postag='D.{3}S.*' postag_regexp='yes'><exception postag='DI0CS0'></exception></token>
<token postag='N.{2}P.*' postag_regexp='yes'>TODO: Include adjectives: N.{2}P.*|AQ.{2}P. <exception postag='N.{2}P.*' postag_regexp='yes' negate_pos='yes'></exception><exception regexp='yes'>[Bb]otones|\p{Lu}\p{L}*</exception></token>
</pattern>
Check evaluation fails and resulting XML rule is messed up.
In WikiCheck, I get numerous false alarms for sentences like this:
'''Opieka artystyczna''': [[Mariusz Arno Jaworowski]]<br />
(br is highlighted as it is an abbreviation).
Now, I immunized appropriate entities, and I cannot replicate that using the command-line, the GUI nor the org.languagetool.dev.wikipedia.Main. So why is there a match on the community website?
The error is displayed for this page:
https://pl.wikipedia.org/wiki/Iron_Man_3
And there is no problem in getting text from it using check.getPlainText()
There are a handful of very common spelling errors we are receiving as user suggestions over and over again. There are probably less than ten of those ('immernoch' and 'vorallem' are examples for German), but having to uncheck those again and again can become slightly annoying for the admins who look after the user suggestions.
Would it be possible to
Characters with diacritics are not handled properly in the rule editor (both expert and simple mode). I have never found this behavior before.
Put a word like "pàgina" in an example sentence, and it is analyzed as having three tokens: "p" "?" "gina".
This issue is about the page https://community.languagetool.org/suggestion/edit.
While working on the user suggestions for German, I have more than once been one step shy of adding unwanted words with special characters that are visually almost indistinguishable from their "normal" counterparts.
This screenshot shows a good example:
What looks like the two-character string 'fi' in the second token is actually the ligature 'fi' (a single character).
Similar, more frequent difficulties include the distinction between the German 'ß' and the Greek lowercase beta. Depending on the font, they can be difficult to distinguish.
In my opinion it is far from trivial to do, because the set of expected standard characters is different for each language, but is it possible to highlight "unexpected" characters for the admins in some way (different background color or something)? The effort and cognitive load of checking every suggestion for those characters is too high.
Hello,
Do you think that, it will be possible to add a few features to the language rules lists, for example: https://community.languagetool.org/?lang=en
Firstly I will say an auto numbering for the name rules.
If a new rule is created could it be sent to the list and then sorted by alphabetical order and then auto numbered?
Secondly, When we are in a rule description could it be added do you think a kind of navigating arrows: “previous rule” and “next rule”. Instead of returning to the rule list.
I think it could be much easier, and it will take less time.
Regards
Pierre
Please, could you add other ways to log in to http://community.languagetool.org/user/login ?
Transifex already allows it with Twitter, Linkedin, Google, Facebook:
https://www.transifex.com/signin/?next=/projects/p/languagetool/
I think you should also add this possibility (with the websites listed above + GitHub too).
Thank you.
The online editor does not offer unification checks at all. If you have a rule for language with standard unification defined in the grammar.xml, for example for Polish:
<rule id="TEST" name="unification test">
<pattern>
<unify><feature id="gender"/><feature id="number"/>
<token skip="-1" postag="(?:subst|ger):.*" postag_regexp="yes"/>
<marker>
<token inflected='yes' regexp='yes'>który|jaki</token>
</marker>
</unify>
</pattern>
<message><suggestion>test</suggestion>?</message>
<example correction="test">To były rozmowy, <marker>które</marker> nie miały sensu.</example>
</rule>
You get the following error message:
Error: XML validation failed: org.xml.sax.SAXParseException; lineNumber: 15; columnNumber: 20; cvc-id.1: There is no ID/IDREF binding for IDREF 'gender'.
I'd be usefull if WikiCheck could check (and display) if a Wikipedia article is protected or semi-protected because you can't change them being a regular user.
See:
languagetool-org/languagetool#100
Compare this page:
The rule IMIONA_Z_APOSTROFAMI should work for the page (there is an error still there, even two) but it doesn't show anything. The errors are correctly detected here:
It seems it's because tables are not handled yet. They end up in TextConverter.visit(AstNode n)
where nothing happens.
For the suggestion:
\1 to bardzo rzadki wyraz, oznaczający zwolennika sanacji. Czy chodziło o '\1'?
I also have "sa" and "se" in regexp_replace boxes for \1 (there's only one box).
I get:
<message>
<match no="1" regexp_match="sa" regexp_replace="se"/> to bardzo rzadki wyraz, oznaczający zwolennika sanacji. Czy chodziło o <suggestion>\1</suggestion>?
</message>
But there's just one box for regex replacement, so I expected it to be applied to the one contained in the suggestion:
<message>
<match no="1"/> to bardzo rzadki wyraz, oznaczający zwolennika sanacji. Czy chodziło o <suggestion><match no="1" regexp_match="sa" regexp_replace="se"/></suggestion>?
</message>
The bug is that there's no code that recognizes that the same match number is used twice. It may have a different role then!
For example, PRZE:
But see here:
What's the problem? Maybe there's a version mismatch between Recent Changes check and the WikiCheck? Anyway, it's quite counterintuitive.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.