josephwright / translator Goto Github PK
View Code? Open in Web Editor NEWEasy translation of strings in LaTeX
License: Other
Easy translation of strings in LaTeX
License: Other
Beamer example with turkish produce error
Too many }'s. \providetranslation{Four}{D\"[o}rt}
The issue is discribed here https://tex.stackexchange.com/questions/542179/fixing-beamer-turkish-translation
\documentclass[turkish]{beamer}
\begin{document}
\begin{frame}
\today
\end{frame}
\end{document}
A given word can have many meanings. The meaning conveyed by a specific occurrence of a word depends on its context. Conversely, a given meaning can be conveyed by many words. In a given context, some of these words may be more suitable than others to convey the meaning. Therefore, when translating a word from one language to another, the context in which the word is used is crucial for determining the suitable translation.
In the current implementation of the translator
package a word's context, in the rare occasion it is given at all, is part of the key. For instance in the basic
standard dictionary, there are two keys for the entry encl
: encl (plural)
and encl (singular)
.
One of the problematic implications of the inclusion of context as part of keys can be seen when a new language is added to an existing dictionary type, for instance when translator-basic-dictionary-Hebrew.dict
is added to the list of dictionaries of type translator-basic-dictionary
. The new language may require that more context be added to a certain key in order to effect a suitable translation. For instance, the key author
, which is part of the basic
standard dictionaries, assumes two different grammatical forms in Hebrew, depending on the author's gender: a male author is מחבר, whereas a female author is מחברת. There is no gender-neutral form in Hebrew, and using one of these forms in a context when the other is expected would be inappropriate. In the current architecture of the translator
package, in order to accommodate Hebrew, a new pair of keys needs to be added to the dictionary: author (male)
and author (female)
, and --- and this is the important, and problematic part --- all instances of use of the old key, e.g. in the beamer
class, must be replaced by one of the two new keys, according to context. But doing so will break existing code, and will require users of languages in which this gender distinction plays no part (e.g. English) to provide duplicate translations of this key for the new contextualized keys. But more importantly: this can't be done by the author of the beamer
class, because they can't tell in advance whether the author referred to by this key is male or female.
I lay that this situation is unacceptable, and unsustainable.
In its stead a mechanism should be set up that would decouple keys from their contexts, and enable the unobtrusive addition of an arbitrary number of context indicators to every key without affecting existing usages of the key, and with each language being able to choose which context indicators of a given key it needs and will respond to.
This requires serious thought, but as a first "draft" I suggest adding a context
option to the \providetranslation
command. In this way the writer of translator-basic-dictionary-English.dict
can specify \providetranslation{author}{author}
, whereas the writer of translator-basic-dictionary-Hebrew.dict
can spacify \providetranslation{author}[context=male]{מחבר}
, \providetranslation{author}[context=female]{מחברת}
.
Now, to "deploy" a translation, the class or package that uses the \translate
command must use it inside a command or environment that is part of the class/package public user interface that is available to the end users. For instance, if the beamer
class wants to use the translator
package to translate the key author
, it must do so inside the command \author
, which is made available for end users of the beamer
class to use. The \author
command, in turn, must now be equipped with a context
option, that the end users can optionally use to supply context. A Hebrew user of this command would use it in the following manner: \author[context=female]{לאה גולדברג}
, or \author[context=male]{סייד קשוע}
, whereas an English user of this command can use it either with or without the context
option: \author{Jack London}
or \author[context=male]{Jack London}
. For the English user the context
option will be ignored, since the basic
dictionary doesn't specify any context indicators for the key author
. However, the translator
package can now choose the correct translation from the Hebrew version of the basic
dictionary based on the context supplied by the end user.
Dear joseph Wright
It is possible to add \DeclareOption{arabic} {\trans@use@and@alias{arabic} {Arabic}}
to the tranlator package
for Arabic dictionnaries.
In French, months are usually typeset in lowercase, except at the beginning of a sentence. Hence, it would be nice to provide in translator-months-dictionary-*.dict
lowercase along with uppercase months.
Would it be possible to chose the language for the translations in the \sisetup
without having it set as a global option?
translator currently defines language aliasses only if the language is in the package option list. Imho it would make sense to define them always, so that casual translations as done by siunitx works also if the language is not given as document option:
\documentclass{article}
\usepackage[ngerman]{babel}
\usepackage{siunitx}
\providetranslation[to = German]{sheep}{Schaf}
\begin{document}
\SIrange{1}{4}{\meter}, \translate{sheep}
\languagealias{ngerman}{German}
\SIrange{1}{4}{\meter}, \translate{sheep}
\end{document}
Hello, and sorry if this is a dumb question.
I am trying to translate some words to French. Here is my minimal working example:
\documentclass[french]{article}
\usepackage[french]{babel}
\usepackage[french]{translator}
\begin{document}
\translate{author}
\translate{Monday}
\translate{January}
\end{document}
What I expect is to see the words "auteur Lundi Janvier" (French translations of "author Monday January"); what I get is the original English words: "author Monday January".
I am missing something?
I am using debian testing. I get the issue with latex, pdflatex and lualatex. I get the issue with the package installed by apt, and when I compile this example in this cloned repository.
Any idea?
-- Louis
$ pdflatex --version
pdfTeX 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian)
kpathsea version 6.3.1
Copyright 2019 Han The Thanh (pdfTeX) et al.
There is NO warranty. Redistribution of this software is
covered by the terms of both the pdfTeX copyright and
the Lesser GNU General Public License.
For more information about these matters, see the file
named COPYING and the pdfTeX source.
Primary author of pdfTeX: Han The Thanh (pdfTeX) et al.
Compiled with libpng 1.6.37; using libpng 1.6.37
Compiled with zlib 1.2.11; using zlib 1.2.11
Compiled with xpdf version 4.01
A UTF-8 BOM (EF, BB, BF; U+FEFF) appears to have snuck into the first bytes of translator-basic-dictionary-Russian.dict
(and possibly other files), leading to my TeX installation choking on it. A \DeclareUnicodeCharacter{FEFF}{}
right before \begin{document}
fixes the issue, but it would be better to take the BOM out. (Note that the BOM is invisible in some, but not all, text editors.)
Would it be possible to add in the documentation a comparison of the current package with the translation
one which seems to provide the same features?
Dear Joseph,
I have compiled a Hungarian translator-months-dictionary. Please, update your translator package with it.
Thanks a lot
jf20191031
translator-months-dictionary-Hungarian.txt
Please find attached all the standard dictionaries, translated to Hebrew.
translator-basic-dictionary-Hebrew.txt
translator-bibliography-dictionary-Hebrew.txt
translator-environment-dictionary-Hebrew.txt
translator-months-dictionary-Hebrew.txt
translator-numbers-dictionary-Hebrew.txt
translator-theorem-dictionary-Hebrew.txt
Important notes.
The standard, and most commonly used Hebrew equivalent of and
is not a standalone word, but a prefix. This requires special handling that I don't know how to do. A solution to the problem in the case of biblatex
was suggested here.
Similar remarks apply to the Hebrew equivalents of "to" and "from", though these words may have appropriate standalone versions, depending on context. For instance, "from (sender of an email)" may have a reasonable standalone Hebrew translation, but "from (place)" does not have a standalone Hebrew translation, only a prefix. This is very context-sensitive.
This issue surfaces also in the "in" entry of the bibliography dictionary.
By the way, similar considerations (probably even more complicated than in Hebrew) apply to agglutinative languages, such as Finnish, and Turkish, perhaps not w.r.t. to the word "and", but with respect to "in" and "from".
Hebrew grammar is heavily gendered. In particular, the words for author and authors are gendered, and - especially in the case of a single author - cannot be given a single word translation. Ideally there should be three, or even four versions: author (male), author (female), author (male/female), author (female/male), and the same for authors. The same applies to the words "editor"/"editors" in the bibliography dictionary. This issue also surfaces in the numbers dictionary, especially in the ordinals.
By the way, this is also relevant to many European languages, and I'm surprised this has not yet come up with respect to other languages with uploaded dictionaries, such as French, German, Italian, Spanish.
Hebrew grammar is heavily sensitive to singular-plural distinctions. Hebrew has a very small number of collective nouns (not sure this is the linguistically correct terminology; perhaps "uncountable nouns" is better) in comparison with English. For instance the words "sheep" and "work" have different grammatical forms in Hebrew depending on whether they refer to an individual sheep/piece of work, or to a multitude of sheep/pieces of work. In particular, the term "related work" should have two versions: "related work (singular)", and "related work (plural)". However, if "related work" is intended to be used as a section heading, I guess the plural form is the correct one, which is the form I used in the attached files.
This issue surfaces also in the "Tech. Rep." entry in the bibliography dictionary. Is "Tech. Rep." an abbreviation for "Technical Report" or "Technical Reports"?
In the bibliography dictionary, is "ed." an abbreviation of "edition" or of "editor"? I translated it as "edition". The same applies to "eds."
The environment dictionary: context is everything for translating these terms appropriately.
A general word of advice/opinion: the package translator
, in its very essence, deals with languages, and should probably have been designed, and should be developed in consultation with a linguist, or at least a polyglot.
syntax error in line 19 of translator-numbers-dictionary-Turkish.dict
-\providetranslation{Four}{D"[o}rt}
+\providetranslation{Four}{D"{o}rt}
I suggest that the manual state that the translator
package ignores case when searching for dictionary files to load. Thus, if a dictionary file called MyDiCt-MyLaNg.DiCt
exists in the working directory, it will be loaded by the following commands
\usedictionary{mydict}
\uselanguage{mylang}
as well as by the following commands
\usedictionary{MydicT}
\uselanguage{myLang}
etc.
This is a consequence of the fact that the translator
code loads the file using the LaTeX command \IfInputFileExists
(see here), which performs a case-insensitive search.
This has various consequences that can be tricky to predict. For instance, if two dictionary files whose names are the same except for capitalization are saved in the working directory, it can be tricky to predict which will be loaded.
Even if the search matches only a single dictionary file, the behavior can be perplexing for a user who is under the impression that the search is case sensitive. For instance the following code prints "A phrase in German", but if you replace every occurrence of german
by German
and vice versa except for the babel
option, it'll print phrase
.
\begin{filecontents*}[overwrite]{Test-German.dict}
\providetranslation{phrase}{A phrase in German}
\end{filecontents*}
\documentclass{article}
\usepackage[german]{babel}
\usepackage{translator}
\usedictionary{Test}
\uselanguage{german}
\begin{document}
\translate{phrase}
\end{document}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.