Git Product home page Git Product logo

translator's People

Contributors

bisgardo avatar houdik avatar jacopodonati avatar josephwright avatar kernela avatar liuq avatar naaci avatar ndandanov avatar paternal avatar samcarter avatar sfr682k avatar tijssen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

translator's Issues

Enhancement request: Decouple keys from context, and set up a mechanism to handle context

A given word can have many meanings. The meaning conveyed by a specific occurrence of a word depends on its context. Conversely, a given meaning can be conveyed by many words. In a given context, some of these words may be more suitable than others to convey the meaning. Therefore, when translating a word from one language to another, the context in which the word is used is crucial for determining the suitable translation.

In the current implementation of the translator package a word's context, in the rare occasion it is given at all, is part of the key. For instance in the basic standard dictionary, there are two keys for the entry encl: encl (plural) and encl (singular).

One of the problematic implications of the inclusion of context as part of keys can be seen when a new language is added to an existing dictionary type, for instance when translator-basic-dictionary-Hebrew.dict is added to the list of dictionaries of type translator-basic-dictionary. The new language may require that more context be added to a certain key in order to effect a suitable translation. For instance, the key author, which is part of the basic standard dictionaries, assumes two different grammatical forms in Hebrew, depending on the author's gender: a male author is מחבר, whereas a female author is מחברת. There is no gender-neutral form in Hebrew, and using one of these forms in a context when the other is expected would be inappropriate. In the current architecture of the translator package, in order to accommodate Hebrew, a new pair of keys needs to be added to the dictionary: author (male) and author (female), and --- and this is the important, and problematic part --- all instances of use of the old key, e.g. in the beamer class, must be replaced by one of the two new keys, according to context. But doing so will break existing code, and will require users of languages in which this gender distinction plays no part (e.g. English) to provide duplicate translations of this key for the new contextualized keys. But more importantly: this can't be done by the author of the beamer class, because they can't tell in advance whether the author referred to by this key is male or female.

I lay that this situation is unacceptable, and unsustainable.

In its stead a mechanism should be set up that would decouple keys from their contexts, and enable the unobtrusive addition of an arbitrary number of context indicators to every key without affecting existing usages of the key, and with each language being able to choose which context indicators of a given key it needs and will respond to.

This requires serious thought, but as a first "draft" I suggest adding a context option to the \providetranslation command. In this way the writer of translator-basic-dictionary-English.dict can specify \providetranslation{author}{author}, whereas the writer of translator-basic-dictionary-Hebrew.dict can spacify \providetranslation{author}[context=male]{מחבר}, \providetranslation{author}[context=female]{מחברת}.

Now, to "deploy" a translation, the class or package that uses the \translate command must use it inside a command or environment that is part of the class/package public user interface that is available to the end users. For instance, if the beamer class wants to use the translator package to translate the key author, it must do so inside the command \author, which is made available for end users of the beamer class to use. The \author command, in turn, must now be equipped with a context option, that the end users can optionally use to supply context. A Hebrew user of this command would use it in the following manner: \author[context=female]{לאה גולדברג}, or \author[context=male]{סייד קשוע}, whereas an English user of this command can use it either with or without the context option: \author{Jack London} or \author[context=male]{Jack London}. For the English user the context option will be ignored, since the basic dictionary doesn't specify any context indicators for the key author. However, the translator package can now choose the correct translation from the Hebrew version of the basic dictionary based on the context supplied by the end user.

Add arabic option

Dear joseph Wright
It is possible to add \DeclareOption{arabic} {\trans@use@and@alias{arabic} {Arabic}} to the tranlator package
for Arabic dictionnaries.

Lowercase along with uppercase months

In French, months are usually typeset in lowercase, except at the beginning of a sentence. Hence, it would be nice to provide in translator-months-dictionary-*.dict lowercase along with uppercase months.

predefine language alias outside package options

translator currently defines language aliasses only if the language is in the package option list. Imho it would make sense to define them always, so that casual translations as done by siunitx works also if the language is not given as document option:

\documentclass{article}
\usepackage[ngerman]{babel}
\usepackage{siunitx}
\providetranslation[to = German]{sheep}{Schaf}
\begin{document}
\SIrange{1}{4}{\meter}, \translate{sheep}

\languagealias{ngerman}{German}
\SIrange{1}{4}{\meter}, \translate{sheep}
\end{document}

image

\translate does not translate

Hello, and sorry if this is a dumb question.

I am trying to translate some words to French. Here is my minimal working example:

\documentclass[french]{article}

\usepackage[french]{babel}
\usepackage[french]{translator}


\begin{document}

\translate{author}
\translate{Monday}
\translate{January}

\end{document}

What I expect is to see the words "auteur Lundi Janvier" (French translations of "author Monday January"); what I get is the original English words: "author Monday January".

I am missing something?

I am using debian testing. I get the issue with latex, pdflatex and lualatex. I get the issue with the package installed by apt, and when I compile this example in this cloned repository.

Any idea?

-- Louis

$ pdflatex --version
pdfTeX 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian)
kpathsea version 6.3.1
Copyright 2019 Han The Thanh (pdfTeX) et al.
There is NO warranty.  Redistribution of this software is
covered by the terms of both the pdfTeX copyright and
the Lesser GNU General Public License.
For more information about these matters, see the file
named COPYING and the pdfTeX source.
Primary author of pdfTeX: Han The Thanh (pdfTeX) et al.
Compiled with libpng 1.6.37; using libpng 1.6.37
Compiled with zlib 1.2.11; using zlib 1.2.11
Compiled with xpdf version 4.01

UTF-8 BOM in Russian dictionary

A UTF-8 BOM (EF, BB, BF; U+FEFF) appears to have snuck into the first bytes of translator-basic-dictionary-Russian.dict (and possibly other files), leading to my TeX installation choking on it. A \DeclareUnicodeCharacter{FEFF}{} right before \begin{document} fixes the issue, but it would be better to take the BOM out. (Note that the BOM is invisible in some, but not all, text editors.)

A Hebrew translation of all the standard dictionaries

Please find attached all the standard dictionaries, translated to Hebrew.

translator-basic-dictionary-Hebrew.txt
translator-bibliography-dictionary-Hebrew.txt
translator-environment-dictionary-Hebrew.txt
translator-months-dictionary-Hebrew.txt
translator-numbers-dictionary-Hebrew.txt
translator-theorem-dictionary-Hebrew.txt

Important notes.

  1. The standard, and most commonly used Hebrew equivalent of and is not a standalone word, but a prefix. This requires special handling that I don't know how to do. A solution to the problem in the case of biblatex was suggested here.

    Similar remarks apply to the Hebrew equivalents of "to" and "from", though these words may have appropriate standalone versions, depending on context. For instance, "from (sender of an email)" may have a reasonable standalone Hebrew translation, but "from (place)" does not have a standalone Hebrew translation, only a prefix. This is very context-sensitive.

    This issue surfaces also in the "in" entry of the bibliography dictionary.

    By the way, similar considerations (probably even more complicated than in Hebrew) apply to agglutinative languages, such as Finnish, and Turkish, perhaps not w.r.t. to the word "and", but with respect to "in" and "from".

  2. Hebrew grammar is heavily gendered. In particular, the words for author and authors are gendered, and - especially in the case of a single author - cannot be given a single word translation. Ideally there should be three, or even four versions: author (male), author (female), author (male/female), author (female/male), and the same for authors. The same applies to the words "editor"/"editors" in the bibliography dictionary. This issue also surfaces in the numbers dictionary, especially in the ordinals.

    By the way, this is also relevant to many European languages, and I'm surprised this has not yet come up with respect to other languages with uploaded dictionaries, such as French, German, Italian, Spanish.

  3. Hebrew grammar is heavily sensitive to singular-plural distinctions. Hebrew has a very small number of collective nouns (not sure this is the linguistically correct terminology; perhaps "uncountable nouns" is better) in comparison with English. For instance the words "sheep" and "work" have different grammatical forms in Hebrew depending on whether they refer to an individual sheep/piece of work, or to a multitude of sheep/pieces of work. In particular, the term "related work" should have two versions: "related work (singular)", and "related work (plural)". However, if "related work" is intended to be used as a section heading, I guess the plural form is the correct one, which is the form I used in the attached files.

    This issue surfaces also in the "Tech. Rep." entry in the bibliography dictionary. Is "Tech. Rep." an abbreviation for "Technical Report" or "Technical Reports"?

  4. In the bibliography dictionary, is "ed." an abbreviation of "edition" or of "editor"? I translated it as "edition". The same applies to "eds."

  5. The environment dictionary: context is everything for translating these terms appropriately.

  6. A general word of advice/opinion: the package translator, in its very essence, deals with languages, and should probably have been designed, and should be developed in consultation with a linguist, or at least a polyglot.

The manual should state that the search for dictionary files is case-insensitive

I suggest that the manual state that the translator package ignores case when searching for dictionary files to load. Thus, if a dictionary file called MyDiCt-MyLaNg.DiCt exists in the working directory, it will be loaded by the following commands

\usedictionary{mydict}
\uselanguage{mylang}

as well as by the following commands

\usedictionary{MydicT}
\uselanguage{myLang}

etc.

This is a consequence of the fact that the translator code loads the file using the LaTeX command \IfInputFileExists (see here), which performs a case-insensitive search.

This has various consequences that can be tricky to predict. For instance, if two dictionary files whose names are the same except for capitalization are saved in the working directory, it can be tricky to predict which will be loaded.

Even if the search matches only a single dictionary file, the behavior can be perplexing for a user who is under the impression that the search is case sensitive. For instance the following code prints "A phrase in German", but if you replace every occurrence of german by German and vice versa except for the babel option, it'll print phrase.

\begin{filecontents*}[overwrite]{Test-German.dict}
\providetranslation{phrase}{A phrase in German}
\end{filecontents*}

\documentclass{article}
\usepackage[german]{babel}
\usepackage{translator}

\usedictionary{Test}
\uselanguage{german}
\begin{document}

\translate{phrase}

\end{document}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.