Comments (8)
I started and pretty much finished this (here https://github.com/elixir-lang/gettext/blob/duplicate-translations/lib/gettext/po/parser.ex#L66-L95) and it works pretty smoothly. I think there's one inconsistency though: when there are duplicate translations, a Gettext.PO.SyntaxError
is raised in Gettext.PO.parse_string!/1
and Gettext.PO.parse_file!/1
. This happens because the parser returns a {:error, line, reason}
tuple where there are duplicate translations, just like when there's any kind of other parsing/tokenizing error.
Personally, I'd really like to raise something like Gettext.PO.DuplicateTranslationError
instead of a syntax error since the syntax is fine (one could argue with this, but I think this error is not what most people would call a "syntax error" even if it technically was).
I can think of two solutions: we make the parser return {:error, type, line, reason}
instead of {:error, line, reason}
(I'm not a fan of this one) or we check for duplicate translations outside of the parser; if we went with the second suggestion, I really have no idea where we should perform such check; Gettext.PO.parse_string/1
or Gettext.PO.parse_file
are still parse-related (as the name suggests), so I'm not sure we could check there (assuming that's the natural place where you would want to check, right after calling Gettext.PO.Parser.parse/1
).
What do you think? /cc @josevalim
from gettext.
I think SyntaxError is fine for now. We can discuss if it is an issue in the future.
Btw, there is an easier/faster way to implement the duplicates check. You can call Enum.reduce using a HashDict as accumulator. For every entry, you check if there was something in the HashDict already. The downside is that we are not going to accumulate all translations (which I think is fine) but on the upside it is going to be much faster in the common case (where everything worksβ’).
from gettext.
@josevalim mmm, but I still need to jump out of Enum.reduce
early if I find a duplicate, right?
If I had something like this
Enum.reduce translations, HashDict.new, fn
%Translation{id: id, po_source: {_, line}}, acc ->
Dict.update(acc, id, [line], &[line|&1])
# same thing for PluralTranslation
end
I could still keep track of all the duplicate translations, but to find them I would have to build the entire HashDict
and then find duplicates in it, right? I'm sorry I'm having a little trouble getting you on this one :).
(btw, it would still be faster using the HashDict
and Enum.reduce
the way I used them before, but I don't think you meant to use them like that :))
from gettext.
The code you wrote is literally group_by implementation. What I mean is that you would use Enum.reduce with a HashDict as soon as you find a duplicate, you would raise. The upside is that we avoid traversing the structures multiple times. The downside is that we raise only on the first duplicate (which is fine imo).
from gettext.
@josevalim indeed that's group_by
, you're right π₯
Ok, if you want to raise at this level I get what you're saying. I was trying to avoid raising in the parser since the parser always returns {:error, line, reason}
when there's an error and it's Gettext.PO.parse_string!/1
or Gettext.PO.parse_file!/1
that do the raising; if we go with your suggestion, we'll have to rescue inside the non bang version of these two functions, right?
from gettext.
@josevalim what about this:
defp check_for_duplicates(translations) do
res = Enum.flat_map_reduce translations, HashDict.new, fn t, acc ->
if line = Dict.get(acc, translation_id(t)) do
{:halt, [line, elem(t.po_source, 1)]}
else
{[t], Dict.put_new(acc, translation_id(t), elem(t.po_source, 1))}
end
end
case res do
{_, [l1, l2]} -> {:error, l2, "found duplicates of this translation"}
{_, _} -> :ok
end
end
where translation_id/1
just extracts id
for %Translation{}
s and {id, id_plural}
for %PluralTranslations{}
.
This way we still stop at the first duplicate translation we find, but we return {:error, line, reason}
instead of raising (while still traversing the structure only once).
Btw, raising as you suggested could solve us the problem of raising a DuplicateTranslationError
instead of a SyntaxError
, so that's another advantage; at the same time, I guess people don't expect parse_string/1
and parse_file/1
to raise if they have a bang variant. Wdyt?
from gettext.
Ah, yes, I am sorry. If you want to exit early from a reduce, in this case, it is fine to use throw/catch.
from gettext.
Close by #25!
from gettext.
Related Issues (20)
- check-up-to-date fails even though files are just extracted HOT 8
- Flaky test in `gettext.extract`
- Duplicate Filename in Reference when `:write_reference_line_numbers` is set to `false`
- Gettext.PluralFormError for plural form "1" in "ja" locale HOT 9
- `gettext.merge` FunctionClauseError HOT 4
- Module is not loaded because :nofile HOT 4
- Mention file path in plural forms deprecation warning HOT 4
- Plural Forms warning occurs in newly generated language file HOT 2
- bump a new version for #359 HOT 1
- Duplicate msgid with singular and plural form HOT 5
- Running `mix gettext.extract` doesn't extract new messages with Elixir 1.15 HOT 2
- Retain custom flags during merge HOT 3
- Interpolation option set but not working HOT 4
- Duplicate references in POT files and warnings about redefining modules HOT 5
- compile depend excoveralls - origin/httpc failed HOT 1
- Add `Gettext.example` macro HOT 6
- If changes to `.po` file are discarded (accidentally), they're not added back HOT 17
- Allow to transform messages at compile time HOT 3
- `expo.msguniq` merges translations with different plurals HOT 4
- Locale changes between static mount and liveview HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gettext.