Comments (14)
Here are some of the synonyms Wikipedia's ISO_3166_name template lists for Pakistan [1]:
{{ISO 3166 name|Pakistan}} Pakistan
{{ISO 3166 name|PK}} Pakistan
{{ISO 3166 name|PAK}} Pakistan
{{ISO 3166 name|586}} Pakistan
{{ISO 3166 name|Pakistán}} Pakistan
{{ISO 3166 name|پاکستان}} Pakistan
{{ISO 3166 name PK}} Pakistan
I guess most of them are abbreviations... But they also recognize some variant toponyms:
- Pakistán
- پاکستان -- The Urdu name for Pakistan
We probably wouldn't want to have every translation of a country name into every other language listed as synonyms. But it seems like it would make sense and be useful to at least have the country's own official name for themselves (in whatever language they use there). (For Germany, for example, we ought to recognize Bundesrepublik Deutschland.) It would be neat, at least...
(Some would probably disagree though, since Carmen is probably primarily for English speakers and not for those with other native languages...)
If Wikipedia can do country name synonyms with the constraints of the MediaWiki _template_ "language" (it's impressive that they can), we could certainly do this in Ruby... :-)
from carmen.
Oops, I forgot/didn't notice that Carmen already supports different localizations of the countries list (by setting Carmen.default_locale = :de
).
That's probably enough for most people's localization needs, though I still think it would be cool to be able to see (or search by) the country's own official name for themselves (پاکستان). But that is definitely not the main purpose of this issue, so I don't want to distract from that...
from carmen.
Some other problem countries (mentioned on http://en.wikipedia.org/wiki/Template:ISO_3166_name) include "Cases where the name is comma mangled":
irb -> Carmen::country_code('Republic of Macedonia') => nil irb -> Carmen::country_name('MK') => "Macedonia, the Former Yugoslav Republic of"
According to http://en.wikipedia.org/wiki/Republic_of_Macedonia, it is "officially the Republic of Macedonia (Република Македонија)". It has the other (ISO) name for historical reasons: "It became a member of the United Nations in 1993 but, as a result of a dispute with Greece over its name, it was admitted under the provisional reference of the former Yugoslav Republic of Macedonia,..."
irb -> Carmen::country_code('The Democratic Republic of Congo') => nil irb -> Carmen::country_code('Democratic Republic of Congo') => nil irb -> Carmen::country_name('CD') => "Congo, the Democratic Republic of the"
http://en.wikipedia.org/wiki/Democratic_Republic_of_the_Congo
Not to be confused with the neighbouring Republic of the Congo.
The Democratic Republic of the Congo is a state located in Central Africa. It is the second largest country in Africa by area and the eleventh largest in the world. With a population of nearly 71 million,[1] the Democratic Republic of the Congo is...
That's not confusing!
from carmen.
I think you touched on 3 issues:
- Some of the data in Carmen is out of date. This is easy enough to fix. It is mostly due to the data being originally scraped from the ISO list, which is rather out of date at this point.
- Countries often have aliases, and Carmen should take these into account. This is a good point. I'm working on a rewrite, and I'll make sure that this case is handled by the new code.
- Localization is a mess. I'm planning on moving away from the custom locale setter we use and standardize on supporting Rails-style i18n. There could potentially be a fallback for those not using Rails.
from carmen.
I am as a guider for a variety of items such as: Vietnam guide, Vietnam airlines, Vietnam visa... I understand as a habit, which is its Vietnamese means Việt Nam, English is Vietnam. Although there are some cases, any one call Viet nam. Names are sometimes difficult
from carmen.
This issue is still in desperate need of attention. We use Carmen to validate state and country name inputs which are often user provided. We often do not know if the input is a code, or an actual name, so we typically pass the input through both the .coded and .named queries to find a result.
Here are some really basic examples I came up with to demonstrate how Carmen can fail in providing the correct response:
1.9.3p327 :085 > Carmen::Country.named("United States", fuzzy: true)
=> <#Carmen::Country name="United States Minor Outlying Islands">
1.9.3p327 :067 > Carmen::Country.coded("U.S.A.")
=> nil
1.9.3p327 :068 > Carmen::Country.named("U.S.A.")
=> nil
1.9.3p327 :069 > Carmen::Country.named("U.S.A.", fuzzy: true)
=> <#Carmen::Country name="Russian Federation">
1.9.3p327 :071 > Carmen::Country.named("US", fuzzy: true)
=> <#Carmen::Country name="Austria">
1.9.3p327 :072 > Carmen::Country.named("United States of America", fuzzy: true)
=> nil
1.9.3p327 :063 > Carmen::Country.coded("UK")
=> nil
1.9.3p327 :065 > Carmen::Country.named("UK", fuzzy: true)
=> <#Carmen::Country name="Ukraine">
1.9.3p327 :073 > Carmen::Country.named("South Korea", fuzzy: true)
=> nil
Has this component of the 1.0 rewrite been worked on at all? Is there something I can do to assist?
Thanks!
from carmen.
> Carmen::Country.named 'Russia'
=> nil
That's definitely not what expected. Could we add aliases for countries to allow multiple names for one country? E. g. ["Russia", "Russian Federation"]
.
Thanks!
from carmen.
Closing here, but referencing in our v2 ticket. Aliasing in general is definitely something we are thinking about!
from carmen.
Just curious, did you folks find a solution looking up country synonyms/aliases (e.g. USA, US, America, United States, United States of America)? A project I'm working on has a similar problem and this thread came up on a google search.
from carmen.
@chriddyp It's something we are still working on implementing, but I think we agreed on a tentative solution if I remember correctly. We would love a PR though if you have any ideas or if you implement something on your existing project and want to backport it into a Carmen PR. :)
from carmen.
@cdainmiller looks like we'll probably start with the regexes maintained in this module: https://github.com/vincentarelbundock/countrycode/blob/master/data/countrycode_data.csv
from carmen.
@chriddyp Fantastic, thanks for the heads up.
from carmen.
Look up country by name is required.
pry(main)> Carmen::Country.named("The Bahamas")
=> nil
pry(main)> Carmen::Country.named("USA")
=> nil
from carmen.
I've overcome it like this
- Added a new key to locales
ru:
world:
by:
common_name: !!null
name: Беларусь
name_aliases:
- Белорусь
- Белорус
- Белоруссия
- Белорусия
- Беларус
- Беларуссия
- Беларусия
official_name: Республика Беларусь
- Loaded custom locales in initializer
Carmen.i18n_backend.append_locale_path(Rails.root.join("config", "locales", "countries"))
- Added a method to
Carmen::Country
Carmen::Country.class_eval do
def name_aliases
Array.wrap(
Carmen.i18n_backend.translate(path("name_aliases"))
) + [name, official_name].compact
end
Carmen::Country#path
is a public method so you don't have to use Monkey patching
from carmen.
Related Issues (20)
- Kenya's provinces were replaced by a system of counties in 2013. HOT 2
- France's 27 regions were reduced to 18 in 2016 HOT 2
- New Maintenance Plan HOT 2
- About the country code of Kosovo HOT 1
- Wrong subregions for NZ
- Brazilian's subregions aren't right HOT 2
- UK Counties & Cites Mixed Up HOT 6
- Very slow rake, rspec, and server startup HOT 3
- subregions? method throwing error. HOT 1
- Bahamas -> New Providence (BS-NP) island state missing HOT 2
- Hong Kong has no subregions
- Missing subregion for Chile
- The keys for IT provinces are shown in numeric
- Translation of subregions for CN
- Error on require: uninitialized constant ActiveSupport::XmlMini::IsolatedExecutionState (NameError) HOT 1
- Shipping Drop Down Menu - Yerushalayim Al Quads - Change to Jerusalemin
- Locales reset unexpectedly HOT 1
- Search through countries is not so effective, as it may be
- Update repo to show v1.1.3 release
- Tanzania is not a valid "named" input
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from carmen.