google / emoji4unicode Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/emoji4unicode
License: Apache License 2.0
Automatically exported from code.google.com/p/emoji4unicode
License: Apache License 2.0
Project name: emoji4unicode Project location: http://code.google.com/p/emoji4unicode/ Google uses Private Use mappings to represent Emoji ("picture character") symbols in Unicode text. These characters are commonly used by Japanese cell phone carriers. This project makes these mappings available. Google and other members of the Unicode consortium are also developing a proposal for the addition of standardized Emoji symbol characters to Unicode. This project also provides data and tools that can be used in the development of the proposal. The tools are Python scripts that provide for consistency checks, reports on the data, and chart generation. The project documentation is available at http://sites.google.com/site/unicodesymbols/Home/emoji-symbols and its subpages.
MarkD commenting on issue 19:
I'd suggest all the circled ones together, either before or after all the
squared ones.
e-B2B
e-B3D
e-B43
e-B50
next to the other circled ones.
Original issue reported on code.google.com by katmomoi
on 5 Dec 2008 at 5:45
Compare again the sets of ARIB and Emoji symbols. Review currently proposed
unifications, and review whether we should unify more symbols.
Original issue reported on code.google.com by markus.icu
on 26 Nov 2008 at 10:27
For example, in cross-references.
Original issue reported on code.google.com by markus.icu
on 15 Dec 2008 at 8:41
William Overington pointed me to a thread at http://forum.high-
logic.com/viewtopic.php?p=10544#p10544 where he made the following comment:
In the http://www.unicode.org/~scherer/emoji4u ... t/utc.html document as
it stands as I write this post, I did notice one item straightaway.
About a fifth of the way down is an item labelled as YEOMAN OF THE GUARD
with the note • Beefeater, British below it. Yet the illustration is of a
Guardsman of a Foot Guard regiment, not of a Yeoman of the Guard, which is
a different group.
The Yeomen of the Guard wear Tudor-style uniforms, with a flat hat. They
are typically shown on the television at the State Opening of Parliament in
England. They are not the same group as the Yeoman Warders of the Tower of
London. Readers who know of the Gilbert and Sullivan Opera named Yeoman of
the Guard may also know that that name is in that Opera being wrongly
applied to the Yeoman Warders of the Tower of London.
There are pictures on the following page.
http://en.wikipedia.org/wiki/Yeomen_of_the_Guard
http://en.wikipedia.org/wiki/Yeomen_Warders
http://en.wikipedia.org/wiki/Foot_Guards
Original issue reported on code.google.com by markus.icu
on 22 Dec 2008 at 6:44
ARIB-9003=U+26CE TRAFFIC WARNING just looks like another heavy exclamation
point. We should propose changing its name to something descriptive of the
shape rather than its Japanese TV Symbols semantics.
Please add comments for further proposed changes of names of characters in
the encoding pipeline.
Original issue reported on code.google.com by markus.icu
on 13 Dec 2008 at 6:54
SoftBank symbols have old and new IDs. The current data contains the old
IDs. Add the new IDs and show them in the generated charts.
Original issue reported on code.google.com by markus.icu
on 24 Nov 2008 at 10:16
See http://www.unicode.org/charts/symbols.html
especially
Geometrical Shapes http://www.unicode.org/charts/PDF/U25A0.pdf
Arrows http://www.unicode.org/charts/PDF/U2190.pdf
and http://www.unicode.org/charts/PDF/U27F0.pdf
and http://www.unicode.org/charts/PDF/U2900.pdf
Miscellaneous Symbols and Arrows
http://www.unicode.org/charts/PDF/U2B00.pdf
(e.g. squares like e-B6B, e-B71, etc.)
See Open Issues: https://docs.google.com/Doc?docid=ddsrrpj5_44cnr2fj64
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 6:11
... and report to people involved in starting the font design
Do not count unified symbols nor in_proposal="no" symbols.
Maybe print at the bottom of the HTML charts?
Original issue reported on code.google.com by markus.icu
on 27 Nov 2008 at 12:04
Per comment by KameOyaji:
http://groups.google.com/group/emoji4unicode/browse_thread/thread/b2e536ad983594
6
e-4C5: BOUTIQUE 109 should not be in the UTC proposal since it refers to a
fashion chain boutique store in Tokyo -- the one in Shibuya being most famous.
The Softbank's name for this Emoji is: Shibuya. Not Shibuya 109. But it
still remains true that the image has 109 logo in it.
I propose to
1) change the name to SHIBUYA
2) move it to the Softbank specific section and move it out of the UTC proposal
Original issue reported on code.google.com by katmomoi
on 6 Dec 2008 at 6:36
OPENWAVE:
http://www.unicode.org/~scherer/emoji4unicode/snapshot/full.html#e-B89
is a business logo and should be moved to KDDI specific section and be
removed from the UTC version of the table.
Original issue reported on code.google.com by katmomoi
on 2 Dec 2008 at 11:40
- if unified with an existing character,
verify that the name is the same as that character's
- otherwise, verify that the name is not already used in Unicode
Original issue reported on code.google.com by markus.icu
on 24 Nov 2008 at 10:32
Optionally generate a shorter version of the chart, with fewer details,
less scrolling, fewer printed pages.
We don't seem to need the carrier codes except for examining cross-mappings.
Original issue reported on code.google.com by markus.icu
on 4 Dec 2008 at 12:36
Generate an HTML chart suitable for a font designer.
Add support for <design> sub-elements of <e> in emoji4unicode.xml.
Show Unicode code points for the new symbols font, instead of symbol IDs.
Probably something like base+ID where base is a Unicode Private Use code
point (Maybe U+E000 if we want to use BMP code points, or U+FF000 if we
want to avoid collisions with other PUA uses.) The "ID" is our symbol ID.
(A font designer will need a code point, and we can easily read our ID from
that if we know the base.)
Hide irrelevant data, like carrier Unicode/Shift-JIS/JIS codes.
Original issue reported on code.google.com by markus.icu
on 26 Nov 2008 at 11:17
Add data (at least code points & names) for ISO 10646 AMD6 characters and
check for character name problems.
Original issue reported on code.google.com by markus.icu
on 15 Dec 2008 at 5:51
For the font design, we want to know how many symbols use representative
images from which carrier. Also, given some existing or assumed resources
for some of the symbol sets, we want to count remaining symbols where we
may need to start from scratch.
Original issue reported on code.google.com by markus.icu
on 5 Dec 2008 at 8:29
Consider adding per-symbol anchors into the generated HTML charts to make
it easy to jump to a symbol row via #e-4B0 or similar.
Original issue reported on code.google.com by markus.icu
on 24 Nov 2008 at 10:25
We received data with the official names.
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 1:04
covered by round-trip mappings with symbols
Original issue reported on code.google.com by markus.icu
on 27 Nov 2008 at 12:07
The ARIB-9138 (U+26F7) character has a ski with a person on it.
All 3 carriers have just a ski and a boot on it. If the ARBI name were
something like "SKIING", this would be acceptable but the proposed name is
"SKIER". Unless the ARIB name can be changed, I suggest that we keep it as
SKIING or something like "A SKI AND A BOOT".
Original issue reported on code.google.com by katmomoi
on 16 Dec 2008 at 1:58
e-4B4: BLACK CROSS ON SHIELD
This unification was made in r65:
http://code.google.com/p/emoji4unicode/source/detail?r=65
There is a new Unicode character ARIB-9109 (U+26E8) but this is a black
cross within a shield. If you look at Emoji for 3 carriers, you see that
all of them have a house and a cross on it.
I feel that we should not unify this unless the ARIB proposal name can be
changed to something like "HOSPITAL CROSS".
Original issue reported on code.google.com by katmomoi
on 16 Dec 2008 at 1:49
While DoCoMo and KDDI have just a single "clock symbol", SoftBank has 12
symbols, one per full hour. We currently round-trip e-027 "10 oclock" with
SoftBank #239=#old45 which makes sense given the continuity in the old
numbers. However, the current SoftBank #239 image is quite different from
their other clocks, and both the (new?) image and the new numbers (they now
jump from #369 "9 oclock" to #370 "11 oclock") suggest that they have
changed their preference.
It seems like we should have a round-trip between e-02A "clock symbol" and
SoftBank #239=#old45, and a fallback from e-027 "10 oclock" to that same
SoftBank symbol.
Original issue reported on code.google.com by markus.icu
on 5 Dec 2008 at 5:00
Unify e-B82 key/password with ARIB-9071=U+26BF, which is a key inside a
square?
Original issue reported on code.google.com by markus.icu
on 25 Nov 2008 at 5:28
A suggestion from: Werner LEMBERG <[email protected]> to the [email protected].
"Maybe a typo: Ophiucus is normally written as Ophiuchus (at least in
the Astronomical world)."
Sometimes we see the spelling OPHIUCUS but the majority of web references use:
OPHIUCHUS
I propose to rename this entry to OPHIUCHUS.
Original issue reported on code.google.com by katmomoi
on 18 Dec 2008 at 7:30
I think e-4BC FOUNTAIN can be unified with ARIB 9125 = new U+26F2
Original issue reported on code.google.com by [email protected]
on 12 Dec 2008 at 1:19
James Kass reports the following misspellings:
e-355 SPEAK NO EVIL MONEY
e-502 BOOK WITH VERICAL FILL
Original issue reported on code.google.com by markus.icu
on 19 Dec 2008 at 6:56
From Asmus Freytag 2008-feb-01:
Clock faces, computer/document icons, as well as a rather significant
number of other symbols are present in the suite of wingdings fonts
distributed by Microsoft. A cross mapping to these would be a useful
exercise - not the least because these fonts represent existing black
and white interpretations of the glyph shape(s) for such symbols. These
glyphs might represent possible starting points for representative
glyphs, should these characters be encoded.
See Open Issues: https://docs.google.com/Doc?docid=ddsrrpj5_44cnr2fj64
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 6:11
For correctness sake, add the name of "Satoru Takabayashi (revisions)"
under "Main authors:" in trunk/data/emoji4unicode.xml
Original issue reported on code.google.com by katmomoi
on 27 Nov 2008 at 9:13
In discussing issue #16 Mark/Kat/Markus looked at the adjacent e-B98..e-B9C
symbols with hands and index fingers pointing in various directions.
We noticed that e-B98 shows the palm of the hand while the other four show
the back of the hand.
We agreed to unify e-B98 -- rather than e-B99 -- with U+261D WHITE UP
POINTING INDEX because the existing characters all show the palm of the hand.
We agreed to disunify e-B99..e-B9C from existing characters and give the
new symbols names like the existing ones but with "BACKHAND" inserted.
Original issue reported on code.google.com by markus.icu
on 4 Dec 2008 at 9:12
e-1E3 does not round-trip to any carrier and does not have any
representation. I assume it's a Google-created symbol. If so, then it
should be in_proposal="no".
Original issue reported on code.google.com by markus.icu
on 5 Dec 2008 at 1:09
They look to me like they are phone IME status indicators and belong into
"Communication (4. Artifacts)" or "Phone Specific (5.
Activities/Work/Entertainment)".
Original issue reported on code.google.com by markus.icu
on 5 Dec 2008 at 1:16
The charts should show whether we are unifying with an existing (Unicode
5.1) character or an upcoming one (Unicode 5.2/AMD6). For upcoming ones we
should not show the actual characters because there won't be fonts for
them.
Original issue reported on code.google.com by markus.icu
on 16 Dec 2008 at 10:45
Its description says "Personal Digital Cellular Symbol/Logo. ?"
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 6:16
Notes from feedback from UTC meeting 2008-feb-05:
KDDI 279 Top Secret Sign =U+3299?
Softbank 201 Existence Sign =U+3292
Softbank 203 Monthly Sign =U+328A (there may be additional characters in
the vicinity of U+328A that can be unified with other symbols in question
here)
KDDI 384 Service Sign =U+32DA?
KDDI 402 Celebration Sign =U+3297?
TODO: Need to verify unifications and apply them in the table.
Discuss: Any further feedback on the proposed unifications? Characters like
U+3299 are much less styled (they look like the Han characters with
enclosing circle) than the symbols in the Emoji context.
Peter Edberg: Docs use KDDI #279 & U+3299 together
Mark: Consider whether plain text distinction
Peter: Maybe VS?
Markus: JIS source separation but otherwise unify
See Open Issues: https://docs.google.com/Doc?docid=ddsrrpj5_44cnr2fj64
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 6:14
no intersection of Emoji unifications with Shift-JIS round-trips
not clear if this is the right thing to do if the Shift-JIS round-trip is
outside the JIS X 0208 part
Original issue reported on code.google.com by markus.icu
on 24 Nov 2008 at 10:33
e-B94 SCISSOR IN HAND GAME is currently disunified from U+270C VICTORY HAND
According to my notes from the 2008-aug UTC meeting:
Ken agreed with the disunification because they are distinct gestures in
the real world.
Mark recommended to unify now, and if the distinction is really needed
later, then add a specific character at that time.
The meeting did not end conclusively on this topic.
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 6:10
For the encoding proposal, new symbol characters will need proposed Unicode
code points. We should try to fill existing blocks with similar symbols,
and otherwise add symbols on plane 1.
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 5:11
In the HTML chart, in a per-carrier-per-symbol cell, if there is no
corresponding carrier symbol and there is also no text_fallback for the
symbol, show the geta mark (U+3013 〓) in text fallback style, rather than
"-".
This is how charts used to show this information with previous tools.
(Before this project was created.)
Original issue reported on code.google.com by markus.icu
on 27 Nov 2008 at 7:25
According to my notes from the 2008-aug UTC meeting, the discussion ended
with Ken's recommendation to unify e-B07 wavy length mark with U+3030 wavy
dash (without change in properties), and to encode e-B08 looped length mark
as a new compatibility character.
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 6:09
Look for "May unify with..." or "Possibly unify with..." and similar
comments.
For example, one of the happy faces should be unified with U+263A.
Original issue reported on code.google.com by markus.icu
on 5 Dec 2008 at 5:40
emoji4unicode/src$ ./emoji4unicode_test.py
source separation error: e-AF8 = U+2191 = Shift-JIS-81AA
source separation error: e-AF9 = U+2193 = Shift-JIS-81AB
source separation error: e-AFA = U+2192 = Shift-JIS-81A8
source separation error: e-AFB = U+2190 = Shift-JIS-81A9
They are e-AF8 UPWARDS ARROW .. e-AFB LEFTWARDS ARROW.
Disunify and rename with prefix "HEAVY"?
Original issue reported on code.google.com by markus.icu
on 6 Dec 2008 at 12:56
Check that our <ann>otations follow the syntax of a small subset of the
Unicode NamesList.txt file format. (Except that we don't use the TAB which
starts most CHAR_ENTRY lines; in an <ann> element that's implied.)
Format the recognized types of annotations as described in NamesList.html.
For example, it says there:
COMMENT_LINE: TAB "*" SP EXPAND_LINE
// * is replaced by BULLET, output line as comment
Original issue reported on code.google.com by markus.icu
on 15 Dec 2008 at 8:45
Should the SoftBank symbols #205, #206, #207 (見出しボタン 1..3) map to e-
B63=KDDI #40, e-B64=KDDI #41 and e-B67 as the data currently does?
These symbols are used as menu selectors.
Also review the name of e-B67 with regard to being in a series with the
other two SoftBank symbols.
Original issue reported on code.google.com by markus.icu
on 25 Nov 2008 at 5:25
Should we rename the 4 operators to "HEAVY" followed by exactly the names
of the corresponding normal characters?
That would mean
e-B51 HEAVY PLUS -> HEAVY PLUS SIGN
e-B52 HEAVY MINUS -> HEAVY HYPHEN-MINUS
e-B53 HEAVY TIMES -> HEAVY MULTIPLICATION SIGN
e-B54 HEAVY DIVISION -> HEAVY DIVISION SIGN
Original issue reported on code.google.com by markus.icu
on 5 Dec 2008 at 1:22
For example, we have HOURGLASS and HOURGLASS 2. It would be good to have
somewhat more interesting name variations than appending "2".
Original issue reported on code.google.com by markus.icu
on 5 Dec 2008 at 6:18
e-7E5: The original Softbank data says the name is 「RV者」.
This is clearly a typo. I propose to correct this to 「RV車」.
Original issue reported on code.google.com by katmomoi
on 13 Dec 2008 at 7:46
... isn't it always called a Four Leaf Clover?
asks David Starner
Original issue reported on code.google.com by markus.icu
on 22 Dec 2008 at 7:04
According to my notes from the 2008-aug UTC meeting, there was consensus to
rename the flags from
FLAG OF JAPAN
to
FLAG SYMBOL JP
etc.
(using 2-letter country codes)
There was an open issue about what to use for representative glyphs -- real
flags, or flags with the 2-letter codes in them, or something else.
Original issue reported on code.google.com by markus.icu
on 2 Dec 2008 at 6:15
We should post a public review of the chart by mid-December 2008 and invite
feedback. I will start a cover page on the unicodesymbols site.
Original issue reported on code.google.com by markus.icu
on 8 Dec 2008 at 9:36
This issue is intended for the collection of various additions of
font/glyph design instructions via <design> sub-elements of appropriate <e>
elements in emoji4unicode.xml. We can check in several of these together.
e-83B KEYPAD 10: Should look like digits 10 enclosed by a keycap like
U+20E3.
(We should move KEYPAD 10 to immediately after KEYPAD 0.)
Original issue reported on code.google.com by markus.icu
on 5 Dec 2008 at 5:12
We have annotations on symbols that are unified with existing characters.
We should propose those annotations as additional annotations on their
characters.
We can also collect additional annotations for other existing/upcoming
characters in this issue.
Original issue reported on code.google.com by markus.icu
on 15 Dec 2008 at 8:26
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.