kanjivg / kanjivg Goto Github PK
View Code? Open in Web Editor NEWKanji vector graphics
Home Page: http://kanjivg.tagaini.net
License: Other
Kanji vector graphics
Home Page: http://kanjivg.tagaini.net
License: Other
KanjiVG: Kanji Vector Graphics Licence ------- KanjiVG is copyright Ulrich Apel and released under the Creative Commons Attribution-Share Alike 3.0 licence: http://creativecommons.org/licenses/by-sa/3.0/ See the file COPYING for more details. Documentation ------------- The project's documentation is at https://kanjivg.tagaini.net/.
Stroke 3 (the bottom of the water radical) should be written left to right
The strokes 9, 10 and 11 are inconsistent.
I don't know what would be the right way to fix this. Maybe we should split this into two variants.
VtLstVtLst
variant. Draw the vertical stroke last on both halves.VtLstVt5
variant. The vertical is drawn last in the left half, but in the right half the vertical is draw as its fifth stroke.Btw, the strokes 5 and 6 i have corrected in another commit, c650ed5.
Strokes 7 and 8 are put as different types in the XML and the SVG in kanji 86fb.
I've found some characters where the stroke numbers are placed next to different strokes than the order defined by the order of the stroke paths:
共: 1 and 2 are swapped
主-VtLst: 3 and 4 are swapped.
住-VtLst: 1 and 2 are swapped and 6 and 7 are swapped
庫: Strokes 2 and 3
炭: Strokes 4 and 5
I intend to figure out how to use github at some point, so I can submit actual changes--I'm assuming it's the order of the path elements that's correct--but I thought I'd mention what I've found in case I don't get around to it.
Is it correct to edit this item when I find more or open new issues?
These Kanji are not inside their frames
5618-variants (reported already in #38)
751e
751e-Kaisho
9f4b-Kaishovt12
Each of these characters appears to be missing one of the two short marks at the top of the radical (⻍).
It's more logical to have the two vertical lines come first in the right part, like how it is in kakijun. z-one March 01, 2011, at 11:56 PM
There are two stroke orders I found for radical 臼:
A. http://kakijun.main.jp/page/usu200.html
B. as a component in http://sugiura5.gsid.nagoya-u.ac.jp/cgi-bin/komori/ww2k.cgi?code=1967
Kakijun says for 臼:
三画目、四画目の順序は逆でも構いません
(It's OK for 3-rd, 4-th strokes to be reversed)
A similar thing is written for 潟 (6, 7 can be reversed).
I think the stroke order can be left as it is (in order B) but the 6-th stroke should be written from left to right.
There have been 168 form changes in JIS X 0213:2004. Kanjivg seems to be based on JIS X 0208 and doesn't cover them.
Wikipedia has a list of the items:
http://ja.wikipedia.org/wiki/JIS_X_0213#JIS_X_0213:2004.E3.81.AE.E6.94.B9.E6.AD.A3
Jis x 013 standard:
http://www.itscj.ipsj.or.jp/ISO-IR/233.pdf
The older Jis x 208:
http://www.itscj.ipsj.or.jp/ISO-IR/087.pdf
A table of the characters is given below.
Practical note:
On Windows systems, MS Meiryo should be able to display them correctly, while MS Mincho displays the old forms (probably depends on version).
The characters are:
5026 倦, 50C5 僅, 5132 儲, 514E 兎, 51A4 冤, 537F 卿, 53A9 厩, 53C9 叉
53DB 叛, 53DF 叟, 54AC 咬, 54E8 哨, 55B0 喰, 5632 嘲, 5642 噂, 564C 噌
56C0 囀, 5835 堵, 5A29 娩, 5C51 屑, 5C60 屠, 5DF7 巷, 5E96 庖, 5EDF 廟
5EFB 廻, 5F98 徘, 5FBD 徽, 6062 恢, 6108 愈, 6241 扁, 633A 挺, 633D 挽
6357 捗, 6372 捲, 63C3 揃, 647A 摺, 64B0 撰, 64E2 擢, 65A7 斧, 6666 晦
6753 杓, 6756 杖, 6897 梗, 68D8 棘, 6962 楢, 696F 楯, 698A 榊, 6994 榔
69CC 槌, 6A0B 樋, 6A3D 樽, 6A59 橙, 6ADB 櫛, 6B4E 歎, 6C72 汲, 6DEB 淫
6EA2 溢, 6EBA 溺, 6F23 漣, 7015 瀕, 701E 瀞, 7026 瀦, 7058 灘, 7078 灸
707C 灼, 7149 煉, 714E 煎, 717D 煽, 723A 爺, 724C 牌, 7259 牙, 727D 牽
72E1 狡, 7337 猷, 7511 甑, 7515 甕, 7526 甦, 75BC 疼, 77A5 瞥, 7941 祁
7947 祇, 795F 祟, 79B0 禰, 79E4 秤, 7A17 稗, 7A7F 穿, 7AC8 竈, 7B08 笈
7B75 筵, 7BAD 箭, 7BB8 箸, 7BC7 篇, 7BDD 篝, 7C3E 簾, 7C7E 籾, 7C82 粂
7FEB 翫, 7FF0 翰, 8171 腱, 817F 腿, 818F 膏, 8258 艘, 8292 芒, 82A6 芦
8328 茨, 845B 葛, 84EC 蓬, 8511 蔑, 853D 蔽, 85A9 薩, 85AF 薯, 85F7 藷
8654 虔, 86F8 蛸, 8703 蜃, 8755 蝕, 87F9 蟹, 8805 蠅, 8956 襖, 8A0A 訊
8A1D 訝, 8A3B 註, 8A6E 詮, 8AB9 誹, 8AFA 諺, 8B0E 謎, 8B2C 謬, 8C79 豹
8CED 賭, 8FBB 辻, 8FBF 辿, 8FC2 迂, 8FC4 迄, 8FE6 迦, 9017 逗, 9019 這
9022 逢, 903C 逼, 9041 遁, 905C 遜, 9061 遡, 912D 鄭, 914B 酋, 91DC 釜
9306 錆, 9375 鍵, 939A 鎚, 9453 鑓, 9699 隙, 9744 靄, 9771 靱, 9784 鞄
9798 鞘, 97AD 鞭, 98F4 飴, 9905 餅, 990C 餌, 9910 餐, 9957 饗, 99C1 駁
9A19 騙, 9BAB 鮫, 9BD6 鯖, 9C2F 鰯, 9C52 鱒, 9D09 鴉, 9D60 鵠
(all in kanjivg) and
5C62 屢 (not in kanjivg)
The following among these are Jouyou-Kanji and may have higher priority:
50C5 僅, 5632 嘲, 6357 捗, 6897 梗, 6DEB 淫, 6EBA 溺, 714E 煎, 7259 牙
7BB8 箸, 8328 茨, 845B 葛, 8511 蔑, 853D 蔽, 8A6E 詮, 8B0E 謎, 8CED 賭
905C 遜, 9061 遡, 91DC 釜, 9375 鍵, 9699 隙, 9905 餅, 990C 餌
Viewer confirms this. Strokes are not even close to being close to each other and does not resemble 心 at all.
There are a number of files where the order of strokes and numbers does not match. In the majority of cases, it is the numbers that are placed incorrectly, although there might be a small number of cases where the actual stroke order is wrong.
In 274 files some or all of the stroke numbers are missing. This affects all kana and latin characters. A majority of them was probably lost in the conversion to the combined xml/svg format ("kanji" directory). They still exist in the old "SVG" directory:
https://github.com/KanjiVG/kanjivg/tree/5e8ff1bed36d8e11866f83b67f5f5b5e5af384e0
The topic of misplaced stroke numbers was discussed a while ago in
https://groups.google.com/forum/?fromgroups=#!topic/kanjivg/-0qmqfLj_aE
Repeating my last post there:
This issue also covers #34, #30, #29, #28, #27, #19, #15, #14 (partial), #9, #8, #6, #2.
The kanjivg.xml file produced by kvg.py produces an error when you try to parse it, for example with Pythons ElementTree, xml.etree.ElementTree.parse(kanjiVgFile)
(with the file open and xml imported): “xml.etree.ElementTree.ParseError: unbound prefix: line 425, column 0
”
The reason appears to be that the “kvg:
” XML namespace isn’t defined.
Adding a xmlns
to the kanjivg tag seems to help:
index 5ad048b..85dd1e6 100755
--- a/kvg.py
+++ b/kvg.py
@@ -75,7 +75,7 @@ def release():
out.write(licenseString)
out.write("\nThis file has been generated on %s, using the latest KanjiVG data\nto this date." % (dat
out.write("\n-->\n")
- out.write("<kanjivg>\n")
+ out.write('<kanjivg xmlns:kvg="http://kanjivg.tagaini.net/format.html">\n')
for f in files:
data = open(os.path.join(datadir, f)).read()
data = data[data.find("<svg "):]
(See the W3C recommendation for tons of boring details on XML namespaces.)
Currently both kanji for うつ are drawn starting from the left tree radical. However, the correct stroke order is: first the upper middle radical (缶 for 鬱, メ for 欝), then left and right tree, then the rest. Reference: http://kakijun.main.jp/page/utsu200.html http://kakijun.main.jp/page/utsu25200.html
Some latin letters and punctuation marks (! . ; i j ?) have missing dots:
00021
0002e
0003a
0003b
00069
0006a
030fb
0ff01
0ff1a
0ff1f
(4 of them already reported in issue #36)
Stroke 6 to be specific
These characters look like they are left-justified now. This affects files in range
00021-0007a (latin letters and punctuation)
03041-03096 (hiragana)
030a1-030fa (katakana)
0ff01-0ff1f (full-width latin punctuation)
Note that there are only 3 files in the full-width latin range (?) Wouldn't it make more sense to use only the full-width range and drop the standard latin range? As I understand it, full-width letters are drawn on a square cell like kanji, and they are used to fit western and characters into eastern typesetting. Normal latin letters are drawn on a rectangular cell.
(Example: JRJR)
It seems to be OK but the strokes are unordered and does not correspond with the numbers.
I fixed the stroke numbers for 漕/06f15-KaishoVtLst.svg and 膿/081bf-KaishoVtLst.svg and while doing so noticed that some of the kvg:NN attributes not only in those files but also in the one i used as reference, 糟/07cdf-KaishoVtLst.svg are wrong.
Some strokes that are horizontal have kvg:type="㇑"
, that is, the file claims they are vertical. The grouping (<g kvg:element="NN">...</g>
) doesn't make a lot of sense, either.
I think this may effect several kanji variants with an "曲" element with the two verticals drawn last. That seem to be these:
05102-KaishoVtLst.svg
066f2-KaishoVtLst.svg
066f9-KaishoVtLst.svg
069fd-KaishoVtLst.svg
06f15-KaishoVtLst.svg
06fc3-KaishoVtLst.svg
079ae-KaishoVtLst.svg
07cdf-KaishoVtLst.svg
081bf-KaishoVtLst.svg
0825a-KaishoVtLst.svg
08276-KaishoVtLst.svg
08c4a-KaishoVtLst.svg
08ec6-KaishoVtLst.svg
08fb2-KaishoVtLst.svg
0906d-KaishoVtLst.svg
091b4-KaishoVtLst.svg
09c67-KaishoVtLst.svg
I don't want to muck around in there to fix them all by hand. (Some seem to be OK. I din't check all of them.)
Maybe this could be done by a (somewhat ad-hoc-ish) script of some sort.
Strokes 8, 9 should be interchanged.
http://www.sp.cis.iwate-u.ac.jp/icampus/u/akanji.jsp?k=%E8%99%90
http://www.yookoso.com/pages/kanji.php?file=display&jisdec=13652
http://sugiura5.gsid.nagoya-u.ac.jp/cgi-bin/komori/ww2k.cgi?code=2152
http://kakijun.main.jp/page/09190200.html
This character seems to be a variation of
http://www.mdbg.net/chindict/chindict.php?page=chardict&cdcanoce=0&cdqchi=%E8%99%90
in which the last horizontal stroke intersected the vertical line and was written last. In the current Japanese character it does not intersect and the stroke order has changed.
shouldn't the right component be 月 instead of 肉? (better fit with the meaning)
"tare" position is set to the parent element, which also creates one useless group.
The following files are empty:
0002e
0003a
030fb
0ff1a
They are supposed to contain some kind of dots or colons.
I noticed that hiragana & katakana defined by KanjiVG are not centered in their square. They appear on the left of the square while the kanji are perfectly centered.
The Tangorin project uses KanjiVG to render stroke order diagrams. When I looked up 半 there, I found that it draws the second stroke—the upper-right diagonal stroke—last, with all other strokes in their proper sequence. Oddly enough, when I looked at 半 on jisho.org (another project which uses KanjiVG), it has the correct stroke order.
The 1st and 2nd strokes may be swapped.
The files need to be renamed and un-jinmeied, since they have their own unicode codepoints.
63b4-jinmei is 6451 and
848b-jinmei is 8523.
This is a random find, I didn't do a systematic check of the jinmei variants.
The mirrored strokes resembling "E" are connected in these two kanji; the strokes 9 and 10 should be merged into one. When the two "E" are separated, they must be written first whole left E then whole right E, so the order is definitely wrong.
These two kanjis seem to be the ones that did not undergo a change with JIS X0213/2004 (others, like 溲, 艘, 叟 had the mirrored "E"s separated).
The only reference I have:
http://kakijun.main.jp/page/u_j064200.html
http://kakijun.main.jp/page/soua13200.html
even if the radical is 工 (48) 攴 (66) does it not make more sense to use the so-called 攵 nobun/ノ-文 variant as the Component? Side note why do so many components not show up in component search display ;_; additional ones will show up depending on previous components selected. Also maybe should note that some users may be used to different styles of chinese characters. Such as 令 as mentioned above. (interesting fact calligraphy is the only legal use of traditional characters in China) Tae August 21, 2010, at 07:02 AM
The stroke paths are correct. But the text numbers are wrong: 2 should be on the down stroke, 3 on the lower diagonal stroke on the left.
These characters appear to be missing the short slanting stroke just above 日on the right side.
createsvgfiles.py was deleted in commit f056bce
I have no clue if viewer.py is useful for anything or if it is obsolete too, just tried to run it and was puzzled by missing reference which google knows nothing about :)
Must be "26951HzFst.svg"
澀 Is the 16 strokes variant more used than the 17 strokes variant? z-one March 01, 2011, at 04:41 AM
梍 The last part is either 七 or 匕. Currently 匕 is shown in the structure but then shouldn't the stroke direction be the opposite? z-one March 01, 2011, at 12:16 AM
懋 The top middle element is the first one in according to kakijun. z-one February 27, 2011, at 11:25 AM
顏 The stroke order for the part that looks like an X is most probably mixed, because in other characters the order is always the opposite. z-one February 22, 2011, at 02:26 AM
Many kanji that include 瓦 (hex codes: 0x74e7 0x74e9 0x74ee 0x74f0 0x74f1 0x74f2 0x74f7 0x74f8 0x7503 0x7504 0x7505 0x750c 0x750d 0x750e 0x7511 0x7513 0x7515). Stroke 2 of the component should maybe be split into two distinct strokes.
first radical of 捌 is inconsistent with the above/below kanji
in 喚, maybe ㇟a should be ㇟a/㇏
Characters like 餅 and 遡 don't show JIS X 0213:2004 standard forms. This is a case of being old rather than being wrong, but they ought to be updated eventually. December 16, 2010, at 07:20 AM
筆 the 6th stroke is incorrect. November 2, 2010
爨 first the top left, then the top middle and then the top right. z-one August 23, 2010, at 08:52 PM
叟 is 10 strokes and not 9. Because of this error the kanji 艘 is missing a stroke. (The last stroke in the list does nothing.) z-one April 17, 2010, at 10:20 PM
The stroke order of kanji like 羸, 蠃 and 贏 that has 月 and 凡 might not have the correct stroke order. The middle element between these two should come first. z-one April 15, 2010, at 04:08 AM
尨 The top dot of 尤 should come last. z-one April 15, 2010, at 04:08 AM Need to be double-checked Gnurou April 22, 2010, at 05:11 PM
慥 The part that looks like 牛 should follow the part's stroke order with the two horizontal lines first. z-one April 15, 2010, at 04:46 AM Doubt it - according to kakijun it is, but the same site agrees with KanjiVG for the stroke order of the right component. Gnurou April 22, 2010, at 05:11 PM
陞 The part in top right corner might not have the correct stroke order, as it differs from 升. z-one April 21, 2010, at 01:00 PM
襾 (and kanji containing it, like 覊), the short horizontal stroke at the middle should be last in my opinion (though it is a bit difficult to confirm). z-one January 16, 2010, at 04:18 AM
黽 As far as I know the 5. and 6. strokes are swapped, and the 7. and 8. strokes as well z-one January 12, 2010, at 07:26 AM
縛 and 博: their component is said to be 尃, but shouldn't it be 専 instead? Gnurou October 19, 2009, at 08:57 AM
滿 and compounds Gnurou October 19, 2009, at 08:28 AM
厖 Gnurou June 10, 2009, at 10:06 PM
叟 and all kanji that use it - are there two ways to draw this kanji? Gnurou June 10, 2009, at 10:05 PM No, looks like the way used in 搜 is the right one, as the XML data and other sources seem to confirm.
頃 Gnurou June 10, 2009, at 10:05 PM
嚢 Gnurou June 10, 2009, at 10:05 PM
鋏 Gnurou June 10, 2009, at 10:05 PM
令 and all kanji using it: bottom part looks different on some fonts? Which one is right? Gnurou June 10, 2009, at 10:05 PM
禸 and all kanji that use it: missing component information for ム and 冂? Gnurou June 24, 2009, at 12:23 AM
As reported by a Tagaini user: there is aproblem with the keys: 黑 and 黒. they are the same key but they are listed differently. i.e:黙 is listed only under 黑 Gnurou July 13, 2009, at 01:31 AM
飴 the 3rd stroke is incorrect (vertical, but should be horizontal) and the 7th and 8th stroke are wrong eno October 08, 2009, at 08:56 AM
As reported by Jan Eichhorn:
Maybe the new thing is that some where lost in the xml=>svg unification. Kanji are affected too. There is also a smaller problem with dots missing in latin characters (ij.;!). These are already missing in the first version found on github. I don't know what came before that. If the dots have ever existed, they might have been a victim of conversion too (someone assuming there can be no "circle" elements).
剝 (0x525D) and 0x20B9F are given as the main variant in the official jouyou list (other variants are allowed). They are variants of the much more common
剥 (0x5265)
叱 (0x53F1)
The differences are in the pig snout radical 彑 and the spoon radical 匕. For both I couldn't find any match of the required variant in kanjivg.
Would it be possible to change the package download links to a static one? Or create separate links? This would make it easier to sync to the latest. All the other dictionary sources like jmdict, jmnedict, kanjidicand etc provide the latest static links.
For example, change:
https://github.com/downloads/KanjiVG/kanjivg/kanjivg-20120219.xml.gz
to:
https://github.com/downloads/KanjiVG/kanjivg/kanjivg-current.xml.gz
Under the link already includes the upload date, so people won't be confused.
Thank you.
The 4 variants of 05618 are all displaced to the right (start points have an offset of ~200 points). 05618.svg is not affected.
Latin letters from the normal range (00020-...) should be moved to the full-width range (0FF00-...), since they graphically represent full-width characters, being drawn on a square canvas. The normal range should be abandoned.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.