[KS] unicode
Frank Hoffmann
hoffmann at koreanstudies.com
Sat May 30 01:28:10 EDT 2015
This is fun!
Professor Muller, a quick question then:
As Professor Cheong just wrote, the Korean variants (of characters that
are pronounced in more than one way in Korean) are the ONLY ones that
are encoded as such in Unicode, by "doubling" the glyphs and then
putting these grouped together into a separate 'block' within each font
(and then also assigning the code page accordingly).
As I just now found, and I also just confirmed this in the CHINESE font
as well (within Google's "Noto" font) that you introduced us to, these
extra glyph blocks do physically exist in all three fonts: Korean,
Japanese, and Chinese. But only the KOREAN and the CHINESE one do have
the code pages that allow us users to access the various versions of
these "double-pronunciation" characters. In the Japanese font they are
there but cannot be accessed, as they are not assigned in the code page.
Through Professor Cheong's mail I now understand that the Japanese (who
would, in my opinion, benefit the most from this) do not have such
special glyph blocks for their Kanji with double/tripple/... readings
-- no "CJK Compatibility Ideographs" block within Unicode -- and after
he said so I now also clearly see that. For those who do not feel like
opening a font themselves, please see attached screen shot of that "CJK
Compatibility Ideographs" block from within one of the Noto fonts, so
you get a clear idea what we talk about .... the fonts start with the
ASCII, then all kind of other special symbols, then the very large
block (maybe 40,000 or more?) of Chinese characters, and then -- you
see it in that screen shot -- middle Korean letters, followed by the
block of "CJK Compatibility Ideographs" (as you see, not too many in
Hanmun. (See end of message!)
My question is this: The Japanese (or better, the Japanese fonts)
still have not assigned a code block for those ALREADY EXISTENT Korean
special characters (to the entire "CJK Compatibility Ideographs" block,
that is). Why is this, given the Chinese do that? There most obviously
is no technical reason for not doing that, since it is already there.
The other, FAR MORE IMPORTANT question:
If this doubling of glyphs is something only the Koreans do (if it only
found its way into Unicode as some sort of compromise, as we can see
this as a historical decision based of political decisions of dealing
with people who did not fully understand the system) -- then I wonder
if there is another technical means (via the code pages) to do a
complete reversal? I think there is no other way. If you enter,
Andrew's example earlier, こと and ごと for 事 and both times get the
same identical glyph assigned (for the same Kanji), with the same
Unicode encoding, then one can obviously not go back to こと or ごと
from 事. So, I do not understand yet why Unicode did not "go the Korean
way" in this case?
Best,
Frank
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CJK-Compatibility-Ideographs.jpg
Type: image/jpeg
Size: 489253 bytes
Desc: not available
URL: <http://koreanstudies.com/pipermail/koreanstudies_koreanstudies.com/attachments/20150529/fb251ce2/attachment.jpg>
-------------- next part --------------
--------------------------------------
Frank Hoffmann
http://koreanstudies.com
More information about the Koreanstudies
mailing list