[KS] unicode

Otfried Cheong otfried at airpost.net
Sat May 30 08:13:58 EDT 2015


On Sat, May 30, 2015, at 16:59, Charles Muller wrote:
> ISO was forced to make a number of dramatic shifts in policy based on 
> technical developments in computing and the various demands of national 
> bodies, and I do know that the Compatibility Ideographs for Korean were 
> established based on some kind of misunderstanding by an early Korean 
> IRG team over the glyph/character/codepoint issue.

I think in this particular case this is an unfair accusation.  The
original IRG (when it was still called the CJK-JRG) worked within a
given set of rules, most importantly the source separation rule.  They
had no choice but to provide separate code points for "characters" that
already had distinct code points in one of the encodings that Unicode
had - a priori - accepted as official.

Since KSC made a distinction between U+907c and U+f9c3, they really had
no choice but to provide two code points.  With hindsight, it might have
been better to dedicate a "variant marker" for this - but at the state
of technology of the early 1990s, this would not have been acceptable
for round trip preservation of an official national encoding.

And without guaranteeing round preservation, Unicode would have died at
that point. (Surprisingly, again in hindsight, it turns out that round
trip preservation may not have been as important as it may have seemed
at that time.  Today, the mapping tables used by various vendors differ
slightly, and so converting from a national encoding to Unicode and back
on a different computer is not guaranteed to give you back the same
data.)

I think the IRG was well aware at that time that these Korean duplicate
characters were an undesirable anomaly - which shows in the fact that
they were delegated to the block of legacy code points - that violated
the spirit of Han unification.

There are certainly some other early decisions where I fully agree with
Professor Muller that they were at best misguided - for instance the
decision to waste 11172 good code points on Hangul syllables, many of
which have never been and will never by used in any text document.  At
that time already many Koreans agreed that the previous set of commonly
used Hangul syllables was enough, and that support for encoding all
possible remaining syllables using combining Jamo characters would have
been much more useful.   

Kind regards,
  Otfried 





More information about the Koreanstudies mailing list