Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Re: Unicode



On Sat, Jul 12, 2003 at 01:44:32PM +0900,
Charles Muller wrote:
> Shimpei Yamashita wrote:
> 
> > Jim, what I don't quite understand is this: exactly what problem is Unicode
> > meant to solve anyway? 
> 
> I am sure that Jim can provide his own answer, but I think the answer to
> this question is obvious: It is meant to solve the problem of incompatible
> coding systems, which was a severe impediment to the exchange of
> information. Especially in the case of Han characters, which have a high
> degree of graphical and semantic equivalence, it was ridiculous to continue on
> in a situation where people using computers in Japan, Korea, China, and
> Taiwan could not talk to each other.

I certainly appreciate the need for an omnifarious text coding standard in
which different languages can express themselves without stepping on each
others' toes.

But that, and combining kanji glyphs, seem to be orthogonal problems to me.
In different CJK nations, they don't necessarily look the same, they aren't
read the same, and they don't even always mean the same. If all you wanted to
do was to create a coding standard in which no two languages ever clashed with
each other, you could have given each language's glyphs different coding
points. So why was this not done? I'm sure there were good rationales behind
it--coding point economy? ease of lookup?--but it doesn't lead automatically
from Unicode's goal as you stated it.

> > It's easy to dismiss Unicode opponents as nationalist
> > counter-revolutionaries, but it isn't clear to me (yet) that the Unicode camp
> > has addressed their grievance adequately.
> 
> As far as I can tell, it is because the grievances are largely based on
> misunderstandings of what Unicode is supposed to do. Almost all of the
> grievances that I have heard from anti-Unicode people have been quibbles
> about small, idiosyncratic differences in glyph representation, which can
> very easily be handled at the level of font, and thus there is no problem
> assigning a single code point.

As an academic, I'd hope you're above hand-waving problems away as "very
easy" when you're explaining things to an amateur. The fact that you need font
information, as well as coding information, in order to have a completely
accurate rendition of the intended text implies that you're trying to
hand-wave away a fundamental problem: Unicode is less information-complete in
representing Japanese text than ISO-2022, EUC-JP, etc. So the Unicode
consortium made a sacrifice. 

I'm not saying that this is necessarily bad; clearly some smart people thought
this was acceptable, or possibly good. So what was the rationale behind the
sacrifice? By rationale, BTW, I'm asking how that particular decision to make
a sacrifice made Unicode a *better* product than if you chose not to combine
the code points--surely *something* must have been gained in exchange for even
a very tiny step backwards for Japanese expression.

> There are of course a very small percentage of _bimyou_ cases where
> expert-level debate needs to take place to determine whether or not a
> character is a variant of another (and if so, what kind of variant). But the
> fact that more of these did not get hashed out at the early stages is again,
> from what I understand, due more to the problems of non-cooperation rather
> than unawareness or arbitrary forcing on the part of the Unicode consortium.

OK, that was the process. However, as an end-user (aka luser), I don't give a
cow what the *process* was; I care about the *results*, because the results
are what I will be using every day, not the process. If you want to present
Unicode as an acceptable alternative for expressing Japanese, it needs to be
done in ways other than "well, too bad about your problems, because you people
were un-cooperative while we were working on it and we figured they were minor
anyway" (sorry for paraphrasing your argument, but I can't draw any other
conclusion from what you've said so far). Again, what I'd like to hear is how
sacrificing expressivity in certain fringe cases made Unicode better as an
overall product.

> The other thing that I would like to stress is that from the early days up
> to the present, the Unicode consortium has been quite open to suggestions
> and reasonable proposals set forth by properly accredited groups and
> individuals, and therefore the Unicode character set continues to grow and
> be refined.

If I hadn't known beforehand that you are a well-intentioned person, this
sentence would have made me very angry. You know you're talking to a
properly non-accredited individual who has no inside contacts; ergo, in lack
of other information, the above paragraph is exactly equivalent to "plebian,
the Unicode consortium is not interested in your needs or thoughts"! I assume
that wasn't the precise message you wanted to convey, though. What did you
really want to say?

> I don't say that Unicode is problem-free. But I can tell you that people
> like myself who work with classical East Asian literary texts would still be
> in the dark ages if Unicode had not come along. Maybe some day in the future
> Unicode will be replaced with something better, and if so, that's fine. But
> to have left things in the fragmentary form they were would have been
> absurd.

I may have missed your previous posts on this before, but how exactly does
Unicode help you? And would the matter of combining code point have any effect
on your work? I'd be curious to know.

Shimpei.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links