Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tlug: Mule-begotten problems for Emacs and Gnus



Stephen Turnbull writes:
> The Unicode issue, as such, is a red herring.  Unicode is a Western
> imperialist plot in the minds of many Orientals.  Europeans are not
Imerialist plot ? Hmmm. Imerialist plot appears a bit farfetched
to me. It is at least an attempt to createa global unified code
to represent all characters that are in use. If it is inadequate
it should be pointed out where and proposals how to fix that
should be made. There is no doubt that a unified global code is
of utmost importance. Just imagine if every contry had its own
email, news, HTML etc. protocols. This would effectively render
international networking impossible.
> going to change from one byte ISO 8859 encodings to two byte Unicode
> encodings for most purposes; why should the Orientals be restricted to
Well, obviously. It IS a technical advantage to have a character
set with only very few characters.
> Unicode?  And in fact there is a real loss to Orientals in using
> Unicode (the "Han unification" problem), unlike the Eastern Europeans
I have to admit that I don't know that problem, but 16 bits
yield a character space of 65536 characters. The Chinese use
about 10000 Kanji or so, the Japanese 4000 or so. The Chinese
Kanji and the Japanese Kanji overlap by 95 % or so, thus
yielding a total character count of maybe 11 000. Most other
languages use alphabets of some kind in which the character
count rarely exceeds 200. Let's assume there are 20 of these
alphabets around, so there's an additional 4000 characters. Thus
we end up with 15000 characters in total. This is not even a
quarter of the available character space. So where is the
problem ?

It may well be that the character numbering in such a case lacks
a bit of systematics, but this shouldn't be much of a problem:
table driven libraries etc. could be made available to hide that
fact from the programmer (actually, it's only a problem with
sorts. In Japanese, due to ON/KUN reading a dctionary driven
sort is required anyway)
> The real issue is not Unicode qua Unicode; while the Europeans in
> general would love to use Unicode to handle Oriental character sets,
> many Orientals are adamantly opposed.  Books have been written about
But what do they propose instead ? The present state of things
is several (mostly incompatible) encoding sets per country. In
Japan, at least 3 are in general use: JIS, Shift-JIS and EUC. As
far as I know, in China it's much alike.  Of course the Chinese
and the Japanese encoding schemes are incompatible. This sounds
VERY MUCH like a Tower-of-Bable story to me. It is obvious that
this is SERIOUSLY hampering global networking and global
software development.
> how Unicode will be the demise of the Japanese language, for example.
>....(Very interesting description of Emacs internals)
To put it bluntly: Emacs development is seriously suffering from
the fact that there is no global unified encoding scheme in
general use by now. lots of time is actually wasted in order to
customize the software to the individual encoding schemes, as
far as I understand.

To make things even worse, a glance onto the names of
implementers of free software shows that the vast majority of
them are of euro-american origin. It is safe to assume that most
of them don't have a background in non european linguistics.
Thus their work will always implicitly be western-centered.
However, if a universal unified character encoding scheme would
be in general use, those people would use that and the problems
would vanish for the most part (specific entry methods would
still be necessary, but the rest would be the same anywhere).

It is easy to see that this would solve a LOT of problems. To
maintain that an attempt to create a unified character encoding
scheme like Unicode is an "imperialistic plot" appears unfair
against the implementers and unconstructive as well to me. It
doesn't get us anywhere. There is no doubt that there might be
incompatibilities with older standards. But that's life. Unified
standards have always proven as a big advantage on the long run.

BTW, the Linux 2.0.x kernels all have Unicode built in. So, is
the Linux community planning an imperialistic plot against
Asians (an interesting question as some of those are Asians in
fact....) ?????

                                Karl-Max Wagner
                                karlmax@example.com
---------------------------------------------------------------
Next TLUG Nomikai: 14 January 1998 19:15  Tokyo station
Yaesu Chuo ticket gate.  Or go directly to Tengu TokyoEkiMae 19:30
Chuo-ku, Kyobashi 1-1-6, EchiZenYa Bld. B1/B2 03-3275-3691
Next Saturday Meeting: 14 February 1998 12:30 Tokyo Station
Yaesu Chuo ticket gate.
---------------------------------------------------------------
a word from the sponsor:
TWICS - Japan's First Public-Access Internet System
www.twics.com  info@example.com  Tel:03-3351-5977  Fax:03-3353-6096



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links