Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: tlug: Mule-begotten problems for Emacs and Gnus
- To: tlug@example.com
- Subject: Re: tlug: Mule-begotten problems for Emacs and Gnus
- From: Karl-Max Wagner <karlmax@example.com>
- Date: Fri, 9 Jan 1998 06:55:52 +0000 (GMT)
- Content-Transfer-Encoding: 7bit
- Content-Type: text/plain; charset=US-ASCII
- In-Reply-To: <m0xpm7D-00012bC@example.com> from "Stephen J. Turnbull" at Jan 7, 98 12:25:11 pm
- Reply-To: tlug@example.com
- Sender: owner-tlug@example.com
Stephen Turnbull writes: > The Unicode issue, as such, is a red herring. Unicode is a Western > imperialist plot in the minds of many Orientals. Europeans are not Imerialist plot ? Hmmm. Imerialist plot appears a bit farfetched to me. It is at least an attempt to createa global unified code to represent all characters that are in use. If it is inadequate it should be pointed out where and proposals how to fix that should be made. There is no doubt that a unified global code is of utmost importance. Just imagine if every contry had its own email, news, HTML etc. protocols. This would effectively render international networking impossible. > going to change from one byte ISO 8859 encodings to two byte Unicode > encodings for most purposes; why should the Orientals be restricted to Well, obviously. It IS a technical advantage to have a character set with only very few characters. > Unicode? And in fact there is a real loss to Orientals in using > Unicode (the "Han unification" problem), unlike the Eastern Europeans I have to admit that I don't know that problem, but 16 bits yield a character space of 65536 characters. The Chinese use about 10000 Kanji or so, the Japanese 4000 or so. The Chinese Kanji and the Japanese Kanji overlap by 95 % or so, thus yielding a total character count of maybe 11 000. Most other languages use alphabets of some kind in which the character count rarely exceeds 200. Let's assume there are 20 of these alphabets around, so there's an additional 4000 characters. Thus we end up with 15000 characters in total. This is not even a quarter of the available character space. So where is the problem ? It may well be that the character numbering in such a case lacks a bit of systematics, but this shouldn't be much of a problem: table driven libraries etc. could be made available to hide that fact from the programmer (actually, it's only a problem with sorts. In Japanese, due to ON/KUN reading a dctionary driven sort is required anyway) > The real issue is not Unicode qua Unicode; while the Europeans in > general would love to use Unicode to handle Oriental character sets, > many Orientals are adamantly opposed. Books have been written about But what do they propose instead ? The present state of things is several (mostly incompatible) encoding sets per country. In Japan, at least 3 are in general use: JIS, Shift-JIS and EUC. As far as I know, in China it's much alike. Of course the Chinese and the Japanese encoding schemes are incompatible. This sounds VERY MUCH like a Tower-of-Bable story to me. It is obvious that this is SERIOUSLY hampering global networking and global software development. > how Unicode will be the demise of the Japanese language, for example. >....(Very interesting description of Emacs internals) To put it bluntly: Emacs development is seriously suffering from the fact that there is no global unified encoding scheme in general use by now. lots of time is actually wasted in order to customize the software to the individual encoding schemes, as far as I understand. To make things even worse, a glance onto the names of implementers of free software shows that the vast majority of them are of euro-american origin. It is safe to assume that most of them don't have a background in non european linguistics. Thus their work will always implicitly be western-centered. However, if a universal unified character encoding scheme would be in general use, those people would use that and the problems would vanish for the most part (specific entry methods would still be necessary, but the rest would be the same anywhere). It is easy to see that this would solve a LOT of problems. To maintain that an attempt to create a unified character encoding scheme like Unicode is an "imperialistic plot" appears unfair against the implementers and unconstructive as well to me. It doesn't get us anywhere. There is no doubt that there might be incompatibilities with older standards. But that's life. Unified standards have always proven as a big advantage on the long run. BTW, the Linux 2.0.x kernels all have Unicode built in. So, is the Linux community planning an imperialistic plot against Asians (an interesting question as some of those are Asians in fact....) ????? Karl-Max Wagner karlmax@example.com --------------------------------------------------------------- Next TLUG Nomikai: 14 January 1998 19:15 Tokyo station Yaesu Chuo ticket gate. Or go directly to Tengu TokyoEkiMae 19:30 Chuo-ku, Kyobashi 1-1-6, EchiZenYa Bld. B1/B2 03-3275-3691 Next Saturday Meeting: 14 February 1998 12:30 Tokyo Station Yaesu Chuo ticket gate. --------------------------------------------------------------- a word from the sponsor: TWICS - Japan's First Public-Access Internet System www.twics.com info@example.com Tel:03-3351-5977 Fax:03-3353-6096
- Follow-Ups:
- Re: tlug: Mule-begotten problems for Emacs and Gnus
- From: "Stephen J. Turnbull" <turnbull@example.com>
- References:
- tlug: Mule-begotten problems for Emacs and Gnus
- From: "Stephen J. Turnbull" <turnbull@example.com>
Home | Main Index | Thread Index
- Prev by Date: tlug: Katakana to Romanji conversion
- Next by Date: tlug: Katakana to Romanji conversion
- Prev by thread: Re: tlug: Mule-begotten problems for Emacs and Gnus
- Next by thread: Re: tlug: Mule-begotten problems for Emacs and Gnus
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links