Re: tlug: Mule-begotten problems for Emacs and Gnus

To: tlug@example.com
Subject: Re: tlug: Mule-begotten problems for Emacs and Gnus
From: Jon Babcock <jon@example.com>
Date: 09 Jan 1998 07:18:40 -0700
In-Reply-To: "Stephen J. Turnbull"'s message of Fri, 9 Jan 1998 17:31:35 +0900 (JST)
References: <m0xpm7D-00012bC@example.com> <199801090655.GAA00847@example.com> <m0xqZqp-00012bC@example.com>
Reply-To: tlug@example.com
Sender: owner-tlug@example.com


At present I'm somewhat overwhelmed by the amount of information and
ideas that my two public requests (one here on tlug and one on ding)
and one private request (to Erik Naggum) have generated. I'm going
over this material now. 

But one small point on the Unicode-Mule issue: I vaguely remember that
as much as maybe a year before the early development versions of the
merged Mule-Emacs started to appear in the .notready directory of
etlport.etl.go.jp:/pub/mule someone, perhaps Handa-san himself(???),
said that the reason that mule was not going to be merged with Emacs
proper was that RMS insisted that Mule first be rewritten using
Unicode. Does anyone else remember this?

Another small point: As Stephen as indicated, it turns out that the
64K-worth of code points in Unicode is not enough.  They wrestle with
this problem on the Unicode list occasionally. The biggest consumer is
the Unihan repertoire, which uses up some 20,000 code points and still
omits most kanji (strictly in terms of numbers, not current
usage). There are about 50,000 kanji in Morohashi's big dictionary and
40 some thousand in the old Kangxi dictionary (1712 AD), for
example. There would have been a solution for this, and a good one,
but it would have required a lot of clear-headed work (w/o political
interference), i.e. to encode the *hemigrams* of kanji in the Unicode
standard rather than giving a code point to each whole kanji and
thereby condemning Unicode to NEVER being able to encode all kanji. I
say "never" because even if every single known kanji were encoded in a
future, extended 32-bit "Unicode", (estimates run at more than 75,000
kanji!)  there would still remain the inability to represent new
kanji. Even though new kanji would be but a tiny tiny fraction of the
total, they could not be directly represented with Unicode, whereas
new words, nonsense words, whatever, in languages that are written in
the Latin script, for example, do not have this limitation.

Representing the hemigrams, instead of whole kanji, would require a
relatively modest number of code points (< 3000), and by using this
approach all 70,000 kanji could be represented as well as any new
kanji that were invented. (See the contest for Japanese school kids to
invent new kanji that shows up in Japanese newspapers about once a
year as an odd-ball example of new kanji.) When I say it requires hard
work, I know from experience since I have been working on a system for
representing all kanji in terms of their hemigrams for the past couple
years. The methodology was first demonstrated to me at UC Berkeley by
Professor Peter A. Boodberg nearly thirty years ago.

Jon Babcock
jon@example.com
---------------------------------------------------------------
Next TLUG Nomikai: 14 January 1998 19:15  Tokyo station
Yaesu Chuo ticket gate.  Or go directly to Tengu TokyoEkiMae 19:30
Chuo-ku, Kyobashi 1-1-6, EchiZenYa Bld. B1/B2 03-3275-3691
Next Saturday Meeting: 14 February 1998 12:30 Tokyo Station
Yaesu Chuo ticket gate.
---------------------------------------------------------------
a word from the sponsor:
TWICS - Japan's First Public-Access Internet System
www.twics.com  info@example.com  Tel:03-3351-5977  Fax:03-3353-6096

References:
- tlug: Mule-begotten problems for Emacs and Gnus
  - From: "Stephen J. Turnbull" <turnbull@example.com>
- Re: tlug: Mule-begotten problems for Emacs and Gnus
  - From: Karl-Max Wagner <karlmax@example.com>
- Re: tlug: Mule-begotten problems for Emacs and Gnus
  - From: "Stephen J. Turnbull" <turnbull@example.com>

Prev by Date: Re: tlug: Mule-begotten problems for Emacs and Gnus
Next by Date: tlug: Re: LILO Vs NT
Prev by thread: Re: tlug: Mule-begotten problems for Emacs and Gnus
Next by thread: tlug: Re: news server access
Index(es):
- Date
- Thread

Home | Main Index | Thread Index