Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tlug: A couple of questions about Unicode



On Sat, 10 Jan 1998, Ken Schwarz wrote:

> implementation.  UCS-2 means all data is managed in 2-octet or 16-bit
> words (vs. the 4-octet or 32-bit words of UCS-4).  However, Level-3
> means that characters may be combined without restriction, so it is
> wrong to assume that all characters are expressed in 16-bits.  The
> "ch" and "ll" characters in Spanish, for example, are considered
> single characters of 4-octets.  Unicode is not simply a wide-char
> version of 8-bit char data; it is a multibyte encoding.  In this way,
> going with Unicode to avoid the complexities of multibyte handling is
> misguided.

Wow!  Everything I've read so far has said that Unicode is fixed-width.
Where have you read that those Spanish chars are 32-bits?  How could,
e.g., "ch" be distinguished from "c" "h"?  What does it mean to be a
single char?  (That it should be displayed with a single glyph?
That two separate glyphs should not be split across lines?  Or 
is it a char in the sense that "qu" could be a char in English,
since "q" is always followed by "u"?)

I cannot accept that Unicode is multibyte, rather than fixed-width.
I know that there are multibyte encodings, e.g., UTF-8, but a major
feature of Unicode is that it's fixed-width.  Can you quote a reference?

--
J. David Beutel       "You're inhabited by the society you live in through
11011011 jdb@example.com  your use of language." McCorduck on Turkle on Lacan

---------------------------------------------------------------
Next TLUG Nomikai: 14 January 1998 19:15  Tokyo station
Yaesu Chuo ticket gate.  Or go directly to Tengu TokyoEkiMae 19:30
Chuo-ku, Kyobashi 1-1-6, EchiZenYa Bld. B1/B2 03-3275-3691
Next Saturday Meeting: 14 February 1998 12:30 Tokyo Station
Yaesu Chuo ticket gate.
---------------------------------------------------------------
a word from the sponsor:
TWICS - Japan's First Public-Access Internet System
www.twics.com  info@example.com  Tel:03-3351-5977  Fax:03-3353-6096



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links