Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SJIS & HTML - potential trouble?



>>>  HTML text has to be in ISO-Latin 1 character set, which
>>> has reserved space for the first byte of SJIS double-byte characters.
>
>Not really. ISO-Latin 1 (aka ISO 8859-1) is an extension on the 128
>characters of ISO 646 (aka ASCII). There is no "reserved space" for SJIS's
>first bytes - you need to tell your browser it is one or the other,
>because they overlap.
>
128..159 are unused in Latin-1, which is first-byte range for most kanji.
I forgot about the half-width katakana (161-223), and there seems to be
another set of first byte ranges at 224-239, and 240-252 (user-defined).
This is such fun isn't it :-).
At least this only messes up the display a bit (umlauts,etc. in European
names and the copyright symbol come out as katakana), but shouldn't mess up
anything serious.


-----------------------------------------------------------------
a word from the sponsor will appear below
-----------------------------------------------------------------
The TLUG mailing list is proudly sponsored by TWICS - Japan's First
Public-Access Internet System.  Now offering 20,000 yen/year flat
rate Internet access with no time charges.  Full line of corporate
Internet and intranet products are available.   info@example.com
Tel: 03-3351-5977   Fax: 03-3353-6096


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links