
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[tlug] Why is Shift_JIS bad?
- Date: Sat, 31 Aug 2002 10:20:24 +1000 (EST)
- From: Jim Breen <jwb@example.com>
- Subject: [tlug] Why is Shift_JIS bad?
I agree with all the comments made so far about Shift JIS. Some, e.g.
sign-extension in char apply equally well to EUC and UTF8. (FWIW I always
handle EUC/Shift_JIS/etc. as "unsigned char").
My two gripes about Shift_JIS are:
(a) wasted code space. By making room for the JIS 201 hankaku kana, a large
proportion of the 2**14 code space is effectively wasted. That's why usage of
JIS212 never got anywhere, and why JIS213 has been designed to squeeze in.
(Having said that, this is a fairly subtle point as it only concerns a few
people who want arcane kanji. Also Unicode is blowing this problem away.)
(b) it is trickier to handle internally. With EUC you can do all sorts of
single-character activities (index, rindex, strchr, etc.) with total safety.
Also you can scan backwards up strings being abble to detect reliably that
you are in a Japanese character. With Shift_Jis both are much trickier.
The ONLY advantage of ShiftJIS is that hankakukana is carried as a single
byte rather than two bytes as in EUC. Conversion between EUC and Shift_JIS at
the string-level is close to trivial.
Jim
--
Jim Breen (j.breen@example.com http://www.csse.monash.edu.au/~jwb/)
Computer Science & Software Engineering, Tel: +61 3 9905 3298
P.O Box 26, Monash University, Fax: +61 3 9905 5146
Clayton VIC 3800, Australia ジム・ブリーン@モナシュ大学
Home |
Main Index |
Thread Index