Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] OT-Japanese in PHP
- Date: Sat, 21 May 2005 19:00:07 +0900
- From: "Stephen J. Turnbull" <stephen@example.com>
- Subject: Re: [tlug] OT-Japanese in PHP
- References: <1644176805052007365b9e63fe@example.com><8764xdxkk1.fsf@example.com><16441768050520233017005052@example.com><20050521.160204.760347749.dave@?om>
- Organization: The XEmacs Project
- User-agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.5 (cilantro, linux)
>>>>> "David" == David E <dave@?om> writes: >> The generally accepted idea is that since Shift_JIS was created >> by Japanese people for Japanese people, then it handles the >> Japanese language better than UTF-8, which is not true (^_^) No, and in fact up until the most recent revision of the JIS standard, there were a few people in Hokkaido who could only type their addresses in UTF-8. David> I've heard the "UTF-8" messes up some characters objection David> from Japanese developers several times, though I've been David> able to get an actual example of it. Urban myth, perhaps? Yes and no. Of course technically for all national standard characters Unicode must round-trip; in that sense it is a myth. However, Unicode cannot include characters that are not standardized by certain recognized bodies (don't ask me; all I know is that the above-mentioned northerners "lucked out" because the characters they need are originally Han, not Yamato characters, while Ukrainian Cyrillic users were not so lucky -- until they got a country, they couldn't property view their traditional literature in Unicode). A lot of such characters are present in many Shift JIS encoded fonts, especially on platforms like Fujitsu, NEC, and IBM. Of course, they're in JIS private space, so why you couldn't do the same with UTF-8, I don't know. The other problem is that UTF-8 doesn't give a clue about which language is being used, while lots of naive users and even some programmers don't realize that charset and language are distinct concepts. This means that font selection does require some smarts (but not all that much; kana are unique to Japanese and ubiquitous in that language, ditto Hangul for Korean, and the simplified and traditional forms of Chinese hanzi have been deemed different, so Taiwanese and Mandarin can be distinguished quite reliably, too). So, for example, if you bring up XEmacs 21.5 in a POSIX environment, it prefers Chinese fonts (even for kana!) My personal feeling is that bloody-minded nationalism is responsible for much of this; everybody has ASCII envy (the Chinese went so far as to create a Unicode derivative in which GB2312 plays the same role as ISO 8859/1 does for standard Unicode, ie, a subset with the same code points as in the non-Unicode standard). And some influential Japanese have this crazy idea that some 0s and 1s have more "Yamato damashii" than other 0s and 1s. Of course, looking at my own university's home pages, maybe it's not nationalism. Maybe it's just plain economic protectionism. Any third-rate San Francisco designer in combination with a gaggle of programmers from Bangalore could do a much more attractive job at half the price, but their systems would choke on Shift JIS. David> Anyway, the reason I suggested setting the output encoding David> in php.ini to SJIS, for a begginner is that it's likely to David> be the easiest for him to get started. This is reasonable, but it should be accompanied with a FIXME comment. :-) And once they've gotten a little past that point, they should be negotiating language, charset, and the like. Admittedly, I don't do any of this on my own home pages. I do use META elements and ISO-2022-JP rather than Shift JIS. And I don't pretend to be a professional.... David> Then there's also the fact that if you're working with a David> web designer, trying to get them to do their HTML in David> anything but Shift_JIS is almost always waaay more trouble David> than making your scripts deal with SJIS output. Heh. iconv is your friend. "Promise her anything, but give her UTF-8." -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software.
- Follow-Ups:
- Re: [tlug] OT-Japanese in PHP
- From: Brett Robson
- Re: [tlug] OT-Japanese in PHP
- From: David E
- References:
- Re: [tlug] OT-Japanese in PHP
- From: Evan Monroig
- Re: [tlug] OT-Japanese in PHP
- From: Stephen J. Turnbull
- Re: [tlug] OT-Japanese in PHP
- From: Evan Monroig
- Re: [tlug] OT-Japanese in PHP
- From: David E
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] OT-Japanese in PHP
- Next by Date: Re: [tlug] OT-Japanese in PHP
- Previous by thread: Re: [tlug] OT-Japanese in PHP
- Next by thread: Re: [tlug] OT-Japanese in PHP
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links