
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[tlug] Re: font/char set question
- Date: Sun, 29 Jul 2007 13:51:27 +1000
- From: "Jim Breen" <jimbreen@example.com>
- Subject: [tlug] Re: font/char set question
steven smith <sjs@example.com> wrote:
> I just had something interesting happen. The string below
> came across in another list:
> 现代汉语词典
> It's the name of a Chinese Kanji dictionary.
>
> I searched for it on amazon.jp. The search dialog looked ok
> when I pasted but the return was:
> "Your search " 代 典" did not match any products."
> and the dialog looked like : 代 典
> I don't know what this will look like on other browsers.
>
> What am I hitting here? Is the font used in the
> Amazon.co.jp non utf-8 (iso2022-jp maybe)?
That's more or less the story. Amazon.jp's pages, and presumably
their search system, are all in Shift_JIS, i.e. the JIS X 208
character set. Most of those hanzi above are not in that set.
(BTW, font is not the issue here. It's all to dowith character
sets.)
> Do they use a
> utf-8 that doesn't support the (I'm assuming) chinese only
> kanji in this string?
They (Amazon) don't use UTF-8 at all, AFAICT.
> Is there a translation between
> character sets going on somewhere that's dropping the
> chinese characters?
If you try to paste a UTF-8 string into a WWW form set for
Shift_JIS, a conversion will be done (what software actually
does it depends on OS, etc.). If matches can't be made, some
substitution, e.g. blanks, will be done.
Jim
--
Jim Breen
Honorary Senior Research Fellow
Clayton School of Information Technology,
Monash University, VIC 3800, Australia
http://www.csse.monash.edu.au/~jwb/
Home |
Main Index |
Thread Index