
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] font/char set question
steven smith <sjs@example.com> wrote:
> I just had something interesting happen. The string below
> came across in another list:
> 现代汉语词典
> It's the name of a Chinese Kanji dictionary.
>
> I searched for it on amazon.jp. The search dialog looked ok
> when I pasted ...
When I pasted it in the field using Konqueror,
only the first and last characters showed as kanji.
The others appeared as mid-dots.
When I pasted it in the field using Firefox, it appeared OK.
> but the return was:
> "Your search "代典" did not match any products."
> and the dialog looked like : 代典
> I don't know what this will look like on other browsers.
Firefox responded as you show.
Konqueror responded with '"代典"の検索に一致する商品はありませんでした。,'
where appeared as hollow rectangles.
> What am I hitting here? Is the font used in the
> Amazon.co.jp non utf-8 (iso2022-jp maybe)?
Read the headers.
wget -S -O /dev/null http://amazon.jp/ 2>&1 | grep Content-Type
It seems they are using Shift_JIS. Perhaps they expect that also.
I did:
[jep@example.com ~]$ echo $LANG
en_US.UTF-8
[jep@example.com ~]$ echo '现代汉语词典' | iconv -f utf-8 -t shift_jis
iconv: illegal input sequence at position 0
[jep@example.com ~]$ echo '代汉语词典' | iconv -f utf-8 -t shift_jis
��iconv: illegal input sequence at position 3
[jep@example.com ~]$ echo '代汉词典' | iconv -f utf-8 -t shift_jis
��iconv: illegal input sequence at position 3
[jep@example.com ~]$ echo '代语词典' | iconv -f utf-8 -t shift_jis
��iconv: illegal input sequence at position 3
[jep@example.com ~]$ echo '代词典' | iconv -f utf-8 -t shift_jis
��iconv: illegal input sequence at position 3
[jep@example.com ~]$ echo '代典' | iconv -f utf-8 -t shift_jis
���T
[jep@example.com ~]$
Hmmm. Is the original string UTF-8? So I try:
[jep@example.com ~]$ echo '现代汉语词典' | iconv -f shift_jis -t shift_jis
现代汉语词�iconv: illegal input sequence at position 16
[jep@example.com ~]$ echo -n '现代汉语词典' | wc
0 1 18
[jep@example.com ~]$ echo '现代汉语典' | iconv -f shift_jis -t shift_jis
现代汉语典
[jep@example.com ~]$ echo '词' | iconv -f shift_jis -t shift_jis
��iconv: illegal input sequence at position 2
[jep@example.com ~]$
So the initial string seems to be closer to shift_jis than utf-8.
It's time for me to punt to others.
> I just had something interesting happen. The string below
> came across in another list:
> 现代汉语词典
Steven, can you forward that email as an attachment?
Admins, is such a crossposting OK?
Home |
Main Index |
Thread Index