Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] font/char set question



steven smith <sjs@example.com> wrote:

> I just had something interesting happen.  The string below
> came across in another list:
> 现代汉语词典
> It's the name of a Chinese Kanji dictionary.
> 
> I searched for it on amazon.jp.  The search dialog looked ok
> when I pasted ... 

When I pasted it in the field using Konqueror, 
only the first and last characters showed as kanji. 
The others appeared as mid-dots. 

When I pasted it in the field using Firefox, it appeared OK. 

> but the return was:
> "Your search "代典" did not match any products."
> and the dialog looked like : 代典
> I don't know what this will look like on other browsers.

Firefox responded as you show. 

Konqueror responded with '"代典"の検索に一致する商品はありませんでした。,'
where  appeared as hollow rectangles. 

> What am I hitting here?  Is the font used in the
> Amazon.co.jp non utf-8 (iso2022-jp maybe)? 

Read the headers. 

   wget -S -O /dev/null http://amazon.jp/ 2>&1 | grep Content-Type

It seems they are using Shift_JIS. Perhaps they expect that also. 

I did: 

   [jep@example.com ~]$ echo $LANG
   en_US.UTF-8
   [jep@example.com ~]$ echo '现代汉语词典' | iconv -f utf-8 -t shift_jis
   iconv: illegal input sequence at position 0
   [jep@example.com ~]$ echo '代汉语词典' | iconv -f utf-8 -t shift_jis
   ��iconv: illegal input sequence at position 3
   [jep@example.com ~]$ echo '代汉词典' | iconv -f utf-8 -t shift_jis
   ��iconv: illegal input sequence at position 3
   [jep@example.com ~]$ echo '代语词典' | iconv -f utf-8 -t shift_jis
   ��iconv: illegal input sequence at position 3
   [jep@example.com ~]$ echo '代词典' | iconv -f utf-8 -t shift_jis
   ��iconv: illegal input sequence at position 3
   [jep@example.com ~]$ echo '代典' | iconv -f utf-8 -t shift_jis
   ���T
   [jep@example.com ~]$ 

Hmmm. Is the original string UTF-8? So I try: 

   [jep@example.com ~]$ echo '现代汉语词典' | iconv -f shift_jis -t shift_jis
   现代汉语词�iconv: illegal input sequence at position 16
   [jep@example.com ~]$ echo -n '现代汉语词典' | wc
         0       1      18
   [jep@example.com ~]$ echo '现代汉语典' | iconv -f shift_jis -t shift_jis
   现代汉语典
   [jep@example.com ~]$ echo '词' | iconv -f shift_jis -t shift_jis
   ��iconv: illegal input sequence at position 2
   [jep@example.com ~]$ 

So the initial string seems to be closer to shift_jis than utf-8. 
It's time for me to punt to others. 

> I just had something interesting happen.  The string below
> came across in another list:
> 现代汉语词典

Steven, can you forward that email as an attachment? 
Admins, is such a crossposting OK?



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links