Mailing List Archive Mailing List
tlug archive
tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Re: font/char set question
- Date: Mon, 30 Jul 2007 08:47:37 +1000
- From: "Jim Breen" <>
- Subject: Re: [tlug] Re: font/char set question
- References: <> <>
On 30/07/07, Josh Glover <> wrote: > On 29/07/07, Jim Breen <> wrote: > > They (Amazon) don't use UTF-8 at all, AFAICT. > > You cannot tell further than the presentation layer, ブリーン舌皛逅跂勉闕生。In fact, > Amazon uses nothing but UTF-8 internally. Japanese pages get > Shit_JIS'd by Gurupa[2], British and US pages get ASCII'd, European > pages get Latin-1'd, and Chinese pages get... er, encoded. (Josh knows > not of the Joyo Amazon stuff.) Thanks for this insight into Amazon's internals. > > If you try to paste a UTF-8 string into a WWW form set for > > Shift_JIS, a conversion will be done (what software actually > > does it depends on OS, etc.). If matches can't be made, some > > substitution, e.g. blanks, will be done. > > I'm pretty sure that our search system honours the encoding you input. > Tragically, the output will be in Shit_JIS, so you won't be able to > read it. Great pity. > But I *know* that if you enter UTF-8, it handles it > correctly, because I do that all the time. I have to experiment with > characters outside of Shit_JIS, but I'm pretty sure I've input > Bulgarian / Russian in UTF-8 on and gotten sane search > results. However, since the WWW form is in Shift_JIS, browsers are unlikely to send in field contents that are not in either ISO 646 (ASCII) or JIS X 0208. > [1] I also work for Amazon, and I *do* work on the website platform, > though most of my work is mobile-centric > [2] Gurupa sounds interesting. I wish they'd change their Japanese output overt to UTF-8. Jim -- Jim Breen Honorary Senior Research Fellow Clayton School of Information Technology, Monash University, VIC 3800, Australia
- Follow-Ups:
- Re: [tlug] Re: font/char set question
- From: Josh Glover
- References:
- [tlug] Re: font/char set question
- From: Jim Breen
- Re: [tlug] Re: font/char set question
- From: Josh Glover
Home | Main Index | Thread Index
- Prev by Date: Re: font/char set question: 00A1:001A Mojibake . . . . . . . [tlug]
- Next by Date: Re: [tlug] Re: font/char set question
- Previous by thread: Re: [tlug] Re: font/char set question
- Next by thread: Re: [tlug] Re: font/char set question
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links