Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: International support



On Thu, Feb 22, 2001 at 02:19:43PM -0800, Soma Interesting wrote:
> At 05:23 AM 2/23/2001 +0900, you wrote:
> > > require PHP/Postgres to be able to interpret the data. However if I 
> > compile
> > > Postgres with locals support for the character set/language in question -
> > > then postgres will be able to sort Japanese. Is this right?
> >
> >You won't get back a sort that is meaningful to most human readers
> >of the data, but that's not the fault of the software.  What
> >ordering are you trying to produce?
> 
> The database can't sort alphabetically even once local support for that 
> character set or language is installed?

That's right.  If all you have in your data is a bunch of Kanji strings
(Chinese characters), there is no simple means of determining how they are
pronounced; you would need to do a linguistic parse of the string, and even
then you will end up with ambiguities. This is one reason why Japanese is
sometimes referred to as "the Devil's tongue"; a given Kanji character
typically has two, and often more than two possible pronunciations.  Chinese,
which offers one-to-one correspondences, is easy by comparison --- despite
what people in the China trade will tell you about the size of that language's
character set.

If you want to sort by pronunciation, you need to capture the hiragana or
katakana for its phonetic representation in your data.  Either of those
character sets (a hundred some-odd characters in both cases) should be
sortable.

Frank Bennett

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links