Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Search MySQL for Japanese Names



Dave M G writes:

 > I'm still just a little confused over this "decomposed" part of the story.

Decomposed is a Unicode sort of thing, or half-wit katakana.  Why do
it?  Because you'll see it anyway; people will put dakuten or
handakuten on characters that don't have a composed form, and of
course halfwidth katakana require the decomposed form.  For example, I
pronounce my name "Steven", in katakana スティーヴェン but in hiragana
(which for reasons I don't understand is occasionally demanded in
furigana) すてぃーう゛ぇん.  Notice how the "ve" is decomposed in the
hiragana form.

 > However, I'm not sure on how to get it.

I wouldn't worry about it.  It's more a regularity thing.  If you have
good Unicode support the system provides a routine to do it.  Also if
you have Unicode support you can remove the (han)dakuten easily by
filtering anything that isn't the right kind of kana.  That allows a
kind of fuzzy matching that is often useful (Japanese are sometimes
unclear about whether nigori is present or not in a given name).

 > "decomposed" form. Is "decomposed" the term most often used for this
 > kind of thing? Google isn't giving me much love when I search on it in
 > relation to katakana.

Try "Unicode normal form NFD" if you're curious.  (You probably want
wikipedia, not the Unicode Technical Report. :-)

Can't help you with PHP (never touch the stuff) or MySQL,
unfortunately.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links