Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

tlug: Two Qs re translation project



(1)

After looking around and thinking about our requirements,
I've tentatively settled on MySQL as the database engine
to use for the dictionary of equivalents in our translation
project.  If I should think twice and consider an
alternative, please let us know.

(2)

One of the things that I'll need to do with the statutory
material (on the English and on the Japanese side) is quick
searching for words and phrases.  Does anyone have information
on Japanese-capable search engines that can be run under
Linux?  I remember there was a mention of Glimpse for
Japanese, but if I recall correctly, the patch is not being
maintained.

I also have a not-unrelated question that someone (Steve
Turnbull?) will be able to help with.  The Jse data is stored
in EUC.  In EUC encoding, could a one-byte search engine
capable to indexing 8-bit text be used?  In other words,
if there is a string made up of four bytes:

  [A] [B] [C] [D]

where A and C are the first bytes of two-byte characters
in EUC-JP encoding, and we run a search using a single-byte
search engine for a single arbitrary two-byte character, is it
possible that our character's underlying encoding could
be [B] [C]?  Or is it logically impossible in EUC-JP
encoding to get crossed up in this way?

In other words, what are the legal bounds of the first and
the second bytes in EUC-JP encoding?

Cheers,
Frank B

--------------------------------------------------------------------
Next Nomikai Meeting: February 18 (Fri) 19:00 Tengu TokyoEkiMae
Next Technical Meeting:  March 11 (Sat) 13:00 Temple University Japan
* Topic: TBD
--------------------------------------------------------------------
more info: http://www.tlug.gr.jp        Sponsor: Global Online Japan


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links