Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new webpage: rikai.com



On Tue, 12 Sep 2000, Stephen J. Turnbull wrote:

> But parsing Japanese is much harder, as is deinflection.  Doing it
> well really requires a tool like chasen.

  Interesting. I had never heard of chasen. What do you think of the
parsing in Rikai? I don't do anything as complex as frequency analysis
(this always seemed like a cheat to me anyway, as you get more frequent
(pronounced 'easy') words correct by guessing and screw up exactly the
ones that are hardest. Jim Breen put it best: Yippie, I parsed keredomo!)
  But it certainly does raise the accuracy rates for marketing material.

> Of course, what you could do is set it up so that while you're
> downloading and installing chasen and edict in the background, your
> client is looking up on rikai in the foreground.

  Exactly. If there are serious arguments for doing this client-side, I
certainly don't see them holding water for more than another year or two.
We're talking 20-40k of data to send around, based on many megs of
dictionary files. Light makes the trip pretty damn fast. One can use Jim's
dictionary lookup program on a docomo phone--why on earth would you want
to download and constantly update edict,kanjidic,namdict,etc.--unless
you're a hacker trying out new ideas (a small percentage of users), or
using it for proprietary business purposes (in which case, Stephen is
unlikely to shed tears for you, I would guess, so write me offline).

  Otherwise, if someone wants to sponsor some miror hosting in Australia
or wherever, please let me know.

-- Todd


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links