Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new webpage: rikai.com



>>>>> "Simon" == Simon Cozens <simon@example.com> writes:

    Simon> Incidentally, I'd be very interested in taking this
    Simon> application and trying to make it work client-side. Doing
    Simon> it server side seems remarkably inefficient.

Well, now that you're threatening to _do_ something about it, I'll
comment.

The point of rikai.com is that clients don't have to maintain a
dictionary and an instance of chasen.  The English-to-Japanese version
is pretty trivial: filter all the tags out (surely there are libraries
for this), break things into words, deinflect the words (I have a
25-line Python hack that does a pretty good job, the Perl version
would be similar, and there are probably library versions that are
_much_ better), and start looking up in edict.

But parsing Japanese is much harder, as is deinflection.  Doing it
well really requires a tool like chasen.

Of course, what you could do is set it up so that while you're
downloading and installing chasen and edict in the background, your
client is looking up on rikai in the foreground.

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences       Tel/fax: +81 (298) 53-5091
_________________  _________________  _________________  _________________
What are those straight lines for?  "XEmacs rules."


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links