Re: tlug: Wnn

To: tlug@example.com
Subject: Re: tlug: Wnn
From: "Stephen J. Turnbull" <turnbull@example.com>
Date: Tue, 13 May 1997 10:51:33 +0900
In-reply-to: Your message of "Mon, 12 May 1997 23:02:23 +0900." <Pine.LNX.3.96.970512225513.390C-100000@example.com>
Reply-To: tlug@example.com
Sender: owner-tlug
--------------------------------------------------------
tlug note from "Stephen J. Turnbull" <turnbull@example.com>
--------------------------------------------------------
>>>>> "Craig" == Craig Oda <craig@example.com> writes:

    Craig> On Mon, 12 May 1997, Dennis McMurchy wrote:

    >> Most Linux books in the vernacular seem very Canna-oriented.
    >> I'm not sure why.  Does anyone know?  Anyone seen good

Probably because Canna is free and supported.  Wnn 6 is very
expensive, and Wnn 4 is not supported (maybe a little bit if you can
handle the nihongo mailing list; I've never bothered).

    >> comparison of Wnn vs. Canna.  I remember Steve saying he
    >> thought Wnn was a memory hog.

    Craig> Dennis, you noticed this too, huh?  I was wondering about
    Craig> the exact same thing.  I started off with Wnn, moved to
    Craig> Canna, but moved back to Wnn on my current system when I
    Craig> ran into compilation problems.  I also had problems using

Different drummer, Craig?  Everybody else compiles Canna OK, and has
problems with Wnn....

    Craig> kinput2-canna.  Wnn works fine for me.  I think it is more

No problems with kinput2-canna, either....

    Craig> sophisticated in lexical parsing than the Win '95 J stuff.
    Craig> However, if canna is better, I would prefer to learn the
    Craig> better system.

I think that Wnn 4 has a slight lead over current Canna still in terms
of "guessing" correct henkan on the basis of usage frequencies, and
maybe a smarter learning algorithm.  Both are much better than any
Win95 or previous Microsoft-kei, at least for the ignorant gaijin who
doesn't much like function keys and prefers Ctrl-G to the mu-henkan
key.  I have not tried MS-IME97 yet, though.

Other than that, I don't think there is much difference between the
two.  Both Wnn and Canna use syntactic information to help narrow the
choices; both have trouble at romaji/nihongo boundaries, although
different kinds of trouble.  Both definitely don't like gaijin who
often henkan in the "wrong" place (I always do stuff like "nihongo"
<henkan> "dehansimasu" <henkan> and get 「日本語出話します」).

    Craig> What is this xim thing?  I have a few files in the
    Craig> Debian-JP distribution, maybe I'll give it a read.
    Craig> Interesting thing is that I have xpostit and xcalendar
    Craig> working with kinput2.  Does this xim thing do the same
    Craig> thing as kinput2 and use a standard protocol as opposed to

XIM _is_ a protocol, allegedly supported by kinput2.  In fact, it is
_the_ _standard_ protocol specified by the X Consortium.  I don't know
whether there are advantages to XIM over kinput2 protocol; I imagine
not for Japanese but possibly for other languages.  One advantage to
XIM for people who speak European languages other than English is that
compose keys and dead keys can be (and are, in fact) implemented in
Xlib, eliminating the delays and synchronization problems of separate
servers.  Then the necessary user interface components for each
language can be added as a relatively small subroutine library.  But
since the protocol is the same, the user interface is more consistent.
But kinput (as an input method, handling raw keyboard input) and Wnn
and Canna (as henkan servers, handling complicated translations
involving dictionary lookup) require an external server program.

    Craig> the kinput2 protocol?  I guess kinput2 protocol is
    Craig> non-standard?  I know that Stephen Turnbull brought the xim
    Craig> docs to the last TLUG meeting, but it was so over my head
    Craig> at the time that I didn't understand it.  As I get more
    Craig> comfortable with Japanese input, I'm starting to become
    Craig> more interested in a standard input method for all
    Craig> programs.

Most Japanese programs for Un*x support kinput or kinput2.  The most
important exception is Emaxen, which support wnn and canna protocols
natively with the patches originally developed for Mule and now
apparently ported to XEmacs.  Mule also supports Quail, a vast
improvement on SKK in terms of multilingualization; I don't think it's 
all that much better in terms of Japanese support as such.

>>>>> "Jason" == Jason Molenda <crash@example.com> writes:

    Jason> I've used SKK for my henkan/FEP thing for a year or two
    Jason> now.  I got fed up with these huge systems like wnn which
    Jason> were a real pain to compile/understand (plus there was lots
    Jason> of documentation which I didn't want to read).

    Jason> It doesn't get much easier than SKK.  No fancy-pants
    Jason> lexical stuff, no servers, no configure scripts.  Just an
    Jason> elisp file and a big plain-text dictionary.  To install,
    Jason> you just copy the elisp & dictionary files in the right
    Jason> place.

The main operational thing to watch out for with SKK (at least the
vintage 1992 version) is that it doesn't parse long phrases; you need
to do the conversions as soon as you have the yomi completed.  (It's a
Single Kanji from Kana converter, you see.)  If you're using GNU Mule,
you might consider Quail instead.  Get the LEIM (library of emacs
input methods) from a recent GNU Emacs/Mule beta site (I think
ftp.lab.kdd.co.jp has this under FSF/Emacs/Mule or something like
that).  Quail is basically SKK by another name, with slightly faster
lookups (the dictionary itself is now compiled LISP code, I think
technically speaking it's been made into an obarray) and better
standardization if you are really multilingual (see below).  Quail
should work with non-Mule Emaxen, I think.

The big defect to SKK/LEIM+Quail is that it makes you Emacs-dependent; 
there won't be an interface to XPostIt for example.  On the other
hand, it is truly multi-lingual, designed for straghtforward extension
to new languages.  When XIM finally gets going, that will be true for
all X (and possibly non-X; XIM doesn't need a window) programs.
However, that is not yet available in a single consistent interface as
far as I can tell.

>>>>> "Dennis" == Dennis McMurchy <denismcm@example.com> writes:

    Dennis> On Mon, 12 May 1997, Stephen J. Turnbull wrote:

    >> I want my X-I-M ("kanji for nothing and Devangari for
    >> free")....

    Dennis>   What is this xim?  I think this is the second reference
    Dennis> today.  If it's real and does Devanagari, I'm interested.
    Dennis> Mule doesn't, strangely, although support may be in the
    Dennis> works.

The Devangari part was was a tease, pure and simple.  XIM doesn't do
it yet, as far as I know.  But you can be rather sure that neither Wnn
nor Canna and a fortiori not kinput2 will ever do it, while XIM will
certainly do so (unless X goes away....).

It's not so strange when you consider that Devangari is not monotonic
in phonemes (although syllables go from left to right, as I recall,
various phonemes may modify the glyph denoting a word on the right or
the left of the last phoneme).  This is a not-easy problem (although
since it's not NP-complete, we shouldn't consider it "hard" :-).  It
was pretty cool to watch on-screen in the demo.  I guess Hangul
composition probably looks pretty much the same.

Mule's LEIM does now do Devangari, at least in an alpha version.
Kawabata Taichi, a graduate student at Todai, demonstrated a "Trial
Implementation of Devanagari Script on Mule" at the Second
Multilingualization Conference held at ETL in Tsukuba at the end of
March.  I don't have an email address for him, unfortunately, but he
seems to be working closely with Handa Ken'ichi at ETL.  I don't have
Handa-san's address, either, but handa@example.com is a good guess, and
it might be more polite to try mule-etl@example.com or
m17n-sec@example.com first (the latter is the address for
Conference-related email, so it may be defunct).  No notes were
distributed, but a conference volume will be produced and I assume
registered participants will get copies; I'll let you know if and when
:-)

It's not Mule-related, but there was a presentation on Indian
information processing, including Devanagari, in the (First) M17N IP
Conference last year; it doesn't look like all that much (although
there seems to be a large appendix of Devangari script) but if you
send me a snail-mail address I'll make a xerox copy and send it out.
I'd scan it and put it on the Web but (a) that's slow, and (b) I am a 
little uneasy about copyright.

-- 
                            Stephen J. Turnbull
Institute of Policy and Planning Sciences                    Yaseppochi-Gumi
University of Tsukuba                      http://turnbull.sk.tsukuba.ac.jp/
Tel: +81 (298) 53-5091;  Fax: 55-3849              turnbull@example.com
-----------------------------------------------------------------
a word from the sponsor will appear below
-----------------------------------------------------------------
The TLUG mailing list is proudly sponsored by TWICS - Japan's First
Public-Access Internet System.  Now offering 20,000 yen/year flat
rate Internet access with no time charges.  Full line of corporate
Internet and intranet products are available.   info@example.com
Tel: 03-3351-5977   Fax: 03-3353-6096
Follow-Ups:
- Re: tlug: Wnn
  - From: Jason Molenda <crash@example.com>
References:
- Re: tlug: Wnn
  - From: Craig Oda <craig@example.com>
Prev by Date: tlug: Netscape 4.0 not for me
Next by Date: Re: tlug: Wnn
Prev by thread: Re: tlug: Wnn
Next by thread: Re: tlug: Wnn
Index(es):
- Date
- Thread
Home | Main Index | Thread Index