Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Japanese input (was RE: tlug: Japanese)



>>>>> "Matt" == Matthew J Francis <asbel@example.com> writes:

    >> King Canute told the tide to go back to the sea.  I'm afraid
    >> you guys are doing the same thing.

    Matt> (Kingy didn't have nuclear explosives. Even the tide can be
    Matt> persuaded if you cut a clear enough channel... :)

Not for long.  I take it you aren't a Dutchman nor do you live in
Kyushu or near the Mississippi River.

I wrote:

    >> There is such a thing as UCS-4, of course, but as far as I know
    >> it is 100% empty except for the BMP and private space and the
    >> reserved non-characters at FFFF and FFFE in every plane.  This
    >> means that the system you are putting in place now must be
    >> completely flexible with respect to the standards that will
    >> undoubtedly evolve for the use of UCS-4.

>>>>> "Gaspar" == Gaspar Sinai <gsinai@example.com> writes:

    Gaspar> I think UTF8 is a perfect choice, because it can encode
    Gaspar> UCS4. But I also feel that MS-biased Unicode Consortium

Putting it that way, UTF8 means nothing, since it's a perfectly
general method of encoding 31-bit integers.  Ie, it has no semantics.
So we're going back to the Space = Alt-3 Alt-2 method of character
input?  I thought not.  And internally, let's not talk about using
UTF-8; I can see no good reason for not using CARD31 buffers that
can't be handled _in practice_ by using CARD8, CARD16, and CARD31
buffers as appropriate to the linguistic situation at hand.

    Gaspar> will never put any practical meaning into UCS4 as MS-NT
    Gaspar> only supports UCS2 and it would be a lot of work for them.

The Unicode Consortium and ISO-10646 try to work together, and the
Consortium has done much of the practical work.  However, technically
speaking, and potentially practically speaking, they are quite
different things.  But it's not M$-bias that created those wide open
spaces; it's that there are not yet any practical proposals of what to
put in there, primarily because everybody is fighting over the empty
space in the BMP.  That's cynical, too, but at least we can see the
end of it.

    Matt> Homework time; I don't know about the details of UCS-4. In
    Matt> any case, an inflexible solution is a bad one, which I would
    Matt> hope not to be involved in writing.

You're missing my point then.  Those wide open spaces mean it needs to 
be infinitely flexible.  (That's an exaggeration, of course.)

But for the near future that I see we will need to handle the entire
Babel of character sets, including some that don't exist yet.  More
flexibility....

    Gaspar> I don't think simple things like character input need a
    Gaspar> genious.  Moreover, when people are too capable they use
    Gaspar> up their energy by over-complicating things.

Character input doesn't take a genius.  But it is inherently complex.

    Gaspar> Just take X11 for example. It does not say how to make
    Gaspar> graphical interfaces and it tries to be very general.
    Gaspar> Result: no drag and drop support, different look-and feel,
    Gaspar> and simple things are impossible to accomplish - for one:
    Gaspar> try to make a cursor with more than two colors in X11 --
    Gaspar> impossible.

Why do you want a cursor with more than two colors?

If you need one, make a 16x16 sprite.  It can be done, and it's not
terribly inefficient.

    Gaspar> People try to defend input methods in X11. But here comes
    Gaspar> multi-language input. It turns out that you have to throw
    Gaspar> away all the library routines in X11 that are supposed to
    Gaspar> help you, to make it work with X11. Sometimes I think that

Who told you that?  I know how to make it work.  I'm not going to say
it will be efficient.  I'm not going to say it will be the best way.
But I do know how to make it work.  Hint:  setlocale(3).

    Gaspar> GGI and Berlin is the only way to go. But still X11 is

URL!  URL!  URL!  We all know what X11 is ... what are these?

    Gaspar> usable we just need to tweak it at the right place.

    Gaspar> o We need an input conversion server, with a standardized
    Gaspar>   communication method that does not need X11 running.

Why?  Real multilingual interfaces may as well be graphical; I can
image a Unicode terminal, but that isn't going to satisfy anybody who's 
ever needed a gaiji.  Damned if I'll use svgalib as my primary user
interface ;-)

And we have that input conversion protocol, in the wnn/?server
protocol.  Or maybe Canna's is better.  Neither needs X11 running.

    Gaspar> o X11 input methods like kinput2 could be modified to
    Gaspar>   handle this protocol. Old apps won't notice a thing.

But will this new protocol handle all the messy state and preedit
stuff as well as XIM?  There's nothing in the XIM protocol that
requires X (and not very much that requires a windowing system).

    Gaspar> o Languages to be added dynamically to the input
    Gaspar>   conversion server.

Why?  Why not run multiple backends, as today where many systems run
Wnn and Canna concurrently?  Or are you talking about the input
manager (like kinput2)?

You don't want to talk about Emacs, but have you looked at the huge
coding effort that is the LEIM?  LEIM already has this.

    Gaspar> Drawbacks: You need to display the converted caracters -
    Gaspar> but wait a minute - all apps CAN display this.

Only with the graphical interface.  But you just threw that away.  I
know from XEmacs experience (lurking on the beta list, not my
personal) that making a multilingual app talk to a monolingual TTY can
be painful.  More complexity with each TERM that you support (many of
which probably haven't been invented yet).

    Gaspar> When I ported Canna and kinput2 to alpha (64 bit)I worked
    Gaspar> day and night for more than a week. I dont feel much
    Gaspar> attached to the "heritage" here.  Old codes are not even
    Gaspar> 64 bit clean - easier to rewrite than port.

But you kept the protocols.  By definition.  Dirty code is unclean,
true.  But let's separate the protocols from their implementations,
please.

    Matt> Dreams are indeed nice, but they're also useless unless you
    Matt> *try to make them happen*. I also have enormous respect for
    Matt> those who have gone before me, but sometimes it takes a
    Matt> bunch of heretics to make interesting stuff happen :)

I'm doing so.  I just have a different opinion (on the small side)
than most people have of just how big the mouth of Stephen J. Turnbull 
is, and I choose to bite off a reasonable-sized mouthful.

    Gaspar> I think what is really needed is an open environment and
    Gaspar> open standard, that is decided by us.

    Matt> If the whole thing turns out 100 times worse than I imagine,
    Matt> then the very least that will have been accomplished is to
    Matt> make other people think about how they could do it
    Matt> better. And that in itself will have been success
    Matt> enough. Competition stimulates...

Go ahead.  If it's good, I'll jump in.  But I know better than to try
to design one from scratch, myself.  I can serve much better
elsewhere.  There are plenty of attempts out there to try to improve on.

    Gaspar> Emacs is very powerful. But it is too complicated for the
    Gaspar> occasional user.  I would like to make Linux available for
    Gaspar> all users - non-professional users are in still in
    Gaspar> majority....

You're missing my point.  I'm not advocating Emacs for the masses
(here; it can be done, and both XEmacs.org and the FSF are pointing in
that direction).  I am advocating Emacs as a development environment
for multilingual applications, because it has an established set of
standards, APIs and protocols, that can be used or altered fairly
easily in a Very High Level Language.

ciao.

--------------------------------------------------------------
Next TLUG Meeting: 13 June Sat, Tokyo Station Yaesu gate 12:30
Featuring Stone and Turnbull on .rpm and .deb packages
Next Nomikai: 17 July, 19:30 Tengu TokyoEkiMae 03-3275-3691
After June 13, the next meeting is 8 August at Tokyo Station
--------------------------------------------------------------
Sponsor: PHT, makers of TurboLinux http://www.pht.co.jp


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links