Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tlug: Re: Japanese input



>>>>> "Matt" == Matthew J Francis <asbel@example.com> writes:

    Matt>   One solution to this could be the language-tagging method
    Matt> specified in Unicode Technical Report #7 (available from

This is getting on towards supporting locales, don't you think?

    Matt> unicode.org). I do think this is a problem that needs
    Matt> shaving carefully with Occam's razor, though.

Yup.  And then look in the mirror of what the customer wants.

    Matt> 1. A somewhat "smart" text renderer, with a set of defaults
    Matt> for how different codepages are displayed (e.g. "English
    Matt> l->r, Hebrew r->l, Japanese r->l and vertical with breaking
    Matt> every 20 lines"). If that is really insufficient control,
    Matt> then...

    >> Hebrew, unfortunately, is not r->l.  It's bidrectional.

    Matt> Ouch. Is that bidirectional as in "Can be written either
    Matt> r->l or l->r", in the same way that Japanese can be written
    Matt> either horizontally or vertically, or "Always both r->l and
    Matt> l->r", as in Boustrophedon?  (Or something even more
    Matt> awkward?)

That last is ancient Greek for writing on stone tablets, ie, r->l
followed by l->r so that you don't have to lug those heavy stone
chisels and the scaffolding back to the other end of the Parthenon
when you do a CR?  ;-)

Look at any commercial truck in Japan; it'll be written back to front.
On both sides.  :-)  So Japanese, even the modern dialect, is
_horizontally_ bidirectional by user preference.

Hebrew (and Arabic) both use l->r notation for numerals and European
language inclusions (very common in modern Hebrew, as you can
imagine), while the normal direction is r->l.

>      Matt> (1) would correspong to the existing sort of "text edit
>      Matt> control"; if you need to do something more than the
>      Matt> simplistic markup it would provide, then you have need of
>      Matt> something more like a Word Processor.
>  
>  Arabic.  Devanagari.

    Matt> I know a (very) little about Arabic, but can you please
    Matt> enlighten as to what problems lie with Devanagari?

Arabic is context-dependent in the sense of initial, medial, and final 
forms; a good Arabic font contains several thousand glyphs to get the
various forms and their joins correct, thus-I-have-heard.

In Devanagari, the glyphs combine in complicated topologies like
Korean Hangul; it just so happens that within a "syllable" the order
is typically right to left, but syllables are ordered left-to-right.
I have seen this script entered, and the combining and scrambling of
the glyphs is reminiscent of the Kama Sutra.

    Matt> I see mutterings in various places about a supplied sample
    Matt> algorithm for marking up text of varying direction, but as I
    Matt> don't yet have the Unicode book I can't comment on that. In
    Matt> general, I think that if someone wants to use languages of
    Matt> varying direction together, there will be some way to figure
    Matt> out a good markup automatically; and, if there is not, then
    Matt> there just won't be a sensible way to do it.

I think automatic is quite possible.  But...

Good and automatic, not soon.  But eventually---and therefore you must 
plan your protocol for upward compatibility, and in the meantime
provide sensible ways for the user to force the desired behavior.

Large projects will not consider a widget set that looks like it will
need a protocol revision in the near future.

    >> The issue here is that from a QC standpoint, _all_ widgets are
    >> worthy of consideration, and are you really sure they all use
    >> labels?

    Matt> Now looking closely; this does appear to be true for GTK's
    Matt> built-in widgets (There are relatively speaking few enough
    Matt> of those anyway). I don't see why it shouldn't also be true
    Matt> with Qt and Motif.

If you did an appropriate grep for all possible ways of putting text
on the screen, then I'll take that "appear" as pretty authoritative.
Otherwise, you haven't opened the can of worms yet.

    Matt> Considering that programs (not including Emacs) which choose
    Matt> to display and input text in non-widget-set ways are quite
    Matt> probably less internationalis{ed,able} than most even now,
    Matt> the QC burden is mostly theirs rather than the widget
    Matt> set's. Although it would be nice if everyone's QC problems
    Matt> were solvable in one stroke, I suggest no such thing.

This is true.  What I am saying is that there are large projects like
XEmacs that would like to move to a standard widget set, but will not
do so unless the QC is shown to be quite high, and the maintainers
responsive to bug reports.

>  If you don't get that, then projects like XEmacs are not going to use
>  the widgets; better we live with and incrementally improve the poor
>  man's widget set we've got, than introduce a whole new set of
>  compatibility issues.

    Matt> As far as I can see, (X)Emacs is unlikely to move to a
    Matt> general widget set in any case, at least in the forseeable
    Matt> future. As you say, they prefer the solid stability of their
    Matt> custom widgets.

That's because there is no reasonably good set of widgets out there
yet.  Motif is proprietary; XEmacs supports it but it's buggy on many
platforms, and Lesstif is very buggy and incomplete.  The others are
much worse.  The issue comes up every three months, however, and there
would be a lot of enthusiasm for such a port if something that
satisfies portability, free source, non-bugginess, and completeness
can be found.  "Non-bugginess" includes "doesn't interfere with
XEmacs's way of doing things" (admittedly, not even Xt satisfies this
perfectly).

    >> As for unportable ... XEmacs can be configured as a widget
    >> which can replace most text widgets.  Hmm....

    Matt> Yet strangely enough, I don't see anyone doing that. Whether
    Matt> it's not what people want, or genuinely just through lack of
    Matt> good PR I couldn't say.

It's an awfully heavy-weight way to enter name and address in a web
browser HTML order form, wouldn't you say?

But there are commercial projects using XEmacs as a widget, what for I
don't know.  I just know that the one guy who represents a company
doing this on the beta list is very prolific of bug reports and
patches ;-)



--------------------------------------------------------------
Next Nomikai: 17 July, 19:30 Tengu TokyoEkiMae 03-3275-3691
Next Meeting: 8 August, Tokyo Station Yaesu central gate 12:30
*** 20 June: TLUG will be at the Tokyo Linux Fair
http://tlug.linux.or.jp/projects/linux-fair/fair.html
--------------------------------------------------------------
Sponsor: PHT, makers of TurboLinux http://www.pht.co.jp


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links