Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] i18n Primer



>>>>> "Jim" == Jim Breen <Jim.Breen@example.com> writes:

    Jim> Naruhodo. I am bearing a huge load of guilt because I have
    Jim> done very little towards it.

Damn.  I was hoping to offload a few of my 16 tons.  :-)  Let me see if
I can make some time next term.

BTW, you can feel guilty if you want, but documentation is unfun to
start with, and I18N strikes me as the least fun of the lot.  If you
look at what folks like Suzuki et al do (Kokusaika Puroguramingu), or
the X11 standards, or Pango (what a distasteful name! and even uglier
when written as παν語---Owen Taylor must be a dyslexic with a
really sick middle ear), or Xft, you can see that people enjoy
thinking up beautiful frameworks and even implementing them, but
nobody documents APIs, let alone usage.

    >>> editions.  My article in the LJ in 1999 was too heavy on
    >>> theory, not very strong on practice, as was my chapter in
    >>> Wrox's Professional Linux Programming.

    Jim> 'Twas early days though. I things are a bit clearer now in a
    Jim> practical sense.

Sure.  The big boys like Red Hat and SuSE, much as they are a pain in
the neck that sits under the XEmacs maintainer hat, really have done
us a beeeeg favor by providing resources to practical I18N efforts.

    Jim> Um. I'm not so sure about [Unicode internals]. I can think of
    Jim> many situations where you can comfortably leave the internals
    Jim> in UTF8. The hit converting UTF8-> Unicode->UTF8 while
    Jim> working on a large file can be horrible. For example, the
    Jim> main part of the internal format of my JMdict is in EUC, and
    Jim> I can open it in a EUC-capable editor in 3 seconds. Opening
    Jim> the UTF8 version in something like Yudit takes about as long
    Jim> as making a cup of coffee.

I'll do some time trials later, but I simply don't believe that is
necessary.  (I believe your observations are accurate, I just think
that proper programming should avoid the problem.)  Would you say
450MHz Pentium II (Xeon) with 256MB of memory is a reasonably
conservative testbed?  I can downgrade that to a 120MHz Pentium
Obsoletus with 80MB, if you like.  ;-)  Reading in a file on a
reasonably modern machine should happen just about as fast as the PCI
bus will permit, with any normal external->internal representation
conversion (obviously if your internal representation is bzip2 things
are gonna be a bit slow...).

    >>> As long as you use a language (a p-language, for example) or
    >>> toolkit (GTK) that supports Unicode internally, you generally
    >>> do not have to worry about issues like font handling or input
    >>> methods.  Those are somebody else's problem.  ;-)

    Jim> Many people think these are the problem, but I agree that if
    Jim> you have done the earlier bits properly, fonts and inputs are
    Jim> a done deal.

So there you are, Josh.  :-)

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links