Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Kanji Email on XP: Mozilla vs. Firefox/Thunderbird (Re:font encoding for mozilla)



>>>>> "sjs" == sjs  <sjs@example.com> writes:

    sjs> I can't see anything that says "Use this font for display of
    sjs> ISO2022."

You won't.  There are two coded character sets that are used with
Japanese: JIS and Unicode.  Unicode is a superset of JIS.
(Technically, there are several variants of JIS, and JIS is divided
into several parts, but this is not really a problem at the level of
getting "konnichiwa" into email.)  If you know the JIS or Unicode code
point, then you can tell X or GTK to blit a glyph on the screen.

Unfortunately, the story doesn't end there.  Just as your Mozilla
package may come as a .deb or .rpm, or gzipped or bzip2ed, tarred or
CPIOed, in files or network streams the character content may be
represented in several different ways.  For JIS the usual ones are
7-bit (sometimes called just JIS), 8-bit (EUC-JP), and Shift JIS.  In
the case of 7-bit you need to distinguish between ASCII and JIS ("#1"
might be "#1" or it might be a full-width "1"), so you add a
sprinkling of escape sequences, and get ISO-2022-JP.  For Unicode
things are much more pleasant: UTF-8 and the two flavors of UTF-16.
(There's also UTF-32, but absolutely nobody uses that "on the wire".)
It is these "Transfer Formats" that you need to know in decoding a web
page, not the character set.

For ugly bitmap fonts, if you have access to the command line on a
system running X, "xlsfonts -fn '-*-*-*-*-*-*-*-*-*-*-*-*-JISX02*-*".
The ones you want will contain JISX0208, JISX0212, or JISX0213 at the
right end.  Those that have JISX0201 contain JIS Roman and halfwidth
katakana, neither of which is useful.

There must be an equivalent tool for Xft/pango, but I don't know what
it is offhand.  But for Xft/pango you can (in theory) just ask if a
font supports Japanese.  (You can even query the exact repertoire as a
subset of Unicode.)

    sjs> am looking into whether ISO8059 is a subset of ISO2022

You mean ISO 8859?  It's conformant to ISO 2022, but not a subset.
For one thing, it's 8 bit and the version of ISO 2022 you're referring
to is undoubtedly the 7-bit code used in email.



-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links