Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tlug: XEmacs and Kanji detection



--------------------------------------------------------
tlug note from "Stephen J. Turnbull" <turnbull@example.com>
--------------------------------------------------------
>>>>> "Craig" == Craig Oda <craig@example.com> writes:

    Craig> and unfortunately, it didn't work.  However, it made me
    Craig> think about what else could affect things.  Then, it hit
    Craig> me.  The "I'm so lame" thought again.  When I went through
    Craig> the disk wipe, I erased my .bash_profile and my LANG
    Craig> variable was unset.  I set LANG to Japanese with:
   
    Craig>     cow:~$ export LANG=ja_JP.EUC

I've never found locale to be useful for much of anything; kterm
hardcodes it, and GNU Mule ignores it.  So this didn't occur to me.
(Who looks lame now?)  Yeah, that seem to do the trick (dammit; it
means that getting m17n right is going to be really mendoukusai;
locale-related functions are dispersed all over xemacs, but not in
Mule source files :-( ), although I'm using an XIM-based XEmacs, not a
standalone henkan server version.

The locale processing also seems to be inconsistently implemented,
since "LANG=ja_JP.EUC" results in SJIS being correctly displayed (and
"LANG=ja_JP.SJIS" produces correct results for EUC).  Furthermore,
using `(setenv "LANG" "C")' or any other locale does not affect this,
so the locale of the XEmacs process seems to be fixed at invocation,
and only used to invoke Mule features.  (I suspected that XEmacs would
handle ISO-2022-JP when compiled without Mule, because it gets it
right when EUC and SJIS are mojibake.  But that is not the case.)
It's really not clear to me what all is going on here.  The Mule
features of XEmacs seem to be really hacked in.

Re: w3.el.  I'm _not_ getting Japanese on the Mule homepage (the top
page is OK, since all the Japanese is in GIFs, but
http://www.etl.go.jp/Research/mulepage/MuleJapanese.html bombs).  I
don't seem to have any problems with the W3 multilingual example page
(http://www.ntt.co.jp/japan/note-on-JP/multi-example.html) except for
missing fonts.

It seems to choke on the (standard-compliant) MIME content-type
header.  If that (`Content-type: text/html; charset=iso-2022-jp') is
present, the non-ASCII characters turn into mojibake :-(.  (I've
confirmed this by prepending that to the header of multi-example.html
and getting mojibake, and taking out the charset reference in the
Content-type of MuleJapanese.html and getting Japanese.)  Sigh.

w3.el doesn't do JPEGs.  Pity, that.

Craig, I have a couple of questions.

Do you get the right charsets when switching from an SJIS server to an
ISO-2022 server in w3.el?  Do you need to do anything else (in
particular I'm thinking of the liblocale dodge that works with
Netscrap)?  Do you get messages about not being able to set locale,
using C/POSIX instead?  (I do, unless I use LD_PRELOAD=liblocale.so,
since I have XIM compiled in, and any keyboard input gives instant
crash.  At least it's for reasons I already understand.  I think; if I 
do understand, you probably don't get those messages or need to use
liblocale.so with XEmacs/Mule/Canna.)

For those not using XIM, the following probably does not apply.

For those of you who like to do advanced stuff with your Asian
languages (I'm thinking in particular of Jim Breen and Dennis
McMurchy) there are a couple of gotchas in my environment (XFree86
3.2A) that may or may not apply to yours.  I don't even know if
they'll cause problems (I don't have the fonts &c installed to test),
but I thought I'd mention them.

First, if you use the liblocale.so workaround to substitute X locale
functions for libc locale functions, the locale databases in
/usr/X11R6/lib/X11/locale/ja* specify the character set as JIS X
0208-1983, but allow JIS X 0208-1990 fonts to be substituted.  I don't
think this matters much (2 characters, right?), but that's the first
thing I noticed.  Second (more important) all of the locale
information pertaining to JIS X 0212-1990 is commented out.  If you
have JIS X 0212 fonts, I think you can probably just uncomment that.

I don't know enough about Chinese to make any comments about support
for it (the multi-example.html page displays the Chinese I have fonts,
for, namely GB, anyway), but you might find stuff in locale/zh* that
helps solve any problems.

Lists of all defined locales and aliases are in
/usr/X11R6/lib/X11/locale/locale.{dir,alias}.

    Craig> and got it working.  Now, I'm happily in XEmacs and using
    Craig> Japanese.  I just have to spend some time setting my
    Craig> Xresources to my liking and I'll be all set.  I'm in mew
    Craig> right now and things look good.

    Craig>   ありがとう瘢雹ございました。

どう瘢雹いたしまして。

"It's getting better all the time....  Better, better, better!"

-- 
                            Stephen J. Turnbull
Institute of Policy and Planning Sciences                    Yaseppochi-Gumi
University of Tsukuba                      http://turnbull.sk.tsukuba.ac.jp/
Tel: +81 (298) 53-5091;  Fax: 55-3849              turnbull@example.com
-----------------------------------------------------------------
a word from the sponsor will appear below
-----------------------------------------------------------------
The TLUG mailing list is proudly sponsored by TWICS - Japan's First
Public-Access Internet System.  Now offering 20,000 yen/year flat
rate Internet access with no time charges.  Full line of corporate
Internet and intranet products are available.   info@example.com
Tel: 03-3351-5977   Fax: 03-3353-6096


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links