Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: tlug: XEmacs and Kanji detection
- To: tlug@example.com
- Subject: Re: tlug: XEmacs and Kanji detection
- From: "Stephen J. Turnbull" <turnbull@example.com>
- Date: Wed, 04 Jun 1997 23:25:21 +0900
- In-reply-to: Your message of "03 Jun 1997 18:23:01 -0400." <m267vvnx6i.fsf@example.com>
- Reply-To: tlug@example.com
- Sender: owner-tlug
-------------------------------------------------------- tlug note from "Stephen J. Turnbull" <turnbull@example.com> -------------------------------------------------------- >>>>> "Steve" == Steve Dunham <dunham@example.com> writes: Steve> I believe the locale is passed to the input method. (I Steve> suspect this is why the Solaris input method didn't work Steve> for me.) Dunno about Solaris, but under Linux, XIM will definitely crash your XEmacs+Mule if XMODIFIERS sets an input method and that input method fails to open for any reason. However, locale is used in a bunch of places in XEmacs. For starters, `fgrep -l locale /var/project/xemacs-20.2/src/*.[ch]' lists 25 files. >> The locale processing also seems to be inconsistently >> implemented, since "LANG=ja_JP.EUC" results in SJIS being >> correctly displayed (and "LANG=ja_JP.SJIS" produces correct >> results for EUC). Steve> Umm, what do you want it to do? Disable EUC support? Well, there are times when I'd like to be able to do something like that, since there are ambiguous files. I have yet to figure out how to get a buffer to be reread in various different encodings conveniently. One can set `buffer-file-coding-system' and relatives, but this is pretty clumsy. Steve> Currently it only uses the LANG variable to determine the Steve> "default" language for startup. Thanks for confirming that. Steve> XEmacs, by default, seems to have iso-2202 filters in the Steve> loading process. The SJIS detection seems to be added when Steve> you load the japanese language stuff. This happens when Steve> LANG=ja or you pick the Japanese menu item. To be picky, iso-2022 seems to be enabled by Mule; it doesn't work if you compile XEmacs without Mule. (I had some spare cycles while I was lecturing the other day....) I would guess that SJIS can _only_ be recognized if you assume Japanese; I'm not sure why EUC doesn't get recognized (my guess is that most EUC files do not conform to the ISO-2022 standard of starting out in ASCII (ISO-Latin-1?) and shifting into Japanese; if they did, they'd probably be recognized). Steve> (You can probably do this in .emacs too, you should be able Steve> to select Japanese and "Save Options", but I haven't tried Steve> it.) This works, Jason Molenda mentioned it. Steve> Anyways, you can change the "default" language in the menu: Steve> "Options/Language Environment". But this is too late for XIM (as I mentioned earlier). Steve> File encoding can be set on a per buffer basis using C-x Steve> C-n f (type C-x C-n C-h for a list of bindings). Unfortunately, this doesn't work. :-) You need to use `set-buffer-file-coding-for-read' which isn't bound. A trivial complaint since one can bind it oneself. >> Furthermore, using `(setenv "LANG" "C")' or any other locale >> does not affect this, so the locale of the XEmacs process seems >> to be fixed at invocation, and only used to invoke Mule >> features. Steve> Ahh, digging around, I found yet another reason for this: Steve> in the lisp directory, there is a Steve> locale/ja/locale-start.el. Apparently, there is no Steve> directory for other locales. (XEmacs desperately needs Steve> testers for non-japanese MULE stuff.) Do you know where this is used? All this file does is set up a localized version of the opening "splash frame", and a localized version of the usage message. Steve> Why do you get a feeling that the mule features are hacked Steve> in? It feels fairly clean to me. There are probably some Steve> necessary differences from the gnu emacs MULE, because of Yes, and all the ones I've seen so far I like ;-) Steve> design differences in the core editor. But as far as I can Steve> tell, MULE is nicely integrated into the editor. Aaaah, maybe I misspoke (I'm not sure about that yet). Maybe it's the XIM that's not properly integrated into Mule. But Mule (from the little I've read the code) seems not to be designed to integrate external henkan servers into its multilingual features. As far as I can tell, you either use a server or you don't, it's more or less fixed at startup (for XIM, at compilation in the cases of native Canna and Wnn support), and it is not selectable by menu. This is maybe a Mule problem more than an XEmacs problem. >> It seems to choke on the (standard-compliant) MIME content-type >> header. If that (`Content-type: text/html; >> charset=iso-2022-jp') is present, the non-ASCII characters turn >> into mojibake :-(. Steve> Sounds like a bug in w3.el... It seems to be ignoring the Steve> charset specification in the HTTP header. Does it work in Steve> MULE? Dunno for sure; can't get the 3.0.x version of w3.el to work with GNU Emacs/Mule (I think this is due to my config stepping on it; but I'm not sure where). But w3-2.2.26 shows the same bug with GNU Emacs/Mule. And it's worse than ignoring; it only gets it wrong when the charset spec is present. Anyway, I've reported the bug. >> Do you get the right charsets when switching from an SJIS >> server to an ISO-2022 server in w3.el? Do you need to do >> anything else (in particular I'm thinking of the liblocale >> dodge that works with Netscrap)? Do you get messages about not >> being able to set locale, using C/POSIX instead? Steve> This is the funny thing: I get those messages from XEmacs Steve> for "LANG=de", perl does the same thing. But I don't get Steve> any messages from Netscape (which displays correctly Netscape probably does not use libc localization, it probably uses Motif localization. Unless you mean it displays localized messages to std{err,out}. Steve> localized text) or various GNU utilities (again with Steve> varying degrees of localized text). But that's not the explanation for GNU. GNU probably is just more robust for some reason. Ah, yes. Looking at /usr/share/locale/*, you'll see that locale `de' doesn't have full support, evidently because the German countries don't share currency and date conventions and so on. So all that is there is the LC_MESSAGES subdirectory. Evidently perl and XEmacs either have a use for money :-) and GNU utilities don't, or they are less careful about checking all that kind of stuff, while the GNU utilities only request the locale functions they expect to need. I'll have to try this out. Yup, perl only complains about character sorting and typing, and doesn't complain if you specify `LANG=de_DE'. Steve> I fully expect that message for LANG=ja, since I don't have Steve> any "ja" locale in /usr/share/locale. (This needs to be Steve> fixed.) Do you know if anyone is working on it, or where to find out? I wonder if the i18n features properly support LC_COLLATE and LC_CTYPE for wc/mb character sets ... I suppose they must. Steve> This is why the liblocale.so is needed. Apparently, Steve> it calls some internal X functions to trick it into Steve> thinking it has a wide character locale on systems lacking Steve> the "ja" locale (English Solaris ships without it). Steve> I believe the X locale functions will work in conjuction Steve> with libc locale In one sense, yes. The models are different. X puts everything into one big text file, and is very concerned about character sets and encodings, while ignoring money (I can't :) and messages (ditto). Linux's implementation of C/POSIX splits things into what are apparently compiled objects for each category, and doesn't seem to care about character sets (due to Unicode?). So evidently the models are pretty well orthogonal to each other, which is the impression I get from the OReilly "R5 Update" volume, which says that X i18n is built on ANSI C i18n. Apparently the liblocale.so hack works only because most programs never actually call any locale functions except for setlocale() (which is as it should be), and neither does Linux.... >> Lists of all defined locales and aliases are in >> /usr/X11R6/lib/X11/locale/locale.{dir,alias}. Steve> The X locale information is there. The libc locales are in Steve> /usr/share/locale (on Debian). Steve> $B%9%A%'%t!&%@!<%J%`(Bdunham@example.com Steve> ----------------------------------------------------------------- Steve> a word from the sponsor will appear below ----------------------------------------------------------------- Steve> The TLUG mailing list is proudly sponsored by TWICS - Steve> Japan's First Public-Access Internet System. Now offering Steve> 20,000 yen/year flat rate Internet access with no time Steve> charges. Full line of corporate Internet and intranet Steve> products are available. info@example.com Tel: 03-3351-5977 Steve> Fax: 03-3353-6096 ----------------------------------------------------------------- a word from the sponsor will appear below ----------------------------------------------------------------- The TLUG mailing list is proudly sponsored by TWICS - Japan's First Public-Access Internet System. Now offering 20,000 yen/year flat rate Internet access with no time charges. Full line of corporate Internet and intranet products are available. info@example.com Tel: 03-3351-5977 Fax: 03-3353-6096
- References:
- Re: tlug: XEmacs and Kanji detection
- From: Steve Dunham <dunham@example.com>
Home | Main Index | Thread Index
- Prev by Date: tlug: Another File Manager
- Next by Date: tlug: smail aliasing host configuration
- Prev by thread: Re: tlug: XEmacs and Kanji detection
- Next by thread: Re: tlug: XEmacs and Kanji detection
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links