Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] Emacs IME, locale, encodings, R, aarrrrgggghhhh!!!!



Stuart Luppescu writes:

 > Trying to get it to display the encodings of the vector elements gives
 > this:
 > 
 > > Encoding(house.names)
 > [1] "unknown" "unknown" "unknown" "unknown" "unknown"

Looks like Emacs is not passing the appropriate environment to the
inferior R, or not converting the strings to the appropriate encoding
when sending them to R.

In a fresh Emacs, try M-x setenv RET LC_CTYPE RET ja_JP.UTF-8 RET, and
M-: (setq default-process-coding-system 'utf-8) RET,
then run R and try the program.

About the output in the rxvt:

 > [1] "name"               "å¹³å±"               "ã©ããã®ãã³ã·ã§ã³"
 > [4] "湯河åãã³ã·ã§ã³"   "ç±æµ·ãã³ã·ã§ã³"

Something screwy is going on here.  Counting the characters in each
Japanese string does not give a multiple of 3 except for the third,
which it should (all Japanese characters are 3 octets in UTF-8).
There are other issues; it's quite obvious that what rxvt displays is
not valid UTF-8.  In XEmacs using decode-coding-region on the first
gives "平å±" and on the fourth gives "湯河åãã³ã·ã§ã³", the others
come back unchanged (meaning that they are not valid UTF-8).  I don't
know much about rxvt (is urxvt now the default rxvt? does it normally
display Japanese correctly?)  Does

    echo 平屋 どこかのマンション 湯河原マンション 熱海マンション

do the right thing?

 > [1] "unknown" "UTF-8"   "UTF-8"   "UTF-8"   "UTF-8"  

This is expected (R uses "unknown" for ASCII because it could be
almost anything).

If that doesn't give you a hint (I suspect it's not enough), feel free
to send me the R program (zip it to be absolutely sure that your mail
client doesn't futz with it), and also the value of Emacs's
default-process-coding-system (that's the XEmacs name which *should*
be the same as Emacs, but if not, try C-H a "process.*coding" and
tell me the names and values of every variable listed ;-).


Home | Main Index | Thread Index