Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] Emacs IME, locale, encodings, R, aarrrrgggghhhh!!!!



TLUGers, I beg your indulgence for a long, complicated set of issues.

I use anthy-ibus for Japanese input for general purposes and also anthy
for Japanese input in emacs, but recently it has stopped working in
emacs. I type something in, do the 変換, hit enter and the text
disappears. It seems there's no more anthy in melpa; I guess it was
removed. So, I found ddskk in Portage and installed it. It works OK,
but it's giving me problems in processing Japanese strings in R. 

I have an R character vector, which I typed into emacs. (I'm running R
in an inferior process in emacs.) It looks like this:

house.names <- c("name", "平屋", "どこかのマンション", "湯河原マンション", "熱海マンション")

Typing `house.names` into the *R* buffer in emacs should print the
contents of the vector, but it gives me this:

> house.names
[1] "name"      "  "        "         " "         " "        " 

Trying to get it to display the encodings of the vector elements gives
this:

> Encoding(house.names)
[1] "unknown" "unknown" "unknown" "unknown" "unknown"

Even if I try to force the encoding to UTF-8 with enc2utf8() it still
tells me "unknown". 

Thinking this is perhaps an emacs problem, I tried running the R
program in BATCH mode at the command line, and got this:

[1] "LC_CTYPE=ja_JP.UTF-8;LC_NUMERIC=C;LC_TIME=ja_JP.UTF-
8;LC_COLLATE=ja_JP.UTF-8;LC_MONETARY=ja_JP.UTF-
8;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=
C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C"
[1] "name"               "å¹³å±"               "ã©ããã®ãã³ã·ã§ã³"
[4] "湯河åãã³ã·ã§ã³"   "ç±æµ·ãã³ã·ã§ã³"    
[1] "unknown" "UTF-8"   "UTF-8"   "UTF-8"   "UTF-8"  

Running it in BATCH does print out the vector elements, and does get
the encoding set correctly (except for the first element), but the
vector elements are not displayed correctly (but at least they're not
blank), and then later in the program where the vector elements appear
as graph axis tick labels, they just print out as dots.

The output printed above is as it appears in a rxvt-unicode terminal,
with TakaoPMincho specified at the font, with ja_JP.UTF-8 as the
locale. 

I realize this is a conglomeration of a lot of problems; if anyone can
help with any of them (I'm not expecting anyone to solve all of them),
I will appreciate it.

-- 
Stuart Luppescu
Chief Psychometrician (ret.)
UChicago Consortium on School Research
http://consortium.uchicago.edu




Home | Main Index | Thread Index