Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Emacs IME, locale, encodings, R, aarrrrgggghhhh!!!!



On Mon, 2021-03-08 at 16:58 +0900, Stephen J. Turnbull wrote:
> Looks like Emacs is not passing the appropriate environment to the 
> inferior R, or not converting the strings to the appropriate encoding
> when sending them to R.
> 
> In a fresh Emacs, try M-x setenv RET LC_CTYPE RET ja_JP.UTF-8 RET, and
> M-: (setq default-process-coding-system 'utf-8) RET,
> 
> then run R and try the program.

Didn't change anything.

> About the output in the rxvt:
> 
>  > [1] "name"               "å¹³å±"               "ã©ããã®ãã³ã·ã§ã³"
>  > [4] "湯河åãã³ã·ã§ã³"   "ç±æµ·ãã³ã·ã§ã³"
> 
> Something screwy is going on here.  Counting the characters in each 
> Japanese string does not give a multiple of 3 except for the third,
> which it should (all Japanese characters are 3 octets in UTF-8).
> There are other issues; it's quite obvious that what rxvt displays is
> not valid UTF-8.  In XEmacs using decode-coding-region on the first
> gives "平å±" and on the fourth gives "湯河åãã³ã·ã§ã³", the others
> come back unchanged (meaning that they are not valid UTF-8).  I don't 
> know much about rxvt (is urxvt now the default rxvt? does it normally
> display Japanese correctly?)  

I don't know. My regular terminal (wterm) I know does very poorly with
non-latin characters, so I installed rxvt-unicode (urxvt). It did not
seem any different from wterm. :shrug:

> Does
>     echo 平屋 どこかのマンション 湯河原マンション 熱海マンション
> do the right thing?

Nope. When I pasted that in, I get 
 echo ?? ????????? ???????? ???????  
?? env_comp~ env_comp ???????

A string of question marks just about sums up my feelings about this.

>  > [1] "unknown" "UTF-8"   "UTF-8"   "UTF-8"   "UTF-8"  
> 
> 
> 
> This is expected (R uses "unknown" for ASCII because it could be
> almost anything).

Ah, that makes sense.

> If that doesn't give you a hint (I suspect it's not enough), feel free 
> to send me the R program (zip it to be absolutely sure that your mail
> client doesn't futz with it), and also the value of Emacs's 
> default-process-coding-system

this says utf-8

>  (that's the XEmacs name which *should* 
> be the same as Emacs, but if not, try C-H a "process.*coding" and
> tell me the names and values of every variable listed ;-).

C-H a didn't do anything besides move the cursor back.

I would send you the program but it's a dumb little thing, very poorly
programmed, that I wrote to help my wife understand the costs of real
estate we're thinking of buying. It's not important, and TBH I'd be
embarrassed to show it to other people. But thanks for offering to look
at it.

-- 
Stuart Luppescu
Chief Psychometrician (ret.)
UChicago Consortium on School Research
http://consortium.uchicago.edu




Home | Main Index | Thread Index