Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Emacs IME, locale, encodings, R, aarrrrgggghhhh!!!!



On 2021-03-09 10:47 +0900 (Tue), Stuart Luppescu wrote:

> On Mon, 2021-03-08 at 16:58 +0900, Stephen J. Turnbull wrote:
> > I don't know much about rxvt (is urxvt now the default rxvt? does it
> > normally display Japanese correctly?)  
> 
> I don't know. My regular terminal (wterm) I know does very poorly with
> non-latin characters, so I installed rxvt-unicode (urxvt).

Yes, urxvt has succeeded rxvt on most systems that I'm aware of, and does
handle UTF-8 well. (urxvt has been my primary terminal for many years now,
and I'm pleased with it. I use Vim, though, and in particular I don't nest
window systems. I've never quite understood the appeal of Bash in an Emacs
window in a tmux window in an X11 window. :-P)

On 2021-03-09 15:37 +0900 (Tue), Stephen J. Turnbull wrote:

> It appears to me that the program that Emacs is saving is properly
> encoded in UTF-8, although it's very hard to be sure when data written by
> emacs is being massaged by R, rxvt, and email in transmission.

`xxd` or another hexdump program may be handy to confirm that file contents
and output are indeed UTF-8.

Perhaps one way of debugging this would be to write "世界おはよう" programs
(that just print that string) in Bash (just an `echo` command) and R,
confirm with a hexdump tool that the contents of those files are UTF-8, run
them from the same command line you use to run Emacs to confirm that those
work, and then try running them from within Emacs. You can add code to
print out the values of various environment variables as well to see if
those are being passed through correctly. And these programs, I suspect,
you could share without fear of someone dissing your code. :-)

> Only the encoding, UTF-8.  That's why programmers should love Unicode --
> it should make text encoding issues moot (and will, *some*day ;-).

Well, yes, it does once you've dealt with UTF-8 vs. UTF-16 vs. unencoded
UCS-2, big- vs. little-endian UTF-16/UCS-2, the presence or not of byte
order markers....

cjs
-- 
Curt J. Sampson      <cjs@example.com>      +81 90 7737 2974

To iterate is human, to recurse divine.
    - L Peter Deutsch


Home | Main Index | Thread Index