Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] Re: UTF-8 Terminal Emulators?



Ryan Shaw <ryan.shaw@example.com> writes:

> Mike Fabian wrote:
>
>> Ryan Shaw <ryan.shaw@example.com> writes:
>> 
>> > I've tried using "xterm -u8" but I get empty boxes when I try to
>> > print UTF-8 strings.
>> 
>> You need to specify suitable fonts. Which fonts are suitable depends
>> on what parts of Unicode you want to display.
>
> I see. That explains it.
>
>> LC_CTYPE=en_US.UTF-8 works just as well if you only want to display
>> Japanese but don't need to input it.
>
> Well, I usually have my LC_CTYPE set to ja_JP.EUC-JP because usually
> when I work with Japanese, I work in EUC.

But that won't work well in an UTF-8 terminal emulator.

If you do this for example

    LANG=ja_JP.UTF-8 LC_CTYPE=ja_JP.eucJP xterm -u8 -fn -misc-fixed-medium-r-normal--18-120-100-100-c-90-iso10646-1 -fw -misc-fixed-medium-r-normal-ja-18-120-100-100-c-180-iso10646-1

you have the following locale settings in your UTF-8 xterm:

    mfabian@example.com:~$ locale
    LANG=ja_JP.UTF-8
    LC_CTYPE=ja_JP.eucJP
    LC_NUMERIC="ja_JP.UTF-8"
    LC_TIME="ja_JP.UTF-8"
    LC_COLLATE=POSIX
    LC_MONETARY="ja_JP.UTF-8"
    LC_MESSAGES="ja_JP.UTF-8"
    LC_PAPER="ja_JP.UTF-8"
    LC_NAME="ja_JP.UTF-8"
    LC_ADDRESS="ja_JP.UTF-8"
    LC_TELEPHONE="ja_JP.UTF-8"
    LC_MEASUREMENT="ja_JP.UTF-8"
    LC_IDENTIFICATION="ja_JP.UTF-8"
    LC_ALL=
    mfabian@example.com:~$ 

Now 'cat some-utf-8-encoded-file' will work correctly.  But 'less
some-utf-8-encoded-file' won't work correctly by default.  less checks
LC_CTYPE for the default charset, and because this is different from
the real capabilities of the terminal, you will see mojibake.

    LESSCHARSET=utf-8 less test-texts/yuki.utf-8

works, but setting LC_CTYPE wrong destroys the automatic detection.
All programs writing messages to stdout/stderr using gettext will
display mojibake as well for example

    mfabian@example.com:~$ ls non-existant-file
    ls: non-existant-file: [... mojibake ...]

because they assume that the terminal handles euc-jp if LC_CTYPE is
set like that. But it doesn't do that if you force UTF-8 with the '-u8'
option.

> Fortunately, when I tried the -fn and -fw options as you recommended
> with the -u8 option, I was able to view Japanese UTF-8 strings without 
> having to change my LC_CTYPE.

Without the '-u8' option, xterm will use UTF-8 mode in UTF-8 locales
and will not use UTF-8 mode in other locales. '-u8' forces UTF-8 mode
even in locales which don't use UTF-8 like ja_JP.eucJP.

But why would you want to do that? It just gives you a lot of mojibake
in the terminal.

The intention of the '-u8' option is to be able to use xterm in UTF-8
mode even on systems which don't have UTF-8 locales. On systems which
do support UTF-8 locales, the '-u8' option is deprecated and one
should set LC_CTYPE correctly instead.

See also:

    http://mail.nl.linux.org/linux-utf8/2001-05/msg00063.html

-- 
Mike Fabian   <mfabian@example.com>   http://www.suse.de/~mfabian
睡眠不足はいい仕事の敵だ。

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links