Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] using eucjp on Linux
- Date: Tue, 24 Dec 2013 08:36:38 +0100
- From: Christian Horn <chorn@example.com>
- Subject: Re: [tlug] using eucjp on Linux
- References: <20131223213339.GA29849@fluxcoil.net> <CAKXLc7dotHh9TtDXK+tsV1HygUbM-cy6W4ur0DPms8QSd1ccvw@mail.gmail.com>
- User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Dec 24, 2013 at 01:22:08PM +0900, Kalin KOZHUHAROV wrote: > On Tue, Dec 24, 2013 at 6:33 AM, Christian Horn <chorn@example.com> wrote: > > I try to get a better understanding on encodings and am puzzled > > about the following. > > > > In a utf8 xterm date and outputting utf8 files works fine, also date: > > > > [chris@hive ~]$ echo $LC_ALL > > ja_JP.utf8 > > [chris@hive ~]$ cat test_utf8 > > 日本語 > > [chris@hive ~]$ date > > 2013年 12月 23日 月曜日 22:20:21 CET > > > so, works as expected. > > Can you try the following in this terminal: > `LC_ALL=ja_JP.eucjp date|iconv -f eucjp` > `LC_ALL=ja_JP.eucjp date|xxd` [chris@hive ~]$ LC_ALL=ja_JP.eucjp date|iconv -f eucjp 2013年 12月 24日 火曜日 08:15:31 CET [chris@hive ~]$ LC_ALL=ja_JP.eucjp date|xxd 0000000: 3230 3133 c7af 2031 32b7 ee20 3234 c6fc 2013.. 12.. 24.. 0000010: 20b2 d0cd cbc6 fc20 3038 3a31 353a 3438 ...... 08:15:48 0000020: 2043 4554 0a CET. The xxd output is the same as for "LC_ALL=ja_JP.utf8 date|xxd" so seems like the LC_ALL=ja_JP.eucjp has no effect? > > I converted the file to eucjp, > > > to make sure just run `cat test_utf8| iconv -f utf8 -t eucjp` instead > of converting off-side. [chris@hive ~]$ cat test_utf8| iconv -f utf8 -t eucjp F|K\8l > > I think I have the locale, > > > Can you confirm by running `locale -a |grep -i euc` ? [chris@hive ~]$ locale -a |grep -i euc ja_JP.eucjp japanese.euc [...] > Also what does `locale` show "after the switch" ? [chris@hive ~]$ locale LANG=ja_JP.utf8 LC_CTYPE="ja_JP.eucjp" LC_NUMERIC="ja_JP.eucjp" LC_TIME="ja_JP.eucjp" LC_COLLATE="ja_JP.eucjp" LC_MONETARY="ja_JP.eucjp" LC_MESSAGES="ja_JP.eucjp" LC_PAPER="ja_JP.eucjp" LC_NAME="ja_JP.eucjp" LC_ADDRESS="ja_JP.eucjp" LC_TELEPHONE="ja_JP.eucjp" LC_MEASUREMENT="ja_JP.eucjp" LC_IDENTIFICATION="ja_JP.eucjp" LC_ALL=ja_JP.eucjp > > but I > > fail to get the eucjp encoded file displayed. Also the date out-out is not correct: > > > > [chris@hive ~]$ LC_ALL=ja_JP.eucjp luit > > [chris@hive ~]$ locale charmap > > EUC-JP > > [chris@hive ~]$ cat test_eucjp > > F|K\8l > > [chris@hive ~]$ date > > 2013G/ 127n 23F| 7nMKF| 22:21:45 CET > > [chris@hive ~]$ cat test_utf8 > > f%f,h* > > > > Running these commands in a terminal "xterm -en eucjp". > > > > I think I am missing something.. any ideas? > > > quite a few things may be going on... try the above commands and let's see. > Also for xterm, show the output of `xrdb -q|grep -i XTerm` No output: [chris@hive ~]$ xrdb -q|grep -i XTerm [chris@hive ~]$ > These days I use mostly x11-terms/xfce4-terminal, but I just tried the > following and it works fine: > > $ LC_ALL=ja_JP.eucjp xterm > $ date <-- in the new terminal > $ date > 2013年 12月 24日 火曜日 13:09:46 JST > $ date |xxd > 0000000: 3230 3133 c7af 2031 32b7 ee20 3234 c6fc 2013.. 12.. 24.. > 0000010: 20b2 d0cd cbc6 fc20 3133 3a30 393a 3437 ...... 13:09:47 > 0000020: 204a 5354 0a JST. > > which is EUC-JP. The "date" output here is not as expected, but "xxd" seems to show that "date" is properly presented to the terminal/luit in eucjp. > Finally after some tinkering, here is a unit test for you to check > that most things around locales are fine (start from UTF8 locale): > > for l in utf8 eucjp sjis; do echo -e "$l\t $(LC_ALL=ja_JP.utf8 date > +%A)"; LC_ALL=ja_JP.$l date +%A|xxd; echo; done > utf8 火曜日 > 0000000: e781 abe6 9b9c e697 a50a .......... > > eucjp 火曜日 > 0000000: b2d0 cdcb c6fc 0a ....... > > sjis 火曜日 > 0000000: 89ce 976a 93fa 0a ...j... > > If you run it today (Tuesday), you may check this as well: > $ for l in utf8 eucjp sjis; do echo -e "$l\t $(LC_ALL=ja_JP.utf8 date > +%A)"; LC_ALL=ja_JP.$l date +%A|xxd; echo; done|md5sum -c <(echo > "bc2ec6dc8e941801ee37286c8b28c277 -") > -: OK > (it should print OK, be careful with spaces). The shift-jis test fails, this fedora here is missing that locale. The existing ja_JP.ujis locale turns out to be an "eucjp" alias. LC_ALL=ja_JP.sjis falls back to "LC_ALL=C". [chris@hive ~]$ for l in utf8 eucjp sjis ujis; do echo -e "$l\t $(LC_ALL=ja_JP.utf8 date +%A)"; LC_ALL=ja_JP.$l date +%A|xxd; echo; done utf8 火曜日 0000000: e781 abe6 9b9c e697 a50a .......... eucjp 火曜日 0000000: b2d0 cdcb c6fc 0a ....... sjis 火曜日 0000000: 5475 6573 6461 790a Tuesday. ujis 火曜日 0000000: b2d0 cdcb c6fc 0a ....... > BTW, last time I checked (2-5 years ago) luit is not needed explicitly. me-- for not doublereading the mails.. Seems like the issue is more in the terminal area? So far only tried uxterm and gnome-terminal as alternatives, both have the same results for the above commands. Any ideas welcome.. Christian
- Follow-Ups:
- Re: [tlug] using eucjp on Linux
- From: Christian Horn
- Re: [tlug] using eucjp on Linux: xxd -g 1
- From: jep200404
- References:
- [tlug] using eucjp on Linux
- From: Christian Horn
- Re: [tlug] using eucjp on Linux
- From: Kalin KOZHUHAROV
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] using eucjp on Linux
- Next by Date: Re: [tlug] using eucjp on Linux
- Previous by thread: Re: [tlug] using eucjp on Linux
- Next by thread: Re: [tlug] using eucjp on Linux
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links