Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Translating old to new kanji forms using tr
- Date: Wed, 29 Jun 2005 12:06:11 +0900
- From: "Stephen J. Turnbull" <stephen@example.com>
- Subject: Re: [tlug] Translating old to new kanji forms using tr
- References: <42C1507D.5070608@example.com>
- Organization: The XEmacs Project
- User-agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.5 (cilantro, linux)
>>>>> "David" == David Riggs <dariggs@example.com> writes: David> I need to go back and forth between the old (¾É×Ö) and the David> modern kanji forms. I have a list of corresponding old and David> new form and the the stardard utility "tr" works fine for a David> test case: tr(1) is byte-oriented, as far as I know, and any resemblence to success is using up your good karma. What is happening is that you are feeding "E4 BB 8F" and "E4 BD 9B" to tr, and it is mapping E4->E4, BB->BD, and 8F->9B for you byte-by-byte. As far as I know byte-oriented is the case for all of "the usual utilities", except that cut(1) claims to know about characters now. David> Which seems to make all the usual utilities work just fine David> with kanji inside the "konsole" (or plain old xterm as far David> as that goes).That's because usage like "grep '[$B$"(B-$B$s(B]' file" is probably relatively unusual for us gaijin. Your best bet is to use a language like Python or **** that supports Unicode internally. They generally have functions that emulate the standard command line utilities but work on Unicode strings as well as on unibyte strings. With **** you can probably write a one-liner. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software.
- References:
- [tlug] Translating old to new kanji forms using tr
- From: David Riggs
Home | Main Index | Thread Index
- Prev by Date: RE: [tlug] Translating old to new kanji forms using tr
- Next by Date: Re: [tlug] Translating old to new kanji forms using tr
- Previous by thread: Re: [tlug] Translating old to new kanji forms using tr
- Next by thread: Re: [tlug] Translating old to new kanji forms using tr
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links