
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] Japanese encoding
- Date: Fri, 23 Aug 2002 15:17:27 +1000 (EST)
- From: Jim Breen <jwb@example.com>
- Subject: Re: [tlug] Japanese encoding
[Brett Robson (Re: [tlug] Japanese encoding) writes:]
>In a loose moment Jim wrote:
>> > the details at:
>> http://www.csse.monash.edu.au/~jwb/wwwjdicinf.html#examp_tag
>> "the collection is in need of considerable editing."
>>
>> I am going to have a very large amount of free time from next week, I'm
>> happy to help.
It's a little early to turn it loose for hand editing. I looked into
possibly setting up a CVS system, but really I want Windblows, Mac, etc. people
in on it too if they can contribute. What I think I'll do eventually is
have it up for rsync collection (rsync is available for Windows), and have
a very standard way of submitting updates so I can run them through a utility.
In the meantime, I'd like to do some more reduction of duplicates using
software. I have tracked down and eliminated the straight replications
in the Japanese text. What I'd like to do it zoom in on things like:
私達はよくいっしょにお昼を食べます。
私達はよく一緒にお昼を食べます。
and knock out the first because 一緒に/いっしょに are the same. At
present I'm doing this by eye, noting where the English sentences are the same.
Examples like:
私達はカヌーを借りた。
私達はカヌ−を借りた。
are horrors to pin down.
Then there are the Japanese bloopers:
私達は皆事態は深刻だと考えた。We all regarded the situation as serious.
私達は皆自体は深刻だと考えた。We all regarded the situation as serious.
Way to go.
Jim
--
Jim Breen (j.breen@example.com http://www.csse.monash.edu.au/~jwb/)
Computer Science & Software Engineering, Tel: +61 3 9905 3298
P.O Box 26, Monash University, Fax: +61 3 9905 5146
Clayton VIC 3800, Australia ジム・ブリーン@モナシュ大学
Home |
Main Index |
Thread Index