
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] [OT] Strip Kanji from a document for study purposes
- Date: Thu, 20 Jul 2006 09:10:16 +1000 (EST)
- From: Jim Breen <Jim.Breen@example.com>
- Subject: Re: [tlug] [OT] Strip Kanji from a document for study purposes
[Dave M G (Re: [tlug] [OT] Strip Kanji from a document for study purposes) writes:]
JB > There is an option to stop a word/phrase being displayed more than
JB > once. It is the checkbox labelled "no repeated translations".
>>
>> Actually, I saw that and tried it. But it doesn't prevent it from
>> happening, so I figured it meant something other than what I thought it
>> would mean.
Examples? It works as expected for me (it only works for the main file -
usually "glossdic", and doesn't work for hiragana-only matches, although
I could fix that.)
>> If I understand correctly, "no repeated translations" will stop a word
>> from being translated twice if it appears twice in the same sentence.
No. It works on the whole document. I just dump out the text to the
display and flush out accumulated glosses at sentence end, or when the
char/line limit is reached.
JB > See how you go saving to a text file and hitting it with an editor.
JB > That's what I have done. I find I have to trim out the words I know
JB > to get it down to a usable study list.
>> I've had some experience in creating study lists using OpenOffice's
>> Calc. If I can scrape out the lists from WWWJDIC's output, I might be
>> able to format them in a way that can sweep out duplicates with some
>> relatively simple regular expressions.
"sort -u" is good for duplicates 8-)}
Cheers
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大蛙触Â
Home |
Main Index |
Thread Index