
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] searching for kanji strings, ignore punctuation and endof lines
- Date: Mon, 16 Jan 2006 17:00:48 +0900
 
- From: Ramil Sagum <ramil@example.com>
 
- Subject: Re: [tlug] searching for kanji strings, ignore punctuation and endof lines
 
- References: <43CB4F48.1060200@example.com>
 
- User-agent: Mozilla Thunderbird 1.0.7 (Windows/20050923)
 
David Riggs wrote:
> If I could take a two line unit spat out by grep -A2, then process it
> as a separate set, I could do it rather easily. Strip out stuff after 
> the match for the first kanji: newline, punctuation, and line numbers. 
> Then if there is a match print out the working data area.
How about making a second copy of the text with the punctuations stripped 
(preserving the line count) and then search the phrase from there?
It's a bit of a kludge, but if disk space isn't a problem, then this is an easy 
way. Since you have to do this a lot, the processed copy might even give you 
that needed speed boost. (I'm assuming your haystack won't change a lot, will it 
always be the CBETA canon?)
-moogs
Home |
Main Index |
Thread Index