
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] a japanese dictionary: regex v. db query
On Tue, 4 Apr 2006, Jim wrote:
>
> "Stephen J. Turnbull" wrote:
>
> > What is the regular expression for "all characters with 16 strokes"?
>
> Oh boy. That made me think.
> That was the right question to highlight the limitations of regexes.
>
> I can not think of how to express "all characters with 16 strokes"
> in the present schemes of regexes as I know them.
But 'man re_syntax' reveals extensions such as:
[:digit:] which matches any digit, or
[:punct:] which matches any punctuation character
Why not extend that syntax to include things like:
[:stroke=16:]
to match any character with 16 strokes. Or even:
[:rad=<code>:]
to match any character containing the radical at codepoint 'code'. You
could probably convert this internally to an SQL search including just
about any character property you have stored in the database.
I would consider this more useful for a word search than for single kanji
searches. RE's become useful when there are potentially many characters in
the search target... or for someone stuck with a text-only interface to
the database ;-).
--
Joe Larabell -- Synopsys VCS Support US: larabell@example.com
http://wwwin.synopsys.com/~larabell/ Japan: larabell@???
Home |
Main Index |
Thread Index