
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] [OT] Regular Expressions to find Japanese Text
- Date: Wed, 09 Aug 2006 14:41:42 +1000 (EST)
- From: Jim Breen <Jim.Breen@example.com>
- Subject: Re: [tlug] [OT] Regular Expressions to find Japanese Text
[Dave M G (Re: [tlug] [OT] Regular Expressions to find Japanese Text) writes:]
JB > That <br> is redundant. I may remove it at some stage. Better to extract
JB > between <li> and the next <li> or the terminal </ul>.
>> >
>> If I may make a suggestion:
>>
>> I recommend that if you do remove the <br> tag, which is definitely
>> redundant, you should replace it with a closing </li> tag. This will
>> make it more compatible with strict XHTML. I think evolving towards
>> XHTML compliance with your HTML output would be a very good thing.
Done.
>> Also, in the case of parsing as I'm doing, finding the next <li> tag or
>> terminal </ul> tag might be complicated by line breaks between them. Not
>> insurmountable, just complicated, and simply not an issue if the HTML
>> was XHTML compliant.
That particular <ul> list should have </li> terminations on all sites within
24 hours.
JB > Well, 付ける;着ける [つける] has 25 meanings grouped in 11 senses.
JB > Readings are a bit harder to count, but I think there are entries with
JB > 5 or 6.
>> That's very helpful to know.
>>
>> What do you mean by "grouped in 11 senses". That there are eleven
>> semi-colons dividing up the 25 meanings?
No, 24 semi-colons, but each new sense starts with the sense-number in
parentheses, e.g.
付ける .... (v1,vt) (1) to attach; to join; to add; to append;
to affix; to stick; to glue; to fasten; to sew on; to apply (ointment);
(2) to furnish (a house with); (3) to wear; to put on; (4) to keep a
diary; ...
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大蛙触Â
Home |
Main Index |
Thread Index