Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Japanese regex question
- Date: Sun, 01 Jan 2006 23:30:30 +0100
- From: Gábor Farkas <gabor@example.com>
- Subject: Re: [tlug] Japanese regex question
- References: <200508241701.55144.jq@example.com> <20050825183913.O88704@example.com> <200508251253.47083.jq@example.com> <20050826113217.J88704@example.com> <87zmr2me23.fsf@example.com>
- User-agent: Thunderbird 1.5 (Macintosh/20051201)
Stephen J. Turnbull wrote: >>>>>> "Tod" == Tod McQuillin <devin@example.com> writes: > > Tod> Yeah but the regex engine doesn't know it's not ascii. > > Urk. "Unidentified unibyte ASCII-superset", if you please! > > Tod> Unless you use unicode, it will interpret the strings as > Tod> strings of 8-bit bytes, not as non-ascii multibyte > Tod> characters. > > Nice call! For those of you who haven't thought carefully about it > yet, those matching 4/6 and 5/7 first-nibble pairs in the ambiguous > match positions are a dead giveaway. > > We had a post on this kind of issue (ambiguous matches in UTF-8) a > couple months back, too. It's worth trying to remember this one. > > Tod> Probably the only proper way to do this is to convert > Tod> everything to unicode first. > > This is all so stupid. XEmacs has been doing this (badly) for almost > a decade, Mule for another 3 or 4 years longer than that. Why Perl > and Python failed to seize the opportunity to do it right when they > added Unicode support I'll never know. > sorry to jump in so late... could you please describe to me what is Python doing wrong regarding unicode? thanks, gabor -- Flexibility is overrated Constraints are liberating -- David Heinemeier Hansson, Secrets behind Ruby on Rails
- Follow-Ups:
- Re: [tlug] Japanese regex question
- From: Stephen J. Turnbull
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] [tlug-digest] Mozilla printing. No joy. Isn't there somegood Mozilla doc about printing?
- Next by Date: [tlug] O3: The Open Source Enterprise Data Networking Magazine
- Previous by thread: Re: [tlug] [tlug-digest] Mozilla printing. No joy. Isn't there somegood Mozilla doc about printing?
- Next by thread: Re: [tlug] Japanese regex question
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links