Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Japanese regex question
- Date: Fri, 26 Aug 2005 11:36:35 +0900 (JST)
- From: Tod McQuillin <devin@example.com>
- Subject: Re: [tlug] Japanese regex question
- References: <200508241701.55144.jq@example.com> <20050825183913.O88704@example.com><200508251253.47083.jq@example.com>
On Thu, 25 Aug 2005, Jonathan Byrne wrote: > On Thursday 25 August 2005 02:40, Tod McQuillin wrote: > >> Just a guess -- have you given the 'i' flag (case insensitivity) > somehow? > > Actually, now that you mention it, yes. [...] > > Not sure if this is our problem, b/c there was no ASCII involved in the > strings that were matched, but I'll look into it. Yeah but the regex engine doesn't know it's not ascii. Unless you use unicode, it will interpret the strings as strings of 8-bit bytes, not as non-ascii multibyte characters. Which means that if the encoding happens to include upper/lowercase letters as part of the string when interpreted as bytewise ascii ... you lose if 'i' was specified. Even though, as you say, there was no ASCII involved in your strings, there was in fact a 'j' and 'J' ascii byte in there, because the encoding dictated it. Probably the only proper way to do this is to convert everything to unicode first. -- Tod
- Follow-Ups:
- Re: [tlug] Japanese regex question
- From: Brett Robson
- Re: [tlug] Japanese regex question
- From: Stephen J. Turnbull
- References:
- [tlug] Japanese regex question
- From: Jonathan Byrne
- Re: [tlug] Japanese regex question
- From: Tod McQuillin
- Re: [tlug] Japanese regex question
- From: Jonathan Byrne
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Japanese regex question
- Next by Date: [tlug] rsh from Fedora Core 4 to Solaris
- Previous by thread: Re: [tlug] Japanese regex question
- Next by thread: Re: [tlug] Japanese regex question
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links