
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] Japanese regex question
On Wed, 24 Aug 2005 22:27:28 -0400
Josh Glover <jmglov@example.com> wrote:
> On 8/24/05, Brett Robson <b-robson@example.com> wrote:
>
> > I don't know how much experience you have with J-regex but the biggest
> > issue is anchoring. Because it's double byte you can't be sure you're
> > matching from the first byte of a character.
>
> I am fairly certain that with Unicode, Perl 5.8 regular expressions
> handle the multi-byte encoding properly and do not treat strings as
> arrays of bytes.
>
But he said he was using raw 2022 encoding and those numbers look
correct for katakana in 2022. I was thinking though that he would
probably be better off converting to unicode internally for that exact
reason.
Brett
Home |
Main Index |
Thread Index