Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Japanese regex question
- Date: Tue, 30 Aug 2005 08:24:24 +0900
- From: "Ben K. Bullock" <benkasminbullock@example.com>
- Subject: Re: [tlug] Japanese regex question
- References: <200508241701.55144.jq@example.com> <200508251253.47083.jq@example.com> <20050826113217.J88704@example.com> <87zmr2me23.fsf@example.com> <30ce843605082808003eac8faa@example.com> <87y86mkrrg.fsf@example.com> <20050828173528.796c3073@example.com> <87u0h9l14p.fsf@example.com> <001401c5ac6d$06b587e0$0b01a8c0@example.com> <8764tolodu.fsf@example.com> <d8fcc0800508291038504fc464@example.com>
----- Original Message ----- From: "Josh Glover" <jmglov@example.com> To: <tlug@example.com> Sent: Tuesday, August 30, 2005 2:38 AM Subject: Re: [tlug] Japanese regex question > On 8/29/05, Stephen J. Turnbull <stephen@example.com> wrote: > >> >>>>> "Ben" == Ben K Bullock <benkasminbullock@example.com> writes: >> >> Ben> So it's actually a very sensible compromise to have a utf-8 >> Ben> handle, I think; it doesn't break legacy code. >> >> Please note that my proposal also does not turn the existing breakage >> in legacy code into a showstopper; it simply requires the programmer >> or user to admit that the legacy code is broken by _explicitly_ using >> a backward compatibility option. > > Your proposal is excellent, Stephen, and you will be happy to know > that is how things work (or rather, will work) in Perl 6. There is a > 'use Perl5' pragma that causes Just-In-Time compilation of Perl 5 > source to Parrot bytecode. > > What Ben says about the 'use utf-8' pragma is true of Perl 5.6, but > not of Perl 5.8, which is the first Perl version to use Unicode > internally for *all strings*. The Perl documentation agrees with > Stephen that 'use utf-8' was a poor design choice: > > http://search.cpan.org/~jhi/perl-5.8.0/pod/perluniintro.pod#Perl's_Unicode_Support Thanks for that link. I read it carefully, but I can't see any disagreement between it and what I said: >>>>>>>>>>>>>>>> Begin quote of me To get Perl to use UTF-8, try use utf8; Then each Unicode character is exactly equivalent to an ascii character for every purpose. That's all you need to make, for example "." in a regular expression match all Unicode characters, or to use UTF8 variable names in your code, or to make length ("馬鹿") == 2; rather than 4 or 6, etc. etc. In future versions of Perl, "use uft8;" is going to become a non-functioning command and utf8 will be switched on by default. <<<<<<<<<<<<< End quote of me. I'm fairly sure this says the same thing as your above page, just from a slightly different perspective. The perspective of the web page is someone talking about the internals of Perl, and my perspective is someone using Perl. Let's try to demonstrate this with a small example program: >>>>>>>>>>>>>>>>>> Please save this as utf-8 or it won't >>>>>>>>>>>>>>>>>> work.#!/usr/bin/perl use warnings; use strict; use utf8; binmode STDOUT, ":utf8"; my $ushi = "馬鹿"; print "$ushi\n" if ($ushi =~ /^..$/); #my $牛 = "馬鹿"; #print "$牛\n" if ($牛 =~ /../);<<<<<<<<<<<<<<<<<<<< End of example program. If you really think the "use utf8;" pragma is unnecessary for practical programming, I'd invite you to play with this by commenting out the "use utf8;" and see what Perl (my version is 5.8.6) actually does. If you are feeling very bold, you could also try commenting out the "use utf8;" line and uncommenting the two commented out "my $牛" lines as well. If you want to try reading from or writing to a utf8 file, that might be interesting as well without the "binmode". Anyway, let me repeat that I don't see any disagreement at all between what I wrote and the contents of the page you mentioned. The problem is that people working on code internals (which is what the page actually seems to be talking about) might not be the best people to describe what Perl actually does from a user point of view. B. Bullock. ___________________________________________________________ How much free photo storage do you get? Store your holiday snaps for FREE with Yahoo! Photos http://uk.photos.yahoo.com
- References:
- [tlug] Japanese regex question
- From: Jonathan Byrne
- Re: [tlug] Japanese regex question
- From: Jonathan Byrne
- Re: [tlug] Japanese regex question
- From: Tod McQuillin
- Re: [tlug] Japanese regex question
- From: Stephen J. Turnbull
- Re: [tlug] Japanese regex question
- From: Ian Wells
- Re: [tlug] Japanese regex question
- From: Stephen J. Turnbull
- Re: [tlug] Japanese regex question
- From: Botond Botyanszki
- Re: [tlug] Japanese regex question
- From: Stephen J. Turnbull
- Re: [tlug] Japanese regex question
- From: Ben K. Bullock
- Re: [tlug] Japanese regex question
- From: Stephen J. Turnbull
- Re: [tlug] Japanese regex question
- From: Josh Glover
Home | Main Index | Thread Index
- Prev by Date: [tlug] recommendations, please -- system administration learning
- Next by Date: [tlug] Petition for FreeBSD users to Codeweavers
- Previous by thread: Re: [tlug] Japanese regex question
- Next by thread: Re: [tlug] Japanese regex question
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links