Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Japanese regex question



On 8/29/05, Stephen J. Turnbull <stephen@example.com> wrote:

> >>>>> "Ben" == Ben K Bullock <benkasminbullock@example.com> writes:
> 
>     Ben> So it's actually a very sensible compromise to have a utf-8
>     Ben> handle, I think; it doesn't break legacy code.
> 
> Please note that my proposal also does not turn the existing breakage
> in legacy code into a showstopper; it simply requires the programmer
> or user to admit that the legacy code is broken by _explicitly_ using
> a backward compatibility option.

Your proposal is excellent, Stephen, and you will be happy to know
that is how things work (or rather, will work) in Perl 6. There is a
'use Perl5' pragma that causes Just-In-Time compilation of Perl 5
source to Parrot bytecode.

What Ben says about the 'use utf-8' pragma is true of Perl 5.6, but
not of Perl 5.8, which is the first Perl version to use Unicode
internally for *all strings*. The Perl documentation agrees with
Stephen that 'use utf-8' was a poor design choice:

http://search.cpan.org/~jhi/perl-5.8.0/pod/perluniintro.pod#Perl's_Unicode_Support

> If we had started making legacy code embarrassing in 1995, most of the
> people on this list wouldn't know there was a problem, because most of
> the programs they use were written since then.

Very true.

-Josh


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links