Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] unicode and Perl- how to pass command line unicodearguments
- Date: Wed, 15 Feb 2006 10:49:58 +0900
- From: "Stephen J. Turnbull" <stephen@example.com>
- Subject: Re: [tlug] unicode and Perl- how to pass command line unicodearguments
- References: <43F12D6F.1020706@example.com><30ce84360602141624p348b3cacm@example.com>
- Organization: The XEmacs Project
- User-agent: Gnus/5.1007 (Gnus v5.10.7) XEmacs/21.5-b23 (daikon, linux)
>>>>> "Ian" == Ian Wells <ijw@example.com> writes: Ian> Thr reason I was being argumentative with Steve and Python is Ian> that while I can see that Python has the same problem with Ian> source code encoding, from what he's saying it seems to take Ian> the approach that the encoding is set string by string, Ian> rather than setting for the whole (or remainder of) the file. Ian> Is that what you meant, Steve? No. In Python 2.x, there is a natural language text object, which for historical reasons is called "Unicode" and whose literals are denoted u"string". Then there is raw memory, which for historical reasons is called "string" and whose literals are denoted "string". For historical reasons, the raw memory object has continued to be heavily abused as a container of natural language text. What I don't like about Perl, as I understand your description, is that Perl mandates that abuse (whatever happened to "there's always more than one way to do it"? :-) As Gabor pointed out, there is a flexible way of making Python as DWIM-witted as Perl. You can set the encoding for the file in the way which has become common for many text editors (include Emacsen and IIRC vim), by putting a specially-formatted comment (aka coding cookie) at the top of the file. Ian> in Perl, I don't have to ever specify u"string". This is a Ian> good thing, in my opinion, because I want strings to be Ian> stored as decoded (once I've set the source file coding) and Ian> not as binary data 99% of the time, and I'm prepared to use Ian> \x.. for the other 1%. But according to you, this is exactly what Perl doesn't do. It decodes the text, then stores it as binary data, and depends on you to not do something stupid. This can work, but (a) it depends on programmer discipline and (b) is modal. Ie, the "use utf8;" declaration is at the top of the file which the programmer may or may not ever look at carefully. In fact I would guess that it might actually be in some other file entirely, since it's part of the language. The Python cookie can't be in another file, since it only refers to the text of the file currently being read. Whether Both approaches have serious problems. Python's is more readable IMO, but the "convert at variable initialization" approach is the most readable (though verbose). -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software.
- Follow-Ups:
- Re: [tlug] unicode and Perl- how to pass command line unicodearguments
- From: Gábor Farkas
- References:
- Re: [tlug] unicode and Perl- how to pass command line unicodearguments
- From: David Riggs
- Re: [tlug] unicode and Perl- how to pass command line unicodearguments
- From: Ian Wells
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] unable to create local copy utf8 encoded Japanese MySQLdata
- Next by Date: [tlug] setting emerge default options
- Previous by thread: Re: [tlug] unicode and Perl- how to pass command line unicodearguments
- Next by thread: Re: [tlug] unicode and Perl- how to pass command line unicodearguments
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links