
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] unicode and Perl- how to pass command line unicodearguments
Stephen J. Turnbull wrote:
>>>>>>"gabor" == gabor <gabor@example.com> writes:
>
>
> gabor> in python byte-strings are objects and unicode-strings are
> gabor> objects too. you create a byte string for example like
> gabor> this:
>
> gabor> string1 = "byte string"
>
> Unfortunately, "これは日本語です。" will produce a string which is
> encoded Japanese (with whatever encoding the file is saved in), but
>
> gabor> string2 = u"byte string"
>
> u"これは日本語です。" does not produce Unicode-encoded Japanese. It
> may work with PEP 263 coding cookies, but this is unreliable in the
> Japanese environment (because of the multiplicity of incompatible
> encodings).
could you explain this part to me? why is your own source-code
unreliable? :)
for example, this works fine:
=======
#!/usr/bin/python
# -*- coding: utf-8 -*-
text = u"これは日本語です"
print len(text)
========
the output is 8.
> I argued strenuously for an XML-like "default to UTF-8" policy with
> optional codecs for loading Python code, but Guido refused on the
> basis of backward compatibility (ie, lots of Europeans were using 8
> bit encodings in existing production code).
>
hmm.. i would also prefer to use utf8 as the default instead of ascii..
btw. even for people who use latin-1, it does not help. without that
pep263-setting,
auto-converting a latin-1 bytestring to unicode will end with an exception.
gabor
Home |
Main Index |
Thread Index