
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] utf form problems: solution
- Date: Sun, 02 Mar 2008 10:22:59 -0800
- From: steven smith <sjs@example.com>
- Subject: Re: [tlug] utf form problems: solution
- References: <47C3B7E0.9020906@sonic.net> <47C3BB7C.2000605@ldp.jp> <47C4491A.6010002@sonic.net> <47C49FF8.1090107@samsara.bebear.net> <47C4B08D.907@sonic.net> <82c89d700802271034s26138f60k6c1b3c54e0648764@mail.gmail.com>
- User-agent: Thunderbird 1.5.0.14ubu (X11/20080227)
David Shanahan wrote:
Sounds like perl might be treating your string as bytes rather than a
unicode string,
then muching them on output.
You should read the docs on the Encode module;
perldoc Encode
You may need to do
$string = decode("utf8", param("stuff"));
to tell perl the %E5%8A%A9 bytes from the form is actually utf8 string
also look at the docs on the -C option ...
I needed -CO for printing to stdout, decode to tell perl
string was utf8 and 'use utf8' because file contained utf8.
This was my minimal debug script for testing the post form.
#!.../perl -CO
# output unicode to stdout. Input is ascii
use utf8; # file contained utf-8 characters.
use Encode;
use URI::Escape;
my $input;
read(STDIN,$input,$ENV{CONTENT_LENGTH},0);
# Tell perl this is a utf-8 string.
my $unencoded = decode('utf8', uri_unescape($input));
print "content-type: text/plain charset=UTF-8\n\n",
"今日は世界\n", # 日本語と"hello world"
$unencoded;
Next I work on reading stdin safely.
Thanks everyone!
Steve S.
Home |
Main Index |
Thread Index