Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] utf form problems: solution
- Date: Sun, 02 Mar 2008 10:22:59 -0800
- From: steven smith <sjs@example.com>
- Subject: Re: [tlug] utf form problems: solution
- References: <47C3B7E0.9020906@sonic.net> <47C3BB7C.2000605@ldp.jp> <47C4491A.6010002@sonic.net> <47C49FF8.1090107@samsara.bebear.net> <47C4B08D.907@sonic.net> <82c89d700802271034s26138f60k6c1b3c54e0648764@mail.gmail.com>
- User-agent: Thunderbird 1.5.0.14ubu (X11/20080227)
David Shanahan wrote:Sounds like perl might be treating your string as bytes rather than a unicode string, then muching them on output.
You should read the docs on the Encode module; perldoc Encode
You may need to do $string = decode("utf8", param("stuff")); to tell perl the %E5%8A%A9 bytes from the form is actually utf8 string
also look at the docs on the -C option ...
I needed -CO for printing to stdout, decode to tell perl string was utf8 and 'use utf8' because file contained utf8. This was my minimal debug script for testing the post form.
#!.../perl -CO # output unicode to stdout. Input is ascii use utf8; # file contained utf-8 characters. use Encode; use URI::Escape; my $input; read(STDIN,$input,$ENV{CONTENT_LENGTH},0); # Tell perl this is a utf-8 string. my $unencoded = decode('utf8', uri_unescape($input)); print "content-type: text/plain charset=UTF-8\n\n", "今日は世界\n", # 日本語と"hello world" $unencoded;
Next I work on reading stdin safely.
Thanks everyone!
Steve S.
Home | Main Index | Thread Index
- Prev by Date: [tlug] Linux vs windows satire
- Next by Date: Re: [tlug] Linux vs windows satire
- Previous by thread: Re: [tlug] Linux vs windows satire
- Next by thread: [tlug] Booting LINUX from an external hard disk
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links