Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] utf form problems



Marty Pauley wrote:
Hello
...
I'm pretty sure the problem is how you are getting the value for
$input.  Please post that bit of code.

my $input = undef; if (exists $ENV{CONTENT_LENGTH}) { read(STDIN,$input,$ENV{CONTENT_LENGTH},0); }



It looks like the encoded string being sent to the server is wrong.

No, the encoded string is correct.
Thank you.


I have debug statements to look at what is arriving on stdin

You're reading stdin directly!? Don't do that.
?? what else would one do?  I know I should be checking the
input string before I do anything with it, but so far -- I'm
not that far.  The whole test form at this point does
nothing but submit a hidden string to see if it's right on
the other end.  This is my first CGI in some time and my
first using UTF-8 ever.  I'm still feeling my way around.

 and the debug output in the browser looks like:
    $input = "rtk_kanji_1=%E5%8A%A9"
    $output = "rtk_kanji_1=%E5%8A%A9"

$ perl -MURI::Escape -wle 'print uri_unescape "%E5%8A%A9";' 助

ok... I'm getting more confused now.  From the statement
above it looks like what I should be doing is something like:
my $uncoded = uri_unescape($input);

At this point I'm even more confused before.  If I step
through the debugger and set the input string by hand,
things look fine.  But I get mojibake on output from the
test form:
  To test that we send utf8, print a kanji:助
  Print input and unencoded strings:
  $input = "rtk_kanji_1=%E5%8A%A9"
  $unencoded = "rtk_kanji_1=助"

I have pretty much a bare bones test.  The html form is 13
lines, the debug script is 23 lines and both are attached.
I'm not sure if tlug will pass on the attachments, so they
are also available at http://sonic.net/~sjs/t.html.txt and
http://sonic.net/~sjs/DEBUG.txt

I think it's time to go hit the gym and blow off some steam.
Maybe this will make more sense afterward.

Any ideas would be appreciated.

Thanks.
Steve S.






 
#!/nfs/httpd/cgi-bin/sjs/perl/bin/perl -w -CAO
$| = 1;

use utf8;
use open ':utf8';
use open ':std';

use URI::Escape;

print  "content-type: text/plain charset=UTF-8\n\n";

print "To test that we send utf8, print a kanji:助\n";

my $input;
if (exists $ENV{CONTENT_LENGTH} &&  $ENV{CONTENT_LENGTH} < 50) {
	read(STDIN,$input,$ENV{CONTENT_LENGTH},0);
}
if ($input) {
	print "Print input and unencoded strings:\n";
	print "\$input = \"$input\"\n";
	my $unencoded = uri_unescape($input);
	print "\$unencoded = \"$unencoded\"\n";
}

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links