Mailing List Archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Re: Piping stderr?

At 27 Jun 2002 22:12:34 +0900,
Stephen J. Turnbull <> wrote:
>>  Oh I'm very sad, I can't exchange my super nice thesis with
>> you, because I needed to use my private area.. if only I can
>> use other encoding...
> _You_ can use anything you want.  Why do you think _I'm_ more likely to
> have support for your personal encoding than the Unicode Consortium is?
> You lose, either way.  Except that I'm _much_ more likely to have, or
> be able to find, Unicode support for unusual character sets.  All I
> need is a font and a CMap.

 You missed the point ;-).  The point is that the situation I(somebody)
need characters which is not defined will be happend,  then
can't exchage the plain text.

> Anyway, Adobe _solved_ the "gaiji problem."  Didn't you notice?

 ???  It this something to do with codeset?
Can I exchage my plain text which uses private area?

>> And do you know the CSI?  or lets's say wchar_t?
> wchar_t I know.  wchar_t is irrelevant to what we're talking about, a
> pure implementation detail.

 How can I do CSI without wchar_t?  I realy would like to know.
Well, but it's not a point.  The point is Internal representation.

> I've read a little bit about CSI on the web.  As far as I can tell, it
> implies that I will have externally-encoded strings being processed
> inside my app.  CSI provides assistance in handling those strings, but
> if my app needs to understand the semantics of that text (eg, to
> create a file name to pass to the OS---in which case I _sure do_ need
> to know about Shift JIS, which embeds both of the common path
> separators in multibyte characters), it needs to handle the encoding
> of the text.

 You are still lost in CSI forest ;-).  CSI is CodeSet Independent.
The policy is more or less that programs shouldn't look the codesets,
not like 'need not'.  So if escape is needed in some codeset, let's it
separate from programs but ask the liblary to do.

 But in this case, it's a problem of SJIS codeset, not CSI.
Of course, it's better to use File System Safe encoding to create
new file, etc.  But need not to forbid using other encodings other 
than UTF-8.
> True?  Or does it do what I want, which is translate everything to
> a common representation internally?

 Sure, but internal representation is abstracted by wchar_t, O.K.?

>> So you don't need to warry about SJIS going into your CSI
>> programs :-).
> No, I don't.  I won't have any, most especially not CGIs, unless I
> hear some advantage over the "check your Shift JIS at the door"
> policy.

 I much prefer do not USE the SJIS :-).
But CSI gives a choice, use/not to use.
Maybe you want to forbid using SJIS.  If so strip SJIS locale from
the system, then CSI programs never be able to use SJIS, 
it fallbacks into 'C' locale.

Jiro SEKIBA | Web tools & AP Linux Competency Center, YSL, IBM Japan
            | email:,

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links