TLUG Mailing List

Mailing List Archive

tlug.jp Mailing List tlug archive tlug Mailing List Archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] Re: Piping stderr?

Date: Mon, 01 Jul 2002 13:32:59 +0900

From: Jiro SEKIBA <jir@example.com>

Subject: Re: [tlug] Re: Piping stderr?

References: <3D109EC0.4070703@example.com><87n0trz38q.fsf@example.com><s3t3cvj6oad.fsf@example.com><87r8j1w8ol.fsf@example.com><877kktb463.wl@example.com><87u1nxuguj.fsf@example.com><87n0tlmn5r.wl@example.com><87bsa1nsvr.fsf@example.com><87vg895g7k.wl@example.com><871yaxnjei.fsf@example.com><87adpkj70k.wl@example.com><87u1nrao03.fsf@example.com><87znxjhlw4.wl@example.com><87ptyf91b3.fsf@example.com><87wusnhfg8.wl@example.com><876607wlpj.fsf@example.com><87eleuz973.wl@example.com><87d6ues4tn.fsf@example.com><87n0thf6m2.wl@example.com><873cv9owxa.fsf@example.com><87lm91f11n.wl@example.com><87u1nooofx.fsf@example.com><878z50fe79.wl@example.com><877kkkm6dx.fsf@example.com><871yasqcb5.wl@example.com><87y9d0kmpm.fsf@example.com><87znxgos8v.wl@example.com><87k7ojlx87.fsf@example.com><87fzz7lv2i.fsf@example.com><87lm8zu4il.wl@example.com><874rfnlj62.fsf@example.com>

User-agent: Wanderlust/2.8.1 (Something) SEMI/1.14.3 (Ushinoya) FLIM/1.14.3(Unebigoryōmae) APEL/10.3 Emacs/20.7(i386-debian-linux-gnu) MULE/4.1 (AOI)
 Let me merge the threads.

At 28 Jun 2002 19:38:59 +0900,
Stephen J. Turnbull <stephen@example.com> wrote:

> But we're not talking about programs.  We're talking about the whole
> system.  CSI programs can do _nothing_ without the CSI library.  I
> just choose to use iconv(1,3) instead of something else.  Why are you
> allowed to use libcsi, but you won't let me even use standard features
> of libc?

 As you may mentioned, all CSI related functions(mb*/wc*) are
standard features.  Almosts are defined in XPG4(a few may in XPG5).
While iconv is XPG5.  It means that a system has CSI related functions,
if the system has iconv.  You don't need to libcsi :-).

> So there is no loss whatsoever to writing the main program to handle
> UTF-8, and only UTF-8.

 Again I said, it may occur the codepint can't map to UTF-8.
Then UTF-8 hard-coding program can't handle that.

>> Separating codset dependent part from programs is the point,
> 
> Not to me.  To me the point is _supporting character sets_ for the
> user, while avoiding any _branches on coded character set_ in the
> program's logic to simplify the programmer's job.

 Yes, "supporting character sets" is the aim, CSI is a mean.

 CSI is just like unicses, and each codesets are devices.
Accessing each devices are abstracted by API(open/read/write etc).
On the other hand hard-coding programs are like programs using ioperm(2).
UTF-8 may be USB device, but it's not the one.

At 28 Jun 2002 20:47:01 +0900,
Stephen J. Turnbull <stephen@example.com> wrote:

>> For example, russian characters have 2 width in EUC-JP, but
>> in Unicode it's 1.  If programs knows, original encoding, it
>> can correct that information.
> 
> So much for CSI.  :-)

 huhh, Yes if UTF-8 is the only codesets, we don't need CSI.
But the point is we have already several codesets.

> Do you have an URL which standardizes this CSI API?  All Google turned
> up was a bunch of flamewars, with the Sun people saying "but we've
> already implemented CSI, and gotten it certified by the Chinese, so
> we'd have to have multiple binaries," and Markus Kuhn and Bruno Haible
> saying, "yeah, well, why shouldn't the rest of us take the opportunity
> to standardize on a single sane coded character set, and add the
> necessary properties to that standard"?

 It's defined in POSIX, check the OpenGroup's locale realated pages.

> It seems CSI is basically about sucking up to the Chinese and other
> nationalists, providing standard heuristics for Japanese TTYs
> (wcwidth), and fighting off the Microsoft dragon, while pretending to
> be "more general".

 No.  To absract codesets, that's all.
Even on UTF-8 codeset, we still need to measure width.
Actually, uxterm ueses wcwidth or similar function.

> How about the users, who want to use characters, not code points?  And
> the programmers, who would like to be able to stop worrying about the
> damage that characters they don't know about can do to their file
> system (for example)?

 Strip such locales, that's all.

 Let me show you the simple CSI program
#let me forget about error handling ;-)

-------------8<-------------8<-------------8<-------------
#include <locale.h>
#include <wctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(void)
{
	wchar_t wc;             /* slipper */
	char buffer[16];        /* getabako */

	setlocale(LC_ALL,"");   /* magic spell */

	read(0,buffer,16);      /* shoes in getabako */
	mbtowc(&wc,buffer,MB_CUR_MAX);  /* Genkan  */
	printf("%lc\n",wc);     

	return 0;
}
-------------8<-------------8<-------------8<-------------

 And wchar_t on linux is happend to be UCS-4.
I think it satisfies almost what you want.
-- 
Jiro SEKIBA | Web tools & AP Linux Competency Center, YSL, IBM Japan
            | email: jir@example.com, jir@example.com
Follow-Ups:

Re: [tlug] Re: Piping stderr?
From: Stephen J. Turnbull

Prev by Date: Re: Software Design (was: Re: [tlug] Confessions of a closet OpenBSDuser)

Next by Date: Re: [tlug] line ending in XEmacs

Previous by thread: Re: [tlug] 802.11b PCMCIA Options

Next by thread: Re: [tlug] Re: Piping stderr?

Index(es):

Date

Thread

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links