Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Re: Piping stderr?
- Date: Thu, 27 Jun 2002 19:50:28 +0900
- From: Jiro SEKIBA <jir@example.com>
- Subject: Re: [tlug] Re: Piping stderr?
- References: <3D109EC0.4070703@example.com><87n0trz38q.fsf@example.com><s3t3cvj6oad.fsf@example.com><87r8j1w8ol.fsf@example.com><877kktb463.wl@example.com><87u1nxuguj.fsf@example.com><87n0tlmn5r.wl@example.com><87bsa1nsvr.fsf@example.com><87vg895g7k.wl@example.com><871yaxnjei.fsf@example.com><87adpkj70k.wl@example.com><87u1nrao03.fsf@example.com><87znxjhlw4.wl@example.com><87ptyf91b3.fsf@example.com><87wusnhfg8.wl@example.com><876607wlpj.fsf@example.com><87eleuz973.wl@example.com><87d6ues4tn.fsf@example.com><87n0thf6m2.wl@example.com><873cv9owxa.fsf@example.com>
- User-agent: Wanderlust/2.8.1 (Something) SEMI/1.14.3 (Ushinoya) FLIM/1.14.3(Unebigoryōmae) APEL/10.3 Emacs/20.7(i386-debian-linux-gnu) MULE/4.1 (AOI)
At 27 Jun 2002 19:09:21 +0900, Stephen J. Turnbull <stephen@example.com> wrote: >> Unicode is not THE codeset, but ONE OF CODESETS. > > I think you have been listening to Ohta-san's propaganda for too long. ???? Who is that? I realy don't know who he is(Oh maybe Unicode hater:-) and have never heard any propaganda. > Unicode doesn't have "lots" of characters. CNS 11643 has "lots" of > characters, more than Unicode 2.1 (and probably more than 3.2, but I > haven't checked). Unicode _is_ THE Universal Character Set (UCS). > Plus a whole bunch of essential algorithms for handling text. So? It is still one of the codesets. I may invent GCS(Galaxy Character Set). > And if you really really have to have some character (or character > set) that Unicode doesn't provide, there are a hundred thousand > private space code points reserved for _you personally_. What's the > problem? Oh I'm very sad, I can't exchange my super nice thesis with you, because I needed to use my private area.. if only I can use other encoding... > Exactly. But one of the reasons it doesn't fallback into C may very > well be because the I18N library _thinks_ it's OK, but it's broken. > This is not sufficient reason for my system to crash; dunno how you > feel about that.... Ah, it's libraries bug. Nothing to do with CSI design. Fix as library thinks it's broken :-) >> If so, this IS the UTF-8 hard coded programs issue. > > Who said "hard code" UTF-8? In fact, I don't need that the programs > I'm talking about to _ever_ interpret UTF-8. They interpret ASCII; > anything containing non-ASCII is part of a string or a comment, and > will be passed on verbatim or ignored. Validation, if necessary, > should be done by other programs or the library functions called. All > that needs to be hard-coded is recognition of a character: > > /* yes, I know there are much faster table-driven ways to do this */ > if (*p & 0x80 == 0x00) /* ASCII */ > length = 1; > else if (*p & 0xE0 == 0xC0) /* multibyte */ > length = 2; > else if (*p & 0xF0 == 0xE0) > length = 3; > else if (*p & 0xF8 == 0xF0) > length = 4; > else if (*p & 0xFC == 0xF8) > length = 5; > else if (*p & 0xFE == 0xFC) > length = 6; > else /* illegal first byte, including 10xxxxxx */ > abort(); It can be done by library according to locale if you write CSI program. >> If you have ten UTF-8 hard coded programs, you have to fix >> each programs. On the other hand, on CSI design just fix >> library. Programs don't need to be modified anything. > > Wrong. Dangerous, ugly stuff like Shift JIS will be wandering around > _inside_ my program. To handle it correctly, I will need extra code. > In _all_ my CSI programs. ???? What is wrong? Which sentence? Or whole paragraph? I don't understand. Sorry ;-). And do you know the CSI? or lets's say wchar_t? You don't need to notice SJIS/EUC/UTF-8, that's CSI. I'm very confused. >> Even if this is not what you mentioned, it shows the bad >> thing of UTF-8 hard code programs. > > I'm not advocating doing _anything_ by hard-coding in each program. But you wrote hard-coding program just above.... > I'm advocating that simple applications that need to be robust should > restrict themselves to a single small library intended to do just one > well-defined thing well: process Unicode character streams, character > by character. No bidi, no composed characters, no interpretation of > surrogates (illegal in UTF-8 but I don't need to care). And no > steenkin' Shift JIS, Big Five, or NEC kanji. Ah, this IS the your point, finally, I got you. Then use Unicode for your own purpose. Unicode may be the best solution for you, but not for everybody. > Are you talking about SETI?[1] No sane earthling will design a > character set to be incompatible with Unicode ever again. GB18030. But you don't have to worry about it. You are happy with Unicode, I know. Even not UCS-4, I'm very impressed. > CSI means that arbitrarily stupid character encodings (Shift JIS is a > leading example) can get inside my program. This means that _my_ > program needs to deal with _their_ brain damage. I don't want my > program to ever deal with Shift JIS. If my users want to see Shift > JIS, I'll translate at the program boundary. I don't have a problem > with that. Again do you know what CSI is? But fortunately, you have a choice, not to use SJIS, So you don't need to warry about SJIS going into your CSI programs :-). -- Jiro SEKIBA | Web tools & AP Linux Competency Center, YSL, IBM Japan | email: jir@example.com, jir@example.com
- Follow-Ups:
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- References:
- [tlug] Piping stderr?
- From: Josh Glover
- Re: [tlug] Piping stderr?
- From: Stephen J. Turnbull
- [tlug] Re: Piping stderr?
- From: Mike Fabian
- [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- Re: [tlug] Re: Piping stderr?
- From: Jiro SEKIBA
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- Re: [tlug] Re: Piping stderr?
- From: Jiro SEKIBA
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- Re: [tlug] Re: Piping stderr?
- From: Jiro SEKIBA
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- Re: [tlug] Re: Piping stderr?
- From: Jiro SEKIBA
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- Re: [tlug] Re: Piping stderr?
- From: Jiro SEKIBA
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- Re: [tlug] Re: Piping stderr?
- From: Jiro SEKIBA
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- Re: [tlug] Re: Piping stderr?
- From: Jiro SEKIBA
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
- Re: [tlug] Re: Piping stderr?
- From: Jiro SEKIBA
- Re: [tlug] Re: Piping stderr?
- From: Stephen J. Turnbull
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] remote
- Next by Date: Re: [tlug] remote
- Previous by thread: Re: [tlug] Re: Piping stderr?
- Next by thread: Re: [tlug] Re: Piping stderr?
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links