Mailing List ArchiveSupport open source code!
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: tlug: Character Encodings Again
- To: tlug@example.com
- Subject: Re: tlug: Character Encodings Again
- From: Matt Gushee <matt@example.com>
- Date: Mon, 2 Nov 1998 18:12:08 +0900
- Content-Type: text/plain; charset=ISO-2022-JP
- In-Reply-To: <Pine.LNX.3.95.981102152404.204D-100000@example.com>
- References: <199811010747.QAA06953@example.com><13885.27568.164020.485043@example.com><Pine.LNX.3.95.981102152404.204D-100000@example.com>
- Reply-To: tlug@example.com
- Sender: owner-tlug@example.com
TLUG saves the day again! J. David Beutel writes: > > 41377 94 8481 > [...] > > 65185 94 32289 > > > > Do these values ring a bell with anyone? (I've been told that one side > You can see the pattern when you convert to hex. 41377 = A1A1, > and 8481 = 2121. So, the right side is the JIS X 0208 kuten (1,1), > and the left side is the same thing in EUC. Duhh! How did I manage to miss that? I did convert the numbers to hex (or I thought I did ... maybe I was trying to convert them *from* hex to decimal). > You may want the Unicode table, JIS0208.TXT, from unicode.org. Got it, thanks. Stephen J. Turnbull writes: > Matt> In case you're wondering what all this is about: I'm trying > Matt> to write an SGML declaration that will allow the use of > Matt> kanji in markup (e.g., instead of <par></par>, you could > Matt> have <段落></段落> ... and so on. > I'm not sure whether content characters and markup characters are kept > separate in the software; you may need to recompile nsgmls to handle > 16-bit character sets. Actually, I'm a step ahead of you on that one. I already made sure to compile it with -DSP_MULTIBYTE (actually, I think that's the default when you compile it from the Jade distribution ... anyway, I made sure of it). > I'm not sure you're allowed to use any character sets in markup except > ASCII, ISO-8859-1, and Unicode (aka ISO-10646/UCS-2). According to the original SGML standard, no. But there are Extended Naming Rules which supposedly do allow you to use any character set, provided that you specify it properly in the SGML declaration. Or so I gather from several very cursory sources. My searches of the Web and Deja News give me the strong impression that very few people, even among SGML wizards, really understand how it works. > Have you looked at the standard with respect to that? Ha-ha. I really should one of these days, shouldn't I. Unfortunately SGML is another one of those standards you have to buy from ISO ... it's 200 bucks or so. So, one of these days ... Thanks once again, guys! Matt Gushee Oshamanbe, Hokkaido ---------------------------------------------------------------- Next Nomikai: 20 November, 19:30 Tengu TokyoEkiMae 03-3275-3691 Next Technical Meeting: 12 December, 12:30 HSBC Securities Office ---------------------------------------------------------------- more info: http://tlug.linux.or.jp Sponsors: PHT, HSBC Securities
- References:
- tlug: Character Encodings Again
- From: Matt Gushee <matt@example.com>
- tlug: Character Encodings Again
- From: "Stephen J. Turnbull" <turnbull@example.com>
- Re: tlug: Character Encodings Again
- From: "J. David Beutel" <jdb@example.com>
Home | Main Index | Thread Index
- Prev by Date: tlug: Debian 2 and j-pine
- Next by Date: Re: tlug: Lilo hosed
- Prev by thread: Re: tlug: Character Encodings Again
- Next by thread: tlug: internal memo!?
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links