Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tlug: Umlauts & Kanji



--------------------------------------------------------
tlug note from "Stephen J. Turnbull" <turnbull@example.com>
--------------------------------------------------------
>>>>> "Jim" == Jim Breen <jwb@example.com> writes:

    Jim> I can't imagine needing to convert existing .html from one
    Jim> coding to another.

Beats trying to explain in Japanese to students how to view my
ISO-2022-JP pages when they've got their browser locked into SJIS.
Maybe this doesn't happen anymore....

Then there's server-parsed HTML.  I don't trust the server to do the 
right thing if you include an SJIS file into an EUC file, but I've
never tried.

    Jim> At present I think you need to do it by coding your Japanese
    Jim> as iso-2022-jp. For the two-byte codes such as EUC and SJIS,
                                 ~~~~~~~~
    SJT> 8-bit? -----------------+

    Jim> No, I meant two-byte, i.e. 16-bit (OK, EUC can be 3-bytes).

I'm still confused.  In what sense is JIS not 2 bytes?  In what sense
are half-width katakana two bytes?  The issue is the 8th bit, right?
That's why SJIS half-width kana are incompatible with umlauts AFAIK.

    Jim> the code ranges collide with those in iso-8859-1, so the
    Jim> normal way of doing "Latin" diacritic marks is not available.
    >>> I believe the Mule/W3 sample page includes EUC Japanese, not
    >>> ISO-2022-JP, but I could be wrong.  Aren't mode shifts
    >>> possible in 8-bit EUC?

    Jim> Have you got a URL for this? I thought Mule stuck to ISO-2022
    Jim> so that it could be multi-lingual.

The short answer is you're right about that page.  Mule, however, is
pretty darn good at guessing character sets (at least US-ASCII, tee
hee, and the 3 major Japanese dialects; I don't do any Chinese or
Korean so I dunno what would happen if you fed it Korean EUC, say).

!@#$% where is IT!!!  Mule _used_ to come with a document called hello
... no, demo ... grrrrrr it's demo.PS and has eexec-encoded Postscript 
Type-1 inline ... ah, yes, NTT had it ... oh, no, NTT has been
downgraded from a toplevel domain to a mere .co.jp, but there it is...

       http://www.ntt.co.jp/japan/note-on-JP/multi-example.html

You're welcome :-)

    Jim> EUC doesn't have mode shifts, unless you call the "8F" a
    Jim> shift into JIS212. (I don't; I regard i as a 3-byte code.)

No, 'twas just a thinko.  Nishikimi, et al, "Maruchiringaru Kankyou no
Jitsugen" has a figure with the various EUC codes associated; guessing
the meaning was MUCH easier than reading the Japanese.  I thought there
were escape sequences that allowed shifting from one national version
of EUC to another, but I can't find them now :-)

    >>> Arena-I18N versions (well that was 6 months ago, actually) did
    >>> it this way....

    Jim> Yes that's how I understand it too.

Nishikimi et al claim that Arena-I18N did Unicode as of mid '96,
though.  If you have the fonts.

    Jim> Unicode will, of course, fix all this.
    >>> Uh-huh.  This is Heaven's Preordained Course (YOW! unified
    >>> OUTput AND inPUT METH'uds), but....

    Jim> Now, now. Any suggestions of BETTER ways of combining
    Jim> Japanese with languages like French or Swedish?

Oh, I'm just a Zippy fan.  I didn't mean to be a Pinhead about it.  It
is heaven's preordained course; after our last exchange I went and
asked around, and nobody hardly ever uses JIS-0212, let alone the last
90,000 code points of the CNS sets.  If you really need 'em, there
will still be UCS-4.  Maybe ... Mule will probably just live forever.
(Despite current rumors that the FSF is trying to make it unusable.)

See!  Despite appearances, I do listen.  (My ears are in my mouth....)

    Jim> Well, I'm sorry to say that I think NT users are likely to
    Jim> see Unicode a lot sooner than those of us using Unix
    Jim> variants. I could be wrong, and "uterm" might be just around
    Jim> the corner, but.....

It's called "9term", actually, and it comes as a Debian Linux
package.  But don't ask me, <A HREF="mailto:craig@example.com">ask Craig 
Oda</A>, he actually has it installed....

Steve

-- 
                            Stephen J. Turnbull
Institute of Policy and Planning Sciences                    Yaseppochi-Gumi
University of Tsukuba                      http://turnbull.sk.tsukuba.ac.jp/
Tel: +81 (298) 53-5091;  Fax: 55-3849              turnbull@example.com
Next TLUG meeting is Saturday October 11, 1997
-----------------------------------------------------------------
a word from the sponsor will appear below
TWICS - Japan's First Public-Access Internet System.
www.twics.com  info@example.com  Tel:03-3351-5977  Fax:03-3353-6096


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links