Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

tlug: Character Encodings Again



>>>>> "Matt" == Matt Gushee <matt@example.com> writes:

    Matt> Speaking with some trepidation since this may be considered
    Matt> an inappropriate question under the new regime (if so, feel
    Matt> free to respond off-list):

Looks technical and Japanese, and nsgmls is open source, so I think
you're home free.

    Matt> In case you're wondering what all this is about: I'm trying
    Matt> to write an SGML declaration that will allow the use of
    Matt> kanji in markup (e.g., instead of <par></par>, you could
    Matt> have <段落></段落> ... and so on. The (amended) SGML
    Matt> standard definitely allows this, and according to my limited
    Matt> understanding of the docs, nsgmls should support such
    Matt> documents, but I haven't been able to make it work.

I'm not sure whether content characters and markup characters are kept
separate in the software; you may need to recompile nsgmls to handle
16-bit character sets.  I believe that sgmls (note, no leading 'n')
worked with Japanese in linuxdoc-sgml-ja by pretending that EUC was
ISO-8859, yuck.

    Matt> 'japan.sgmldecl', in its original form, provides for
    Matt> Japanese characters only in content. So I'm trying to modify
    Matt> it, proceeding on the obvious assumption that I should just
    Matt> use the same numbers for the markup characters as for the
    Matt> content characters ... but it's not working.

I'm not sure you're allowed to use any character sets in markup except
ASCII, ISO-8859-1, and Unicode (aka ISO-10646/UCS-2).  Have you
looked at the standard with respect to that?  You almost certainly
aren't allowed to use control characters in there, so ISO-2022-style
JIS is out.

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences        Telfax: +81 (298) 53-5091
__________________________________________________________________________
__________________________________________________________________________
What are those two straight lines for?  "Free software rules."
----------------------------------------------------------------
Next Nomikai: 20 November, 19:30   Tengu TokyoEkiMae 03-3275-3691
Next Technical Meeting: 12 December, 12:30 HSBC Securities Office
----------------------------------------------------------------
more info: http://tlug.linux.or.jp Sponsors: PHT, HSBC Securities

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links