Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tlug: List of meta's for different languages



>>>>> "Craig" == Craig Toshio Oda <craigoda@example.com> writes:

    Craig> From: Darren Cook <darren@example.com>

    darren> Has anyone seen a list of what the meta tags should be for
    darren> each language (Japanese, Chinese, etc)? Actually more than
    darren> a standard, I'm looking for a guide for what to use so
    darren> that it will be recognized by both Explorer and Netscape,
    darren> and both versions 3 and 4.

    Craig> Use of characters sets is specified in the HTML4.0
    Craig> specification.  The section you want is
    Craig> http://www.w3.org/TR/REC-html40/charset.html#encodings

This is exactly what he doesn't want, I think.  He wants to know what
will work with real implementations, as opposed to conformant
implementations.  Netscape and MSIE have historically been extremely
keen to avoid conformance....

(1) Nonetheless, I strongly recommend that you choose from the IANA
list.  (Debian users can get a fairly recent IANA list from the
doc-rfc package.  doc-rfc contains the most useful RFCs as well, such
as RFC-MIME, RFC-821 (SMTP), RFC-822 (headers), RFC-1123 (messaging
application clarifications for 821, 822, and 1036), etc.)

Furthermore, pick ones that are real standards (ISO, ECMA, GB, CNS,
JIS, etc.) rather than "industry standards" such as Shift-JIS and
Windows-1216 and IBM-437.  The only exception to this last is Big5 for 
Chinese.  CNS isn't really a good alternative AFAIK :-(.

(2) If you want both version 3 and version 4, you may be hosed.  It's
not possible AFAIK to augment the sets that MSIE and Netscrap
recognize (unlike w3.el and Arena, I think).  So you have to find the
ones in the intersection.  You can be sure (despite recommendation 1)
that both Netscape and MSIE will handle the principal Microsoft
abominations, so converting docs to Shift-JIS and Big5 will probably
work.  However, many other apps will not recognize the shift-jis tag
(for one thing, unlike pretty much everything else, the official IANA
tag for shift-JIS is shift_jis, using an underscore).  ISO-2022
conforming sets with registered final bytes should be pretty safe;
that means that the EUC variants (charset=euc-jp, for example) are
possibly a good choice.  However, early patchlevels of Netscrap 3 may
have only recognized x-euc-jp, not euc-jp.  (I think Jim Breen tested
several versions of Netscrap and MSIE and all recognized euc-jp,
though.)  euc-kr for Korean should be safe, but I don't recall the
right tags for Chinese (there being two conflicting standards, CNS and 
GB).

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences        Tel/fax: +1 (298) 53-5091
---------------------------------------------------------------
Next Meeting: 10 October, 12:30 Tokyo Station Yaesu central gate
Next Nomikai: 20 November, 19:30  Tengu TokyoEkiMae 03-3275-3691
---------------------------------------------------------------
Sponsor: PHT, makers of TurboLinux http://www.pht.co.jp


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links