Mailing List Archive

Support open source code!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "My Kanpo" open law project



JB>> > What's the "[badchar]" problem. PDF-funnies?
>> 
FB>> Ah.  The PDF is CID encoded.  I hacked in EUC mappings to cope with the
FB>> special vertical-text characters (I noticed that that came up in a recent
FB>> GhostScript-related post as well) used by Adobe.  But CID also offers a lot of
FB>> rare glyphs that don't have direct EUC-JP mappings.  

If they are kanji etc, Ken Lunde can probably tell you the mapping into
JIS212. You can graft in an image for them, but that makes it less
useful as general text.

FB>> That, and some other
FB>> characters just seem to get hosed by xpdf's mapping algorithm.  The converter
FB>> calls iconv to attempt a conversion to Unicode and back after the text stream
FB>> has been more or less tidied up; each character that causes iconv to whinge
FB>> gets clobbered and is replaced with that string. 

I do a Unicode->EUC conversion in WWWJDIC to handle text sent in from
IE5 Javascript snippets. I looked at using iconv, but it's so bloody
locale and implementation-specific, and since my WWWJDIC mirrors run on
AIX/Solaris/FreeBSD/multiple-Linices the safe solution was to so my own
conversion.

Jim
-- 
Jim Breen  [jwb@example.com  http://www.csse.monash.edu.au/~jwb/]
Visiting Professor, Institute for the Study of Languages and Cultures of 
Asia and Africa, Tokyo University of Foreign Studies, Japan
+81 3 5974 3880         [$B%8%`!&%V%j!<%s(B@$BEl5~30Bg(B]


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links