Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Kanji file names-- how to change encoding, Mac OSX/Darwin file names



>>>>> "Alain" == Alain Hoang <hoanga@example.com> writes:

    Alain> David Riggs writes:

    >> My next question was going to be how to do this for Darwin
    >> i.e. Mac OS X file names, which I cannot read no matter what I
    >> do. Convmv manual seems to say that Apple has it so screwed up
    >> that it is not possible to read much less convert file names.

Eh?  I read and write files with Japanese names all the time from
XEmacs on Mac OS X.

One problem I have encountered with Apple is that at some level the OS
enforces UTF-8 file names.  But it's a "Foot, meet Bullet" problem: in
the XEmacs test suite, we try to write file names in various
encodings, and (eg) ISO-8859-2 fails.  In real life, I've never wanted
to write a non-UTF-8 file name on Mac OS X.

Can you say more about the problems you're facing?  If worst comes to
worst you can just do them as MIME attachments in mail messages, for
example.
	
    Alain> These UTF-8 normalization forms and their interactions when
    Alain> actually trying to deal with them are currently something
    Alain> that looks like some black magic

It's basically trivial.  In German, you can write ss or you can write
ß, althouh the latter, composed, form is canonical.  The
normalization forms simply dictate maximally composed and minimally
composed forms, with rules for handling cases where there are multiple
extrema.  Conformant software is supposed to handle both forms.

    Alain> The subtle differences of NFD and NFC manifested itself
    Alain> when I was trying to write some text files using Vietnamese
    Alain> in OS X then moved them over to a FreeBSD machine and
    Alain> noticed the accent marks weren't attached.  *sigh*

By "not attached" do you mean "not displayed as composed"?  The
necessary information to fix that is in the large Unidata table, which
tells you which characters are composed from others.  If you mean
"lost", then you have seriously non-conforming software somewhere in
the pipeline.

I haven't tried this, but I should think Pango handles the composition
internally.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links