
Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlug] Reading kanji file name from Mac OS X
Thanks for the suggestions Stephen, I can sometimes read some of the
file names in emacs, but its a real hassel.
It seems that using WinXP is indeed the best way to handle this-- the
Mac OS kanji file names are a mess.
Just for the record, here is my little note to myself about how to bring
over a dir tree of Mac files with kanji and spaces and upper case ascii
in the file names and translate from their names in sjis to utf-8, also
converting spaces to "." and lowercasing ascii. And convert content too!
Copy to usb in WinXP, mount it in Linux as -o iocharset=sjis and then do
the following to convert to utf-8 and no-blank file names.
Go to the top level dir that contains the files from Mac. To clean up
Mac _directory_ names of blanks, converting the blank to a "." run
repeatedly until no errors are reported:
find * -type d -print0 | xargs -0 rename 's/ /./g'
(Several runs may be needed due to the problem of renaming a folder
that has already had a higher level rename done because it is itself
contained in folder that has a blank in the name. To properly separate
file names despite the blanks, -print0 directsl to use
null instead of blank for separators, and xargs -0 recognizes that.)
Then run for _files_ (should not be any more errors):
find * -type f -print0 | xargs -0 rename 's/ /./g'
Then convert kanji file names to utf-8, and also lowercase ascii. If
there are garbage, or non-sjis kanji names, "convmv" will report it and
stop and you may have to correct non-sjis names manually, then rerun.
"convmv" does its own recursion into directories.
convmv -r -f sjis -t utf-8 --notest --lower *
Then convert data, both line breaks and encoding, (should be no blanks
in names, but just in case use -printf0):
find * -type f -print0 |xargs -0 recode -f sjis/cl..utf-8/
Hope it is useful to someone,
David Riggs
Kyoto
Home |
Main Index |
Thread Index