Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Kanji file names-- how to change encoding from euc-jp toutf-8



David Riggs wrote: 

> I can mount and access kanji named files from Japanese 
> WinXP floppies... . I see the 
> file names as euc-jp kanji, though the file content is in sjis.
> 
> Anyway, what I want to do is to have those files on my utf-8  ext2 file 
> system, with the kanji file names intact, but changed to utf-8 encoding. 
> I can change the data with recode, but it does not do file names, as far 
> as I can see.

> I have been bumbling around with recode and shell scripts but its really 
> over my head.

Mine too. Let's try to swim. 

First, just try copying the files over to your mounted ext2 filesystem. 
Hopefully that would take care of coverting the filenames to utf-8. 
I am just guessing that it would. 

Changing the content takes a little more work. First you need to 
identify which files needs to be changed, then you need to make the change. 

Let's say that you have a list of filenames, one per line, 
of files that you need to convert from sjis to utf-8 and 
that that list of filenames is saved in a file, 
let's say sjisfiles. 

Then you have a shell script named sjis2utf8: 

   #!/bin/sh
   while read filename; do
      recode -youroptions "$filename"
   done

then execute it as: 

   sjis2utf8 <sjisfiles

or without the sjisfiles files

   ls *.txt | sjis2utf8

or

   find /copy/of/floppy -type f -name '*.txt' | sjis2utf8

> This is another one of those, "surely those TLUGer's can do it in their 
> sleep" kind of questions. 

It is, but I'm not that good at scripts or i18n. 
My bad code above will provoke those who know to say 
"No! you (*#$&* idiot. Everyone knows you do it ..."



Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links