Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlug] migrating Japanese filenames from old to new samba server



I apologize that I have contributed little to TLUG, in spite of benefiting so much. Thank you for the community and the knowledge.

I just got samba running on a new RHEL5 file server, so I connected through WinXP (JA version), saved an Excel file with a JA filename to the share, and then checked through Gnome and then the terminal on my Mac. I also tried mounting the share on MacOSX. To my delight, not only can I read the filename, I can specify the file and use chown. Nirvana!

However, the Rsnapshot backup server is also RHEL5, and the files backed up from the old file server (RHEL3) are still unreadable and unchangeable, so I assume that the files on the old file server will have to be converted to UTF8 before sharing them on the new file server.

The old RHEL3 file server specifies
dos charset = CP932
unix charset = CP932
in smb.conf, but the new RHEL5 default charset seems to be CP850:
[root@example.com ~]# testparm -v | grep "dos charset"
...
        dos charset = CP850

- 1 - I am sure someone has experience with this situation. What would be the best way to bring the old file server files over to the new machine so that I can manipulate the Japanese-named files through the command line? My initial plan was to rsync them over with ownership, groups, etc. intact. I have already prepared lists of usernames and directories to get it done.
- 2 - Should I change the 'dos charset' and/or the 'unix charset' in the smb.conf of the new RHEL5 file server? CP850 is western characters only, and even Microsoft claims to not support it any longer. On the other hand, it is working really well, so obviously I don't understand well enough. If the 'dos charset' is CP850, shouldn't the server be spitting out garbled filenames to the MacOSX and the WinXPJA clients? The filesystem is Unicode, right? And Samba 3 can speak Unicode, so shouldn't I have to set the 'dos charset' to utf-8, too?

I am sure this must have come up before, but a search for "Japanese samba" in the TLUG archives returned 0 records, and the chapter on Japanese charsets in the official Samba HowTo is pretty difficult and not very easy for someone of my level to find a conclusion:
http://us3.samba.org/samba/docs/man/Samba-HOWTO-Collection/unicode.html#id2670220

The Samba HowTo mentions software for converting entire directories to a different charset, but the software's site lists many caveats:
http://j3e.de/linux/convmv/man/

I tried to find the answer by Googling, but if I have missed the solution or the appropriate resource, please point me in the right direction.

Yoroshiku onegaishimasu.

--
Micheal Cooper
Miyazaki, Japan (GMT+9, no DST)

Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links