Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] CVS and Japanese files



On Sat, Jun 22, 2002 at 10:44:12AM +0900,
Ryan Shaw wrote:
> Hello,
> 
> Have any of you who have used CVS ever had problems
> with it corrupting Japanese files? I have been using
> CVS for a long time with Japanese HTML/XML files and
> resource bundles, and have never had problems with
> corruption.
> 
> However, my new co-worker says that CVS always corrupts
> Japanese files, and demonstrated it with an SJIS resource
> file containing half-width katakana.

Lots of problems like that are discussed here:

http://www-vox.dj.kit.ac.jp/nishi/cvs/ml-log/threads.html

Basically, the simplest solution is to write everything in EUC or UTF-8. If
you absolutely have to write something in SJIS, you have to have CVS invoke
filters that convert SJIS files to EUC on their way in, and back to SJIS on
their way out. Oh, and Japanese filenames--not so common in web development,
I hope, but prevalent in Windows environments--tends to confuse CVS unless
you specifically patch it.

Hankaku also causes a problem if you are auto-converting: a string consisting
of EUC hankaku characters may be auto-detected as SJIS consisting of garbage.
This may be a problem if you are auto-converting and have hankaku in your
source code.

I'm not sure what a good solution is to this problem; personally, I think you
should just ban hankaku, but I hate hankaku with a passion (doing battle right
now with this young developer working with me who thinks hankaku is the best
thing since 2ch.net)....

-- 
Shimpei Yamashita                               http://www.shimpei.org/
You can't have everything. Where would you put it?    -- Steve Wright


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links