Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Coda



>>>>> "jb" == Jonathan Byrne <jq@example.com> writes:

    jb> Steve, your mention of Coda reminded me that Someday Real Soon
    jb> Now (TM) I want to try something like this with my Thinkpad
    jb> for keeping it in sync with my desktop machine.  Is that how
    jb> you're using it?  Is it pretty good?

It's pretty close to perfect for those of my needs it satisfies.

It doesn't keep you "in sync", it's a caching distributed file system.
This means that (1) frequently accessed stuff will automatically be in
the cache (good), but (2) if it's something like your maildir, where
you might want to back up in a thread, you'll need to tweak cache
expiration priorities and explicitly hoard files in the cache
(possibly a PITA).

For true synchronization, something like Unison might be better, or
even a homebrew rsync script.  There's also Intermezzo, but Coda seems
to be a lot more mature than Intermezzo, and I don't really know what
Intermezzo is good at (compared to Coda).  Coda has a good mailing
list, and there's very little fanboying and lots of good information
about competing products, since Peter Bram was involved in all of AFS,
Coda, and Intermezzo, even though he has since moved on from Coda to
Intermezzo.  Heck, you'll even see the occasional post saying "for
your needs, Samba is the way to go".  :-)

What Coda does: when you open a file, it checks for a live server, if
it finds one, it checks for existence and version of the file in the
/coda filesystem.  Then it goes to the local cache and checks for a
local copy.  If found, and versions match, it uses it.  If no match,
it downloads it first.  If not found on the server, or server not
connected, it operates locally.  On close, it pushes any changes back
to the server.  Both are straight copies, and the conflict resolution
semantics require that the open system call block until the copying is
complete.  (The close call doesn't block, but it will cause the
servers to refuse to serve the file to other clients until the copying
is done.)

What's nice: reliable disconnected operation; reliable conflict
detection (no false negatives); reliable mirroring and automatic
failover across servers when you configure replicated volumes.  Works
and plays well with distributed authentication (Kerberos) (but all net
traffic is cleartext or trivially XOR-encrypted; adding encryption
shouldn't be hard, and is being requested a lot, but hasn't been done
yet).  Universal filesystem namespace organized into "cells" (sorta
like Internet domains or Kerberos realms), which you access by domain
name (eg, to look at the anonymous playpen for would-be Coda users,
you just access /coda/testserver.coda.cs.cmu.edu/, and the volumes
managed by that cell automagically appears as a Unix filesystem
mounted on that directory).  You can use DNS SRV records for service
discovery.

What's not nice: latency proportional to filesize on cache misses.
Files _must_ be smaller than total cache size.  Coda does not export
an existing file system, it creates its own---thus even on the server
you have to run the coda client to access the file system.  Both
server and client like to use a lot of RAM (not VM, RAM).  Conflict
resolution is very primitive: basically, you look at two versions of
the file and choose the one you like.  Not yet very scalable; the
local cache is more or less limited to about 1GB, the maximum file
size is 4GB (32-bit size_t), and there are restrictions on the size of
directories.

There is no Mac client yet (needs a kernel module), but Apple has
recently shown interest (in the form of saying "if there seems to be
good press in it for us, we might throw in some programmer hours").

What this is good for: single-user, multiple machine r/w, strongly or
intermittently connected.  Multi-user disconnected operation with
shared read-often/write-rarely files.  Basically, push the probability
of simultaneous writes really low and Coda is way cool.

Use cases:  personal maildir good, big (> 10MB) mbox bad.  Single user
dev workspace good, multi-developer "whiteboard" bad.  Actually, for
files up to about 50kB, Coda over a 19.2 connection provided
satisfactory performance as long as I could multitask (eg, get coffee
or change diapers) while >10 kB copies were happening.  (I'm not
currently doing this because mostly I'm on the Mac at home, q.v.)

In practice, I no longer use NFS or Samba at all, but my use cases
(maildir MUA, single developer workspaces) fit Coda to a "t".  My
former roadwarrior didn't have that much space, so I would hoard 200MB
of stuff that I might conceivably use, and this hoard evolved
gracefully over time.  Especially for a personal library executable
scripts, /coda/<realm>/bin is a good thing to have.  (Binaries can be
problematic unless your hardware, kernel, and libc versions are quite
similar and stay that way.)  Ditto devo workspaces.  No more
forgetting that I tweaked something on one box or the other, etc.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links