Mailing List Archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] bash and grep and diff



Fredric Fredricson writes:

 > > *snort* Do you really think anybody doesn't know which is the
 > > "original" Linux?  C'mon.
 > As far as I know there are several "original", even if Linus, sort
 > of, control the "real original". I guess that RedHat have their
 > own, for example. But that is totally beside the point, since I do
 > not advocate a central repository for open source projects.

It's not besides the point.  Linus is responsible for his version.  We
know what that is, he signs it.  Red Hat is responsible for the
versions it distributes, an authorized release engineer signs them.

 > > And all changes are tracked; they're just
 > > not necessarily all on the same storage volume.
 > So where exactly are they? It works fine for open source not
 > to know, perhaps, but not for commercial developers that have
 > to fix bugs in software written three years ago.

Why do you care where *all* the commits are?  Of course if you
distributed a version three years ago, you need to have a copy of that
repository.  You will then have a copy of the released version and all
ancestors (unless you've done something really foolish like use
shallow clones for official repos).

What you don't necessarily have is copies of unreleased code from
experimental versions, or commits with logs like "system flakey,
committing before lunch", or with logs like "automatic commit for
autosave 2010-08-16 14:23".  But you can have those, too, if you want
them; it's a matter of policy, not a restriction of git.  OTOH, with a
centralized system, you are likely to end up with experimental changes
that never get recorded and coarser commits being recorded, where
whole features get committed at once, rather than being broken apart
into more reviewable changes.

Of course in a company it's possible to create and follow a policy of
making reviewable coherent changesets of reasonable size -- but it's
much easier to follow that practice with something like git, where you
don't need to worry about polluting the public repo with junk commits
(or embarrassing yourself by publishing stupid ideas) and you can
recreate a more presentable history ex post, then push policy-
conforming changesets to the central repository.  That's why OSS
projects took to DVCSes so quickly -- they're very productivity-
enhancing.  But managements worry a lot about keeping control, and are
less willing to introduce productivity-enhancing tools if they seem
like they might reduce control.

 > >   If you want to ensure that all changes are available to the
 > > company, you just do the same kind of thing that has always been
 > > done: workers use company controlled SANs, not local hard drives,
 > > for storage, only company-owned hardware is allowed to connect to
 > > the net, and maybe even no hardware that can connect to the repo
 > > is allowed to leave the premises.
 > I want all changes in one place, not scattered over a lot of
 > drives, company controlled or not. Managers and other programmers
 > should not have any problem finding the changes and descriptions.

You can have that, simply by using diskless workstations and insisting
that the programmers put their personal repos in a standard place on
the SAN.  What do you think github is, for example?

 > >   >  I have used Perforce a lot and most of the things LT claims
 > >   >  to be next to impossible with all systems but git like
 > >   >  creating and merging branches, "diamond"-style merging (make
 > >   >  two branches, make changes, and merge these two branches to
 > >   >  a third branch, that later will bet merged with main) is
 > >   >  easy and fast (perforce helps you a lot).
 > >
 > > Could be.[1]  But I don't see what that has to do with commercial
 > > development needing a centralized repository.
 > I just point out that it does not take a distributed repository (or
 > git) to make merging easy. I definitely got the impression that LT
 > claimed this.

I don't know what Linus claimed, but what is needed for effective
merging is knowledge of history, especially cherry-picked changesets,
to avoid spurious conflicts.  The Darcs people claim that "patch
theory" can make improvements on that, but I've never seen a
demonstration.  "Reuse recorded resolution" functionality is also
helpful.  It turns out that the same feature (DAG orientation) that
makes efficient distributed VCS possible is also a good way to achieve
effective merging.  CVS and SVN don't have it; I don't know about
how Perforce does merging.

 > > It seems to me that commercial development requires identifying
 > > the official versions, but that can be determined by who signed
 > > the commit rather than where the repository is located.
 > And that person would be who? Staff change. That is however not
 > the main reason for a central repository, it's the history.

Huh?  In a company, the person has a role signature, of course, and
such signatures are recorded and authenticated by the company.

You need to realize that as soon as somebody fires up an editor and
inserts or deletes a character, you have history that is not in the
repository.  If you really want to preserve a lot of history, the best
way to do that is to make it easy to commit.  The big barrier to
committing in any multiperson project, and even many one-person
projects, is the bureaucratic hurdle of making the commit conform to
some policy.  With distributed VCS, that barrier becomes *much* lower,
because there's the company's repo (high standards) and *my* repo
(comfortable standards).

 > >    Commercial development requires strong auditing features so
 > > you can track claims that code not owned by you is propagating
 > > through your system.  Bitkeeper has such features, and to some
 > > extent I expect they could be built on the "pickaxe" feature of
 > > git.  I suspect that what really is going on, though, is that
 > > owners of commercial code fear that it's too easy for their
 > > proprietary stuff to leak out if their developers are using
 > > DVCSes.
 > No, not really. What commercial developers fear the most (those
 > in the know) is that the code base become unmaintainable.

Enforcing a centralized repo has nothing to do with maintainability.
Good policy carefully followed and good programmers are what produce
maintainability.  There's no reason why it's easier to do that with an
enforced central repository.

 > > Footnotes: [1] Of course, there are a lot of refugees from
 > > Perforce-land on the various DVCS user lists who say it was a
 > > pretty horrible experience.  Maybe you were just as lucky with
 > > Perforce as I've been with CVS? :-)
 > You should not rely on luck. That's why Perforce have atomic
 > submits of a set of files with transaction control. That alone make
 > Perforce totally superior to cvs. I believe svn (and git) have
 > similar features.

All of the DVCSes are provably ACID.  And my point is not that I rely
on luck.  I don't.  I take advantage of it when things go my way,
though.  I never had a difficult merge in CVS, despite doing a lot of
things people said were crazy in CVS.  That's pure good luck, maybe,
but more likely it was systematic good luck -- ie, the workflow and
particular tasks that were assigned to separate branches were well-
tuned to easy merging.  I'm suggesting that maybe the workflows of
those who say they had good results with commercial centralized
systems were similarly well-tuned.




Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links