How to cope with patches sanely

Manoj Srivastava srivasta at debian.org
Fri Feb 29 04:10:07 UTC 2008


On Fri, 29 Feb 2008 16:11:48 +1300, Sam Vilain <sam at vilain.net> said: 

> Manoj Srivastava wrote:
>>> Yes.  Feature branches are effectively forking a particular version
>>> of a project - this is not a problem, and is essential for efficient
>>> development.  People jumbling together changes in "trunk" branches
>>> is perhaps one of the worst upshots of the 2002-2006 or so obsession
>>> with poorly designed centralised systems and in my opinion sank many
>>> projects.
>> 
>> Err. If you go back and read this thread in the archive, You'll note
>> that I have stated that my feature branches are always kept up to
>> date with the latest upstream branch I am basing my Debian package
>> on.

> This technique is also called rebasing the patch set; it's fine, but
> it's just one approach.

        Actually, that is not it. I am not rebasing -- I am doing
 repeated merges. Arch does not rebase -- it just applies  the upstream
 delta, with full history. This allows me to replay into the integration
 branch at will.

>> When I have been creating patches for inclusion with upstream, I
>> essentially feed them the source patch and a changelog entry --
>> essentially, creating a single patch series; squashing the underlying
>> history.  Most upstream do not care about the messy history of my
>> development; and most do not grok arch well enough to pull directly.

> This is sometimes worthwhile and sometimes a bad idea.  The driving
> motive, if you want to aim for patches to be easily reviewed, is that
> each patch should introduce a single change, which is well explained.
> I agree that the upstream will not want a messy history; which is why
> you reshape the individual changes using a tool such as Quilt, Stacked
> Git, Guilt, Mercurial Queues, etc, so that they are more easily
> reviewed.

        I can do this by cherry picking the chnages from my topic
 branch, and feeding it separately. Emacs and diff mode makes it easy to
 split off chunks if I want to do it after the fact from the squashed
 diff; or I can regenerate changesets and cherry pick the series.

        And no, I can do this using plain old arch, and I don't really
 have to change my SCM.


>>> They mean that a later merge back the other way, to merge the
>>> feature branch into the target branch, can happen painlessly.
>>> ASSUMING that you're using a system which has commutative merge
>>> characteristics, such as git or mercurial.
>> 
>> I use Arch.

> Arch is critically deficient in this respect; it doesn't really have a
> concept of tracking branches, and merging is not commutative; if you
> merge a branch that just merged from your branch, an unnecessary new
> changeset is made.  But if you are rebasing then you don't need to
> worry about that.  As I said, it's just more work.

        Which is why we have sync-tree. Yes, I have to keep track myself
 of which delta I am  currently merging; and only apply a merge once
 into each branch; and immedately syn-tree with the other branches.

        And since it is all in one fully automated script, called
 arch_upgrade, that takes any new upstream, updates all my topic branches
 and my integration branch automatically, I don't see this as much more
 work. 

        Indeed, I have yet to see any porcelain that makes merging  a
 new upstream commit into all my topic branches and the integration
 branch as a single operation, I suspect that it is more work in git; at
 least until I can replicate my arch scaffolding for the git porcelain. 

>>> Can you express this problem with reference to a particular history
>>> of an integration branch?  I will provide some short git commands to
>>> extract the information in the form you are after.
>> 
>> http://arch.debian.org/cgi-bin/archzoom.cgi/srivasta@debian.org--lenny?color=sunny?expand> 
>> Take any package. Say, flex. Or flex-old. You have all my feature
>> branches there. The --devo branch is the integration branch.  Please
>> show me an automated way you can grab the feature branches and
>> generate a quilt series that gives you the devo branch.  The diff.gz
>> is how we get from upstream to the devo branch (modulo ./debian); if
>> you can break that down nicely for the folks who want each feature
>> separate, that would work as well.

> Thanks for restating the problem clearly.  While the underlying
> problem is easily approached and I would still call it trivial, the
> details of what you are asking for make it impossible - because quilt
> series cannot contain merges (someone correct me here if it can and I
> can go forward).

        I don't use quilt, so I am not the one to answer this.

> Shipping changes for upstream inclusion as a *single* set of quilt
> patches is not possible if you are including merges, but if you allow
> the patches to be grouped, and introduce a new type of patch which
> encapsulates a merge (gitk has one example of this; it uses different
> identifiers to represent which file's lines are included), then it can
> be done.  The apply-patches script would need extending to support
> this, but I don't think that's particularly show-stopping.


        Great. This is, of course, a totally new proposal for debian
 source package format; and one on which I hold no opinion; and will let
 the interested parties and dpkg maintainers comment on it.

> However, ignoring the merges, so far we're not that far away from the
> "script" being 'git-log -p' or 'git format-patch upstreamrev'

        But not all Debian maintainers are using git; and my workflow is
 repeated and frequent cross merges, so it does not appear to be very
 interesting to me either. But I am sure there are people who will
 indeed find it useful.

> Also having never really used arch, if you can provide me with the
> commands to get a copy of those branches (the man page is sadly not
> very forthcoming), and I'll give the git-archimport script a whorl and
> see if I can get it imported and show how this can work in practice.
> If someone with git-archimport experience can perform this and publish
> the repositories somewhere, I'd be very grateful.

        Try 
 tla register-archive --present-ok http://arch.debian.org/arch/private/srivasta
 tla grab http://arch.debian.org/arch/private/srivasta/grab/flex

        This should grab a mostly empty directory structure in
 ./manoj-packages.

 cd manoj-packages/flex/upstream/
 my_version=$(tla versions -A srivasta at debian.org--lenny flex--upstream | tail -n 1)
 if [ -n "$my_version" ]; then
   my_branches=$(tla branches -A srivasta at debian.org--lenny flex)
   if [[ -n "$my_branches" ]]; then
     for branch in $(tla branches -A srivasta at debian.org--lenny flex |sort); do  
        my_bversion=$(tla versions -A srivasta at debian.org--lenny $branch | tail -n 1)
        if [[  -n "$my_bversion" ]]; then
          for version in $(tla versions -A srivasta at debian.org--lenny $branch | tail -n 1); do
            test -d $version || tla get -A srivasta at debian.org--lenny $version $version
          done
       fi
    done
   fi
 fi

        There. All branches, in one place.


>> If you code works well enough every single time a new upstream comes
>> around and I release a new version of flex or whatever, I'll throw in
>> the generated quilt patches.

> I think what is required is a rethink of the problem.  What is being
> tried to be achieved, and are there any other ways to achieve it which
> will solve the problem in a vastly more effective way.

        Remember, the problem is to  create a new Debian source package
 format that allows people using feature branches to present a set of
 patches that other people are already using -- in other words, a quilt
 series.

        If you propose a new patch format, you also have to convert a
 quilt series into the same format (or perhaps your format
 trivially reduces to a quilt series)

> Version control systems that have content-addressable filesystems
> (essentially, git and Monotone) are inherently efficient to
> distribute; as only the changes between versions need be distributed.
> The notion of stream compressing tarballs is archaic compared with
> being able to search for deltas anywhere in the source tree.

        Which is great, but I fear it will not fly as a the one and only
 
> The essence of what I'm saying is to view a distributed git archive as
> a /replacement/ (or, if you prefer, complement) for the source
> archives; and going by previous results, this will result in an
> overall reduction of the size of the archive, faster distribution -
> even P2P - and preserving more history.

> Instead of distributing large source archive packages, the upstream
> sources are imported (perhaps as a tarball, or perhaps using a rich
> history-preserving git archive), and the patches are applied as
> commits on feature branches.  When you 'apt-get source', you are
> simply checking out the head version (and probably the upstream as
> well).  All the information you are after - individual changes from
> the upstream - are available.  If the upstream updates, then the
> source archive can represent that in the most convenient fashion to
> the maintainer - be it a rebase of the applied patches as you have
> used previously, or a simple merge.

> And please, I'm not looking to start a VCS flamewar here - I'm talking
> about git in its capacity as a file distribution and archival
> mechanism.
>  At this task, it excels.  It doesn't matter what the upstream uses;
> they can all be converted to git well.

        Heh. More power to you, friend, for proposing git as the
 replacement for wig & pen.  I wish you luck.

        manoj
-- 
May a hundred thousand midgets invade your home singing cheesy
lounge-lizard versions of songs from The Wizard of Oz.
Manoj Srivastava <srivasta at debian.org> <http://www.debian.org/~srivasta/>  
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C



More information about the vcs-pkg-discuss mailing list