[Pkg-xen-devel] git workflow, redux

Thu Aug 23 23:08:32 BST 2018

Hi team,

On 08/23/2018 08:07 PM, Ian Jackson wrote:
> Summary:
> 
> I have tried the packaging-only repo and I really don't like it at
> all.  I don't know how anyone copes with this - such hard work!
> IMO we should switch to git-debrebase.  (As an alternative, if
> you don't trust git-debrebase because it's my own tool, gbp pq
> would be better, too, even though it's not as good as git-debrebase.)
> 
> Particularly, now that we have more people making substantial
> contributions, it is important to have something that works for as
> many as possible, and that is low friction for common tasks.

Ack, especially the low friction part.

Xen is a broad product that can be used in many different ways, and bugs
that appear might not be easily reproducible (or fixes tested) by anyone
else than the reporter, having specific hardware and configurations.

The users of Xen in Debian are usually a bit more technical, since it's
not very easy to get everything running correctly. I suspect that when
someone in our user group runs into problems, they are able to not only
sit in a corner and wait until someone else fixes their specific
problem, but are willing to participate in getting it done.

The "onboarding experience" of working with the packaging is a crucial
thing here. It's preferable that a user can try out things himself,
without having to ask someone else to apply some patch and make a test
build all the time. When doing this, our goal should be that the amount
of time spent on things that are not executing the actual idea to
fix/improve/whatever for the Xen related thing is reduced to an absolute
minimum.

So, yes, that's also the 'as many as possible' part indeed.

> Knorrie wrote on irc:
> 
>   ... I only want to discuss it if you have packaging scenarios and
>   want to do them in both ways and then discuss the result, so we can
>   discuss based on actual work/commands/etc instead of feelings
> 
> Well, I don't propose to actually try doing the same task two
> different ways, at least unless I completely fail to manage it the
> first way I try.

Yes. What I mean is that in discussions that need to have a decision to
change things, or not, as result, I'm allergic to arguments that are
based on generalizations and feelings ("much more work!", "hundreds of
things!", "so difficult!", "omg!", "but I'm used to this"), because they
display a feeling instead of facts, and/or often display lack of
interest in any other opinion than ones current one.

> But I will consider some scenarios:
> 
>                     packaging-only      git-debrebase
> 
>   Build binaries    100 lines of        dgit sbuild -A
>   for release       instructions!         or
>                     + pbuilder/sbuild   dgit pbuilder [options]

The left column could be:

git clean -dfx; debian/rules orig; debian/rules debian/control; pdebuild
--use-pdebuild-internal --auto-debsign --debsign-k XXX --configfile
.pbuilderrc-sid

This is a oneliner, which can go into an alias, or just in the shell
history (ctrl-r!). This is not "100 lines of instructions" and a README
is something that you only have to read once for background information
to understand why you are doing what you are doing.

>   Build binaries    ??? no idea         dpkg-buildpackage -uc -b
>   ad hoc for         README.source
>   testing            dirties the tree

What you express here, "it's dirty" is a "ugh! dirty! ew!" feeling,
because you apparently do not like the fact that actual instructions
about what to do are needed at all.

Do realize that someone else who has to use tooling you wrote has to go
through the same process of "wtf! no idea how to do this! what is this
shit! ew!" first, while it's the most normal thing in the world to you now.

>   New upstream      ??? no idea how     # choose VERSION and COMMIT
>   rebase              to do this        git tag upstream/VERSION COMMIT
>                     fix up conflicts    git debrebase new-upstream VERSION
>                      in quilt omg!      # fix up conflicts in git
>                                         # edit changelog at some point

A one line edit in debian/changelog is all, you're exaggerating again,
which is not helpful.

And yes, if there are conflicts (which I've only seen happening when
converting from 4.8 -> 4.9 -> 4.10 -> 4.11), doing this with quilt and
having a git diff that is a diff of diffs is absolutely suboptimal
compared to doing the same in "normal" git conflicts.

> 
>   Add a patch       obtain tree         # edit upstream files
>                     with upstream src   git commit
>                     do quilt stuff 
> 
>   Amend a patch     obtain tree         git debrebase -i
>                     with upstream src   # is just like git-rebase
>                     do quilt stuff
> 
>   Drop a patch      edit series or      git debrebase -i
>                      soemthing ?        # is just like git-rebase

git revert abcd (where the patch was added), and add some information in
the commit message why it can be dropped.

> 
>   Cherry pick       git format-patch    git cherry-pick
>   from upstream     edit series?
>                     how to check it
>                     applies, or build?

Yes, it looks like the most interesting part of this discussion is going
to be about patch management. The packaging-only approach with quilt and
detached patches is absolutely painful compared to using tooling that
has been written with the goal to exactly do that better.

>   Update d/control  edit control.in     # edit control.in
>                     run rules to        debian/rules debian/control
>                      see output,        git commit debian/control
>                     debdifff?           git diff
> 
>   git blame on      not possible        git blame
>    upstream source

Not possible? Well, not in the same repo no, but I can't imagine you
don't know how to do a git blame on the upstream source. Sorry, but this
does not remotely make sense at all.

> 
>   machinery needed  orig targets        README.source saying
>    in tree          genorig             `see dgit-maint-debrebase(7)'
>                     README.source       and mentioning d/control.in

Yes, the genorig and other really old python stuff is a dead horse that
is being dragged around. Instead of trying to rewrite it, it would be
preferable to use tooling that is well-maintained and have all of this
removed, since it's distracting from the actual work that should be
done, improving the Xen stuff.

>   push to salsa     check that          git debrebase prepush
>                      patches apply?     # ^ won't fail
>                     git push            git push
> 
>   uploading         100 lines of        git deborig
>                     pratting about      dgit push-source
>                     debsign, dput       (or dgit push if NEW)
> 
> You'll observe that the left hand column contains references to
> documentation.  It also contains some ??? because I haven't bothered
> to RTFM quilt recently.  (I used it once, a long time ago.  We have
> much better tools now.)
> 
> You'll see that the right hand column is wider and longer.  That
> is because it contains the complete actual command lines.
> 
> 
> In fact, when I was doing the upload to experimental just now, I so
> much missed git that I did the following:
>   * Get the relevant branch and tag from salsa
>   * Follow the instructions in README.source.md to generate a
>     source package
>   * dgit import-dsc
>   * lots of git diff (sadly this procedure produces a not very
>      helpful history)
>   * dgit sbuild -A
>   * dgit push

I'm not going to respond to these ramblings.

> I'm going to quote some snippets from irc:
> 
> 17:17 <babilen> My take on this is, is that I'm more than confident to
>   deal with any security issue that arises in the future with the
>   packaging style used for 4.8 in the stretch-security uploads
> 
> Likewise.  (I'd like to take this chance to publicly thank Wolodja for
> his work on the stretch security updates.)

Woohoo! +1 Thanks!

> 17:27 <Knorrie> also, a relevant question is whether we want to have a
>   patch-burden-heavy workflow in debian stable, or if we could just
>   follow the upstream stable branch, like the kernel team does with
>   stable kernel releases
> 
> I don't think it is realistic to expect the situation to be different
> in buster than it was in stretch.  When I uploaded 4.8 to stretch, it
> was an upstream RC without patches.  The way upstream provide security
> patches means we sometimes have to take upstream stable branches plus
> some patches from advisories; sometimes upstream unstable branches (!)
> and sometimes some unholy mixture.
> 
> It's awkward but git-debrebase makes this reasonably straightforward.
> Doing this stuff with quilt would be nearly unworkable.

Because of the complexity of the work that's flying by in the XSAs and
generally in the code, the thing I'm mostly interested in here is which
approach is the best to end up with a working result for everyone, which
is the thing that we call 'stable' in Debian.

Is it giving the stable-X branch to our users, because we know it passed
all the upstream tests? Or is it cobbling together our own collection of
changes?

Obviously, a patch rebase tool can make something apply and help it
compile again. But it does not detect that if you have patch X and Y and
not Z which seemed to be unrelated but actually breaks a corner case of
some user while the combination of them is stable-X upstream...

But, this is more about how upstream organizes things, how often stable
releases are tagged and there's more discussion about it upstream, which
should be participated in (about maturity level of the patch publishing
process etc), instead of discussing it here.

Also, I am fully aware of the fact that having someone actually working
at $upstream and also actively participating in our packaging is a
luxury. (Yay!) But, for doing the work, it's not optimal to sometimes
have to rely on this person spot invalid combinations of patches added.

Right here, I should be able to mention a specific case where this went
(almost) wrong. I can dig up things from IRC logs if anyone wants.

> 23:41 <Knorrie> babilen: doing stable security updates for a single
>   xen version is a whole different flow than organizing a rolling
>   release (with stable-backports) which follows latest upstream
>   release in unstable/testing (and upstream rc in experimental), while
>   also allowing to easily charry-pick things or merge branches between
>   them
> 
> I think git-debrebase is going to be easier for all these things than
> the current approach.

Ok, let's try it! Thanks a lot for doing the above writeup. After
filtering out the irrelevant stuff, I certainly see that it is better
for us and our users to change and spend time to invest in improving the
external tooling itself instead of improving the tooling inside this
packaging-only repo.

The comment about me not trusting the tools you wrote is irrelevant,
since I don't see any reason to do so at all.

How to proceed:

Can you provide me with some rtfm pointers and instructions about how
you would like to see things being transformed? Or would you like to do
it together? Do we want to do this before or after Sep 10?

I actually already have some questions about the git-debrebase behavior
to start with:

1: meta information

When writing software and git commits, my main incentive is to have a
future reader understand what I'm doing and why. I tend to put quite
some effort in this when I do things, not only because...

I'm also on the other side a lot and not being able to find information
in the history and read *why* something was done instead of *what* was
done, and I see how it hurts future development. This is a major waste
of time and blocking progress. Don't hesitate to ask me for examples.

The quilt workflow allows to add a patch and add (meta-) comments to the
git commit adding that patch, why it needed to be added etc. When
directly adding patches on git level, this is not possible. How is this
meta information stored for future readers? Does the reader have to
search in the debian/changelog and hope something is mentioned about it?
Is there something else? I'd like to see some solution/process for this,
instead of having the information in the head of the person who did it.

2: usage of git

One thing that is quite disturbing for me personally, is the result that
is left behind in a git repository when using git-debrebase. Instead of
having a meaningful history (which is my favourite thing to end up with,
something you can "browse" and read meaningful things, see above) git
seems to be used as some sort of ftp server, where a rm *; cp -a * is
done every time.

This results in the git history looking like a football-field wide rake,
listing the same patches over and over and over and over again, like the
history in the git-dpm repo that we have archived.

The relationship between patches and changes in them is not stored in
git history because every time things end up in new unrelated commits.

The history of the current master branch, doing the packaging from 4.8
via 4.9 and 4.10 to 4.11 and all the changes still does fit on your
screen. I like this. I can just read back what I did and why.

So my question is: how does debrebase care about future readers? How do
I quickly find out what *actual* changes have happened to the packaging
from the large commit blurb that I'm looking at? Does this need a README
with git commands to filter commits in debian/ or other things, or does
the debrebase tooling need to get features to analyze the git history
again to produce a useful overview of this information?

Knorrie