[Daca-general] scan-build and metrics gsoc proposals and DACA

Michael Tautschnig mt at debian.org
Mon Mar 18 01:49:23 UTC 2013


[Keeping Sylvestre and Zack in CC, not sure you are subscribed]

Hi Raphael, hi all,

[...]
> And that's just the tip of the iceberg, but the real problem is proper job 
> scheduling and data processing (to e.g. generate the "dumb html" reports or 
> combined views, etc.)
> 
> So here's where, I believe, all three projects meet: there is currently no 
> proper infrastructure for doing that kind of thing.
> 
> For DACA I've an initial implementation of such a system using gearman jobs 
> as a way to do everything from notifying of a new package version to 
> responding to that "job" and triggering other jobs ("get the list of tools", 
> "call every tool", "store result", "notify of new result", "get the list of 
> result analysers", "trigger new jobs for every single tool", etc).
> This started well and the idea seemed good at first sight as you can connect 
> multiple job servers and workers and do all that stuff; but that's just one 
> part of it.
> It seems like what is actually needed is something like hadoop & friends, 
> and that's the point where I'm currently stuck with DACA at. We don't even 
> have the proper stack in Debian.
> 
> What do you think about all this? Do you consider that it would be better to 
> re-think a little more the proposals and try to come up with something 
> bigger (but split so that more than one student can work on it)?
>
[...]

For a lack of time, I haven't yet looked into the GSoC proposals. Yet I am also
facing the very same issues in my current research activities, where I'm trying
to run our software verification tools on all the packages in Debian.  There is
still quite a bit of work ahead of me until I eventually get there, but at least
automated builds using our own research compiler infrastructure are happening
(resulting in some >100 bug reports..., usertagged goto-cc).

I really see data processing as the biggest challenge, and I don't yet have a
solution. But there are a few things I could contribute:

- One of my students (MSc project, not GSoC) is going to look into this over the
  next few months. I will be happy to share the results.
- I'd like to investigate whether jenkins (with auto-generated jobs) could be
  used to take care of all the scheduling and triggering bits, plus
  notifications, etc. This would immediately come with the potential for scaling
  to build farms, as jenkins natively supports a master/slave setup.
- I am currently working towards acquiring hardware via the university. Once
  I've got some (more), I'll play with a jenkins setup. If this succeeds, I'll
  happily share it.

Best,
Michael

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/daca-general/attachments/20130318/145692b5/attachment.pgp>


More information about the Daca-general mailing list