[pkg-apparmor] UDD notifications script

Thu Feb 19 18:37:18 UTC 2015

Hi,

u wrote (19 Feb 2015 17:20:52 GMT) :
> I think I've applied everything you mentioned and share your vision.

Yay, congrats, I find it a lot more readable and cleaner. Thanks! :)

Want more? Here are a few more comments. Again, feel free to tell me
"now that's enough, thank you" or "patches welcome", about some or all
of the following comments.

* This looks suboptimal (but of course, as long as we haven't
  a massive amount of usertagged bugs, it should be just fine):

 	for bug in cursor.fetchall():
		buglist.append(bug)

  ... so I'm curious: can't we do that in one batch operation instead?

* When using the get_bug_list output (a list of pairs) in
  send_team_notification, we have to access it by indices: bug[0] to
  get the bug ID, bug[1] to get the added or removed usertag. It's not
  astonishingly obvious. I would recommend that get_bug_list outputs:

  - either a list of dictionaries so we can then write bug['id']
    and bug['usertag']
  - or a dictionary whose keys would be the bug IDs, and values would
    be the usertag; but wait! this would be too limiting, read on:

* I see we first retrieve a list of usertagged bugs (in get_bug_list),
  and then run one more query per bug that had a usertag added or
  removed, in order to get its title. Alternatively, get_bug_list
  could use a JOIN to retrieve all the data we need about all
  usertagged bugs in one go. No idea the higher number of queries
  (current implementation) costs more than retrieving the title of
  bugs that we won't use (suggested implementation). Anyhow, putting
  aside performance and resources considerations, I would find the
  implementation of send_team_notification more readable and elegant
  if get_bug_list returned a list of dictionaries, each such
  dictionary having keys (id, title, usertag). Food for thought :)

* pylint says that "def get_bug_title(bugid) :" has one too much
  space. It's also quite unhappy about a lot of other small things.
  Possibly it knows Python best practices better than you and I.
  I've no idea (really -- it might produce obsolete recommendations,
  and I've not checked). Similarly, pyflakes is a bit mean and detects
  two unused variables.

* In:

	cursor.execute("SELECT title from bugs WHERE id='%s'" % bugid)
	for bug in cursor.fetchall():
		return bug[0]

  ... we're somewhat acting defensively as if we didn't have
  a guarantee that the id is a unique identifier (is it the case?),
  which can arguably be a good thing.

  But we're actually not defensive in practice, since we're always
  returning the title of the first matching bug and ignoring any
  other. So, retrieving all the data using fetchall(), and then
  iterating on the first result only in a for loop feels... weird.

  So I would simply use fetchone() instead of fetchall(), and drop the
  for loop entirely.

  Now, if you really want to be more defensive, we can also count the
  number of matching bugs and throw an exception if it's 2 or more,
  but I suspect we could avoid that after a quick look at the
  UDD schema. What's an ID if it's not an ID? :)

* Mixing tabs and spaces for indentation produces very confusing
  results in my $EDITOR:

	except IOError as e:
		send_error_mail("Could not save state file.")
        return False

	return True

  ... here, the "return False" statement is really part of the
  exception handling block, even if it doesn't look like it when one
  has tabs display width = 8 spaces. I suggest applying the
  indentation changes suggested by pylint to avoid that :)

* In the read_statefile function, I believe that the "return old"
  statement has no good reason to be in the try block. In general it's
  good practice to guard as little code as possible with exception
  catching blocks. In this case, as a non-expert in Python, I have no
  idea if it may have any adversarial consequences in practice to
  leave it there, but IMO it would be good code hygiene to move it
  after the exception handling block. And by the way, it'll make it
  clearer what is the expected output of the function on success :)

* 'print "No new data."' might produce useless noise when run from
  cron, no?

* The 'else' on line 93 (at commit 40226e0) forces us to indent the
  main algorithm of this function. I generally prefer dropping such
  else clauses, and instead return earlier in case we cannot do our
  job... which is actually already done in this function, so the
  refactoring I'm suggesting is trivial.

* The send_team_notification function still uses the sender and
  receiver global variables. Any reason why?

* It seems that we're passing to compare_state:
   - a string we got from read_statefile
   - a list we got from get_bug_list
  ... and then compare_state treats both as strings. For some reason,
  it works anyway.

  I think I'd find the design cleaner if compare_state took structured
  data as input (a set, a dictionary, whatever it prefers). And then,
  read_statefile should handle deserialization and return data in that
  same format. And then, we would be handling {,}deserialization only
  in save_statefile and read_statefile, and using only structured data
  anywhere else, which feels better to me.

* compare_state has a comment that says "convert new state string to
  dictionary", while it's apparently (from my Python almost-newbie
  PoV) creating a Set object instead. Who's right? :)

* The "Send the message via our local SMTP server" comment feels
  wrong, now that you've made the SMTP relay configurable.

Cheers!
-- 
intrigeri