[Aptitude-devel] aptitude-qt VCS usage and code review

Fri Jul 9 21:38:17 UTC 2010

2010/7/9 Daniel Burrows <dburrows at google.com>:
> On Fri, Jul 9, 2010 at 2:14 AM, Piotr Galiszewski <piotr at galiszewski.pl>
> wrote:
>>
>> 2010/7/9 Daniel Burrows <dburrows at google.com>:
>> >   These are the ways of retrieving package information from a package
>>
>> > that spring to mind immediately:
>> >   1) Fully eager: as soon as the Package is created, copy all the values
>> > from
>> >       the PkgIterator over into the Package object (I believe the GTK+
>> > code
>> >       effectively does this).
>> >   2) Fully lazy: the first time a member of Package is accessed, copy
>> > all
>> > the
>> >       values from the PkgIterator into it (what the current Qt code
>> > does).
>> >   3) Memberwise lazy without caching: just store a PkgIterator; when a
>> >       component of the Package is requested, return the corresponding
>> >       member of the PkgIterator (what the curses frontend does).
>> >   4) Memberwise lazy with caching: when a component of the Package is
>> >       requested, copy the corresponding member of the PkgIterator into
>> > the
>> >       Package object and return it (I don't think any aptitude code does
>> > this).
>> >   Which of these did you try?  Do you have numbers showing how big an
>> > improvement you saw?  Could you try some of the others to compare?
>>
>> I tried second and third points. Do not have any hard numbers. My
>> assumptions were only based on feeling and time of response from GUI.
>> I do not know how to provide numbers, but I will try.
>
>   I might just instrument the GUI to run whatever you're testing immediately
> on startup, then exit, and use the "time" command to check how long it took.
> An alternative, maybe better, would be to save the system time before and
> after the process you're interested in and log it (say, using LOG_()).
>   When I'm timing stuff like this I generally run it twice to eliminate
> cache
> effects (the time on an empty cache is somewhat interesting, but I think
> that normally a user of the GUI interface will have the package database
> cached).
>

Getting numbers was much harder than I had thought. I was also not
sure what I should measure. I have decided to time creating list of
package objects with viewing and sorting by name on the list (only
name parameter is really used):

1) Fully eager
	2.75s
2) Fully lazy (name and current version are loaded on creating package object):
	0.39s
3) Memberwise lazy without caching:
	0.7s
4) Memberwise lazy with caching:
	0.4

Later I have tried to measure time of searching two characters long
word by name and description. Unfortunately, I was unable to get any
results (probably due to Qt signals and event queue)

I have used getrusage to get times.

My laptops has quite modern processor (c2d T 9400 (2x2.5) and 4 GB of
ram). On slower computers the difference will be greater.So it looks
like the caching gives some nice improvements. I do not have any
numbers, but I think that option number 4 will be even better than
option 2. With option number 2 first searching by description is quite
long. Probably time is similar to measured Fully eager time. Option 4
probably will be easier to implement. Most of the variables are
strings so one simple if with  empty() option should work. Current
behaviour of Qt frontend is little problematic, cause ensure_loaded
discards const qualifier of getters.. In 4 mutable should work

>>
>> The speed was
>> the easiest to observe during sorting and searching packages. On the
>> other hand, I think that returning of the most members will be short.
>> Only description loading and maybe something more is time consuming. I
>> will experiment with this
>
>   My main concern is that I don't want to get a bunch of complicated
> code to optimize something that doesn't need to be optimized.  But then,
> maybe it'll turn out that it does need to be optimized after all...
>

I completely understand you, But it looks for me that some
optimizations of program startup could be nice :)

>>
>> >   Is there any way to lose the QFoo types in Package?  I'm not sure we
>> > want to reuse it directly, but there might be a way to solve this
>> > problem
>> > more generally, as long as it doesn't require the use of Qt-specific (or
>> > GTK-specific) types.
>>
>> Yes. There are only QStrings and QStringLists. std::string and
>> std::list could also be used. Moreover, I think that this class
>> requires much more thinking. Maybe it will be better do extend
>> PackageInformation class (and rename this to package_information or
>> package ;) ) ?
>
>   Yeah, that could be a better place to start.
>

So, what are you proposing as my next task? :)

>>
>> There is also one more problem I am not entirely sure
>> hot is should be solved. What should be done after cache updating?
>> Packages could be changed so the state of package class also should be
>> updated. Maybe package objects should be stored in map/hashtable in
>> packages_pool instead of simple list?
>
>   You need to deal both with changes to package state, and with the
> whole cache being discarded and reloaded.
>   For changes to package state, you need some sort of map from
> package iterators to objects in your GUI model.  The backend supplies
> you with a set of packages whose state has changed, and you just
> need to find them and update them.  In the GTK+ code, this is a map
> from packages to iterators in the GTK+ tree model (we use a stock
> one).  If you're responsible for storing your own list, I'd recommend
> using boost::multi_index_container for this case (one index for the
> display order of the elements, another for finding elements by package
> iterator).

I have planned to use QMap for this, but it looks like
Boost.MultiIndex is a superior for this. For Qt model indexing is
required but it is also supported.

I have to learn more about boost :)

>   In the GTK+ and curses frontends, we deal with the cache being
> reloaded by throwing away all the currently displayed packages
> and rebuilding the view from scratch.  The only real alternative would
> be to somehow merge the new cache into the display, maybe on the
> basis of name and/or version strings?  I don't think it's worth the
> trouble for your first version.
>

That was also my plan. My only concern is about earlier opened
package_info tabs. They also contain reference to package object, so
they should be updated. But probably it is not big problem, cause I
could identify them also by name

>   Daniel
-- 
Regards
Piotr Galiszewski