[Teammetrics-discuss] MemoryError Observations

Andreas Tille andreas at an3as.eu
Thu Sep 22 08:58:59 UTC 2011


On Mon, Sep 19, 2011 at 04:58:41PM +0530, Sukhbir Singh wrote:
> Yes, we will have a MemoryError almost surely the first time the code
> runs *only* for SVN repositories. This is because each time Paramiko
> fetches data, it creates a new connection object.

Ahhh, that's bad.

> Possible solutions for this problem:
> 
> 1. I am not sure, but if we perform all the operations on Alioth and
> then fetch everything all at once (like you said :), it might work.
> But we need to test this because the size of the data is large given
> that pkg-perl itself has 77977 revisions. Whether this works or not,
> this is to be tested.

If you perform the complete data gathering on Vasks first the
calculation time on this machine is not higher than if you fetch it in
single steps with paramiko.  The only difference is moving the data from
Vasks to blends.d.n.  I fail to see why doing it at once should be more
complicated / time consuming / other resource consuming than doing it in
one single rush.  I rather expect a performance gain to do it in one
rush.
 
> 2. What if we populate the 'teammetrics' database on blends.d.n from
> vasks _directly_ without having to go through our script? Then we
> eliminate the need for getting it through Paramiko and running it into
> memory problems. So we have a script on vasks that does this for us.

It might work or not but we need to open an additional port to the
outer world which I'd be not to happy about if there are other ways
(like 1.).
 
> If we finish then we can present teams with some statistics :)

Yep.

Kind regards

       Andreas.

PS: Please note that I'll be basically offline from 25.9. to 6.10.
    If the hotels might provide some internet connection I might read
    my mails in evening hours but not very regularly.

-- 
http://fam-tille.de



More information about the Teammetrics-discuss mailing list