[Debian-med-packaging] Bug#776812: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)

Tim Booth avarus at fastmail.fm
Mon Feb 2 11:18:15 UTC 2015


Hi All,

Sure, I can do some performance tests.  The tests supplied with the
software are specifically to test performance so I can run those with
various build flags and see what we get.  Maybe also enabling link-time
optimisation would help here - I've not really played with that myself
yet.

The documentation says the software has a "64-bit design" so quite
probably we shouldn't be building this for i386.

Cheers,

TIM

On Mon, Feb 2, 2015, at 10:53 AM, Andreas Tille wrote:
> Hi Gert,
> 
> thanks for your helpful comments.
> 
> On Mon, Feb 02, 2015 at 11:38:20AM +0100, Gert Wollny wrote:
> > Hello, 
> > 
> > On Mon, 2015-02-02 at 07:51 +0100, Andreas Tille wrote:
> > > Hi Mentors,
> > 
> > > It is very important to build vsearch with the maximum optimisation for speed
> > > and thus I wonder whether dropping this option is a good idea or whether
> > > I should enable it on i386 and amd64 (the question extends also to
> > > freebsd-i386/freebsd-amd64 once an other issue in freebsd with this
> > > package is solved).
> > 
> > On amd64 sse/sse2 is enabled by default. 
> > 
> > Tuning the code for a specific processor (i.e. core2) might not be such
> > a good idea, according to the GCC man page one should use -mtune=generic
> > instead: 
> > 
> > "generic: 
> > 
> >  Produce code optimized for the most common IA32/AMD64/EM64T processors.
> > If you know the CPU on which your code will run, then you should use the
> > corresponding -mtune or -march option instead of -mtune=generic.  But,
> > if you do not know exactly what CPU users of your application will have,
> > then you should use this option.
> > As new processors are deployed in the marketplace, the behavior of this
> > option will change.  Therefore, if you upgrade to a newer version of
> > GCC, code generation controlled by this option will change to reflect
> > the processors that are most common at the time that version of GCC is
> > released. " 
> 
> Tim, could you clarify with upstream if they agree that -mtune=generic is
> the option that should be used?  In this case my patch in svn I prepared
> in advance (x86_spezific_opts.patch) should be dropped.
>  
> > In addition, with itksnap I saw that -funroll-loops and -ftree-vectorize
> > improved performance a lot, and these are options that do not depend on
> > the architecture, but are also not enabled by default.
> > 
> > -funroll-loops may also slow down the code, you should check this. It is
> > especially effective if there are many small loops of fixed size (like
> > it is the case with ITK's types that are templated over dimensions). 
> > 
> > -ftree-vectorize may be useless on x86 without SSE but on amd64 it could
> > give some speedups.
> 
> Tim, could you do some performance checks?  I have no idea whether the
> usual upstream test suite is a proper check for this. 
> 
> Kind regards
> 
>       Andreas.
> 
> -- 
> http://fam-tille.de


-- 
Of course I'm a technophobe; I program computers for a living!



More information about the Debian-med-packaging mailing list