<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Package: Atlas</p>
<p>Version: 3.10.3</p>
<p>Tags: Upgrade</p>
<p>--</p>
<p>Hello,</p>
<p>Since the folling announce has been made, it sounds good to plan
for the package upgrade. <br>
</p>
<p>Thanks<br>
</p>
<br>
ATLAS 3.10.3 should be noticeably faster than 3.10.2 on modern
hardware, <br>
but the 3.11 series is almost always much faster on such systems.
While <br>
I was able to backport support for modern architectures, and even <br>
provide some reasonable kernels for modern ISA extensions, the 3.11
<br>
series allows for much larger block factors and improved storage
formats <br>
that are required to get decent performance many modern machines <br>
(including all AVX-enabled Intel chips). So, if you can use it,
3.11 is <br>
still the best for modern machines by a long way.<br>
<br>
I had hoped to have ATLAS 4.0 out by now, but various setbacks have
<br>
delayed the release, necessitating 3.10.3, since 3.10.2 was not <br>
installing well on modern machines.<br>
<br>
3.10.3 fixes these three bugs:<br>
<a class="moz-txt-link-freetext" href="http://math-atlas.sourceforge.net/errata3.10.2.html#herkNaN">http://math-atlas.sourceforge.net/errata3.10.2.html#herkNaN</a><br>
<a class="moz-txt-link-freetext" href="http://math-atlas.sourceforge.net/errata3.10.2.html#syr2kNaN">http://math-atlas.sourceforge.net/errata3.10.2.html#syr2kNaN</a><br>
<a class="moz-txt-link-freetext" href="http://math-atlas.sourceforge.net/errata3.10.2.html#rotmg">http://math-atlas.sourceforge.net/errata3.10.2.html#rotmg</a><br>
<br>
I have tested 3.10.3 to work on the following OSes:<br>
1. Linux<br>
2. Windows64 (cygwin64 builds now work!)<br>
3. AIX<br>
4. OS X<br>
<br>
For OSes 2-4, see special sections in the install guide for
additional help:<br>
<a class="moz-txt-link-freetext" href="http://math-atlas.sourceforge.net/atlas_install/node53.html">http://math-atlas.sourceforge.net/atlas_install/node53.html</a><br>
Hopefully other OSes (eg., Windows32, Solaris) still work from
3.10.2 <br>
testing.<br>
<br>
Also note that clang can now be used to build ATLAS by adding:<br>
--force-clang=/path/to/clang<br>
to your configure line. For the open version of clang, performance
<br>
still tends to lag gcc, but is strongly improved from last release.
<br>
Apple's clang appears to be substantially faster, but I may be
mistaken.<br>
<br>
<br>
New architecture support available in 3.10.3 includes:<br>
1. ARM32: a7, a9, a15 (auto-detect of SOFT/HARD ABI)<br>
2. ARM64: xgene1, a53, a57<br>
3. Intel: Corei3 & Corei4 (skylake)<br>
4. IBM: Z series, POWER8 (including little/big endian)<br>
<br>
Support for modern vector extensions in atlas_simd.h:<br>
1. Intel AVX2<br>
2. IBM VSX & Z-series VX<br>
3. ARM64 Advanced SIMD<br>
4. ARM32 NEON (only if -Si ieee 0 flag is thrown)<br>
<br>
Regards,<br>
Clint<br>
<br>
ATLAS 3.10.3 released 07/28/16, highlights of changes from 3.10.2<br>
* Updated F77 L1BLAS testers to those used LAPACK3.6.1<br>
* Fixed bug in rotmg revealed by LAPACK3.6.1 testers<br>
* Fixed bug in hprk/sprk that could cause NaN propogation in <br>
HERK/SYRK due<br>
to reading uninitialized memory in BETA=0 case<br>
* Fixed bug in threaded SYR2K/HER2K that could cause NaN
propogation due<br>
to reading uninitialized memory<br>
* Extended matrix/vector norm functions to detect NaNs<br>
* Extended configure:<br>
+ --force-clang=/path/to/clang : will use clang for all C
compilers,<br>
even goodgcc (assumes gcc flag & inline-assembly
compatibility)<br>
+ --cripple-atlas-performance: install despite failing
throttle check<br>
+ Can now use arch string rather than enum # for -A arg<br>
+ --force-tids now affects ATLrun.sh as well as threaded build<br>
+ ARM32 autodetects SOFTFP/HARDFP ABI<br>
* backport of config & archdefs for:<br>
+ POWER[7,8]le, IBMz[10,13,19], Corei[3,4], ARM[7,9,15,17],<br>
ARM64[xgene,a53,a57]<br>
+ archdefs for NEON ARMa[7,15]<br>
+ config support for IBM Z[9,196,12]<br>
* backport & extension of atlas_simd.h &
atlas_cplxsimd.h<br>
+ New SIMD kernels for: VSX, VXZ, AVX2, AdvancedSIMD, NEON<br>
* Fixed mflop test of PrintMMLine, that sometimes failed to
print<br>
valid mflop due to negative values from prior runs<br>
* Removed ATL_dmm6x1x60_sse2_32.c from z index files (not valid
cplx <br>
kern)<br>
* Forced MinGW comps to be ignored unless -Si nocygwin 1 is set<br>
* Added support for WOW64 detection & basic use, numerous
changes to <br>
make<br>
work on cygwin64<br>
* Fixed uninit nM in s[1,2]nxtune.c's RecDoubleNX<br>
<br>
-- <br>
**********************************************************************<br>
** R. Clint Whaley, PhD * Assoc Prof, LSU * <a class="moz-txt-link-abbreviated" href="http://www.csc.lsu.edu/~whaley">www.csc.lsu.edu/~whaley</a>
**<br>
**********************************************************************<br>
<div class="moz-signature">-- <br>
<br>
__________________________________________________________________________
<br>
thf - Thierry Fauck - <a class="moz-txt-link-abbreviated" href="mailto:tfauck@free.fr">tfauck@free.fr</a>>
<br>
<i> pubkey: 4096R/FCC181CE</i>
<br>
<i> fingerprint: 5CCF 6B82 DE4E E72A A40B B63E A153 BF4F FCC1 81CE</i></div>
</body>
</html>