[Debichem-devel] request for building beta release of GROMACS 2018

Mark Abraham mark.j.abraham at gmail.com
Wed Dec 13 01:23:00 UTC 2017


Hi Andreas,

On Tue, Dec 12, 2017 at 11:36 PM Andreas Tille <andreas at an3as.eu> wrote:

> Hi Mark and Nicolas,
>
> On Tue, Dec 12, 2017 at 04:51:47AM +0000, Mark Abraham wrote:
> > Hi Debichem team, particularly Nicolas Breen,
> >
> > I'm the development manager for GROMACS, which you guys kindly package
> and
> > have given us some great feedback about. We've just released our second
> > beta tarball for our 2018 release, and we'd love to find out early if
> > there's any problems building or running the code. Is there anything I
> can
> > do to help get that process started? I note that the system has detected
> > that we made our 2018-beta1 a few weeks back, but 2018-beta2 will be
> > available when it next runs.
>
> As promised I have moved the gromacs packaging from SVN to Git[1].
> Since the above new version seems to be our new target I imported this
> as well and adapted the patches to the new upstream version (Nicolas,
> please double check).
>
> Unfortunately when trying to build I get two test failures:
>
> ...
>
> Test project /build/gromacs-2018~beta2/build/mpich
>       Start  1: TestUtilsUnitTests
>  1/24 Test  #1: TestUtilsUnitTests ...............   Passed    0.03 sec
>       Start  2: TestUtilsMpiUnitTests
>  2/24 Test  #2: TestUtilsMpiUnitTests ............***Failed    0.03 sec
> --------------------------------------------------------------------------
> The value of the MCA parameter "plm_rsh_agent" was set to a path
> that could not be found:
>
>   plm_rsh_agent: ssh : rsh
>
> Please either unset the parameter, or check that the path is correct
> --------------------------------------------------------------------------
>
>       Start  3: MdlibUnitTest
>  3/24 Test  #3: MdlibUnitTest ....................   Passed    0.01 sec
>       Start  4: AppliedForcesUnitTest
>  4/24 Test  #4: AppliedForcesUnitTest ............   Passed    0.01 sec
>       Start  5: ListedForcesTest
>  5/24 Test  #5: ListedForcesTest .................   Passed    0.01 sec
>       Start  6: CommandLineUnitTests
>  6/24 Test  #6: CommandLineUnitTests .............   Passed    0.02 sec
>       Start  7: EwaldUnitTests
>  7/24 Test  #7: EwaldUnitTests ...................   Passed    0.17 sec
>       Start  8: FFTUnitTests
>  8/24 Test  #8: FFTUnitTests .....................   Passed    0.05 sec
>       Start  9: GpuUtilsUnitTests
>  9/24 Test  #9: GpuUtilsUnitTests ................   Passed    0.01 sec
>       Start 10: HardwareUnitTests
> 10/24 Test #10: HardwareUnitTests ................   Passed    0.02 sec
>       Start 11: MathUnitTests
> 11/24 Test #11: MathUnitTests ....................   Passed    0.00 sec
>       Start 12: MdrunUtilityUnitTests
> 12/24 Test #12: MdrunUtilityUnitTests ............   Passed    0.01 sec
>       Start 13: MdrunUtilityMpiUnitTests
> 13/24 Test #13: MdrunUtilityMpiUnitTests .........***Failed    0.01 sec
> --------------------------------------------------------------------------
> The value of the MCA parameter "plm_rsh_agent" was set to a path
> that could not be found:
>
>   plm_rsh_agent: ssh : rsh
>
> Please either unset the parameter, or check that the path is correct
> --------------------------------------------------------------------------
> ...
> 24/24 Test #24: CompatibilityHelpersTests ........   Passed    0.00 sec
>
> 92% tests passed, 2 tests failed out of 24
>
> Label Time Summary:
> GTest       =   0.60 sec (24 tests)
> MpiTest     =   0.04 sec (2 tests)
> UnitTest    =   0.60 sec (24 tests)
>
> Total Test time (real) =   0.62 sec
>
> The following tests FAILED:
>           2 - TestUtilsMpiUnitTests (Failed)
>          13 - MdrunUtilityMpiUnitTests (Failed)
>
>
> For your reference I have commited a copy of the full build log in
> branch logs[2].  I admit the said error somehow rings a bell and I have
> seen this in some other Debian Science package.  It smells like this
> bug[3] and thus I'll add openssh-client to Build-Depends and try
> rebuilding.
>

Indeed, that is a useful step, despite the fact that it exposes a further
issue that when ctest tries to launch these tests with multiple ranks, only
one gets launched.

When we register these tests that fail, we do so in a way that
* gives them the MpiTest label within CTest (so that is a way for you to
disable running them, if you run ctest manually rather than "make test" -
but I haven't yet found the DebiChem build recipe for GROMACS to make a
concrete suggestion), and
* uses the machinery in our cmake/gmxManageMpi.cmake to find an mpirun and
call it with multiple ranks - it's not necessary for the node running the
test to have a network, or even a network interface, but the MPI runtime
(e.g. openmpi-bin or mpich-bin) needs to (be able to) be configured to
start more than one rank

Skipping those tests is quite reasonable - the role of
TestUtilsMpiUnitTests is to serve as a sentinel for exactly these kinds of
infrastructure setup issues, and the role of MdrunUtilityMpiUnitTests is to
test our functionality for handling thread affinity in an MPI context. We'd
be interested to know that the latter pass on interesting hardware, but it
isn't worth much work unless there's others who need the ability to run an
MPI process with multiple ranks as part of their test process.

Digging through that has exposed a bug in our handling of other MPI tests,
which I will fix, so thanks for that. (MdrunMpiTests should be failing in
the same way.)

Minor suggestions:
* setting "export GTEST_COLOR=no" will make the failures a little more
readable because they won't have ascii color codes in them
* cmake -DGMX_X11=on is reasonable for the non-MPI builds (but is off by
our default); turning X11 off support makes good sense for the MPI builds,
particularly with -DGMX_BUILD_MDRUN_ONLY=on; currently the MPI builds turn
it on and then off, which works OK but is potentially a problem
* using ctest -V would help troubleshoot aspects of these MPI tests (if we
continue to test them) by hopefully letting us see the full command line in
the log

Mark

Just to let you know about the status of the SVN-Git migration +
> upgrade.
>
> Kind regards
>
>       Andreas.
>
>
> [1] https://anonscm.debian.org/git/debichem/packages/gromacs.git
> [2]
> https://anonscm.debian.org/cgit/debichem/packages/gromacs.git/tree/gromacs_2018~beta2-1_amd64.build?h=logs&id=05b0694d8a5deaf36460fa021752c58e1a0b1cd8
> [3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=882603
>
> --
> http://fam-tille.de
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/debichem-devel/attachments/20171213/5b1457e2/attachment.html>


More information about the Debichem-devel mailing list