[Pkg-ofed-devel] lenny - openmpi problems

Yann JOBIC jobic at polytech.univ-mrs.fr
Wed Sep 16 15:47:31 UTC 2009


Guy Coates wrote:
> I am afraid I do not know what else to suggest; I have a couple of 
> test machines running in the same configuration as yours, and 
> everything works correctly.
>
>
> lspci:
> 0c:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR, 
> PCIe 2.0 2.5GT/s] (rev a0)
>
>
> opensm logfile:
>
> Sep 15 16:56:33 720298 [469406E0] 0x80 -> OpenSM 3.2.6_20090317
> Sep 15 16:56:33 720630 [469406E0] 0x02 -> osm_vendor_init: 1000 
> pending umads specified
> Sep 15 16:56:33 720743 [469406E0] 0x80 -> Entering DISCOVERING state
> Sep 15 16:56:33 720791 [469406E0] 0x02 -> osm_vendor_bind: Binding to 
> port 0x2c9030002faf6
> Sep 15 16:56:33 751935 [469406E0] 0x02 -> osm_vendor_bind: Binding to 
> port 0x2c9030002faf6
> Sep 15 16:56:33 753473 [469BC950] 0x80 -> Entering MASTER state
> Sep 15 16:56:33 761905 [469BC950] 0x80 -> SUBNET UP
>
>
> The ib0,ib1 interfaces all sit on-top of the base ib protocol. If 
> opensm/ibping is not working then none of the other protocols will work.
>
> I would suggest you un-configure your ipoib interfaces and start again 
> from section 4 in the HOWTO.
>
> Sorry I cannot be of more help.
>
> Cheers,
>
> Guy
>
Hello,

I reinstalled from scratch the 2 debian.
I installed as the howto said, and compiled the modules.
I tried two things. I first thought that gcc4.1 is needed, as the 
"module-assistant prepare" is installing gcc-4.1
So i tried gcc-4.3 and gcc-4.1
The compilations gave no errors.

Things are getting wrong when i'm launching opensm.
Then I still got the error message that i gave yesterday.
Sep 16 11:34:55 Lidia kernel: [  173.877545] ib_query_port failed (-16) 
for mlx4_0
Sep 16 11:35:05 Lidia kernel: [  185.697568] Couldn't query port
Sep 16 11:35:15 Lidia kernel: [  197.549691] ib_query_port failed (-16) 
for mlx4_0

what firmeware have you on the MT25418 card ?
I've got : 2.5.100, given by ibstat.

The strange thing is that i can ibping with the binary version shipped 
with ofed, and when i'm compiling it, nothing's working.
Is it possible that opensm and the modules are incompatible, because of 
some compilation parameters ?

Thanks,

Cheers

Yann




More information about the Pkg-ofed-devel mailing list