[Pkg-ofed-devel] Support for these OFED packages, libipathverbs1

Nasser Mohieddin Abukhdeir nasser.abukhdeir at mcgill.ca
Sat May 30 13:53:36 UTC 2009


Hello Guy:
    I've tried this and it works, thanks for that. Unfortunately I am 
still "in the woods" in that I get an error when trying to run MPI 
(following this message). I have the OFED packages installed (enough to 
get OpenSM installed) on our head node with the default configuration 
files untouched. When I try and check to see if other hosts were 
detected (ibhosts) I get this error:

ibwarn: [5309] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 
0; 0,1,3)
ibwarn: [5309] handle_port: NodeInfo on DR path slid 0; dlid 0; 0,1,3 
failed, skipping port
ibwarn: [5309] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 
0; 0,1,5)
ibwarn: [5309] handle_port: NodeInfo on DR path slid 0; dlid 0; 0,1,5 
failed, skipping port
Ca    : 0x0011750000ff80c1 ports 1 "QLogic kernel.org driver rey2"

and here is the message when I actually try and run a job:

--BEGIN--
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
[node3][0,1,1][btl_openib_component.c:467:init_one_hca] error obtaining 
device context for ipath0 errno says Permission denied

[node3][0,1,0][btl_openib_component.c:467:init_one_hca] 
--------------------------------------------------------------------------
WARNING: There were errors during IB HCA initialization on host 'node3'.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There is at least on IB HCA found on host 'node3', but there is
no active ports detected. This is most certainly not what you wanted.
Check your cables and SM configuration.
--------------------------------------------------------------------------
error obtaining device context for ipath0 errno says Permission denied

--------------------------------------------------------------------------
WARNING: There were errors during IB HCA initialization on host 'node3'.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There is at least on IB HCA found on host 'node3', but there is
no active ports detected. This is most certainly not what you wanted.
Check your cables and SM configuration.
--------------------------------------------------------------------------

--END--

Guy Coates wrote:
> Hi Nasser,
>
> this is the right forum for the discussing the particulars of the debian
> OFED packages.
>
>
> There is a bug in the ofa-kernel-source package, which was causing the
> ib_ipath module not to be rebuilt, so you have ended up with a mixture
> of old and new kernel modules.
>
> I have now fixed the bug; if you download the new ofa-kernel-source
> package (ofa-kernel-source_1.4-3_all.deb)
>
> and rebuild the kernel modules:
>
>  module-assistant prepare
>  module-assistant clean ofa-kernel
>  module-assistant build-ofa-kernel
>
> You will have to rmmod the existing infiniband modules or reboot the
> machine before you can use the new modules.
>
> You can check that the right one is being used by running
> modinfo ib_ipath.
>
> You should pick up the modules from:
>
> /lib/modules/2.6.26-2-amd64/updates
>
> rather than:
>
> /lib/modules/2.6.26-2-amd64/kernel/
>
> Cheers,
>
> Guy
>
>   



More information about the Pkg-ofed-devel mailing list