Infiniband Quickstart Howto
Guy Coates
____________________________________________________________
Table of Contents
1. Introduction
1.1 What is OFED?
2. Installing the OFED Userspace Software
2.1 Installing prebuilt packages
2.2 Building packages from source
2.2.1 Install the prerequities development packages
2.2.2 Checkout the svn tree
2.2.3 Install the upstream source (optional)
2.2.4 Build the packages.
3. OFED kernel modules
3.1 Building OFED kernel modules
4. Setting up a basic infiniband network
4.1 Upgrade your infiniband card and switch firmware
4.2 Choose a Subnet Manager
4.3 Load the kernel modules
4.4 (optional) Start opensm
4.5 Check network health
4.6 Check the extended network connectivity
4.7 testing connectivity with ibping
4.8 Testing RDMA performance
5. IP over infiniband (IPoIB)
5.1 List the network devices
5.2 IP Configuration
5.3 Connected vs Unconnected Mode
5.4 TCP tuning
6. MPI
7. SDP
7.1 Configuration
7.2 Example Using SDP with Netpipe
8. SRP
8.1 Configuration
8.2 SRP deamon configuration
8.2.1 Determine the IDs of presented devices
8.2.2 Configure srp_deamon to connect to the devices
8.3 Multipathing, LVM and formatting.
9. Building Lustre against OFED
9.1 Check Compatibility
9.2 Install the OFED packages
9.3 Install a set of kernel modules
9.4 Configure lustre
10. Network Troubleshooting
11. Further Information
______________________________________________________________________
11.. IInnttrroodduuccttiioonn
This document describes how to install and configure the OFED
infiniband sotware on Debian. This document is intended to be a
quickstart document on how to configure basic infiniband
functionality. It is not a replacment for the details documentation
provided in the ofed-docs package!
11..11.. WWhhaatt iiss OOFFEEDD??
OFED (OpenFabric's Enterprise Distribution) is the defacto infiniband
software stack on linux. OFED provides a consistent set of kernel
modules and userspace libraries which have been tested together.
Further details of the Openfabrics Alliance and OFED can be found here
http://www.openfabrics.org
22.. IInnssttaalllliinngg tthhee OOFFEEDD UUsseerrssppaaccee SSooffttwwaarree
Before you can use your infiniband network you will need to install
the OFED software on your infiniband clients. You can choose to use
the pre-build packages on alioth, or build your own packages straight
from the alioth SVN repository.
22..11.. IInnssttaalllliinngg pprreebbuuiilltt ppaacckkaaggeess
Download and install the packages at
https://alioth.debian.org/frs/?group_id=100311
. Packages are grouped
by OFED release. Unless you know what you are doing, you should
install all of the packages. Note that some OFED 1.4 packages are
already in debian Lenny. You can install them from your usual
repository.
22..22.. BBuuiillddiinngg ppaacckkaaggeess ffrroomm ssoouurrccee
If you wish to build the OFED packages from the alioth svn repository,
use the following procedure.
22..22..11.. IInnssttaallll tthhee pprreerreeqquuiittiieess ddeevveellooppmmeenntt ppaacckkaaggeess
aptitude install svn-buildpackage build-essential devscripts
22..22..22.. CChheecckkoouutt tthhee ssvvnn ttrreeee
svn co svn://svn.debian.org/pkg-ofed/
22..22..33.. IInnssttaallll tthhee uuppssttrreeaamm ssoouurrccee ((ooppttiioonnaall))
The upstream source tarballs need to be available if you want to build
pukka debian packages suitable for inclusion upstream. If you are
simply building packages for your own use, you can ignore this step.
cd pkg-ofed
mkdir tarballs
Populate the tarballs with the *.orig.tar.gz files available form the
"upstream source" release on
https://alioth.debian.org/frs/?group_id=100311
22..22..44.. BBuuiilldd tthhee ppaacckkaaggeess..
cd into the package you wish to build. eg for libibcommon,
cd pkg-ofed/libibcommon
Link in the upstream tarballs directory (optional)
ln -s -f ../tarballs .
Run svn-buildpackage from within the trunk directory.
cd pkg-ofed/libibcommon/trunk
svn-buildpackage -uc -us -rfakeroot
The build process will generate a deb in the build-area directory.
Repeat the process for the rest of the packages. Note that some
packages have build dependancies on other OFED packages. The suggested
build order is:
libibcm
libibcommon
libibumad
libibmad
libnes
libsdp
dapl
opensm
infiniband-diags
ibutils
mstflint
perftest
qlvnictools
qpert
rds-tools
sdpnetstat
srptools
tvflash
ibsim
ofed-docs
ofa_kernel
ofed
33.. OOFFEEDD kkeerrnneell mmoodduulleess
You will also require a set of OFED kernel modules which match the
version of the OFED userspace software you have installed. OFED
kernel modules are periodically merged into the mainline kernel, and
so the default debian kernel will already contain a set of OFED
infiniband drivers. However, the drivers in the debian kernel may not
match the OFED userspace version have installed. Consult the table
below to determine what OFED version the debian kernel contains.
Debian Kernel Version OFED Version
<=2.6.26 1.3
>=2.6.27 1.4
If the debian kernel modules do not match the OFED version you have
installed you can build a new set of modules using the ofa-kernel-
source package. If your kernel already includes the correct OFED
kernel modules you can skip the rest of this section. If you are in
doubt, you should build a new set of modules rather than relying on
the modules shipped with the kernel.
33..11.. BBuuiillddiinngg OOFFEEDD kkeerrnneell mmoodduulleess
You can build new kernel modules using module-assistant.
aptitude install module-assistant
Ensure you have the ofa-kernel-source package installed, and then run:
module-assistant prepare
module-assistant clean ofa-kernel
module-assistant build ofa-kernel
This will create a deb containing the OFED kernel modules. The deb
contains replacements for existing kernel modules and so you will need
to either manually remove any infiniband modules which have already
been loaded, or reboot the machine, before you can use the new mod-
ules.
The new kernel modules will be installed into /usr/lib//updates. They will not overwrite the original kernel modules,
but the kernel module loader will pick up the modules from the updates
directory in preference. You can verify that the system is using the
new kernel modules by running the modinfo command.
# modinfo ib_core
filename: /lib/modules/2.6.22.19/updates/kernel/drivers/infiniband/core/ib_core.ko
author: Roland Dreier
description: core kernel InfiniBand API
license: Dual BSD/GPL
vermagic: 2.6.22.19 SMP mod_unload
Note that if you wish to rebuild the kernel modules (eg for a new
kernel version) then you must issue the module-assistant clean command
before trying a new build.
44.. SSeettttiinngg uupp aa bbaassiicc iinnffiinniibbaanndd nneettwwoorrkk
This sections describes how to set up a basic infiniband network and
test its functionality.
44..11.. UUppggrraaddee yyoouurr iinnffiinniibbaanndd ccaarrdd aanndd sswwiittcchh ffiirrmmwwaarree
Before proceeding you should ensure that the firmware in your switches
and infiniband cards is at the latest release. Older firmware
versions may cause interoperbility and fabric stability issues. Do not
assume that just because your hardware has come fresh from the factory
that it has the latest firmware on it.
You should follow the documentation from your vendor as to how the
firmware should be updated.
44..22.. CChhoooossee aa SSuubbnneett MMaannaaggeerr
Each infiniband network requires a subnet manager. You can choose to
run the OFED opensm subnet manager on one of the linux clients, or you
may choose to use an embedded subnet manager running on one of the
switches in your fabric. Note that not all switches come with a subnet
manager; check your switch documentation.
44..33.. LLooaadd tthhee kkeerrnneell mmoodduulleess
infiniband kernel modules are not loaded automatically. You should
adding them to /etc/modules so that they are automatically loaded on
machine bootup. You will need to include the hardware specific modules
and the protocol modules.
/etc/modules:
# Hardware drivers
# Choose the apropriate modules from
# /lib/modules//updates/kernel/drivers/infiniband/hw
#
#mlx4_ib # Mellanox ConnectX cards
#ib_mthca # some mellanox cards
#iw_cxgb3 # Chelsio T3 cards
#iw_nes # NetEffect cards
#
# Protocol modules
# Common modules
ib_umad
ib_uverbs
# IP over IB
ib_ipoib
# scsi over IB
ib_srp
# IB SDP protocol
ib_sdp
44..44.. ((ooppttiioonnaall)) SSttaarrtt ooppeennssmm
If you choose to use the opensm suetnet manager, edit
/etc/default/opensm and add the port GUIDs of the interfaces on which
you wish to start opensm.
You can find the port GUIDs of your cards with the ibstat -p command:
# ibstat -p
0x0002c9030002fb05
0x0002c9030002fb06
/etc/default/opensm:
PORTS="0x0002c9030002fb05 0x0002c9030002fb06"
Note if you want to start opensm on all ports you can use the
PORTS="ALL" keyword.
Start opensm:
#/etc/init.d/opensm start
If opensm has started correctly you should see SUBNET UP messages in
the opensm logfile (/var/log/opensm..log).
Mar 04 14:56:06 600685 [4580A960] 0x02 -> SUBNET UP
Note that you can start opensm on multiple nodes; one node will be the
active subnet manager and the others will put themselves into standby
and take over if the original subnet manager dies.
44..55.. CChheecckk nneettwwoorrkk hheeaalltthh
You can now check the status of the local infiniband link with the
ibstat command. Connected links should be in the "LinkUp" state. The
following output is from a host with a dual ported card, only one of
which (port1) is connected.
# ibstat
CA 'mlx4_0'
CA type: MT25418
Number of ports: 2
Firmware version: 2.3.0
Hardware version: a0
Node GUID: 0x0002c9030002fb04
System image GUID: 0x0002c9030002fb07
Port 1:
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 2
LMC: 0
SM lid: 1
Capability mask: 0x02510868
Port GUID: 0x0002c9030002fb05
Port 2:
State: Down
Physical state: Polling
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x02510868
Port GUID: 0x0002c9030002fb06
44..66.. CChheecckk tthhee eexxtteennddeedd nneettwwoorrkk ccoonnnneeccttiivviittyy
Once the host is connected to the infiniband network you can check the
health of all of the other network components with the ibhosts,
ibswitches and iblinkinfo commands.
ibhosts displays all of the hosts visible on the network.
# ibhosts
Ca : 0x0008f1040399d3d0 ports 2 "Voltaire HCA400Ex-D"
Ca : 0x0008f1040399d370 ports 2 "Voltaire HCA400Ex-D"
Ca : 0x0008f1040399d3fc ports 2 "Voltaire HCA400Ex-D"
Ca : 0x0008f1040399d3f4 ports 2 "Voltaire HCA400Ex-D"
Ca : 0x0002c9030002faf4 ports 2 "MT25408 ConnectX Mellanox Technologies"
Ca : 0x0002c9030002fc0c ports 2 "MT25408 ConnectX Mellanox Technologies"
Ca : 0x0002c9030002fc10 ports 2 "MT25408 ConnectX Mellanox Technologies"
ibswitches will display all of the switches in the network.
# ibswitches
Switch : 0x0008f104004121fa ports 24 "ISR9024D-M Voltaire" enhanced port 0 lid 1 lmc 0
iblinkinfo will show the status and speed of all of the links in the
network.
#iblinkinfo.pl
Switch 0x0008f104004121fa ISR9024D-M Voltaire:
1 1[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 2 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 2[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 13 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 3[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 4 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 4[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 26 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 5[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 27 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 6[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 24 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 7[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 28 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 8[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 25 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 9[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 31 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 10[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 32 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 11[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 33 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 12[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 29 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
1 13[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 30 1[ ] "MT25408 ConnectX Mellanox Technologies" ( )
14[ ] ==( 4X 2.5 Gbps Down / Polling)==> [ ] "" ( )
1 15[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 3 1[ ] "Voltaire HCA400Ex-D" ( )
1 16[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 10 1[ ] "Voltaire HCA400Ex-D" ( )
17[ ] ==( 4X 2.5 Gbps Down / Polling)==> [ ] "" ( )
18[ ] ==( 4X 2.5 Gbps Down / Polling)==> [ ] "" ( )
1 19[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 7 2[ ] "Voltaire HCA400Ex-D" ( )
1 20[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 6 2[ ] "Voltaire HCA400Ex-D" ( )
1 21[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 5 2[ ] "Voltaire HCA400Ex-D" ( )
1 22[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 21 1[ ] "Voltaire HCA400Ex-D" ( )
1 23[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 9 2[ ] "Voltaire HCA400Ex-D" ( )
1 24[ ] ==( 4X 5.0 Gbps Active / LinkUp)==> 8 1[ ] "Voltaire HCA400Ex-D" ( )
44..77.. tteessttiinngg ccoonnnneeccttiivviittyy wwiitthh iibbppiinngg
ibping is an infiniband equivalent to the icmp ping command. Choose a
node on the fabric and run a ibping server:
#ibping -S
Choose another node on your network, and then ping the port GUID of
the server. (ibstat on the server will list the port GUID).
#ibping -G 0x0002c9030002fc1d
Pong from test.example.com (Lid 13): time 0.072 ms
Pong from test.example.com (Lid 13): time 0.043 ms
Pong from test.example.com (Lid 13): time 0.045 ms
Pong from test.example.com (Lid 13): time 0.045 ms
44..88.. TTeessttiinngg RRDDMMAA ppeerrffoorrmmaannccee
You can test the latency and bandwith of a link with the ib_rdma_lat
commands.
To test the latency, start the server on a node:
#ib_rdma_lat
and then start a client on anothe node, giving it the hostname of the
server.
#ib_rdma_lat hostname-of-server
local address: LID 0x0d QPN 0x18004a PSN 0xca58c4 RKey 0xda002824 VAddr 0x00000000509001
remote address: LID 0x02 QPN 0x7c004a PSN 0x4b4eba RKey 0x82002466 VAddr 0x00000000509001
Latency typical: 1.15193 usec
Latency best : 1.13094 usec
Latency worst : 5.48519 usec
You can test the bandwith of the link using the ib_rdma_bw command.
#ib_rdma_bw
and then start a client on another node, giving it the hostname of the
server.
#ib_rdma_bw hostname-of-server
855: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 | duplex=0 | cma=0 |
855: Local address: LID 0x0d, QPN 0x1c004a, PSN 0xbf60dd RKey 0xde002824 VAddr 0x002aea4092b000
855: Remote address: LID 0x02, QPN 0x004a, PSN 0xaad03c, RKey 0x86002466 VAddr 0x002b8a4e191000
855: Bandwidth peak (#0 to #955): 1486.85 MB/sec
855: Bandwidth average: 1486.47 MB/sec
855: Service Demand peak (#0 to #955): 1970 cycles/KB
855: Service Demand Avg : 1971 cycles/KB
The perftest package contains a number of other similar benchmarking
programs to test various aspects of your network.
55.. IIPP oovveerr iinnffiinniibbaanndd ((IIPPooIIBB))
The OFED stack allows you to run TCP/IP over your infiniband network.
This is useful for running non-infiniband aware applications across
your network. Several native infiniband applications also use IPoIB
for host resolution (eg Lustre and SDP).
55..11.. LLiisstt tthhee nneettwwoorrkk ddeevviicceess
Check that the IBoIP modules is loaded.
#modprobe ib_ipoib
You will now have an "ib" network interface for each of your infini-
band cards.
#ifconfig -a
ib0 Link encap:UNSPEC HWaddr 80-06-00-48-FE-80-00-00-00-00-00-00-00-00-00-00
BROADCAST MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
ib1 Link encap:UNSPEC HWaddr 80-06-00-49-FE-80-00-00-00-00-00-00-00-00-00-00
BROADCAST MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
55..22.. IIPP CCoonnffiigguurraattiioonn
You can now configure the ib network devices using
/etc/network/interfaces.
auto ib0
iface ib0 inet static
address 172.31.128.50
netmask 255.255.240.0
broadcast 172.31.143.255
Bring the network device up, as normal.
ifup ib0
You can test IPoIB functionality using the usual IP networking tools
(eg ping, netperf etc).
55..33.. CCoonnnneecctteedd vvss UUnnccoonnnneecctteedd MMooddee
IPoIB can run over two infiniband transports, Unreliable Datagram (UD)
mode or Connected mode (CM). The difference between these two modes
are described in:
RFC4392 - IP over InfiniBand (IPoIB) Architecture
RFC4391 - Transmission of IP over InfiniBand (IPoIB) (UD mode)
RFC4755 - IP over InfiniBand: Connected Mode
ADDME: Pro/cons of these two methods?
You can switch between these two mode at runtime with:
echo datagram > /sys/class/net/ibX/mode
echo connected > /sys/class/net/ibX/mode
The default is datagram (UD) mode. If you with to use CM then you can
add a script to /etc/network/interfaces/if-up.d to automatically set
CM mode on your interfaces when they are configured.
55..44.. TTCCPP ttuunniinngg
In order to obtain maximum IPoIB throughput you may need to tweak the
MTU and various kernel TCP buffer and window settings. See the
details in the ipoib_release_notes.txt document in the ofed-docs
package.
66.. MMPPII
ADDME: How to run a test MPI application
77.. SSDDPP
Sockets Direct Protocol (SDP) is a infiniband network protocol which
provides an RDMA accelerated alternative to TCP. OFED provides an
LD_PRELOADable library (libsdp.so), which will allow code which has
been written to use TCP to use the more efficient SDP protocol
instead. The use of an LD_PRELOADable libary means that the program
does not need to be recompiled.
77..11.. CCoonnffiigguurraattiioonn
SDP used IPoIB for address resolution, so you must configure IPoIB
before using SDP.
You should also ensure the ib_sdp kernel module is installed.
modprobe ib_sdp
You can use libsdp in two ways; you can either manually LD_PRELOAD the
library whilst invoking your application, or create a config file
which specifies which applications will use SDP.
To manually LD_PRELOAD a library, simply set the LD_PRELOAD variable
before invoking your application.
LD_PRELOAD=libsdp.so ./path/to/your/application ...
If you which to choose which programs will use SDP you can edit
/etc/sdp.conf and specify which programs, ports and addresses are eli-
gible for use. See the comments in the config file for the syntax.
77..22.. EExxaammppllee UUssiinngg SSDDPP wwiitthh NNeettppiippee
The following example shows how to use libsdp to make the TCP
benchmarking application, netpipe, use SDP rather than TCP. NodeA is
the server and NodeB is the client. IPoIB is configured on both nodes,
and NodeA's IPoIB address is 10.0.0.1
Install netpipe on both nodes.
aptitude install netpipe-tcp
First, run the netpipe benchmark over TCP in order to obtain a
baseline number.
nodeA# NPtcp
nodeB# NPtcp -h 10.0.0.1
Send and receive buffers are 16384 and 87380 bytes
(A bug in Linux doubles the requested buffer sizes)
Now starting the main loop
0: 1 bytes 2778 times --> 0.22 Mbps in 34.04 usec
1: 2 bytes 2937 times --> 0.45 Mbps in 33.65 usec
2: 3 bytes 2971 times --> 0.69 Mbps in 33.41 usec
121: 8388605 bytes 3 times --> 2951.89 Mbps in 21680.99 usec
122: 8388608 bytes 3 times --> 3008.08 Mbps in 21276.00 usec
123: 8388611 bytes 3 times --> 2941.76 Mbps in 21755.66 usec
Now repeat the test, but force netpipe to use SDP rather than TCP.
nodeA# LD_PRELOAD=libsdp.so NPtcp
nodeB# LD_PRELOAD=libsdp.so NPtcp -h 10.0.0.1
Send and receive buffers are 16384 and 87380 bytes
(A bug in Linux doubles the requested buffer sizes)
Now starting the main loop
0: 1 bytes 9765 times --> 1.45 Mbps in 5.28 usec
1: 2 bytes 18946 times --> 2.80 Mbps in 5.46 usec
2: 3 bytes 18323 times --> 4.06 Mbps in 5.63 usec
121: 8388605 bytes 5 times --> 7665.51 Mbps in 8349.08 usec
122: 8388608 bytes 5 times --> 7668.62 Mbps in 8345.70 usec
123: 8388611 bytes 5 times --> 7629.04 Mbps in 8389.00 usec
You should see a significant increase in performance when using SDP.
88.. SSRRPP
SRP (SCSI Remote protocol or SCSI RDMA protocol) is a protocol that
allows the use of SCSI devices across infiniband. If you have
infiniband storage, then you can access the devices via SRP.
88..11.. CCoonnffiigguurraattiioonn
Ensure that your infiniband storage is presented to the host in
question. Check your storage controller documentation. Ensure that
the ib_srp kernel module is loaded and that the srptools package is
installed.
modprobe ib_srp
88..22.. SSRRPP ddeeaammoonn ccoonnffiigguurraattiioonn
srp_deamon is responisble for discovering and connecting to SRP
targets. The default configuration shipped with srp_daemon is to
ignore all presented devices; this is a failsafe to prevent devices
from being mounted by accident on the wrong hosts.
The srp_daemon config file /etc/srp_daemon.conf has a simply syntax,
and is described in the srp_daemon(1) manpage. Each line in this file
is a rule which can be either to allow connection or to disallow
connection according to the first character in the line (a or d
accordingly) and ID of the storage device.
88..22..11.. DDeetteerrmmiinnee tthhee IIDDss ooff pprreesseenntteedd ddeevviicceess
You can determine the IDs of SRP devices presented to your hosts by
running the ibsrpdm -c command.
# ibsrpdm -c
id_ext=50001ff10005052a,ioc_guid=50001ff10005052a,dgid=fe8000000000000050001ff10005052a,pkey=ffff,service_id=2a050500f11f0050
88..22..22.. CCoonnffiigguurree ssrrpp__ddeeaammoonn ttoo ccoonnnneecctt ttoo tthhee ddeevviicceess
Once we have the IDs of the devices, we can add them to
/etc/srp_daemon.conf. You can also specify other srp related options
for the target, such as max_cmd_per_lun and Max_sect. These are
storage specific; check your vendor documentation for reccomended
values.
# This rule allows connection to our target
a id_ext=50001ff10005052a,ioc_guid=50001ff10005052a,max_cmd_per_lun=32,max_sect=65535
# This rule disallows everything else
d
Restart the srp_deamon and the storage target should now become visi-
ble; check the kernel log to see if the disk has been detected.
/etc/init.d/srptools restart
In the example kernel log output the disk has been descovered as scsi
device sdb.
scsi 3:0:0:1: Direct-Access IBM DCS9900 5.03 PQ: 0 ANSI: 5
sd 3:0:0:1: [sdb] 1953458176 4096-byte hardware sectors (8001365 MB)
sd 3:0:0:1: [sdb] Write Protect is off
sd 3:0:0:1: [sdb] Mode Sense: 97 00 10 08
sd 3:0:0:1: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
sd 3:0:0:1: [sdb] 1953458176 4096-byte hardware sectors (8001365 MB)
sd 3:0:0:1: [sdb] Write Protect is off
sd 3:0:0:1: [sdb] Mode Sense: 97 00 10 08
sd 3:0:0:1: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
sdb:<6>scsi4 : SRP.T10:50001FF10005052A
unknown partition table
sd 3:0:0:1: [sdb] Attached SCSI disk
sd 3:0:0:1: Attached scsi generic sg5 type 0
88..33.. MMuullttiippaatthhiinngg,, LLVVMM aanndd ffoorrmmaattttiinngg..
The newly detected SRP device can be treated as an other scsi device.
If you have multiple infiniband adapters you can use multipath-tools
on top of the SRP devices to protects against a network failure. If
you are not using multipathed IO you can simply format the device as
normal.
99.. BBuuiillddiinngg LLuussttrree aaggaaiinnsstt OOFFEEDD
Lustre is a scalable cluster filesystem popular on high performance
compute clusters. See http://www.lustre.org
for more information. lustre can use infiniband as one of its network
transports in order to increase performance. The section describes how
to compile lustre against the OFED infinband stack.
99..11.. CChheecckk CCoommppaattiibbiilliittyy
99..22.. IInnssttaallll tthhee OOFFEEDD ppaacckkaaggeess
99..33.. IInnssttaallll aa sseett ooff kkeerrnneell mmoodduulleess
99..44.. CCoonnffiigguurree lluussttrree
1100.. NNeettwwoorrkk TTrroouubblleesshhoooottiinngg
Diags: ibdiagnet -r
Applications
Lustre over IB.
Example MPI application.
openmpi-dev
SDP
1111.. FFuurrtthheerr IInnffoorrmmaattiioonn