[Pkg-ofed-commits] r487 - /trunk/ofed-docs/trunk/DEBIAN-HOWTO/

gmpc-guest at alioth.debian.org gmpc-guest at alioth.debian.org
Tue Oct 13 14:22:05 UTC 2009


Author: gmpc-guest
Date: Tue Oct 13 14:22:04 2009
New Revision: 487

URL: http://svn.debian.org/wsvn/pkg-ofed/?sc=1&rev=487
Log:
Add latest howto 

Added:
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-12.html
Modified:
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-1.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-10.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-11.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-2.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-3.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-4.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-5.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-6.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-7.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-8.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-9.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.html
    trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.txt

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-1.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-1.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-1.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-1.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: Introduction</TITLE>
  <LINK HREF="infiniband-howto-2.html" REL=next>
 

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-10.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-10.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-10.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-10.html Tue Oct 13 14:22:04 2009
@@ -1,8 +1,8 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
- <TITLE>Infiniband HOWTO: Network Troubleshooting</TITLE>
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
+ <TITLE>Infiniband HOWTO: Troubleshooting</TITLE>
  <LINK HREF="infiniband-howto-11.html" REL=next>
  <LINK HREF="infiniband-howto-9.html" REL=previous>
  <LINK HREF="infiniband-howto.html#toc10" REL=contents>
@@ -12,18 +12,56 @@
 <A HREF="infiniband-howto-9.html">Previous</A>
 <A HREF="infiniband-howto.html#toc10">Contents</A>
 <HR>
-<H2><A NAME="s10">10.</A> <A HREF="infiniband-howto.html#toc10">Network Troubleshooting</A></H2>
+<H2><A NAME="s10">10.</A> <A HREF="infiniband-howto.html#toc10">Troubleshooting</A></H2>
 
-
-<H2><A NAME="ss10.1">10.1</A> <A HREF="infiniband-howto.html#toc10.1">ibdiagnet</A>
+<P>This section covers general troubleshooting and commonly reported problems.</P>
+<H2><A NAME="ss10.1">10.1</A> <A HREF="infiniband-howto.html#toc10.1">Genernal fabric troubleshooting</A>
 </H2>
 
-<P>The ibdiagnet program can be used to troubleshoot potential issues with your infiniband fabric.</P>
-<P>
+<P>The ibdiagnet program can be used to troubleshoot potential issues with your infiniband fabric.
 <BLOCKQUOTE><CODE>
 ibdiagnet -r
 </CODE></BLOCKQUOTE>
 </P>
+
+<H2><A NAME="ss10.2">10.2</A> <A HREF="infiniband-howto.html#toc10.2">ib_query_gid() failed errors on mlx4 platforms</A>
+</H2>
+
+<P>ibstat or opensm hangs and the following kernel messages are printed:</P>
+<P>
+<BLOCKQUOTE><CODE>
+<PRE>
+kernel: [   78.170077] ib0: ib_query_gid() failed
+kernel: [   89.272789] ib0: ib_query_port failed
+</PRE>
+</CODE></BLOCKQUOTE>
+</P>
+<P>Fix: Load the mlx4_core module with the msi_x=0 option.</P>
+<P>
+<BLOCKQUOTE><CODE>
+<PRE>
+cat > /etc/modprobe.d/mlx4_core &lt;&lt;EOF
+options mlx4_core msi_x=0
+EOF
+
+update-initramfs -u
+</PRE>
+</CODE></BLOCKQUOTE>
+</P>
+
+<H2><A NAME="ss10.3">10.3</A> <A HREF="infiniband-howto.html#toc10.3">Missing XRC support</A>
+</H2>
+
+<P>If you see error messages pertaining to missing support for XRC, it means you have mis-matched kernel modules and userspace libraries.
+<BLOCKQUOTE><CODE>
+<PRE>
+mlx4: There is a mismatch between the kernel and the userspace  
+libraries: Kernel does not support XRC. Exiting.
+</PRE>
+</CODE></BLOCKQUOTE>
+
+Fix: Make sure that you build and install the OFED kernel modules as described in section X.</P>
+
 
 <HR>
 <A HREF="infiniband-howto-11.html">Next</A>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-11.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-11.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-11.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-11.html Tue Oct 13 14:22:04 2009
@@ -1,42 +1,33 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
- <TITLE>Infiniband HOWTO: Further Information</TITLE>
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
+ <TITLE>Infiniband HOWTO: Tips and Tricks</TITLE>
+ <LINK HREF="infiniband-howto-12.html" REL=next>
  <LINK HREF="infiniband-howto-10.html" REL=previous>
  <LINK HREF="infiniband-howto.html#toc11" REL=contents>
 </HEAD>
 <BODY>
-Next
+<A HREF="infiniband-howto-12.html">Next</A>
 <A HREF="infiniband-howto-10.html">Previous</A>
 <A HREF="infiniband-howto.html#toc11">Contents</A>
 <HR>
-<H2><A NAME="s11">11.</A> <A HREF="infiniband-howto.html#toc11">Further Information</A></H2>
+<H2><A NAME="s11">11.</A> <A HREF="infiniband-howto.html#toc11">Tips and Tricks</A></H2>
 
-<P>Extensive documentation on the OFED software is present in the ofed-docs package.</P>
-<P>The openfabrics alliance webpage can be found here:</P>
-<P>
-<A HREF="http://www.openfabrics.org/">http://www.openfabrics.org/</A></P>
+<P>This section details an assortment of miscellaneous tips.</P>
+<H2><A NAME="ss11.1">11.1</A> <A HREF="infiniband-howto.html#toc11.1">Descriptive node names</A>
+</H2>
 
-<P>The following mailing lists are also useful:</P>
-<P>
-<A HREF="http://lists.alioth.debian.org/mailman/listinfo/pkg-ofed-devel">http://lists.alioth.debian.org/mailman/listinfo/pkg-ofed-devel</A>:
-pkg-ofed-devel: Discussion of debian specific problem or issues.</P>
+<P>You can give your hosts descriptive names by echoing text to the following file:
+<BLOCKQUOTE><CODE>
+<PRE>
+echo `uname -n` > /sys/class/infiniband/&lt;driver&gt;/node_desc
+</PRE>
+</CODE></BLOCKQUOTE>
+</P>
 
-<P>
-<A HREF="http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general">http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general</A>:
-ofa-general: General discussion of the OFED software.</P>
-<P>Books:
-<PRE>
-Infiniband Network Architecture
-by MindShare, Inc.; Tom Shanley
-Publisher: Addison-Wesley Professional
-Pub Date: October 31, 2002
-Print ISBN-10: 0-321-11765-4
-</PRE>
-</P>
 <HR>
-Next
+<A HREF="infiniband-howto-12.html">Next</A>
 <A HREF="infiniband-howto-10.html">Previous</A>
 <A HREF="infiniband-howto.html#toc11">Contents</A>
 </BODY>

Added: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-12.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-12.html?rev=487&op=file
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-12.html (added)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-12.html Tue Oct 13 14:22:04 2009
@@ -1,0 +1,43 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
+<HTML>
+<HEAD>
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
+ <TITLE>Infiniband HOWTO: Further Information</TITLE>
+ <LINK HREF="infiniband-howto-11.html" REL=previous>
+ <LINK HREF="infiniband-howto.html#toc12" REL=contents>
+</HEAD>
+<BODY>
+Next
+<A HREF="infiniband-howto-11.html">Previous</A>
+<A HREF="infiniband-howto.html#toc12">Contents</A>
+<HR>
+<H2><A NAME="s12">12.</A> <A HREF="infiniband-howto.html#toc12">Further Information</A></H2>
+
+<P>Extensive documentation on the OFED software is present in the ofed-docs package.</P>
+<P>The openfabrics alliance webpage can be found here:</P>
+<P>
+<A HREF="http://www.openfabrics.org/">http://www.openfabrics.org/</A></P>
+
+<P>The following mailing lists are also useful:</P>
+<P>
+<A HREF="http://lists.alioth.debian.org/mailman/listinfo/pkg-ofed-devel">http://lists.alioth.debian.org/mailman/listinfo/pkg-ofed-devel</A>:
+pkg-ofed-devel: Discussion of debian specific problem or issues.</P>
+
+<P>
+<A HREF="http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general">http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general</A>:
+ofa-general: General discussion of the OFED software.</P>
+<P>Books:
+<PRE>
+Infiniband Network Architecture
+by MindShare, Inc.; Tom Shanley
+Publisher: Addison-Wesley Professional
+Pub Date: October 31, 2002
+Print ISBN-10: 0-321-11765-4
+</PRE>
+</P>
+<HR>
+Next
+<A HREF="infiniband-howto-11.html">Previous</A>
+<A HREF="infiniband-howto.html#toc12">Contents</A>
+</BODY>
+</HTML>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-2.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-2.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-2.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-2.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: Installing the OFED Software</TITLE>
  <LINK HREF="infiniband-howto-3.html" REL=next>
  <LINK HREF="infiniband-howto-1.html" REL=previous>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-3.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-3.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-3.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-3.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: Install the kernel modules</TITLE>
  <LINK HREF="infiniband-howto-4.html" REL=next>
  <LINK HREF="infiniband-howto-2.html" REL=previous>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-4.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-4.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-4.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-4.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: Setting up a basic infiniband network </TITLE>
  <LINK HREF="infiniband-howto-5.html" REL=next>
  <LINK HREF="infiniband-howto-3.html" REL=previous>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-5.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-5.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-5.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-5.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: IP over Infiniband (IPoIB)</TITLE>
  <LINK HREF="infiniband-howto-6.html" REL=next>
  <LINK HREF="infiniband-howto-4.html" REL=previous>
@@ -101,7 +101,7 @@
 <P>In order to obtain maximum IPoIB throughput you may need to tweak the MTU and various kernel TCP buffer and window settings. 
 See the details in the ipoib_release_notes.txt document in the ofed-docs package.</P>
 
-<H2><A NAME="ss5.5">5.5</A> <A HREF="infiniband-howto.html#toc5.5">ARP and dual ported cards.</A>
+<H2><A NAME="ss5.5">5.5</A> <A HREF="infiniband-howto.html#toc5.5">ARP and dual ported cards</A>
 </H2>
 
 <P>If you have a dual ported card with both ports on the same IB subnet, but different IP subnets, you

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-6.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-6.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-6.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-6.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: OpenMPI</TITLE>
  <LINK HREF="infiniband-howto-7.html" REL=next>
  <LINK HREF="infiniband-howto-5.html" REL=previous>
@@ -74,7 +74,7 @@
 <P>OpenMPI uses ssh to spawn jobs on remote hosts. You should configure a public/private keypair to ensure that  you 
 can ssh between hosts without entering a password. You should also ensure that your login process is silent.</P>
 
-<H2><A NAME="ss6.6">6.6</A> <A HREF="infiniband-howto.html#toc6.6">Run the MPI PingPong benchmark.</A>
+<H2><A NAME="ss6.6">6.6</A> <A HREF="infiniband-howto.html#toc6.6">Run the MPI PingPong benchmark</A>
 </H2>
 
 

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-7.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-7.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-7.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-7.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: SDP</TITLE>
  <LINK HREF="infiniband-howto-8.html" REL=next>
  <LINK HREF="infiniband-howto-6.html" REL=previous>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-8.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-8.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-8.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-8.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: SRP</TITLE>
  <LINK HREF="infiniband-howto-9.html" REL=next>
  <LINK HREF="infiniband-howto-7.html" REL=previous>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-9.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-9.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-9.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto-9.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO: Building Lustre against OFED</TITLE>
  <LINK HREF="infiniband-howto-10.html" REL=next>
  <LINK HREF="infiniband-howto-8.html" REL=previous>
@@ -31,7 +31,7 @@
 It is required for the next step.</P>
 
 
-<H2><A NAME="ss9.3">9.3</A> <A HREF="infiniband-howto.html#toc9.3">Build OFED modules for the lustre patched kernel.</A>
+<H2><A NAME="ss9.3">9.3</A> <A HREF="infiniband-howto.html#toc9.3">Build OFED modules for the lustre patched kernel</A>
 </H2>
 
 <P>Build OFED modules against the newly build lustre patched kernel.</P>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.html
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.html?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.html (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.html Tue Oct 13 14:22:04 2009
@@ -1,7 +1,7 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
- <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.50">
+ <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.65">
  <TITLE>Infiniband HOWTO</TITLE>
  <LINK HREF="infiniband-howto-1.html" REL=next>
 
@@ -15,6 +15,9 @@
 <H1>Infiniband HOWTO</H1>
 
 <H2>Guy Coates </H2>
+<HR>
+<EM>This document describes how to install and configure the OFED infiniband software on Debian.</EM>
+<HR>
 <P>
 <H2><A NAME="toc1">1.</A> <A HREF="infiniband-howto-1.html">Introduction</A></H2>
 
@@ -57,7 +60,7 @@
 <LI><A NAME="toc5.2">5.2</A> <A HREF="infiniband-howto-5.html#ss5.2">IP Configuration</A>
 <LI><A NAME="toc5.3">5.3</A> <A HREF="infiniband-howto-5.html#ss5.3">Connected vs Unconnected Mode</A>
 <LI><A NAME="toc5.4">5.4</A> <A HREF="infiniband-howto-5.html#ss5.4">TCP tuning</A>
-<LI><A NAME="toc5.5">5.5</A> <A HREF="infiniband-howto-5.html#ss5.5">ARP and dual ported cards.</A>
+<LI><A NAME="toc5.5">5.5</A> <A HREF="infiniband-howto-5.html#ss5.5">ARP and dual ported cards</A>
 </UL>
 <P>
 <H2><A NAME="toc6">6.</A> <A HREF="infiniband-howto-6.html">OpenMPI</A></H2>
@@ -68,7 +71,7 @@
 <LI><A NAME="toc6.3">6.3</A> <A HREF="infiniband-howto-6.html#ss6.3">Check permissions and limits</A>
 <LI><A NAME="toc6.4">6.4</A> <A HREF="infiniband-howto-6.html#ss6.4">Install the mpi test programs</A>
 <LI><A NAME="toc6.5">6.5</A> <A HREF="infiniband-howto-6.html#ss6.5">Configure Host Access</A>
-<LI><A NAME="toc6.6">6.6</A> <A HREF="infiniband-howto-6.html#ss6.6">Run the MPI PingPong benchmark.</A>
+<LI><A NAME="toc6.6">6.6</A> <A HREF="infiniband-howto-6.html#ss6.6">Run the MPI PingPong benchmark</A>
 </UL>
 <P>
 <H2><A NAME="toc7">7.</A> <A HREF="infiniband-howto-7.html">SDP</A></H2>
@@ -91,17 +94,25 @@
 <UL>
 <LI><A NAME="toc9.1">9.1</A> <A HREF="infiniband-howto-9.html#ss9.1">Check Compatibility</A>
 <LI><A NAME="toc9.2">9.2</A> <A HREF="infiniband-howto-9.html#ss9.2">Build a lustre patched kernel</A>
-<LI><A NAME="toc9.3">9.3</A> <A HREF="infiniband-howto-9.html#ss9.3">Build OFED modules for the lustre patched kernel.</A>
+<LI><A NAME="toc9.3">9.3</A> <A HREF="infiniband-howto-9.html#ss9.3">Build OFED modules for the lustre patched kernel</A>
 <LI><A NAME="toc9.4">9.4</A> <A HREF="infiniband-howto-9.html#ss9.4">Configure lustre</A>
 </UL>
 <P>
-<H2><A NAME="toc10">10.</A> <A HREF="infiniband-howto-10.html">Network Troubleshooting</A></H2>
+<H2><A NAME="toc10">10.</A> <A HREF="infiniband-howto-10.html">Troubleshooting</A></H2>
 
 <UL>
-<LI><A NAME="toc10.1">10.1</A> <A HREF="infiniband-howto-10.html#ss10.1">ibdiagnet</A>
+<LI><A NAME="toc10.1">10.1</A> <A HREF="infiniband-howto-10.html#ss10.1">Genernal fabric troubleshooting</A>
+<LI><A NAME="toc10.2">10.2</A> <A HREF="infiniband-howto-10.html#ss10.2">ib_query_gid() failed errors on mlx4 platforms</A>
+<LI><A NAME="toc10.3">10.3</A> <A HREF="infiniband-howto-10.html#ss10.3">Missing XRC support</A>
 </UL>
 <P>
-<H2><A NAME="toc11">11.</A> <A HREF="infiniband-howto-11.html">Further Information</A></H2>
+<H2><A NAME="toc11">11.</A> <A HREF="infiniband-howto-11.html">Tips and Tricks</A></H2>
+
+<UL>
+<LI><A NAME="toc11.1">11.1</A> <A HREF="infiniband-howto-11.html#ss11.1">Descriptive node names</A>
+</UL>
+<P>
+<H2><A NAME="toc12">12.</A> <A HREF="infiniband-howto-12.html">Further Information</A></H2>
 
 <HR>
 <A HREF="infiniband-howto-1.html">Next</A>

Modified: trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.txt
URL: http://svn.debian.org/wsvn/pkg-ofed/trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.txt?rev=487&op=diff
==============================================================================
--- trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.txt (original)
+++ trunk/ofed-docs/trunk/DEBIAN-HOWTO/infiniband-howto.txt Tue Oct 13 14:22:04 2009
@@ -1,7 +1,10 @@
-  Infiniband Howto
+  Infiniband HOWTO
   Guy Coates
 
-  ____________________________________________________________
+
+  This document describes how to install and configure the OFED infini-
+  band software on Debian.
+  ______________________________________________________________________
 
   Table of Contents
 
@@ -38,15 +41,15 @@
      5.2 IP Configuration
      5.3 Connected vs Unconnected Mode
      5.4 TCP tuning
-     5.5 ARP and dual ported cards.
+     5.5 ARP and dual ported cards
 
   6. OpenMPI
      6.1 Configure IPoIB
      6.2 Load the modules
      6.3 Check permissions and limits
      6.4 Install the mpi test programs
-     6.5 Configure Hosts
-     6.6 Run the MPI PingPong benchmark.
+     6.5 Configure Host Access
+     6.6 Run the MPI PingPong benchmark
 
   7. SDP
      7.1 Configuration
@@ -62,13 +65,18 @@
   9. Building Lustre against OFED
      9.1 Check Compatibility
      9.2 Build a lustre patched kernel
-     9.3 Build OFED modules for the lustre patched kernel.
+     9.3 Build OFED modules for the lustre patched kernel
      9.4 Configure lustre
 
-  10. Network Troubleshooting
-     10.1 ibdiagnet
-
-  11. Further Information
+  10. Troubleshooting
+     10.1 Genernal fabric troubleshooting
+     10.2 ib_query_gid() failed errors on mlx4 platforms
+     10.3 Missing XRC support
+
+  11. Tips and Tricks
+     11.1 Descriptive node names
+
+  12. Further Information
 
 
   ______________________________________________________________________
@@ -132,12 +140,11 @@
 
   If you wish to build the OFED packages from the alioth svn repository,
   use the following procedure.
-
   22..22..11..  IInnssttaallll tthhee pprreerreeqquuiissiitteess ddeevveellooppmmeenntt ppaacckkaaggeess
 
 
 
-  aptitude install svn-buildpackage build-essential devscripts
+       aptitude install svn-buildpackage build-essential devscripts
 
 
 
@@ -159,10 +166,19 @@
 
 
 
-  Populate the tarballs with the *.orig.tar.gz files available form the
-  "upstream source" release on
-  https://alioth.debian.org/frs/?group_id=100311
-  <https://alioth.debian.org/frs/?group_id=100311>
+  Original source tarballs can be downloaded from the repository:
+
+
+         apt-get source libibverbs
+
+
+
+  Alternatively, you can grab the source code directly from upstream.
+
+  http://www.openfabrics.org/downloads/OFED/
+
+  Upstream source is distributed via SRPMS; you can use alien to convert
+  them into tarballs.
 
   22..22..44..  BBuuiilldd tthhee ppaacckkaaggeess..
 
@@ -191,29 +207,35 @@
   build order is:
 
 
-
-   libibcm
-   libibcommon
-   libibumad
-   libibmad
-   libnes
-   libsdp
-   dapl
-   opensm
-   infiniband-diags
-   ibutils
-   mstflint
-   perftest
-   qlvnictools
-   qpert
-   rds-tools
-   sdpnetstat
-   srptools
-   tvflash
-   ibsim
-   ofed-docs
-   ofa_kernel
-   ofed
+        libibverbs
+        libnes
+        libcxgb3
+        libipathverbs
+        libmlx4
+        libmthca
+        librdmacm
+        libibcm
+        libibcommon
+        libibumad
+        libibmad
+        libsdp
+        dapl
+        opensm
+        infiniband-diags
+        ibutils
+        mstflint
+        perftest
+        qlvnictools
+        qperf
+        rds-tools
+        sdpnetstat
+        srptools
+        tvflash
+        ibsim
+        mpitests
+        ofed-docs
+        ofa_kernel
+        ofed
 
 
 
@@ -241,6 +263,8 @@
   set of modules rather than relying on the modules shipped with the
   kernel.
 
+
+
   33..11..  BBuuiillddiinngg nneeww kkeerrnneell mmoodduulleess
 
   You can build new kernel modules using module-assistant.
@@ -253,17 +277,27 @@
   Ensure you have the ofa-kernel-source package installed, and then run:
 
 
-
-   module-assistant prepare
-   module-assistant clean ofa-kernel
-   module-assistant build ofa-kernel
-
-
-
-  This will create a deb which you can then install. As the deb contains
-  replacements for existing kernel modules you will need to either manu-
-  ally remove any infiniband modules which have already been loaded, or
-  reboot the machine, before you can use the new modules.
+        module-assistant prepare
+        module-assistant clean ofa-kernel
+        module-assistant build ofa-kernel
+
+
+
+  This procedure will create an ofa-kernel-modules deb in /usr/src. You
+  can the install the deb using dpkg or by running:
+
+
+        module-assistant install ofa-kernel
+
+
+
+  The deb can also be copied to your other infiniband hosts and
+  installed using dpkg.
+
+  As the deb contains replacements for existing kernel modules you will
+  need to either manually remove any infiniband modules which have
+  already been loaded, or reboot the machine, before you can use the new
+  modules.
 
   The new kernel modules will be installed into /usr/lib/<kernel-
   version>/updates. They will not overwrite the original kernel modules,
@@ -282,14 +316,16 @@
 
 
 
-  Note that if you wish to rebuild the kernel modules (eg for a new
-  kernel version) then you must issue the module-assistant clean command
-  before trying a new build.
+  Note that if you wish to rebuild the kernel modules for any reason,
+  (eg for a new kernel version or to continue an interrupted build) then
+  you must issue the "module-assistant clean" command before trying a
+  new build.
 
   44..  SSeettttiinngg uupp aa bbaassiicc iinnffiinniibbaanndd nneettwwoorrkk
 
   This sections describes how to set up a basic infiniband network and
   test its functionality.
+
 
   44..11..  UUppggrraaddee yyoouurr IInnffiinniibbaanndd ccaarrdd aanndd sswwiittcchh ffiirrmmwwaarree
 
@@ -355,9 +391,10 @@
   You can find the port GUIDs of your cards with the ibstat -p command:
 
 
-       # ibstat -p
-       0x0002c9030002fb05
-       0x0002c9030002fb06
+
+  # ibstat -p
+  0x0002c9030002fb05
+  0x0002c9030002fb06
 
 
 
@@ -398,32 +435,32 @@
 
 
 
-       # ibstat
-       CA 'mlx4_0'
-               CA type: MT25418
-               Number of ports: 2
-               Firmware version: 2.3.0
-               Hardware version: a0
-               Node GUID: 0x0002c9030002fb04
-               System image GUID: 0x0002c9030002fb07
-               Port 1:
-                       State: Active
-                       Physical state: LinkUp
-                       Rate: 20
-                       Base lid: 2
-                       LMC: 0
-                       SM lid: 1
-                       Capability mask: 0x02510868
-                       Port GUID: 0x0002c9030002fb05
-               Port 2:
-                       State: Down
-                       Physical state: Polling
-                       Rate: 10
-                       Base lid: 0
-                       LMC: 0
-                       SM lid: 0
-                       Capability mask: 0x02510868
-                       Port GUID: 0x0002c9030002fb06
+  # ibstat
+  CA 'mlx4_0'
+          CA type: MT25418
+          Number of ports: 2
+          Firmware version: 2.3.0
+          Hardware version: a0
+          Node GUID: 0x0002c9030002fb04
+          System image GUID: 0x0002c9030002fb07
+          Port 1:
+                  State: Active
+                  Physical state: LinkUp
+                  Rate: 20
+                  Base lid: 2
+                  LMC: 0
+                  SM lid: 1
+                  Capability mask: 0x02510868
+                  Port GUID: 0x0002c9030002fb05
+          Port 2:
+                  State: Down
+                  Physical state: Polling
+                  Rate: 10
+                  Base lid: 0
+                  LMC: 0
+                  SM lid: 0
+                  Capability mask: 0x02510868
+                  Port GUID: 0x0002c9030002fb06
 
 
 
@@ -447,6 +484,7 @@
        Ca      : 0x0002c9030002fc10 ports 2 "MT25408 ConnectX Mellanox Technologies"
 
 
+
   ibswitches will display all of the switches in the network.
 
 
@@ -459,32 +497,33 @@
   network.
 
 
-       #iblinkinfo.pl
-       Switch 0x0008f104004121fa ISR9024D-M Voltaire:
-             1    1[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       2    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1    2[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      13    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1    3[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       4    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1    4[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      26    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1    5[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      27    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1    6[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      24    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1    7[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      28    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1    8[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      25    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1    9[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      31    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1   10[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      32    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1   11[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      33    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1   12[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      29    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-             1   13[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      30    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
-                 14[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] "" (  )
-             1   15[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       3    1[  ] "Voltaire HCA400Ex-D" (  )
-             1   16[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      10    1[  ] "Voltaire HCA400Ex-D" (  )
-                 17[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] "" (  )
-                 18[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] "" (  )
-             1   19[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       7    2[  ] "Voltaire HCA400Ex-D" (  )
-             1   20[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       6    2[  ] "Voltaire HCA400Ex-D" (  )
-             1   21[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       5    2[  ] "Voltaire HCA400Ex-D" (  )
-             1   22[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      21    1[  ] "Voltaire HCA400Ex-D" (  )
-             1   23[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       9    2[  ] "Voltaire HCA400Ex-D" (  )
-             1   24[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       8    1[  ] "Voltaire HCA400Ex-D" (  )
+
+  #iblinkinfo.pl
+  Switch 0x0008f104004121fa ISR9024D-M Voltaire:
+        1    1[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       2    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1    2[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      13    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1    3[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       4    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1    4[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      26    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1    5[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      27    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1    6[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      24    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1    7[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      28    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1    8[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      25    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1    9[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      31    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1   10[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      32    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1   11[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      33    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1   12[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      29    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+        1   13[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      30    1[  ] "MT25408 ConnectX Mellanox Technologies" (  )
+            14[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] "" (  )
+        1   15[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       3    1[  ] "Voltaire HCA400Ex-D" (  )
+        1   16[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      10    1[  ] "Voltaire HCA400Ex-D" (  )
+            17[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] "" (  )
+            18[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] "" (  )
+        1   19[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       7    2[  ] "Voltaire HCA400Ex-D" (  )
+        1   20[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       6    2[  ] "Voltaire HCA400Ex-D" (  )
+        1   21[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       5    2[  ] "Voltaire HCA400Ex-D" (  )
+        1   22[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>      21    1[  ] "Voltaire HCA400Ex-D" (  )
+        1   23[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       9    2[  ] "Voltaire HCA400Ex-D" (  )
+        1   24[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>       8    1[  ] "Voltaire HCA400Ex-D" (  )
 
 
 
@@ -523,12 +562,12 @@
   server.
 
 
-       #ib_rdma_lat  hostname-of-server
-          local address: LID 0x0d QPN 0x18004a PSN 0xca58c4 RKey 0xda002824 VAddr 0x00000000509001
-         remote address: LID 0x02 QPN 0x7c004a PSN 0x4b4eba RKey 0x82002466 VAddr 0x00000000509001
-       Latency typical: 1.15193 usec
-       Latency best   : 1.13094 usec
-       Latency worst  : 5.48519 usec
+  #ib_rdma_lat  hostname-of-server
+     local address: LID 0x0d QPN 0x18004a PSN 0xca58c4 RKey 0xda002824 VAddr 0x00000000509001
+    remote address: LID 0x02 QPN 0x7c004a PSN 0x4b4eba RKey 0x82002466 VAddr 0x00000000509001
+  Latency typical: 1.15193 usec
+  Latency best   : 1.13094 usec
+  Latency worst  : 5.48519 usec
 
 
 
@@ -572,27 +611,28 @@
        #modprobe ib_ipoib
 
 
-  You will now have an "ib" network interface for each of your
-  infiniband cards.
-
-
-       #ifconfig -a
-
-       <snip>
-       ib0       Link encap:UNSPEC  HWaddr 80-06-00-48-FE-80-00-00-00-00-00-00-00-00-00-00
-                 BROADCAST MULTICAST  MTU:2044  Metric:1
-                 RX packets:0 errors:0 dropped:0 overruns:0 frame:0
-                 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
-                 collisions:0 txqueuelen:256
-                 RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
-
-       ib1       Link encap:UNSPEC  HWaddr 80-06-00-49-FE-80-00-00-00-00-00-00-00-00-00-00
-                 BROADCAST MULTICAST  MTU:2044  Metric:1
-                 RX packets:0 errors:0 dropped:0 overruns:0 frame:0
-                 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
-                 collisions:0 txqueuelen:256
-                 RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
-       <snip>
+  You will now have an "ib" network interface for each of your infini-
+  band cards.
+
+
+
+  #ifconfig -a
+
+  <snip>
+  ib0       Link encap:UNSPEC  HWaddr 80-06-00-48-FE-80-00-00-00-00-00-00-00-00-00-00
+            BROADCAST MULTICAST  MTU:2044  Metric:1
+            RX packets:0 errors:0 dropped:0 overruns:0 frame:0
+            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
+            collisions:0 txqueuelen:256
+            RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
+
+  ib1       Link encap:UNSPEC  HWaddr 80-06-00-49-FE-80-00-00-00-00-00-00-00-00-00-00
+            BROADCAST MULTICAST  MTU:2044  Metric:1
+            RX packets:0 errors:0 dropped:0 overruns:0 frame:0
+            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
+            collisions:0 txqueuelen:256
+            RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
+  <snip>
 
 
 
@@ -648,10 +688,10 @@
   details in the ipoib_release_notes.txt document in the ofed-docs
   package.
 
-  55..55..  AARRPP aanndd dduuaall ppoorrtteedd ccaarrddss..
-
-  If you have a dual ported card with both ports on the same IB subnet
-  but a different IP subnet, you will need to tweak the ARP settings for
+  55..55..  AARRPP aanndd dduuaall ppoorrtteedd ccaarrddss
+
+  If you have a dual ported card with both ports on the same IB subnet,
+  but different IP subnets, you will need to tweak the ARP settings for
   the IPoIB interfaces. See ipoib_release_notes.txt in the ofed-docs
   package for a full discussion of this issue.
 
@@ -702,7 +742,7 @@
   OpenMPI will need to pin memory. Edit /etc/security/limits.conf and
   add the line:
 
-       * hard memlock unlimited
+  * hard memlock unlimited
 
 
   66..44..  IInnssttaallll tthhee mmppii tteesstt pprrooggrraammss
@@ -712,36 +752,30 @@
        aptitude install mpitests
 
 
-  66..55..  CCoonnffiigguurree HHoossttss
+  66..55..  CCoonnffiigguurree HHoosstt AAcccceessss
 
   OpenMPI uses ssh to spawn jobs on remote hosts. You should configure a
   public/private keypair to ensure that you can ssh between hosts
   without entering a password. You should also ensure that your login
   process is silent.
 
-  Choose two hosts on which to test the program and put their hostnames
-  into a file called hostfile:
-
-
-
-        hostA slots=1
-        hostB slots=1
-
-
-
-  66..66..  RRuunn tthhee MMPPII PPiinnggPPoonngg bbeenncchhmmaarrkk..
+  66..66..  RRuunn tthhee MMPPII PPiinnggPPoonngg bbeenncchhmmaarrkk
 
   We will use the MPI PingPong benchmark for our testing. By default,
   openmpi should use inifiniband networks in preference to any tcp
-  networks it finds. However, we will force mpi to be extra-chatty
-  during the test to ensure that we are really using the infiniband
-  interfaces.
-
-  (ADDME: Is there a better way to confirm which networks openmpi is
-  using?)
-
-
-  mpirun --mca btl_openib_verbose 1 --mca btl ^tcp -n 2 -hostfile /path/to/hostfile IMB-MPI1 PingPong
+  networks it finds. However, we will force mpi to ignore tcp networks
+  to ensure that is using the infiniband network.
+
+
+  #!/bin/bash
+  #Infiniband MPI test program
+  #Edit the hosts below to match your test hosts
+  cat > /tmp/hostfile.$$.mpi <<EOF
+  hostA slots=1
+  HostB slots=1
+  EOF
+
+  mpirun --mca btl_openib_verbose 1 --mca btl ^tcp -n 2 -hostfile /tmp/hostfile.$$.mpi IMB-MPI1 PingPong
 
 
 
@@ -798,7 +832,19 @@
   are connected via eth0).
 
 
-   mpirun --mca btl ^openib --mca btl_tcp_if_include eth0 --hostfile hostfile -n 2 IMB-MPI1 -benchmark PingPong
+  #!/bin/bash
+  #TCP MPI test program
+  #Edit the hosts below to match your test hosts
+  cat > /tmp/hostfile.$$.mpi <<EOF
+  hostA slots=1
+  HostB slots=1
+  EOF
+  mpirun --mca btl ^openib --mca btl_tcp_if_include eth0 --hostfile hostfile -n 2 IMB-MPI1 -benchmark PingPong
+
+
+
+  You should notice signficantly higher latencies than for the
+  infiniband test.
 
 
 
@@ -815,6 +861,7 @@
 
   SDP used IPoIB for address resolution, so you must configure IPoIB
   before using SDP.
+
   You should also ensure the ib_sdp kernel module is installed.
 
   modprobe ib_sdp
@@ -867,23 +914,22 @@
        123: 8388611 bytes      3 times -->   2941.76 Mbps in   21755.66 usec
 
 
-
   Now repeat the test, but force netpipe to use SDP rather than TCP.
 
 
 
-  nodeA# LD_PRELOAD=libsdp.so NPtcp
-  nodeB# LD_PRELOAD=libsdp.so  NPtcp -h 10.0.0.1
-  Send and receive buffers are 16384 and 87380 bytes
-  (A bug in Linux doubles the requested buffer sizes)
-  Now starting the main loop
-    0:       1 bytes   9765 times -->      1.45 Mbps in       5.28 usec
-    1:       2 bytes  18946 times -->      2.80 Mbps in       5.46 usec
-    2:       3 bytes  18323 times -->      4.06 Mbps in       5.63 usec
-  <snip>
-  121: 8388605 bytes      5 times -->   7665.51 Mbps in    8349.08 usec
-  122: 8388608 bytes      5 times -->   7668.62 Mbps in    8345.70 usec
-  123: 8388611 bytes      5 times -->   7629.04 Mbps in    8389.00 usec
+       nodeA# LD_PRELOAD=libsdp.so NPtcp
+       nodeB# LD_PRELOAD=libsdp.so  NPtcp -h 10.0.0.1
+       Send and receive buffers are 16384 and 87380 bytes
+       (A bug in Linux doubles the requested buffer sizes)
+       Now starting the main loop
+         0:       1 bytes   9765 times -->      1.45 Mbps in       5.28 usec
+         1:       2 bytes  18946 times -->      2.80 Mbps in       5.46 usec
+         2:       3 bytes  18323 times -->      4.06 Mbps in       5.63 usec
+       <snip>
+       121: 8388605 bytes      5 times -->   7665.51 Mbps in    8349.08 usec
+       122: 8388608 bytes      5 times -->   7668.62 Mbps in    8345.70 usec
+       123: 8388611 bytes      5 times -->   7629.04 Mbps in    8389.00 usec
 
 
 
@@ -1002,7 +1048,7 @@
   wiki. Once you have build the kernel keep the configured source tree.
   It is required for the next step.
 
-  99..33..  BBuuiilldd OOFFEEDD mmoodduulleess ffoorr tthhee lluussttrree ppaattcchheedd kkeerrnneell..
+  99..33..  BBuuiilldd OOFFEEDD mmoodduulleess ffoorr tthhee lluussttrree ppaattcchheedd kkeerrnneell
 
   Build OFED modules against the newly build lustre patched kernel.
 
@@ -1030,18 +1076,70 @@
 
 
 
-  1100..  NNeettwwoorrkk TTrroouubblleesshhoooottiinngg
-
-  1100..11..  iibbddiiaaggnneett
+  1100..  TTrroouubblleesshhoooottiinngg
+
+  This section covers general troubleshooting and commonly reported
+  problems.
+
+  1100..11..  GGeenneerrnnaall ffaabbrriicc ttrroouubblleesshhoooottiinngg
 
   The ibdiagnet program can be used to troubleshoot potential issues
   with your infiniband fabric.
 
-
        ibdiagnet -r
 
 
-  1111..  FFuurrtthheerr IInnffoorrmmaattiioonn
+  1100..22..  iibb__qquueerryy__ggiidd(()) ffaaiilleedd eerrrroorrss oonn mmllxx44 ppllaattffoorrmmss
+
+  ibstat or opensm hangs and the following kernel messages are printed:
+
+
+
+       kernel: [   78.170077] ib0: ib_query_gid() failed
+       kernel: [   89.272789] ib0: ib_query_port failed
+
+
+
+  Fix: Load the mlx4_core module with the msi_x=0 option.
+
+
+       cat > /etc/modprobe.d/mlx4_core <<EOF
+       options mlx4_core msi_x=0
+       EOF
+
+       update-initramfs -u
+
+
+
+  1100..33..  MMiissssiinngg XXRRCC ssuuppppoorrtt
+
+  If you see error messages pertaining to missing support for XRC, it
+  means you have mis-matched kernel modules and userspace libraries.
+
+
+       mlx4: There is a mismatch between the kernel and the userspace
+       libraries: Kernel does not support XRC. Exiting.
+
+
+
+  Fix: Make sure that you build and install the OFED kernel modules as
+  described in section X.
+
+  1111..  TTiippss aanndd TTrriicckkss
+
+  This section details an assortment of miscellaneous tips.
+
+  1111..11..  DDeessccrriippttiivvee nnooddee nnaammeess
+
+  You can give your hosts descriptive names by echoing text to the
+  following file:
+
+
+       echo `uname -n` > /sys/class/infiniband/<driver>/node_desc
+
+
+
+  1122..  FFuurrtthheerr IInnffoorrmmaattiioonn
 
   Extensive documentation on the OFED software is present in the ofed-
   docs package.
@@ -1061,8 +1159,6 @@
   general: General discussion of the OFED software.
 
   Books:
-
-
 
   Infiniband Network Architecture
   by MindShare, Inc.; Tom Shanley




More information about the Pkg-ofed-commits mailing list