[Debian-ha-maintainers] Bug#571134: redhat-cluster-suite: Soft lockup in send_dlm on shutdown

Russell Howe rhowe at bmtmarinerisk.com
Tue Feb 23 18:44:54 UTC 2010


Package: redhat-cluster-suite
Version: 2.20081102-1+lenny1
Severity: normal


When shutting down my cluster nodes, they always lock up with the 
following trace:

Stopping HP System Management Homepage...Stopping libvirt management 
daemon: libvirtd.
Stopping multipath daemon: multipathd.
Stopping Munin-Node: done.
Stopping nagios-nrpe: nagios-nrpe.
Stopping NFS common utilities: statd.
Stopping Network UPS Tools: upsmon.
Stopping internet superserver: inetd.
Stopping network management services: snmpd snmptrapd.
Shutting down Xen domains:[done].
Stopping XEN control daemon: xend.
Stopping NTP server: ntpd.
Saving the system clock.
Stopping NFS kernel daemon: mountd nfsd[691120.755497] nfsd: last server 
has exited
[691120.755536] nfsd: unexporting all filesystems
.
Unexporting directories for NFS kernel daemon....
Stopping DNS forwarder and DHCP server: dnsmasq.
Stopping enhanced syslogd: rsyslogd.
Stopping Bacula File daemon: bacula-fd.
Stopping cluster service manager:
Waiting for services to stop: done.
clurgmgrd is stopped.
done.
Stopping cluster manager
 Stopping Quorum Disk daemon: done
 Leaving fence domain: done
 Stopping daemons: gfs_controld dlm_controld fenced[691123.417539] dlm: 
closing connection to node 3
[691123.417628] dlm: closing connection to node 2
[691123.417699] dlm: closing connection to node 1
[691123.417751] dlm: closing connection to node 4
openais[4333]: cman killed by node 1 because we were killed by cman_tool 
or other application

 groupd
 Leaving the cluster:cman_tool: Cannot open connection to cman, is it 
running ?
 done
 Stopping cluster configuration system: done
 Unmounting config filesystem: done
[691186.940246] BUG: soft lockup - CPU#0 stuck for 61s! [dlm_send:4371]
[691186.940246] Modules linked in: xt_tcpudp xt_physdev iptable_filter 
ip_tables x_tables bridge netloop nfsd lockd nfs_acl auth_rpcgss sunrpc 
exportfs mptctl mptbase sg ipmi_devintf sctp libcrc32c ipv6 dlm configfs 
ext3 jbd mbcache loop ipmi_si psmouse container hpilo ipmi_msghandler 
pcspkr i5000_edac rng_core serio_raw button edac_core shpchp pci_hotplug 
evdev xfs dm_mirror dm_log dm_snapshot dm_round_robin dm_emc 
dm_multipath dm_mod ide_cd_mod cdrom ata_generic libata dock sd_mod 
usbhid hid ff_memless piix bnx2 ide_pci_generic ide_core ehci_hcd 
uhci_hcd e1000e qla2xxx firmware_class cciss scsi_transport_fc scsi_tgt 
scsi_mod thermal processor fan thermal_sys [last unloaded: gfs2]
[691186.940246] CPU 0:
[691186.940246] Modules linked in: xt_tcpudp xt_physdev iptable_filter 
ip_tables x_tables bridge netloop nfsd lockd nfs_acl auth_rpcgss sunrpc 
exportfs mptctl mptbase sg ipmi_devintf sctp libcrc32c ipv6 dlm configfs 
ext3 jbd mbcache loop ipmi_si psmouse container hpilo ipmi_msghandler 
pcspkr i5000_edac rng_core serio_raw button edac_core shpchp pci_hotplug 
evdev xfs dm_mirror dm_log dm_snapshot dm_round_robin dm_emc 
dm_multipath dm_mod ide_cd_mod cdrom ata_generic libata dock sd_mod 
usbhid hid ff_memless piix bnx2 ide_pci_generic ide_core ehci_hcd 
uhci_hcd e1000e qla2xxx firmware_class cciss scsi_transport_fc scsi_tgt 
scsi_mod thermal processor fan thermal_sys [last unloaded: gfs2]
[691186.940246] Pid: 4371, comm: dlm_send Not tainted 2.6.26-2-xen-amd64 
#1
[691186.940246] RIP: e030:[<ffffffffa02dc50f>]  [<ffffffffa02dc50f>] 
:dlm:tcp_connect_to_sock+0x0/0x1de
[691186.940246] RSP: e02b:ffff8800efe0fe68  EFLAGS: 00000296
[691186.940246] RAX: 00000000ffffffff RBX: ffff8800f15a6ba0 RCX: 
0000000000000000
[691186.940246] RDX: 0000000000005f5f RSI: 0000000000000000 RDI: 
ffff8800f15a6b00
[691186.940246] RBP: ffff8800f15a6b00 R08: ffff8800f0a5b0c8 R09: 
ffff8800f0a5b0c0
[691186.940246] R10: ffff8800f0a5b0c0 R11: ffff8800f0a5b0c8 R12: 
ffff8800f15a6ba0
[691186.940246] R13: ffffffffa02dcac7 R14: ffffffff8057d1c0 R15: 
0000000000000000
[691186.940246] FS:  00007fcb1fa26770(0000) GS:ffffffff8053a000(0000) 
knlGS:0000000000000000
[691186.940246] CS:  e033 DS: 0000 ES: 0000
[691186.940246] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[691186.940246] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[691186.940246] 
[691186.940246] Call Trace:
[691186.940246]  [<ffffffffa02dcaf1>] ? 
:dlm:process_send_sockets+0x2a/0x1d4
[691186.940246]  [<ffffffffa02dcac7>] ? 
:dlm:process_send_sockets+0x0/0x1d4
[691186.940246]  [<ffffffff8023c160>] ? run_workqueue+0xbe/0x189
[691186.940246]  [<ffffffff8023cb49>] ? worker_thread+0xd5/0xe0
[691186.940246]  [<ffffffff8023f4d5>] ? 
autoremove_wake_function+0x0/0x2e
[691186.940246]  [<ffffffff8023ca74>] ? worker_thread+0x0/0xe0
[691186.940246]  [<ffffffff8023f3a7>] ? kthread+0x47/0x74
[691186.940246]  [<ffffffff8022816c>] ? schedule_tail+0x27/0x5c
[691186.940246]  [<ffffffff8020be18>] ? child_rip+0xa/0x12
[691186.940246]  [<ffffffff8023f360>] ? kthread+0x0/0x74
[691186.940246]  [<ffffffff8020be0e>] ? child_rip+0x0/0x12
[691186.940246] 


-- System Information:
Debian Release: 5.0.4
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-2-xen-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages redhat-cluster-suite depends on:
ii  clvm                 2.02.39-7           Cluster LVM Daemon for lvm2
ii  cman                 2.20081102-1+lenny1 Red Hat cluster suite - cluster ma
ii  gfs-tools            2.20081102-1+lenny1 Red Hat cluster suite - global fil
ii  gfs2-tools           2.20081102-1+lenny1 Red Hat cluster suite - global fil
ii  rgmanager            2.20081102-1+lenny1 Red Hat cluster suite - clustered 

redhat-cluster-suite recommends no packages.

redhat-cluster-suite suggests no packages.

-- no debconf information





More information about the Debian-ha-maintainers mailing list