Bug#539154: mdadm monitor spins with start-failed raid0

Jeff DeFouw jeffd at i2k.com
Mon Jun 28 06:53:46 UTC 2010


I figured out what's causing this.  I sent the following message to the 
linux-raid mailing list.  The included patch may not be complete, because 
it may cause an undesired change of behavior.

You can reproduce the problem by creating a raid0 with two disks, 
stopping it, and then assembling it with "--run" and only one disk.  
It will try to start but fail.  In this start-failed state, it will be 
processed by mdadm --monitor --scan in a way that causes it to spin 
forever.  If that's not enough, I have a script that reproduces the 
problem using loopback devices.

----- Forwarded message from Jeff DeFouw <jeffd at i2k.com> -----

Date: Mon, 28 Jun 2010 02:34:33 -0400
From: Jeff DeFouw <jeffd at i2k.com>
To: linux-raid at vger.kernel.org
Subject: mdadm monitor spins with start-failed raid0

mdadm --monitor --scan (--oneshot) spins indefinitely without sleeping 
when an "inactive" start-failed raid0 or linear array is found in 
/proc/mdstat.  By "start-failed" I mean something attempts to 
(automatically) assemble and start the array, but the array fails to 
start.  In my case, an old raid0 is missing a disk.  The mdstat parser 
assumes all entries have a personality string, but "inactive" arrays 
don't.

md0 : inactive sda3[0]
      2915712 blocks

The first disk (sda3[0] in this case) is copied as the level string.  
The mismatch gets the raid0/linear array into the statelist, which is 
immediately rejected by the statelist loop.  The rejection occurs 
without marking the mdstat entry as used, so the array is seen as a new 
entry again, the sleep/break is skipped, a new duplicate state is added 
to the statelist, and the loop starts again immediately.

Fixing the parser is simple, but fixing it leads to Monitor ignoring ALL 
inactive arrays discovered by mdstat.  This is because the mdstat loop 
requires a level string.  If Monitor should process mdstat-discovered 
start-failed arrays (as it currently does), then either the level will 
have to be checked using GET_ARRAY_INFO, or raid0/linear arrays will 
have to be rejected later.

This patch only shows how to fix the parser.

---
 mdstat.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mdstat.c b/mdstat.c
index 4a9f370..fdca877 100644
--- a/mdstat.c
+++ b/mdstat.c
@@ -168,9 +168,10 @@ struct mdstat_ent *mdstat_read(int hold, int start)
 			char *eq;
 			if (strcmp(w, "active")==0)
 				ent->active = 1;
-			else if (strcmp(w, "inactive")==0)
+			else if (strcmp(w, "inactive")==0) {
 				ent->active = 0;
-			else if (ent->active >=0 &&
+				in_devs = 1;
+			} else if (ent->active > 0 &&
 				 ent->level == NULL &&
 				 w[0] != '(' /*readonly*/) {
 				ent->level = strdup(w);
-- 
1.7.1

-- 
Jeff DeFouw <jeffd at i2k.com>

----- End forwarded message -----

-- 
Jeff DeFouw <jeffd at i2k.com>





More information about the pkg-mdadm-devel mailing list