[Git][qa/jenkins.debian.net][master] 2 commits: djm: improve UX when rebooting a node fails

Holger Levsen (@holger) gitlab at salsa.debian.org
Wed Jul 12 13:08:16 BST 2023



Holger Levsen pushed to branch master at Debian QA / jenkins.debian.net


Commits:
1289b81d by Holger Levsen at 2023-07-12T11:43:40+02:00
djm: improve UX when rebooting a node fails

Signed-off-by: Holger Levsen <holger at layer-acht.org>

- - - - -
8d82b529 by Holger Levsen at 2023-07-12T14:07:16+02:00
reproducible system health: ignore less than 10 unkillable zombies.

this just happens (and could be migated with more isolation I guess)
but is also harmless and can only be fixed by rebooting the node in question.

Signed-off-by: Holger Levsen <holger at layer-acht.org>

- - - - -


4 changed files:

- bin/djm
- bin/reproducible_maintenance.sh
- bin/reproducible_system_health.sh
- logparse/reproducible.rules


Changes:

=====================================
bin/djm
=====================================
@@ -552,7 +552,7 @@ djm_do() {
 		# action
 		#
 		case $ACTION in
-			reboot)	( ssh $NODE "sudo reboot || ( echo press enter ; read a ) " || true ) & sleep 1
+			reboot)	( ssh $NODE "sudo reboot" || xterm -T "$SHORTNODE / $ACTION failed" -class deploy-jenkins -bg $BG -fa 'DejaVuSansMono' -fs 10 -e "echo -e 'ssh to $NODE failed, thus rebooting failed.\n\npress enter to continue' ; read a " )
 				run_xterm2wait4node_comeback
 				;;
 			powercycle)	case $SHORTNODE in


=====================================
bin/reproducible_maintenance.sh
=====================================
@@ -743,7 +743,11 @@ for i in $PBUIDS ; do
 	done
 done
 if [ -n "$PSCALL" ] ; then
-	echo -e "Warning: processes found which should not be there and which could not be killed. Please fix manually:"
+	if [ $(ps -F -p "$PSCALL" | wc -l) -lt 10 ] ; then
+		echo "Info: ignoring less than ten processes found which should not be there and which could not be killed, because those are probably just a few harmless zombies, which can only be removed by rebooting...."
+	else
+		 echo "Warning: found more than ten processes which should not be there and which could not be killed. Please investigate and reboot or ignore them...:"
+	fi
 	ps -F -p "$PSCALL"
 	echo
 fi


=====================================
bin/reproducible_system_health.sh
=====================================
@@ -178,7 +178,7 @@ for JOB_NAME in $(ls -1d reproducible_* | sort ) ; do
 			small_note "session failed for user jenkins"
 		elif $(grep -q "etckeeper.service loaded failed" $LOG) ; then
 			small_note "etckeeper.service problem, manual intervention required"
-		elif $(grep -E -q "^Warning: processes found which should not be there and which could not be killed." $LOG) ; then
+		elif $(grep -E -q "^Warning: found more than ten processes which should not be there" $LOG) ; then
 			small_note "unkillable unwanted processes"
 		elif $(grep -q "failed failed pbuilder_build" $LOG) ; then
 			small_note "pbuilder build scope failed"


=====================================
logparse/reproducible.rules
=====================================
@@ -7,7 +7,7 @@ warning /Warning: .+ contains invalid yaml, please fix./
 warning /Warning: lock .+ still exists, exiting./
 warning /^Warning: failed to end schroot session:/
 warning /Warning: Tried, but failed to delete these/
-warning /Warning: processes found which should not be there/
+warning /Warning: found more than ten processes which should not be there/
 warning /Warning: found reproducible_build.sh processes which have pid 1 as parent.+/
 warning /Warning: Found files with bad permissions.+/
 warning /Warning: .+ could not be fully removed.+/



View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/-/compare/3138d6fa96a043ba05bff226bf0a5993f5001d18...8d82b52996729618201d86b4e601b895a58e4661

-- 
View it on GitLab: https://salsa.debian.org/qa/jenkins.debian.net/-/compare/3138d6fa96a043ba05bff226bf0a5993f5001d18...8d82b52996729618201d86b4e601b895a58e4661
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/qa-jenkins-scm/attachments/20230712/f808955f/attachment-0001.htm>


More information about the Qa-jenkins-scm mailing list