<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
tags -1 +patch<br>
thanks<br>
<br>
I believe I have tracked down the problem. The changes introduced
in fec9e97c51b3a8ff226a4b3b2b0563a4a680ac68 cause a file object to
be manipulated on both sides of a thread boundary, which seems
dangerous. The attached patch restructures things to avoid that
problem, and a few other implementation issues. The commit message
has more details about the rationale.<br>
<br>
I've also attached a script I used to help reproduce the issue. It
doesn't do so reliably, and it's not totally robust itself, but I
found it more reliable than trying to time timeout or a ^C right.<br>
<br>
fec9d97 was meant to improve portability. It's possible that my
patch has its own portability problems: it opens a FIFO in
nonblocking mode, and fifo(7) makes it sound like this behavior is
undefined by POSIX. We might want to check with some BSD folks to
see what they do in that case. But even if the patch doesn't work
as intended outside Linux, hopefully fixing that is a matter of
checking for more error codes or something like that, and won't
require as much restructuring as fec9d97 did.<br>
<br>
I should note that the cleanup that we do after receiving a signal
is a little incomplete in the current state, too. What happens now
is: when we get a signal, we raise a SystemExit exception in the
main thread. That bubbles all the way back up to
diffoscope.main.main, which cleans up all the temporary files and
directories in a finally block, and then the Python process exits.
However, subprocesses and all of our threads (which are marked as
daemon threads) are still running, and have all the file descriptors
open. So the files still "exist" in some sense and are taking up
space, even if they're no longer addressable. We're also still
spending plenty of CPU doing the real diff work. Eventually the
processes will all go away because they finished their work
naturally, or because they encounter an error because their working
directories have disappeared out from under them.<br>
<br>
It would be nice to move to a system where, when we get a signal, we
send it to all our children, then wait for them to finish before
doing our own cleanup. But that's a bigger job and less urgent, so
I didn't go that far here.<br>
<br>
<pre class="moz-signature" cols="72">--
Brett Smith</pre>
</body>
</html>