Another patch to fix the -lx timeout code. Some actions spawn

sub-processes after bjam forks a new process (for example, after
g++ is forked by bjam, g++ then forks sub-processes like cc1plus).
The timeout code would kill the g++ process, but might not kill
the subprocesses spawned by g++.

I fixed this problem by making the bjam fork'ed process (g++) a 
session leader by calling setsid() before calling exec.  The setsid 
call, in essence, gives all child processes a parent process id 
(ppid) of the g++ process id.  This guarantees that killing g++ 
will kill all child processes spawned by g++ as well.

One last comment on the maximum process time before a process is actually
killed.  The worst case process elapsed time is 2x seconds if -lx is
given.  The reason is that a process might be one second away from being
killed and, if there's no other signal activity, the select function will
wait x seconds before timing out and killing any active processes.  So
if you say -lx and monitor a build known to have lengthy processes, you 
may see a process with up to 2x seconds of time before it is killed.



[SVN r39467]
This commit is contained in:
K. Noel Belcourt 2007-09-21 22:38:17 +00:00
parent fcd347ebde
commit 27447ae6fa

View File

@ -211,6 +211,17 @@ execcmd(
else
dup2(err[1], STDERR_FILENO);
/* Make this process a session leader
* so that when we kill it, all child
* processes of this process are terminated
* as well.
*
* we use kill(-pid, SIGKILL) to kill the
* session leader and all children of this
* session.
*/
setsid();
execvp( argv[0], argv );
_exit(127);
}
@ -367,7 +378,7 @@ void populate_file_descriptors(int *fmax, fd_set *fds)
struct tms buf;
clock_t current = times(&buf);
if (globs.timeout <= (current-cmdtab[i].start_time)/tps) {
kill(cmdtab[i].pid, SIGKILL);
kill(-cmdtab[i].pid, SIGKILL);
cmdtab[i].exit_reason = EXIT_TIMEOUT;
}
}
@ -408,16 +419,6 @@ execwait()
/* select will wait until io on a descriptor or a signal */
ret = select(fd_max+1, &fds, 0, 0, &tv);
if (0 == ret) {
/* select timed out, all processes have expired, kill them */
for (i=0; i<globs.jobs; ++i) {
cmdtab[i].start_time = 0;
}
/* select will wait until io on a descriptor or a signal */
populate_file_descriptors(&fd_max, &fds);
ret = select(fd_max+1, &fds, 0, 0, 0);
}
}
else {
/* select will wait until io on a descriptor or a signal */