* Synchronization problem with posix_spawn @ 2020-07-29 20:17 Ken Brown 2020-07-29 23:12 ` Ken Brown 0 siblings, 1 reply; 11+ messages in thread From: Ken Brown @ 2020-07-29 20:17 UTC (permalink / raw) To: cygwin [-- Attachment #1: Type: text/plain, Size: 1179 bytes --] posix_spawn(p) returns before the spawned process is fully up and running. As a result, the spawned process can fail to receive signals. The attached test case illustrates the problem. It spawns a sleep process and then tries to kill it. On exit, the sleep process is still running. The following excerpts from the strace output show the issue: The SIGTERM signal is sent after the main program has forked a subprocess (and posix_spawnp has returned), but before the forked subprocess has exec'd the sleep process: 559 32069 [main] spawn_test 4125 vfork: stub called 257 48437 [main] spawn_test 4125 dofork: 4126 = fork() 754 9511 [main] spawn_test 4126 dofork: 0 = fork() 66 48503 [main] spawn_test 4125 kill0: kill (4126, 15) 44 9555 [main] spawn_test 4126 find_exec: find_exec (/usr/bin/sleep) 42 10835 [main] spawn_test 4126 spawnve: spawnve (/usr/bin/sleep, sleep, 0x8000281A0) 45 3149 [main] sleep 4126 child_info::ready: signalled 0x164 that I was ready 6475 21055 [main] spawn_test 4126! child_info::sync: pid 45028, WFMO returned 0, exit_code 0x103, res 1 --- Process 45028 (pid: 4126) thread 41444 created Ken [-- Attachment #2: spawn_test.c --] [-- Type: text/plain, Size: 469 bytes --] #include <stdlib.h> #include <stdio.h> #include <string.h> #include <spawn.h> #include <signal.h> extern char **environ; pid_t run_sleep () { pid_t pid; char *argv[] = { "sleep", "3600", NULL }; int err = posix_spawnp (&pid, "/usr/bin/sleep", NULL, NULL, argv, environ); if (err == 0) return pid; else { printf ("posix_spawnp: %s\n", strerror (err)); exit (1); } } int main () { pid_t pid = run_sleep (); kill (pid, SIGTERM); } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-07-29 20:17 Synchronization problem with posix_spawn Ken Brown @ 2020-07-29 23:12 ` Ken Brown 2020-07-30 11:59 ` Corinna Vinschen 0 siblings, 1 reply; 11+ messages in thread From: Ken Brown @ 2020-07-29 23:12 UTC (permalink / raw) To: cygwin On 7/29/2020 4:17 PM, Ken Brown via Cygwin wrote: > posix_spawn(p) returns before the spawned process is fully up and running. As a > result, the spawned process can fail to receive signals. The attached test case > illustrates the problem. It spawns a sleep process and then tries to kill it. > On exit, the sleep process is still running. > > The following excerpts from the strace output show the issue: The SIGTERM signal > is sent after the main program has forked a subprocess (and posix_spawnp has > returned), but before the forked subprocess has exec'd the sleep process: > > 559 32069 [main] spawn_test 4125 vfork: stub called > 257 48437 [main] spawn_test 4125 dofork: 4126 = fork() > 754 9511 [main] spawn_test 4126 dofork: 0 = fork() > 66 48503 [main] spawn_test 4125 kill0: kill (4126, 15) > 44 9555 [main] spawn_test 4126 find_exec: find_exec (/usr/bin/sleep) > 42 10835 [main] spawn_test 4126 spawnve: spawnve (/usr/bin/sleep, sleep, > 0x8000281A0) > 45 3149 [main] sleep 4126 child_info::ready: signalled 0x164 that I was > ready > 6475 21055 [main] spawn_test 4126! child_info::sync: pid 45028, WFMO > returned 0, exit_code 0x103, res 1 > --- Process 45028 (pid: 4126) thread 41444 created I just took a look at the source, and I see that posix_spawn was taken from FreeBSD. Does FreeBSD have the same problem? Should applications just be aware of this issue and insert a sleep after posix_spawn before sending signals? Ken ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-07-29 23:12 ` Ken Brown @ 2020-07-30 11:59 ` Corinna Vinschen 2020-07-30 17:17 ` Corinna Vinschen 0 siblings, 1 reply; 11+ messages in thread From: Corinna Vinschen @ 2020-07-30 11:59 UTC (permalink / raw) To: cygwin On Jul 29 19:12, Ken Brown via Cygwin wrote: > On 7/29/2020 4:17 PM, Ken Brown via Cygwin wrote: > > posix_spawn(p) returns before the spawned process is fully up and > > running. [...] > I just took a look at the source, and I see that posix_spawn was taken from > FreeBSD. Does FreeBSD have the same problem? Should applications just be > aware of this issue and insert a sleep after posix_spawn before sending > signals? Actually, this is a Cygwin problem. I just had a look into the newlib implementation myself, and it turns out that the code, in particular the do_posix_spawn() function, is BSD specific. It relies on the BSD implementation of vfork(2). Cygwin's vfork(2) on the other hand, is NOT following the historic idea of the BSD vfork(2), rather it's equivalent to fork(2). This is POSIX compliant, but certainly the reliance of the BSD vfork behaviour makes do_posix_spawn() as it's implemented right now, not overly functional for Cygwin. IOW, we need a Cygwin-specific do_posix_spawn() using fork(2) in conjunction with some synchronization the BSD function gets "for free" by using its specific vfork(2). Corinna -- Corinna Vinschen Cygwin Maintainer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-07-30 11:59 ` Corinna Vinschen @ 2020-07-30 17:17 ` Corinna Vinschen 2020-07-30 23:04 ` Ken Brown 0 siblings, 1 reply; 11+ messages in thread From: Corinna Vinschen @ 2020-07-30 17:17 UTC (permalink / raw) To: Ken Brown; +Cc: cygwin Hi Ken, On Jul 30 13:59, Corinna Vinschen wrote: > On Jul 29 19:12, Ken Brown via Cygwin wrote: > > On 7/29/2020 4:17 PM, Ken Brown via Cygwin wrote: > > > posix_spawn(p) returns before the spawned process is fully up and > > > running. [...] > > I just took a look at the source, and I see that posix_spawn was taken from > > FreeBSD. Does FreeBSD have the same problem? Should applications just be > > aware of this issue and insert a sleep after posix_spawn before sending > > signals? > > Actually, this is a Cygwin problem. I just had a look into the > newlib implementation myself, and it turns out that the code, > in particular the do_posix_spawn() function, is BSD specific. It > relies on the BSD implementation of vfork(2). Cygwin's vfork(2) > on the other hand, is NOT following the historic idea of the > BSD vfork(2), rather it's equivalent to fork(2). This is > POSIX compliant, but certainly the reliance of the BSD vfork > behaviour makes do_posix_spawn() as it's implemented right now, > not overly functional for Cygwin. > > IOW, we need a Cygwin-specific do_posix_spawn() using fork(2) > in conjunction with some synchronization the BSD function > gets "for free" by using its specific vfork(2). Below is a POC implementation for a Cygwin-specific do_posix_spawn(). If this does the trick (at least your testcase works in my testing), then I'm planning to move the function over to the winsup/cygwin dir so it can be streamlined further. Can you give it a try? Thanks, Corinna diff --git a/newlib/libc/posix/posix_spawn.c b/newlib/libc/posix/posix_spawn.c index 19c5cd0fe986..3adbac29d7fd 100644 --- a/newlib/libc/posix/posix_spawn.c +++ b/newlib/libc/posix/posix_spawn.c @@ -254,6 +254,82 @@ process_file_actions(const posix_spawn_file_actions_t fa) return (0); } +#ifdef __CYGWIN__ +#include <windows.h> +#include <sys/mman.h> +#include <process.h> + +typedef struct +{ + pid_t pid; + int error; +} child_info; + +static int +do_posix_spawn(pid_t *pid, const char *path, + const posix_spawn_file_actions_t *fa, + const posix_spawnattr_t *sa, + char * const argv[], char * const envp[], int use_env_path) +{ + pid_t p; + int ret = 0; + child_info *ci; + + ci = (child_info *) mmap(NULL, sizeof(child_info), + PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_ANONYMOUS, -1, 0); + if (ci == MAP_FAILED) + return (errno); + + /* Cygwin's vfork does not follow BSD vfork semantics. Rather it's + equivalent to fork. While that's POSIX compliant, it means the + below FreeBSD implementation relying on BSD vfork semantics + doesn't work as expected on Cygwin. */ + p = fork(); + switch (p) { + case -1: + ret = errno; + break; + case 0: + if (sa != NULL) { + ci->error = process_spawnattr(*sa); + if (ci->error) + _exit(127); + } + if (fa != NULL) { + ci->error = process_file_actions(*fa); + if (ci->error) + _exit(127); + } + if (use_env_path) + p = spawnvpe(_P_NOWAITO, path, + (const char * const *) argv, + (const char * const *) + (envp != NULL ? envp : *p_environ)); + else + p = spawnve(_P_NOWAITO, path, + (const char * const *) argv, + (const char * const *) + (envp != NULL ? envp : *p_environ)); + if (p < 0) { + ci->error = errno; + _exit(127); + } + ci->pid = p; + _exit (0); + default: + if (waitpid(p, NULL, 0) < 0) + ret = errno; + else if (ci->error) + ret = ci->error; + else if (pid != NULL) + *pid = ci->pid; + break; + } + munmap(ci, sizeof(child_info)); + return ret; +} +#else static int do_posix_spawn(pid_t *pid, const char *path, const posix_spawn_file_actions_t *fa, @@ -292,6 +368,7 @@ do_posix_spawn(pid_t *pid, const char *path, return (error); } } +#endif int posix_spawn (pid_t *pid, -- Corinna Vinschen Cygwin Maintainer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-07-30 17:17 ` Corinna Vinschen @ 2020-07-30 23:04 ` Ken Brown 2020-07-31 8:10 ` Corinna Vinschen 0 siblings, 1 reply; 11+ messages in thread From: Ken Brown @ 2020-07-30 23:04 UTC (permalink / raw) To: cygwin [-- Attachment #1: Type: text/plain, Size: 2220 bytes --] Hi Corinna, On 7/30/2020 1:17 PM, Corinna Vinschen wrote: > Hi Ken, > > On Jul 30 13:59, Corinna Vinschen wrote: >> On Jul 29 19:12, Ken Brown via Cygwin wrote: >>> On 7/29/2020 4:17 PM, Ken Brown via Cygwin wrote: >>>> posix_spawn(p) returns before the spawned process is fully up and >>>> running. [...] >>> I just took a look at the source, and I see that posix_spawn was taken from >>> FreeBSD. Does FreeBSD have the same problem? Should applications just be >>> aware of this issue and insert a sleep after posix_spawn before sending >>> signals? >> >> Actually, this is a Cygwin problem. I just had a look into the >> newlib implementation myself, and it turns out that the code, >> in particular the do_posix_spawn() function, is BSD specific. It >> relies on the BSD implementation of vfork(2). Cygwin's vfork(2) >> on the other hand, is NOT following the historic idea of the >> BSD vfork(2), rather it's equivalent to fork(2). This is >> POSIX compliant, but certainly the reliance of the BSD vfork >> behaviour makes do_posix_spawn() as it's implemented right now, >> not overly functional for Cygwin. >> >> IOW, we need a Cygwin-specific do_posix_spawn() using fork(2) >> in conjunction with some synchronization the BSD function >> gets "for free" by using its specific vfork(2). > > Below is a POC implementation for a Cygwin-specific do_posix_spawn(). > If this does the trick (at least your testcase works in my testing), > then I'm planning to move the function over to the winsup/cygwin dir > so it can be streamlined further. > > Can you give it a try? It looks like something further is needed: 'wait' doesn't seem to recognize the spawned process. I'm attaching a second test case. This spawns a "sleep 10" process and then waits for termination. If I run it on Cygwin, it returns immediately like this: $ ./spawn_test2 wait: No child processes sleep 10 exited normally with status 0 and the sleep process is still running. If I run it on Linux, I see $ ./spawn_test2 sleep 10 exited normally with status 0 if I let the sleep process run to completion, or $ ./spawn_test2 sleep 10 exited due to signal 15 if I run 'pkill sleep' in another terminal. Ken [-- Attachment #2: spawn_test2.c --] [-- Type: text/plain, Size: 762 bytes --] #include <stdlib.h> #include <stdio.h> #include <string.h> #include <spawn.h> #include <signal.h> #include <sys/wait.h> extern char **environ; pid_t run_sleep () { pid_t pid; char *argv[] = { "sleep", "10", NULL }; int err = posix_spawnp (&pid, "/bin/sleep", NULL, NULL, argv, environ); if (err == 0) return pid; else { printf ("posix_spawnp: %s\n", strerror (err)); exit (1); } } int main () { int status; pid_t pid = run_sleep (); if (wait (&status) < 0) perror ("wait"); if (WIFEXITED (status)) printf ("sleep 10 exited normally with status %d\n", WEXITSTATUS (status)); else if (WIFSIGNALED (status)) printf ("sleep 10 exited due to signal %d\n", WTERMSIG (status)); else printf ("????\n"); } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-07-30 23:04 ` Ken Brown @ 2020-07-31 8:10 ` Corinna Vinschen 2020-08-03 9:10 ` Peter Dons Tychsen 0 siblings, 1 reply; 11+ messages in thread From: Corinna Vinschen @ 2020-07-31 8:10 UTC (permalink / raw) To: cygwin On Jul 30 19:04, Ken Brown via Cygwin wrote: > Hi Corinna, > > On 7/30/2020 1:17 PM, Corinna Vinschen wrote: > > Hi Ken, > > > > On Jul 30 13:59, Corinna Vinschen wrote: > > > On Jul 29 19:12, Ken Brown via Cygwin wrote: > > > > On 7/29/2020 4:17 PM, Ken Brown via Cygwin wrote: > > > > > posix_spawn(p) returns before the spawned process is fully up and > > > > > running. [...] > > > > I just took a look at the source, and I see that posix_spawn was taken from > > > > FreeBSD. Does FreeBSD have the same problem? Should applications just be > > > > aware of this issue and insert a sleep after posix_spawn before sending > > > > signals? > > > > > > Actually, this is a Cygwin problem. I just had a look into the > > > newlib implementation myself, and it turns out that the code, > > > in particular the do_posix_spawn() function, is BSD specific. It > > > relies on the BSD implementation of vfork(2). Cygwin's vfork(2) > > > on the other hand, is NOT following the historic idea of the > > > BSD vfork(2), rather it's equivalent to fork(2). This is > > > POSIX compliant, but certainly the reliance of the BSD vfork > > > behaviour makes do_posix_spawn() as it's implemented right now, > > > not overly functional for Cygwin. > > > > > > IOW, we need a Cygwin-specific do_posix_spawn() using fork(2) > > > in conjunction with some synchronization the BSD function > > > gets "for free" by using its specific vfork(2). > > > > Below is a POC implementation for a Cygwin-specific do_posix_spawn(). > > If this does the trick (at least your testcase works in my testing), > > then I'm planning to move the function over to the winsup/cygwin dir > > so it can be streamlined further. > > > > Can you give it a try? > > It looks like something further is needed: 'wait' doesn't seem to recognize > the spawned process. Oh well. I did a quick test with your new testcase (thanks for that!) and it seems to be a bit more complicated than I anticipated yesterday. The parent-child relationship between the processes is broken. I have to think a while about this problem, stay tuned. Corinna -- Corinna Vinschen Cygwin Maintainer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-07-31 8:10 ` Corinna Vinschen @ 2020-08-03 9:10 ` Peter Dons Tychsen 2020-08-03 10:50 ` Corinna Vinschen 0 siblings, 1 reply; 11+ messages in thread From: Peter Dons Tychsen @ 2020-08-03 9:10 UTC (permalink / raw) To: cygwin Hi all, On Fri, 2020-07-31 at 10:10 +0200, Corinna Vinschen wrote: > Oh well. I did a quick test with your new testcase (thanks for > that!) > and it seems to be a bit more complicated than I anticipated > yesterday. > The parent-child relationship between the processes is broken. I > have > to think a while about this problem, stay tuned. I also have seen this problem. I propose a different solution however. Why no get rid of the call to fork() all together. One of the things bogging down performance on larger setups is the calls to fork() are is inherently slow and should be avoided at all costs. Instead why don't we just call spawn(vpe) instead, which is quite stable and fast. This would give a _huge_ boost to e.g. larger build jobs. Make supports posix_spawn, but it does not help on cygwin at it just calls back into fork(). The current implementation is probably something that makes sense on BSD/linux as fork() is lightning fast there. Just an idea... /pedro ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-08-03 9:10 ` Peter Dons Tychsen @ 2020-08-03 10:50 ` Corinna Vinschen 2020-08-20 5:40 ` Peter Dons Tychsen 0 siblings, 1 reply; 11+ messages in thread From: Corinna Vinschen @ 2020-08-03 10:50 UTC (permalink / raw) To: cygwin On Aug 3 11:10, Peter Dons Tychsen via Cygwin wrote: > Hi all, > > On Fri, 2020-07-31 at 10:10 +0200, Corinna Vinschen wrote: > > Oh well. I did a quick test with your new testcase (thanks for > > that!) > > and it seems to be a bit more complicated than I anticipated > > yesterday. > > The parent-child relationship between the processes is broken. I > > have > > to think a while about this problem, stay tuned. > > I also have seen this problem. I propose a different solution however. > Why no get rid of the call to fork() all together. One of the things > bogging down performance on larger setups is the calls to fork() are is > inherently slow and should be avoided at all costs. Instead why don't > we just call spawn(vpe) instead, which is quite stable and fast. > > This would give a _huge_ boost to e.g. larger build jobs. Make supports > posix_spawn, but it does not help on cygwin at it just calls back into > fork(). spawn alone doesn't cut it, due to the requirement to support the additional file actions and spawn atributes POSIX defines. This would require a revamp of Cygwin's spawn functionality, which is already quite complicated. So this is something I'm only willing to do in homeopathic doses. Corinna -- Corinna Vinschen Cygwin Maintainer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-08-03 10:50 ` Corinna Vinschen @ 2020-08-20 5:40 ` Peter Dons Tychsen 2020-08-20 12:50 ` Corinna Vinschen 0 siblings, 1 reply; 11+ messages in thread From: Peter Dons Tychsen @ 2020-08-20 5:40 UTC (permalink / raw) To: cygwin Hi Corinna, > spawn alone doesn't cut it, due to the requirement to support the > additional file actions and spawn atributes POSIX defines. This > would require a revamp of Cygwin's spawn functionality, which is > already quite complicated. So this is something I'm only willing > to do in homeopathic doses. OK. I can see that most of the processing of the options are done seperately. We could to start with create a shortcut that calls spawn[vpe]() instead if there are no options specified? That would speed up most cases. Then we could add the extra cases later if needed. Would that be an acceptable patch? The reason i am pushing for this is that allot of developers i interact with get the feeling cygwin is slow. When i look into their cases it is almost always because they use programs that end up calling fork() allot (large shell scripts, makefiles, etc.). Almost all modern versions of these programs have support for calling posix_spawn() instead when creating processes, which would (if we changed it) speed up these cases dramatically if it avoided fork(). It would simply make cygwin more appealing to this group of users. Thanks, /pedro ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-08-20 5:40 ` Peter Dons Tychsen @ 2020-08-20 12:50 ` Corinna Vinschen 2020-08-21 7:58 ` Peter Dons Tychsen 0 siblings, 1 reply; 11+ messages in thread From: Corinna Vinschen @ 2020-08-20 12:50 UTC (permalink / raw) To: Peter Dons Tychsen; +Cc: cygwin On Aug 20 07:40, Peter Dons Tychsen via Cygwin wrote: > Hi Corinna, > > > spawn alone doesn't cut it, due to the requirement to support the > > additional file actions and spawn atributes POSIX defines. This > > would require a revamp of Cygwin's spawn functionality, which is > > already quite complicated. So this is something I'm only willing > > to do in homeopathic doses. > > OK. I can see that most of the processing of the options are done > seperately. We could to start with create a shortcut that calls > spawn[vpe]() instead if there are no options specified? That would > speed up most cases. Then we could add the extra cases later if needed. > > Would that be an acceptable patch? No, it won't work as expected, as you can see from the discussion in this thread. Some internal work would be required. > The reason i am pushing for this is that allot of developers i interact > with get the feeling cygwin is slow. Well, Cygwin *is* slow, given the fork/exec implementation. We asked Microsoft a lot if there's any chance to create a fork-alike working in the Windows subsystem, but that ship has sailed. Corinna -- Corinna Vinschen Cygwin Maintainer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Synchronization problem with posix_spawn 2020-08-20 12:50 ` Corinna Vinschen @ 2020-08-21 7:58 ` Peter Dons Tychsen 0 siblings, 0 replies; 11+ messages in thread From: Peter Dons Tychsen @ 2020-08-21 7:58 UTC (permalink / raw) To: cygwin Hi Corinna, On Thu, 2020-08-20 at 14:50 +0200, Corinna Vinschen wrote: > No, it won't work as expected, as you can see from the discussion in > > this thread. Some internal work would be required. OK. So even with no file actions and spawn atributes, it still would break things. What kind of things would not work? Could you share a few thoughts on what would be needed? Just to get an idea of the workload it would imply. Thanks, /pedro ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-08-21 7:58 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-07-29 20:17 Synchronization problem with posix_spawn Ken Brown 2020-07-29 23:12 ` Ken Brown 2020-07-30 11:59 ` Corinna Vinschen 2020-07-30 17:17 ` Corinna Vinschen 2020-07-30 23:04 ` Ken Brown 2020-07-31 8:10 ` Corinna Vinschen 2020-08-03 9:10 ` Peter Dons Tychsen 2020-08-03 10:50 ` Corinna Vinschen 2020-08-20 5:40 ` Peter Dons Tychsen 2020-08-20 12:50 ` Corinna Vinschen 2020-08-21 7:58 ` Peter Dons Tychsen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).