public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Bogus exit code 127 from a child process
@ 2024-03-17  8:14 Alexey Izbyshev
  2024-03-17  8:44 ` Takashi Yano
  0 siblings, 1 reply; 14+ messages in thread
From: Alexey Izbyshev @ 2024-03-17  8:14 UTC (permalink / raw)
  To: cygwin

Hello,

I've been getting occasional "Error 127" from make -jN on seemingly 
random jobs. After reducing the set of jobs and eventually eliminating 
make, I've arrived to this one-liner:

bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'

When run repeatedly, the second "wait -n" often reports 127.

I've reproduced this in the following environments:

* Cygwin 3.5.1, Windows 10 22H2 x64
* Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64

I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.

Thanks,
Alexey


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17  8:14 Bogus exit code 127 from a child process Alexey Izbyshev
@ 2024-03-17  8:44 ` Takashi Yano
  2024-03-17  9:01   ` Alexey Izbyshev
  0 siblings, 1 reply; 14+ messages in thread
From: Takashi Yano @ 2024-03-17  8:44 UTC (permalink / raw)
  To: cygwin; +Cc: Alexey Izbyshev

On Sun, 17 Mar 2024 11:14:16 +0300
Alexey Izbyshev wrote:
> Hello,
> 
> I've been getting occasional "Error 127" from make -jN on seemingly 
> random jobs. After reducing the set of jobs and eventually eliminating 
> make, I've arrived to this one-liner:
> 
> bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'
> 
> When run repeatedly, the second "wait -n" often reports 127.
> 
> I've reproduced this in the following environments:
> 
> * Cygwin 3.5.1, Windows 10 22H2 x64
> * Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64
> 
> I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.

Could you please try latest cygwin 3.6.0 (TEST) ?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17  8:44 ` Takashi Yano
@ 2024-03-17  9:01   ` Alexey Izbyshev
  2024-03-17  9:27     ` Takashi Yano
  0 siblings, 1 reply; 14+ messages in thread
From: Alexey Izbyshev @ 2024-03-17  9:01 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin

On 2024-03-17 11:44, Takashi Yano wrote:
> On Sun, 17 Mar 2024 11:14:16 +0300
> Alexey Izbyshev wrote:
>> Hello,
>> 
>> I've been getting occasional "Error 127" from make -jN on seemingly
>> random jobs. After reducing the set of jobs and eventually eliminating
>> make, I've arrived to this one-liner:
>> 
>> bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'
>> 
>> When run repeatedly, the second "wait -n" often reports 127.
>> 
>> I've reproduced this in the following environments:
>> 
>> * Cygwin 3.5.1, Windows 10 22H2 x64
>> * Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64
>> 
>> I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.
> 
> Could you please try latest cygwin 3.6.0 (TEST) ?

Tested with 3.6.0-0.82.gfc691d0246b9 on Windows 10 22H2 x64, the problem 
still occurs.

Thanks,
Alexey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17  9:01   ` Alexey Izbyshev
@ 2024-03-17  9:27     ` Takashi Yano
  2024-03-17 10:03       ` Alexey Izbyshev
  0 siblings, 1 reply; 14+ messages in thread
From: Takashi Yano @ 2024-03-17  9:27 UTC (permalink / raw)
  To: cygwin; +Cc: Alexey Izbyshev

On Sun, 17 Mar 2024 12:01:55 +0300
Alexey Izbyshev wrote:
> On 2024-03-17 11:44, Takashi Yano wrote:
> > On Sun, 17 Mar 2024 11:14:16 +0300
> > Alexey Izbyshev wrote:
> >> Hello,
> >> 
> >> I've been getting occasional "Error 127" from make -jN on seemingly
> >> random jobs. After reducing the set of jobs and eventually eliminating
> >> make, I've arrived to this one-liner:
> >> 
> >> bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'
> >> 
> >> When run repeatedly, the second "wait -n" often reports 127.
> >> 
> >> I've reproduced this in the following environments:
> >> 
> >> * Cygwin 3.5.1, Windows 10 22H2 x64
> >> * Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64
> >> 
> >> I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.
> > 
> > Could you please try latest cygwin 3.6.0 (TEST) ?
> 
> Tested with 3.6.0-0.82.gfc691d0246b9 on Windows 10 22H2 x64, the problem 
> still occurs.

In my evrironmen, trial for 1 hour does not reproduce the issue.
Could you please let us know your environment, i.e. CPU, amount of 
memory, and so on?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17  9:27     ` Takashi Yano
@ 2024-03-17 10:03       ` Alexey Izbyshev
  2024-03-17 10:21         ` Takashi Yano
  0 siblings, 1 reply; 14+ messages in thread
From: Alexey Izbyshev @ 2024-03-17 10:03 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin

On 2024-03-17 12:27, Takashi Yano wrote:
> On Sun, 17 Mar 2024 12:01:55 +0300
> Alexey Izbyshev wrote:
>> On 2024-03-17 11:44, Takashi Yano wrote:
>> > On Sun, 17 Mar 2024 11:14:16 +0300
>> > Alexey Izbyshev wrote:
>> >> Hello,
>> >>
>> >> I've been getting occasional "Error 127" from make -jN on seemingly
>> >> random jobs. After reducing the set of jobs and eventually eliminating
>> >> make, I've arrived to this one-liner:
>> >>
>> >> bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'
>> >>
>> >> When run repeatedly, the second "wait -n" often reports 127.
>> >>
>> >> I've reproduced this in the following environments:
>> >>
>> >> * Cygwin 3.5.1, Windows 10 22H2 x64
>> >> * Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64
>> >>
>> >> I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.
>> >
>> > Could you please try latest cygwin 3.6.0 (TEST) ?
>> 
>> Tested with 3.6.0-0.82.gfc691d0246b9 on Windows 10 22H2 x64, the 
>> problem
>> still occurs.
> 
> In my evrironmen, trial for 1 hour does not reproduce the issue.
> Could you please let us know your environment, i.e. CPU, amount of
> memory, and so on?

It's been reproduced in a variety of environments:

* Windows 10 22H2 x64, Intel Core i7 11700, 32 GB RAM
* Windows 10 22H2 x64, Intel Core i7 9700, 32 GB RAM
* Windows 10 22H2 x64, Intel Core i7 6700, 32 GB RAM
* Windows 7 SP1 x64, Intel Core i7 6700, 32 GB RAM

I'm surprised that you're not hitting it very quickly. The following 
loop usually fails after a few iterations (rarely a hundred or so) in my 
tests:

while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait 
-n || { echo 2: $?; exit 1; }'; do echo $((i++)); done

Thanks,
Alexey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17 10:03       ` Alexey Izbyshev
@ 2024-03-17 10:21         ` Takashi Yano
  2024-03-17 12:03           ` Takashi Yano
  0 siblings, 1 reply; 14+ messages in thread
From: Takashi Yano @ 2024-03-17 10:21 UTC (permalink / raw)
  To: cygwin; +Cc: Alexey Izbyshev

On Sun, 17 Mar 2024 13:03:40 +0300
Alexey Izbyshev wrote:
> On 2024-03-17 12:27, Takashi Yano wrote:
> > On Sun, 17 Mar 2024 12:01:55 +0300
> > Alexey Izbyshev wrote:
> >> On 2024-03-17 11:44, Takashi Yano wrote:
> >> > On Sun, 17 Mar 2024 11:14:16 +0300
> >> > Alexey Izbyshev wrote:
> >> >> Hello,
> >> >>
> >> >> I've been getting occasional "Error 127" from make -jN on seemingly
> >> >> random jobs. After reducing the set of jobs and eventually eliminating
> >> >> make, I've arrived to this one-liner:
> >> >>
> >> >> bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'
> >> >>
> >> >> When run repeatedly, the second "wait -n" often reports 127.
> >> >>
> >> >> I've reproduced this in the following environments:
> >> >>
> >> >> * Cygwin 3.5.1, Windows 10 22H2 x64
> >> >> * Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64
> >> >>
> >> >> I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.
> >> >
> >> > Could you please try latest cygwin 3.6.0 (TEST) ?
> >> 
> >> Tested with 3.6.0-0.82.gfc691d0246b9 on Windows 10 22H2 x64, the 
> >> problem
> >> still occurs.
> > 
> > In my evrironmen, trial for 1 hour does not reproduce the issue.
> > Could you please let us know your environment, i.e. CPU, amount of
> > memory, and so on?
> 
> It's been reproduced in a variety of environments:
> 
> * Windows 10 22H2 x64, Intel Core i7 11700, 32 GB RAM
> * Windows 10 22H2 x64, Intel Core i7 9700, 32 GB RAM
> * Windows 10 22H2 x64, Intel Core i7 6700, 32 GB RAM
> * Windows 7 SP1 x64, Intel Core i7 6700, 32 GB RAM
> 
> I'm surprised that you're not hitting it very quickly. The following 
> loop usually fails after a few iterations (rarely a hundred or so) in my 
> tests:
> 
> while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait 
> -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done

Thanks. My main PC still runs the above test for more than 15000 counts.
So, I tried another PC which CPU is Core i5 540M and could reproduce
the issue about 1 time per a few hundreds count.

I also tried to run sleep 0.1 instead of true, then, the issue happens
1 time per a few decades counts.

I'll look into this problem. Thanks for the report.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17 10:21         ` Takashi Yano
@ 2024-03-17 12:03           ` Takashi Yano
  2024-03-17 12:15             ` Takashi Yano
  0 siblings, 1 reply; 14+ messages in thread
From: Takashi Yano @ 2024-03-17 12:03 UTC (permalink / raw)
  To: cygwin

On Sun, 17 Mar 2024 19:21:16 +0900
Takashi Yano wrote:
> On Sun, 17 Mar 2024 13:03:40 +0300
> Alexey Izbyshev wrote:
> > On 2024-03-17 12:27, Takashi Yano wrote:
> > > On Sun, 17 Mar 2024 12:01:55 +0300
> > > Alexey Izbyshev wrote:
> > >> On 2024-03-17 11:44, Takashi Yano wrote:
> > >> > On Sun, 17 Mar 2024 11:14:16 +0300
> > >> > Alexey Izbyshev wrote:
> > >> >> Hello,
> > >> >>
> > >> >> I've been getting occasional "Error 127" from make -jN on seemingly
> > >> >> random jobs. After reducing the set of jobs and eventually eliminating
> > >> >> make, I've arrived to this one-liner:
> > >> >>
> > >> >> bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'
> > >> >>
> > >> >> When run repeatedly, the second "wait -n" often reports 127.
> > >> >>
> > >> >> I've reproduced this in the following environments:
> > >> >>
> > >> >> * Cygwin 3.5.1, Windows 10 22H2 x64
> > >> >> * Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64
> > >> >>
> > >> >> I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.
> > >> >
> > >> > Could you please try latest cygwin 3.6.0 (TEST) ?
> > >> 
> > >> Tested with 3.6.0-0.82.gfc691d0246b9 on Windows 10 22H2 x64, the 
> > >> problem
> > >> still occurs.
> > > 
> > > In my evrironmen, trial for 1 hour does not reproduce the issue.
> > > Could you please let us know your environment, i.e. CPU, amount of
> > > memory, and so on?
> > 
> > It's been reproduced in a variety of environments:
> > 
> > * Windows 10 22H2 x64, Intel Core i7 11700, 32 GB RAM
> > * Windows 10 22H2 x64, Intel Core i7 9700, 32 GB RAM
> > * Windows 10 22H2 x64, Intel Core i7 6700, 32 GB RAM
> > * Windows 7 SP1 x64, Intel Core i7 6700, 32 GB RAM
> > 
> > I'm surprised that you're not hitting it very quickly. The following 
> > loop usually fails after a few iterations (rarely a hundred or so) in my 
> > tests:
> > 
> > while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait 
> > -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
> 
> Thanks. My main PC still runs the above test for more than 15000 counts.
> So, I tried another PC which CPU is Core i5 540M and could reproduce
> the issue about 1 time per a few hundreds count.
> 
> I also tried to run sleep 0.1 instead of true, then, the issue happens
> 1 time per a few decades counts.
> 
> I'll look into this problem. Thanks for the report.

In my environment, the issue is reproducible even with cygwin 3.3.6
(32bit, i.e. WOW64) and bash 4.4.12(3)-release (i686-pc-cygwin).

What are the versions of bash in each systems?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17 12:03           ` Takashi Yano
@ 2024-03-17 12:15             ` Takashi Yano
  2024-03-17 12:35               ` Takashi Yano
  0 siblings, 1 reply; 14+ messages in thread
From: Takashi Yano @ 2024-03-17 12:15 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2951 bytes --]

On Sun, 17 Mar 2024 21:03:58 +0900
Takashi Yano wrote:
> On Sun, 17 Mar 2024 19:21:16 +0900
> Takashi Yano wrote:
> > On Sun, 17 Mar 2024 13:03:40 +0300
> > Alexey Izbyshev wrote:
> > > On 2024-03-17 12:27, Takashi Yano wrote:
> > > > On Sun, 17 Mar 2024 12:01:55 +0300
> > > > Alexey Izbyshev wrote:
> > > >> On 2024-03-17 11:44, Takashi Yano wrote:
> > > >> > On Sun, 17 Mar 2024 11:14:16 +0300
> > > >> > Alexey Izbyshev wrote:
> > > >> >> Hello,
> > > >> >>
> > > >> >> I've been getting occasional "Error 127" from make -jN on seemingly
> > > >> >> random jobs. After reducing the set of jobs and eventually eliminating
> > > >> >> make, I've arrived to this one-liner:
> > > >> >>
> > > >> >> bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'
> > > >> >>
> > > >> >> When run repeatedly, the second "wait -n" often reports 127.
> > > >> >>
> > > >> >> I've reproduced this in the following environments:
> > > >> >>
> > > >> >> * Cygwin 3.5.1, Windows 10 22H2 x64
> > > >> >> * Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64
> > > >> >>
> > > >> >> I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.
> > > >> >
> > > >> > Could you please try latest cygwin 3.6.0 (TEST) ?
> > > >> 
> > > >> Tested with 3.6.0-0.82.gfc691d0246b9 on Windows 10 22H2 x64, the 
> > > >> problem
> > > >> still occurs.
> > > > 
> > > > In my evrironmen, trial for 1 hour does not reproduce the issue.
> > > > Could you please let us know your environment, i.e. CPU, amount of
> > > > memory, and so on?
> > > 
> > > It's been reproduced in a variety of environments:
> > > 
> > > * Windows 10 22H2 x64, Intel Core i7 11700, 32 GB RAM
> > > * Windows 10 22H2 x64, Intel Core i7 9700, 32 GB RAM
> > > * Windows 10 22H2 x64, Intel Core i7 6700, 32 GB RAM
> > > * Windows 7 SP1 x64, Intel Core i7 6700, 32 GB RAM
> > > 
> > > I'm surprised that you're not hitting it very quickly. The following 
> > > loop usually fails after a few iterations (rarely a hundred or so) in my 
> > > tests:
> > > 
> > > while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait 
> > > -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
> > 
> > Thanks. My main PC still runs the above test for more than 15000 counts.
> > So, I tried another PC which CPU is Core i5 540M and could reproduce
> > the issue about 1 time per a few hundreds count.
> > 
> > I also tried to run sleep 0.1 instead of true, then, the issue happens
> > 1 time per a few decades counts.
> > 
> > I'll look into this problem. Thanks for the report.
> 
> In my environment, the issue is reproducible even with cygwin 3.3.6
> (32bit, i.e. WOW64) and bash 4.4.12(3)-release (i686-pc-cygwin).
> 
> What are the versions of bash in each systems?

And, the attached simple test case in C which does very similar things
with bash command could not reproduce the problem.

Can you reproduce the issue with attached STC?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

[-- Attachment #2: repr.c --]
[-- Type: text/x-csrc, Size: 490 bytes --]

#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
	int status;
	int num = 2;
	int cnt = 0;
	if (argc > 1) num = atoi(argv[1]);
	while (1) {
		for (int i=0; i<num; i++)
			if (fork() == 0) {
				execl("/usr/bin/sleep", "sleep", "0.1", NULL);
			}
		for (int i=0; i<num; i++) {
			waitpid(-1, &status, 0);
			if (WEXITSTATUS(status)) {
				printf("%d: %d\n", num, WEXITSTATUS(status));
				return 1;
			}
		}
	}
	return 0;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17 12:15             ` Takashi Yano
@ 2024-03-17 12:35               ` Takashi Yano
  2024-03-17 12:50                 ` Dimitry Andric
  0 siblings, 1 reply; 14+ messages in thread
From: Takashi Yano @ 2024-03-17 12:35 UTC (permalink / raw)
  To: cygwin

On Sun, 17 Mar 2024 21:15:17 +0900
Takashi Yano wrote:
> On Sun, 17 Mar 2024 21:03:58 +0900
> Takashi Yano wrote:
> > On Sun, 17 Mar 2024 19:21:16 +0900
> > Takashi Yano wrote:
> > > On Sun, 17 Mar 2024 13:03:40 +0300
> > > Alexey Izbyshev wrote:
> > > > On 2024-03-17 12:27, Takashi Yano wrote:
> > > > > On Sun, 17 Mar 2024 12:01:55 +0300
> > > > > Alexey Izbyshev wrote:
> > > > >> On 2024-03-17 11:44, Takashi Yano wrote:
> > > > >> > On Sun, 17 Mar 2024 11:14:16 +0300
> > > > >> > Alexey Izbyshev wrote:
> > > > >> >> Hello,
> > > > >> >>
> > > > >> >> I've been getting occasional "Error 127" from make -jN on seemingly
> > > > >> >> random jobs. After reducing the set of jobs and eventually eliminating
> > > > >> >> make, I've arrived to this one-liner:
> > > > >> >>
> > > > >> >> bash -c 'true & true & wait -n || echo 1: $? && wait -n || echo 2: $?'
> > > > >> >>
> > > > >> >> When run repeatedly, the second "wait -n" often reports 127.
> > > > >> >>
> > > > >> >> I've reproduced this in the following environments:
> > > > >> >>
> > > > >> >> * Cygwin 3.5.1, Windows 10 22H2 x64
> > > > >> >> * Cygwin 3.4.6, Windows 10 22H2 x64 and Windows 7 x64
> > > > >> >>
> > > > >> >> I couldn't reproduce it in Cygwin 3.3.6 (WOW64) on Windows 7 x64.
> > > > >> >
> > > > >> > Could you please try latest cygwin 3.6.0 (TEST) ?
> > > > >> 
> > > > >> Tested with 3.6.0-0.82.gfc691d0246b9 on Windows 10 22H2 x64, the 
> > > > >> problem
> > > > >> still occurs.
> > > > > 
> > > > > In my evrironmen, trial for 1 hour does not reproduce the issue.
> > > > > Could you please let us know your environment, i.e. CPU, amount of
> > > > > memory, and so on?
> > > > 
> > > > It's been reproduced in a variety of environments:
> > > > 
> > > > * Windows 10 22H2 x64, Intel Core i7 11700, 32 GB RAM
> > > > * Windows 10 22H2 x64, Intel Core i7 9700, 32 GB RAM
> > > > * Windows 10 22H2 x64, Intel Core i7 6700, 32 GB RAM
> > > > * Windows 7 SP1 x64, Intel Core i7 6700, 32 GB RAM
> > > > 
> > > > I'm surprised that you're not hitting it very quickly. The following 
> > > > loop usually fails after a few iterations (rarely a hundred or so) in my 
> > > > tests:
> > > > 
> > > > while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait 
> > > > -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
> > > 
> > > Thanks. My main PC still runs the above test for more than 15000 counts.
> > > So, I tried another PC which CPU is Core i5 540M and could reproduce
> > > the issue about 1 time per a few hundreds count.
> > > 
> > > I also tried to run sleep 0.1 instead of true, then, the issue happens
> > > 1 time per a few decades counts.
> > > 
> > > I'll look into this problem. Thanks for the report.
> > 
> > In my environment, the issue is reproducible even with cygwin 3.3.6
> > (32bit, i.e. WOW64) and bash 4.4.12(3)-release (i686-pc-cygwin).
> > 
> > What are the versions of bash in each systems?
> 
> And, the attached simple test case in C which does very similar things
> with bash command could not reproduce the problem.
> 
> Can you reproduce the issue with attached STC?

I also test your test case:
while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
in Linux (Debian 12.5), and the issue reproduced!

It seems that this is a bug of upstream of bash.

Eric, Corinna, any idea?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17 12:35               ` Takashi Yano
@ 2024-03-17 12:50                 ` Dimitry Andric
  2024-03-17 13:10                   ` Dimitry Andric
  0 siblings, 1 reply; 14+ messages in thread
From: Dimitry Andric @ 2024-03-17 12:50 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin

On 17 Mar 2024, at 13:35, Takashi Yano via Cygwin <cygwin@cygwin.com> wrote:
...
> 
> I also test your test case:
> while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
> in Linux (Debian 12.5), and the issue reproduced!

Yeah, same here with bash 5.1.16(1)-release on Ubuntu 22.04. It errors out with 127 after ~50-200 loops.

-Dimitry


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17 12:50                 ` Dimitry Andric
@ 2024-03-17 13:10                   ` Dimitry Andric
  2024-03-18  3:09                     ` Takashi Yano
  0 siblings, 1 reply; 14+ messages in thread
From: Dimitry Andric @ 2024-03-17 13:10 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin

On 17 Mar 2024, at 13:50, Dimitry Andric <dimitry@unified-streaming.com> wrote:
> 
> On 17 Mar 2024, at 13:35, Takashi Yano via Cygwin <cygwin@cygwin.com> wrote:
> ...
>> 
>> I also test your test case:
>> while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
>> in Linux (Debian 12.5), and the issue reproduced!
> 
> Yeah, same here with bash 5.1.16(1)-release on Ubuntu 22.04. It errors out with 127 after ~50-200 loops.

Having built bash master (bash-5.2-27-gf3b6bd19) here, it consistently gives 127 in this area:

https://git.savannah.gnu.org/cgit/bash.git/tree/builtins/wait.def#n227

   211  #if defined (JOB_CONTROL)
   212    if (nflag)
   213      {
   214        if (list)
   215          {
   216            opt = set_waitlist (list);
   217            if (opt == 0)
   218              WAIT_RETURN (127);
   219            wflags |= JWAIT_WAITING;
   220          }
   221
   222        status = wait_for_any_job (wflags, &pstat);
   223        if (vname && status >= 0)
   224          builtin_bind_var_to_int (vname, pstat.pid, bindflags);
   225
   226        if (status < 0)
=> 227          status = 127;
   228        if (list)
   229          unset_waitlist ();
   230        WAIT_RETURN (status);
   231      }
   232  #endif

So for some reason, wait_for_any_job() returns a negative value in this particular situation.

-Dimitry


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-17 13:10                   ` Dimitry Andric
@ 2024-03-18  3:09                     ` Takashi Yano
  2024-03-18  4:58                       ` Takashi Yano
  0 siblings, 1 reply; 14+ messages in thread
From: Takashi Yano @ 2024-03-18  3:09 UTC (permalink / raw)
  To: cygwin

On Sun, 17 Mar 2024 14:10:55 +0100
Dimitry Andric wrote:
> On 17 Mar 2024, at 13:50, Dimitry Andric <dimitry@unified-streaming.com> wrote:
> > 
> > On 17 Mar 2024, at 13:35, Takashi Yano via Cygwin <cygwin@cygwin.com> wrote:
> > ...
> >> 
> >> I also test your test case:
> >> while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
> >> in Linux (Debian 12.5), and the issue reproduced!
> > 
> > Yeah, same here with bash 5.1.16(1)-release on Ubuntu 22.04. It errors out with 127 after ~50-200 loops.
> 
> Having built bash master (bash-5.2-27-gf3b6bd19) here, it consistently gives 127 in this area:
> 
> https://git.savannah.gnu.org/cgit/bash.git/tree/builtins/wait.def#n227
> 
>    211  #if defined (JOB_CONTROL)
>    212    if (nflag)
>    213      {
>    214        if (list)
>    215          {
>    216            opt = set_waitlist (list);
>    217            if (opt == 0)
>    218              WAIT_RETURN (127);
>    219            wflags |= JWAIT_WAITING;
>    220          }
>    221
>    222        status = wait_for_any_job (wflags, &pstat);
>    223        if (vname && status >= 0)
>    224          builtin_bind_var_to_int (vname, pstat.pid, bindflags);
>    225
>    226        if (status < 0)
> => 227          status = 127;
>    228        if (list)
>    229          unset_waitlist ();
>    230        WAIT_RETURN (status);
>    231      }
>    232  #endif
> 
> So for some reason, wait_for_any_job() returns a negative value in this particular situation.

Line 218 looks also suspicious.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-18  3:09                     ` Takashi Yano
@ 2024-03-18  4:58                       ` Takashi Yano
  2024-03-18 11:08                         ` Alexey Izbyshev
  0 siblings, 1 reply; 14+ messages in thread
From: Takashi Yano @ 2024-03-18  4:58 UTC (permalink / raw)
  To: cygwin

On Mon, 18 Mar 2024 12:09:06 +0900
Takashi Yano wrote:
> On Sun, 17 Mar 2024 14:10:55 +0100
> Dimitry Andric wrote:
> > On 17 Mar 2024, at 13:50, Dimitry Andric <dimitry@unified-streaming.com> wrote:
> > > 
> > > On 17 Mar 2024, at 13:35, Takashi Yano via Cygwin <cygwin@cygwin.com> wrote:
> > > ...
> > >> 
> > >> I also test your test case:
> > >> while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
> > >> in Linux (Debian 12.5), and the issue reproduced!
> > > 
> > > Yeah, same here with bash 5.1.16(1)-release on Ubuntu 22.04. It errors out with 127 after ~50-200 loops.
> > 
> > Having built bash master (bash-5.2-27-gf3b6bd19) here, it consistently gives 127 in this area:
> > 
> > https://git.savannah.gnu.org/cgit/bash.git/tree/builtins/wait.def#n227
> > 
> >    211  #if defined (JOB_CONTROL)
> >    212    if (nflag)
> >    213      {
> >    214        if (list)
> >    215          {
> >    216            opt = set_waitlist (list);
> >    217            if (opt == 0)
> >    218              WAIT_RETURN (127);
> >    219            wflags |= JWAIT_WAITING;
> >    220          }
> >    221
> >    222        status = wait_for_any_job (wflags, &pstat);
> >    223        if (vname && status >= 0)
> >    224          builtin_bind_var_to_int (vname, pstat.pid, bindflags);
> >    225
> >    226        if (status < 0)
> > => 227          status = 127;
> >    228        if (list)
> >    229          unset_waitlist ();
> >    230        WAIT_RETURN (status);
> >    231      }
> >    232  #endif
> > 
> > So for some reason, wait_for_any_job() returns a negative value in this particular situation.
> 
> Line 218 looks also suspicious.

Probably, this is not a bug. man bash says:
              If  the  -n option is supplied, wait waits for a single job from
              the list of ids or, if no ids are supplied, any job, to complete
              and returns its exit status.  If none of the supplied  arguments
              is a child of the shell, or if no arguments are supplied and the
              shell  has no unwaited‐for children, the exit status is 127.

If the background process exited before calling 'wait -n', it returns 127.
This is very different from wait() system call, which is necessary for
any background joubs, otherwise zombie remains.

In the shell, it is not necessary to call wait command for background jobs,
therefore exit status of the background job which already exited is not held
anymore.

So, actual bug is in the test case.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bogus exit code 127 from a child process
  2024-03-18  4:58                       ` Takashi Yano
@ 2024-03-18 11:08                         ` Alexey Izbyshev
  0 siblings, 0 replies; 14+ messages in thread
From: Alexey Izbyshev @ 2024-03-18 11:08 UTC (permalink / raw)
  To: takashi.yano; +Cc: dimitry, cygwin

On 2024-03-18 07:58, Takashi Yano wrote:
> On Mon, 18 Mar 2024 12:09:06 +0900
> Takashi Yano wrote:
>> On Sun, 17 Mar 2024 14:10:55 +0100
>> Dimitry Andric wrote:
>> > On 17 Mar 2024, at 13:50, Dimitry Andric <dimitry@unified-streaming.com> wrote:
>> > >
>> > > On 17 Mar 2024, at 13:35, Takashi Yano via Cygwin <cygwin@cygwin.com> wrote:
>> > > ...
>> > >>
>> > >> I also test your test case:
>> > >> while bash -c 'true & true & wait -n || { echo 1: $?; exit 1; } && wait -n || { echo 2: $?; exit 1; }'; do echo $((i++)); done
>> > >> in Linux (Debian 12.5), and the issue reproduced!
>> > >
>> > > Yeah, same here with bash 5.1.16(1)-release on Ubuntu 22.04. It errors out with 127 after ~50-200 loops.
>> >
>> > Having built bash master (bash-5.2-27-gf3b6bd19) here, it consistently gives 127 in this area:
>> >
>> > https://git.savannah.gnu.org/cgit/bash.git/tree/builtins/wait.def#n227
>> >
>> >    211  #if defined (JOB_CONTROL)
>> >    212    if (nflag)
>> >    213      {
>> >    214        if (list)
>> >    215          {
>> >    216            opt = set_waitlist (list);
>> >    217            if (opt == 0)
>> >    218              WAIT_RETURN (127);
>> >    219            wflags |= JWAIT_WAITING;
>> >    220          }
>> >    221
>> >    222        status = wait_for_any_job (wflags, &pstat);
>> >    223        if (vname && status >= 0)
>> >    224          builtin_bind_var_to_int (vname, pstat.pid, bindflags);
>> >    225
>> >    226        if (status < 0)
>> > => 227          status = 127;
>> >    228        if (list)
>> >    229          unset_waitlist ();
>> >    230        WAIT_RETURN (status);
>> >    231      }
>> >    232  #endif
>> >
>> > So for some reason, wait_for_any_job() returns a negative value in this particular situation.
>> 
>> Line 218 looks also suspicious.
> 
> Probably, this is not a bug. man bash says:
>               If  the  -n option is supplied, wait waits for a single 
> job from
>               the list of ids or, if no ids are supplied, any job, to 
> complete
>               and returns its exit status.  If none of the supplied  
> arguments
>               is a child of the shell, or if no arguments are supplied 
> and the
>               shell  has no unwaited‐for children, the exit status is 
> 127.
> 
> If the background process exited before calling 'wait -n', it returns 
> 127.
> This is very different from wait() system call, which is necessary for
> any background joubs, otherwise zombie remains.
> 
> In the shell, it is not necessary to call wait command for background 
> jobs,
> therefore exit status of the background job which already exited is not 
> held
> anymore.
> 
> So, actual bug is in the test case.

I missed the subthread starting from your bash version request due to 
not being CCed, so replying via the mail archive link.

I'm sorry for wasting your time with the bad test case. I should have 
tested on Linux first myself. Thank you, Takashi and Dimitry.

The original problem with make that I was reproducing doesn't involve 
"wait -n" or any bash background jobs though, so this puts me back to 
the point before (I thought) I eliminated make. I'll try my older 
reproducers with new Cygwin versions, and will probably look at make 
source code (since it's starting to look like it might be a bug in make, 
not in Cygwin) before posting further.

Thanks,
Alexey


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-03-18 11:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-17  8:14 Bogus exit code 127 from a child process Alexey Izbyshev
2024-03-17  8:44 ` Takashi Yano
2024-03-17  9:01   ` Alexey Izbyshev
2024-03-17  9:27     ` Takashi Yano
2024-03-17 10:03       ` Alexey Izbyshev
2024-03-17 10:21         ` Takashi Yano
2024-03-17 12:03           ` Takashi Yano
2024-03-17 12:15             ` Takashi Yano
2024-03-17 12:35               ` Takashi Yano
2024-03-17 12:50                 ` Dimitry Andric
2024-03-17 13:10                   ` Dimitry Andric
2024-03-18  3:09                     ` Takashi Yano
2024-03-18  4:58                       ` Takashi Yano
2024-03-18 11:08                         ` Alexey Izbyshev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).