public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
@ 2014-01-13 16:31 tednolan
  2014-01-14 10:18 ` Corinna Vinschen
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: tednolan @ 2014-01-13 16:31 UTC (permalink / raw)
  To: cygwin; +Cc: tednolan

Hello,

I'm running:

CYGWIN_NT-6.1 prog5 1.7.27(0.271/5/3) 2013-12-09 11:54 x86_64 Cygwin
gcc (GCC) 4.8.2

on a 64 bit Win7 system.

I have just run into an odd bug, which I have boiled down into the program
below (which started as a mod to tiff2ps).

If you compile this program:

=========================CUT HERE=============
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>


int main(int argc, char **argv)
{

	FILE *fp;
	char buf[4096];
	char infile[4096];
	char outfile[4096];
	int i = 0;
	int running_children = 0;
	int child_limit = 20;
	int wait_status;

	if (argc == 1) {
		fp = stdin;
	} else if (argc == 2) {

		fp = fopen(argv[1], "r");
		if (!fp) {
			fprintf(stderr, "Can't open input list %s\n", argv[1]);
			exit(1);
		}

	} else {
		fprintf(stderr, "Usage: multi_tiff2ps [spec_file]\n");
		exit(1);
	}

	while( fgets(buf, sizeof(buf), fp) ) {
		++i;

		if(sscanf(buf, "%s %s", infile, outfile) != 2) {
			fprintf(stderr, "Malformed spec line %d (%s)\n",
				i, buf);
			continue;
		}

		//fprintf(stderr, "(%s) (%s) %d %ld\n", infile,
		//	outfile, i, ftell(fp));

		fprintf(stderr, "Running %d\n", running_children);

		if (running_children >= child_limit) {
			fprintf(stderr, "Initial wait\n");
			wait(&wait_status);
			--running_children;
		}

		switch (fork()) {
			
			/* error */
			case -1:
				fprintf(stderr,
					"Can't fork new tiff2ps process!\n");
				exit(1);
				break;

			/* child */
			case 0:
				fprintf(stderr, "child\n"); fflush(stderr);
				exit(0);
				break;

			/* parent */
			default:
				++running_children;
				break;
		}
	}

	for(i = 0; i < running_children; i++) {
		fprintf(stderr, "Final wait\n");
		wait(&wait_status);
	}


	exit(0);

}

=========================End code=============

and run it with this data:

00.tif  00.eps
01.tif  01.eps
02.tif  02.eps

It will run forever.

However, if you uncomment the fprintf with the ftell(), it runs as
expected.

It works correctly on linux.

Anyone seen this before?

Is there a fix?

Thanks!

Ted Nolan

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-13 16:31 fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54 tednolan
@ 2014-01-14 10:18 ` Corinna Vinschen
  2014-01-14 15:54 ` Eric Blake
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Corinna Vinschen @ 2014-01-14 10:18 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1600 bytes --]

On Jan 13 11:06, tednolan@bellsouth.net wrote:
> Hello,
> 
> I'm running:
> 
> CYGWIN_NT-6.1 prog5 1.7.27(0.271/5/3) 2013-12-09 11:54 x86_64 Cygwin
> gcc (GCC) 4.8.2
> 
> on a 64 bit Win7 system.
> 
> I have just run into an odd bug, which I have boiled down into the program
> below (which started as a mod to tiff2ps).
> 
> If you compile this program:
> 
> =========================CUT HERE=============
> [...]
> =========================End code=============
> 
> and run it with this data:
> 
> 00.tif  00.eps
> 01.tif  01.eps
> 02.tif  02.eps
> 
> It will run forever.
> 
> However, if you uncomment the fprintf with the ftell(), it runs as
> expected.

Alternatively it runs as expected when dropping the fork() call.  I'm
totally baffled, especially because this is such a blatant misbehaviour
that it should have been found much earlier.  I ran your testcase under
Cygwin versions from 2010, and the problem showed up, too, so this
problem is not exactly new.

I tried to debug it a bit already and for some reason the *Windows* call
reading the input file always starts reading at position 0 again.

Still to debug:

- Why on earth does the OS "forget" the current file position, just
  because the process calls fork?

- Why on earth does the OS not forget the current file position, just
  because the process calls ftell before fork?

Thanks for the report and the testcase.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-13 16:31 fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54 tednolan
  2014-01-14 10:18 ` Corinna Vinschen
@ 2014-01-14 15:54 ` Eric Blake
  2014-01-14 16:15   ` tednolan
  2014-01-15 16:40 ` Tom Honermann
  2014-01-17 20:06 ` Eric Blake
  3 siblings, 1 reply; 22+ messages in thread
From: Eric Blake @ 2014-01-14 15:54 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3518 bytes --]

On 01/13/2014 09:06 AM, tednolan@bellsouth.net wrote:
> Hello,
> 
> I'm running:
> 
> CYGWIN_NT-6.1 prog5 1.7.27(0.271/5/3) 2013-12-09 11:54 x86_64 Cygwin
> gcc (GCC) 4.8.2
> 
> on a 64 bit Win7 system.
> 
> I have just run into an odd bug, which I have boiled down into the program
> below (which started as a mod to tiff2ps).

Your program may be violating POSIX, which would trigger undefined behavior.

Quoting POSIX:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05

For a handle to become the active handle, the application shall ensure
that the actions below are performed between the last use of the handle
(the current active handle) and the first use of the second handle (the
future active handle). The second handle then becomes the active handle.
All activity by the application affecting the file offset on the first
handle shall be suspended until it again becomes the active file handle.
(If a stream function has as an underlying function one that affects the
file offset, the stream function shall be considered to affect the file
offset.)

The handles need not be in the same process for these rules to apply.

Note that after a fork(), two handles exist where one existed before.
The application shall ensure that, if both handles can ever be accessed,
they are both in a state where the other could become the active handle
first. The application shall prepare for a fork() exactly as if it were
a change of active handle. (If the only action performed by one of the
processes is one of the exec functions or _exit() (not exit()), the
handle is never accessed in that process.)

For the first handle, the first applicable condition below applies.
After the actions required below are taken, if the handle is still open,
the application can close it.

    If it is a file descriptor, no action is required.

    If the only further action to be performed on any handle to this
open file descriptor is to close it, no action need be taken.

    If it is a stream which is unbuffered, no action need be taken.

    If it is a stream which is line buffered, and the last byte written
to the stream was a <newline> (that is, as if a:

        putc('\n')

    was the most recent operation on that stream), no action need be taken.

    If it is a stream which is open for writing or appending (but not
also open for reading), the application shall either perform an
fflush(), or the stream shall be closed.

    If the stream is open for reading and it is at the end of the file
(feof() is true), no action need be taken.

    If the stream is open with a mode that allows reading and the
underlying open file description refers to a device that is capable of
seeking, the application shall either perform an fflush(), or the stream
shall be closed.

For the second handle:

    If any previous active handle has been used by a function that
explicitly changed the file offset, except as required above for the
first handle, the application shall perform an lseek() or fseek() (as
appropriate to the type of handle) to an appropriate location.

If the active handle ceases to be accessible before the requirements on
the first handle, above, have been met, the state of the open file
description becomes undefined. This might occur during functions such as
a fork() or _exit().

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-14 15:54 ` Eric Blake
@ 2014-01-14 16:15   ` tednolan
  2014-01-15  4:53     ` Lord Laraby
  2014-01-17 20:38     ` Eric Blake
  0 siblings, 2 replies; 22+ messages in thread
From: tednolan @ 2014-01-14 16:15 UTC (permalink / raw)
  To: cygwin

In message <52D55D96.8070407@redhat.com>you write:
>
>Your program may be violating POSIX, which would trigger undefined behavior.
>
>Quoting POSIX:
> pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05
>

[long quote elided]

Yikes!  That's pretty impenatrable.  And if it says what I think it says,
it seems to violate the way I've understood Unix fork() and how fds 
(and stdio buffers) are inherited since forever.

However..

Do I understand that to say that if the first thing my child does is

	fclose(fp);

everything should be hunky-dory?

Because I just tried that, and it's still not.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-14 16:15   ` tednolan
@ 2014-01-15  4:53     ` Lord Laraby
  2014-01-15  5:40       ` Lord Laraby
                         ` (2 more replies)
  2014-01-17 20:38     ` Eric Blake
  1 sibling, 3 replies; 22+ messages in thread
From: Lord Laraby @ 2014-01-15  4:53 UTC (permalink / raw)
  To: Cygwin Mailing List

My two cents say, since the child is not referencing 'fp' at all,
there is no violation of the POSIX semantics in this situation. It
actually does seem, however, that the fork is closing, or at least
forgetting the stdio file position of, fp when it forks. A possible
memory corruption during fork from which fgets can not recover?

On Tue, Jan 14, 2014 at 10:50 AM,  <tednolan@bellsouth.net> wrote:
> In message <52D55D96.8070407@redhat.com>you write:
>>
>>Your program may be violating POSIX, which would trigger undefined behavior.
>>
>>Quoting POSIX:
>> pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05
>>
>
> [long quote elided]
>
> Yikes!  That's pretty impenatrable.  And if it says what I think it says,
> it seems to violate the way I've understood Unix fork() and how fds
> (and stdio buffers) are inherited since forever.
>
> However..
>
> Do I understand that to say that if the first thing my child does is
>
>         fclose(fp);
>
> everything should be hunky-dory?
>
> Because I just tried that, and it's still not.
>
> --
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-15  4:53     ` Lord Laraby
@ 2014-01-15  5:40       ` Lord Laraby
  2014-01-15  7:46       ` Peter Rosin
  2014-01-17 20:11       ` Eric Blake
  2 siblings, 0 replies; 22+ messages in thread
From: Lord Laraby @ 2014-01-15  5:40 UTC (permalink / raw)
  To: Cygwin Mailing List

Please forgive my TOFU above. This gmail client really has no manners at all.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-15  4:53     ` Lord Laraby
  2014-01-15  5:40       ` Lord Laraby
@ 2014-01-15  7:46       ` Peter Rosin
  2014-01-15 15:53         ` tednolan
  2014-01-17 20:11       ` Eric Blake
  2 siblings, 1 reply; 22+ messages in thread
From: Peter Rosin @ 2014-01-15  7:46 UTC (permalink / raw)
  To: cygwin

On 2014-01-15 05:53, Lord Laraby wrote:
> On Tue, Jan 14, 2014 at 10:50 AM, Ted Nolan wrote:
>> In message <52D55D96.8070407@redhat.com> you write:
>>>
>>> Your program may be violating POSIX, which would trigger undefined behavior.
>>>
>>> Quoting POSIX:
>>> pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05
>>>
>>
>> [long quote elided]
>>
>> Yikes!  That's pretty impenatrable.  And if it says what I think it says,
>> it seems to violate the way I've understood Unix fork() and how fds
>> (and stdio buffers) are inherited since forever.
>>
>> However..
>>
>> Do I understand that to say that if the first thing my child does is
>>
>>         fclose(fp);
>>
>> everything should be hunky-dory?
>>
>> Because I just tried that, and it's still not.
> 
> My two cents say, since the child is not referencing 'fp' at all,
> there is no violation of the POSIX semantics in this situation. It
> actually does seem, however, that the fork is closing, or at least
> forgetting the stdio file position of, fp when it forks. A possible
> memory corruption during fork from which fgets can not recover?

Let me requote one little bit quoted by Eric:

	                           (If the only action performed by one of the
	processes is one of the exec functions or _exit() (not exit()), the
	handle is never accessed in that process.)

Ted is using exit() in the children, not _exit(), and the above indicates
that exit() in fact "accesses the handle". My guess would be that fclose(3)
also "accesses the handle".

But, reading about _exit(), it seems that handle accesses are implementation
defined, so I'm not sure it will help in all situations.

Cheers,
Peter


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-15  7:46       ` Peter Rosin
@ 2014-01-15 15:53         ` tednolan
  2014-01-15 16:33           ` Corinna Vinschen
  2014-01-17 20:13           ` Eric Blake
  0 siblings, 2 replies; 22+ messages in thread
From: tednolan @ 2014-01-15 15:53 UTC (permalink / raw)
  To: cygwin

In message <52D63CE2.9060308@lysator.liu.se>you write:
>On 2014-01-15 05:53, Lord Laraby wrote:
>> On Tue, Jan 14, 2014 at 10:50 AM, Ted Nolan wrote:
>>> In message <52D55D96.8070407@redhat.com> you write:
>>>>
>>>> Your program may be violating POSIX, which would trigger undefined behavio
>r.
>>>>
>>>> Quoting POSIX:
>>>> pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_0
>5
>>>>
>>>
>>> [long quote elided]
>>>
>>> Yikes!  That's pretty impenatrable.  And if it says what I think it says,
>>> it seems to violate the way I've understood Unix fork() and how fds
>>> (and stdio buffers) are inherited since forever.
>>>
>>> However..
>>>
>>> Do I understand that to say that if the first thing my child does is
>>>
>>>         fclose(fp);
>>>
>>> everything should be hunky-dory?
>>>
>>> Because I just tried that, and it's still not.
>> 
>> My two cents say, since the child is not referencing 'fp' at all,
>> there is no violation of the POSIX semantics in this situation. It
>> actually does seem, however, that the fork is closing, or at least
>> forgetting the stdio file position of, fp when it forks. A possible
>> memory corruption during fork from which fgets can not recover?
>
>Let me requote one little bit quoted by Eric:
>
>	                           (If the only action performed by one of the
>	processes is one of the exec functions or _exit() (not exit()), the
>	handle is never accessed in that process.)
>
>Ted is using exit() in the children, not _exit(), and the above indicates
>that exit() in fact "accesses the handle". My guess would be that fclose(3)
>also "accesses the handle".
>
>But, reading about _exit(), it seems that handle accesses are implementation
>defined, so I'm not sure it will help in all situations.
>
>Cheers,
>Peter

Well, all I can say in this instance, is that arguably conforming to
the bare letter of the standard (if that's in fact what is happening)
is not "the right thing".  People certainly don't expect that stdio
file pointers that exist at fork() time and which are never "used" by a
child will be reset in the parent.  I mean, if they can't even be fclose()-ed
to take them out of the picture, what chance have you got? -:)

FWIW, FreeBSD, Linux and Solaris all compile and run the test program
with the behavoir I expect..


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-15 15:53         ` tednolan
@ 2014-01-15 16:33           ` Corinna Vinschen
  2014-01-16  5:08             ` tednolan
  2014-01-17 20:13           ` Eric Blake
  1 sibling, 1 reply; 22+ messages in thread
From: Corinna Vinschen @ 2014-01-15 16:33 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2908 bytes --]

On Jan 15 10:28, tednolan@bellsouth.net wrote:
> In message <52D63CE2.9060308@lysator.liu.se>you write:
> >On 2014-01-15 05:53, Lord Laraby wrote:
> >> On Tue, Jan 14, 2014 at 10:50 AM, Ted Nolan wrote:
> >>> In message <52D55D96.8070407@redhat.com> you write:
> >>>>
> >>>> Your program may be violating POSIX, which would trigger undefined behavio
> >r.
> >>>>
> >>>> Quoting POSIX:
> >>>> pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_0
> >5
> >>>>
> >>>
> >>> [long quote elided]
> >>>
> >>> Yikes!  That's pretty impenatrable.  And if it says what I think it says,
> >>> it seems to violate the way I've understood Unix fork() and how fds
> >>> (and stdio buffers) are inherited since forever.
> >>>
> >>> However..
> >>>
> >>> Do I understand that to say that if the first thing my child does is
> >>>
> >>>         fclose(fp);
> >>>
> >>> everything should be hunky-dory?
> >>>
> >>> Because I just tried that, and it's still not.
> >> 
> >> My two cents say, since the child is not referencing 'fp' at all,
> >> there is no violation of the POSIX semantics in this situation. It
> >> actually does seem, however, that the fork is closing, or at least
> >> forgetting the stdio file position of, fp when it forks. A possible
> >> memory corruption during fork from which fgets can not recover?
> >
> >Let me requote one little bit quoted by Eric:
> >
> >	                           (If the only action performed by one of the
> >	processes is one of the exec functions or _exit() (not exit()), the
> >	handle is never accessed in that process.)
> >
> >Ted is using exit() in the children, not _exit(), and the above indicates
> >that exit() in fact "accesses the handle". My guess would be that fclose(3)
> >also "accesses the handle".
> >
> >But, reading about _exit(), it seems that handle accesses are implementation
> >defined, so I'm not sure it will help in all situations.
> >
> >Cheers,
> >Peter
> 
> Well, all I can say in this instance, is that arguably conforming to
> the bare letter of the standard (if that's in fact what is happening)
> is not "the right thing".  People certainly don't expect that stdio
> file pointers that exist at fork() time and which are never "used" by a
> child will be reset in the parent.  I mean, if they can't even be fclose()-ed
> to take them out of the picture, what chance have you got? -:)
> 
> FWIW, FreeBSD, Linux and Solaris all compile and run the test program
> with the behavoir I expect..

Just for completeness:  I can test on Linux, but not on FreeBSD and
Solaris.  Does the testcase also work as expected on both of them,
after you added fclose to the child?  On Linux it does.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-13 16:31 fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54 tednolan
  2014-01-14 10:18 ` Corinna Vinschen
  2014-01-14 15:54 ` Eric Blake
@ 2014-01-15 16:40 ` Tom Honermann
  2014-01-15 16:50   ` Corinna Vinschen
  2014-01-17 20:06 ` Eric Blake
  3 siblings, 1 reply; 22+ messages in thread
From: Tom Honermann @ 2014-01-15 16:40 UTC (permalink / raw)
  To: cygwin

On 01/13/2014 11:06 AM, tednolan@bellsouth.net wrote:
...
> 		switch (fork()) {
> 			
> 			/* error */
> 			case -1:
...
> 			/* child */
> 			case 0:
> 				fprintf(stderr, "child\n"); fflush(stderr);
> 				exit(0);
> 				break;

The above code is incorrect.  It is always wrong to call exit() from a 
forked process that has not yet called one of the exec() family of 
functions.  _exit() should be called instead.  Best case, calling exit() 
will result in double flushing of any stream buffers held by the parent 
at the time fork() is called (since the buffers will (eventually) be 
flushed by the parent as well as the child (unless at least one of the 
processes aborts or exits with _exit()).

Tom.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-15 16:40 ` Tom Honermann
@ 2014-01-15 16:50   ` Corinna Vinschen
  0 siblings, 0 replies; 22+ messages in thread
From: Corinna Vinschen @ 2014-01-15 16:50 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1289 bytes --]

On Jan 15 11:39, Tom Honermann wrote:
> On 01/13/2014 11:06 AM, tednolan@bellsouth.net wrote:
> ...
> >		switch (fork()) {
> >			
> >			/* error */
> >			case -1:
> ...
> >			/* child */
> >			case 0:
> >				fprintf(stderr, "child\n"); fflush(stderr);
> >				exit(0);
> >				break;
> 
> The above code is incorrect.  It is always wrong to call exit() from
> a forked process that has not yet called one of the exec() family of
> functions.  _exit() should be called instead.  Best case, calling
> exit() will result in double flushing of any stream buffers held by
> the parent at the time fork() is called (since the buffers will
> (eventually) be flushed by the parent as well as the child (unless
> at least one of the processes aborts or exits with _exit()).

Still, SUSv4 says:

  The exit() function shall then flush all open streams with unwritten
  buffered data and close all open streams. Finally, the process shall
  be terminated [...]
 
Note that exit only flushes streams with *unwritten* data, but not
streams with *unread* data.  So this testcase is still valid and
should work.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-15 16:33           ` Corinna Vinschen
@ 2014-01-16  5:08             ` tednolan
  2014-01-16  8:50               ` Corinna Vinschen
  0 siblings, 1 reply; 22+ messages in thread
From: tednolan @ 2014-01-16  5:08 UTC (permalink / raw)
  To: cygwin

In message <20140115163354.GA30234@calimero.vinschen.de>you write:
>--ew6BAiZeqk4r7MaW
>Content-Type: text/plain; charset=utf-8
>Content-Disposition: inline
>Content-Transfer-Encoding: quoted-printable
>
>On Jan 15 10:28, tednolan@bellsouth.net wrote:
>> In message <52D63CE2.9060308@lysator.liu.se>you write:
>> >On 2014-01-15 05:53, Lord Laraby wrote:
>> >> On Tue, Jan 14, 2014 at 10:50 AM, Ted Nolan wrote:
>> >>> In message <52D55D96.8070407@redhat.com> you write:
>> >>>>
>> >>>> Your program may be violating POSIX, which would trigger undefined b=
>ehavio
>> >r.
>> >>>>
>> >>>> Quoting POSIX:
>> >>>> pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#ta=
>g_15_0
>> >5
>> >>>>
>> >>>
>> >>> [long quote elided]
>> >>>
>> >>> Yikes!  That's pretty impenatrable.  And if it says what I think it s=
>ays,
>> >>> it seems to violate the way I've understood Unix fork() and how fds
>> >>> (and stdio buffers) are inherited since forever.
>> >>>
>> >>> However..
>> >>>
>> >>> Do I understand that to say that if the first thing my child does is
>> >>>
>> >>>         fclose(fp);
>> >>>
>> >>> everything should be hunky-dory?
>> >>>
>> >>> Because I just tried that, and it's still not.
>> >>=20
>> >> My two cents say, since the child is not referencing 'fp' at all,
>> >> there is no violation of the POSIX semantics in this situation. It
>> >> actually does seem, however, that the fork is closing, or at least
>> >> forgetting the stdio file position of, fp when it forks. A possible
>> >> memory corruption during fork from which fgets can not recover?
>> >
>> >Let me requote one little bit quoted by Eric:
>> >
>> >	                           (If the only action performed by one of the
>> >	processes is one of the exec functions or _exit() (not exit()), the
>> >	handle is never accessed in that process.)
>> >
>> >Ted is using exit() in the children, not _exit(), and the above indicates
>> >that exit() in fact "accesses the handle". My guess would be that fclose=
>(3)
>> >also "accesses the handle".
>> >
>> >But, reading about _exit(), it seems that handle accesses are implementa=
>tion
>> >defined, so I'm not sure it will help in all situations.
>> >
>> >Cheers,
>> >Peter
>>=20
>> Well, all I can say in this instance, is that arguably conforming to
>> the bare letter of the standard (if that's in fact what is happening)
>> is not "the right thing".  People certainly don't expect that stdio
>> file pointers that exist at fork() time and which are never "used" by a
>> child will be reset in the parent.  I mean, if they can't even be fclose(=
>)-ed
>> to take them out of the picture, what chance have you got? -:)
>>=20
>> FWIW, FreeBSD, Linux and Solaris all compile and run the test program
>> with the behavoir I expect..
>
>Just for completeness:  I can test on Linux, but not on FreeBSD and
>Solaris.  Does the testcase also work as expected on both of them,
>after you added fclose to the child?  On Linux it does.
>
>
>Thanks,
>Corinna
>

Well, it appears I spoke too soon about Solaris.  I saw that it terminated
rather than running forever, and assumed it was working correctly.
That turns out not to be the case: For 3 lines in the input file, it somehow
gets up to 8 processes before terminating.

Here's what I can say per OS:

FreeBSD 4.9
FreeBSD 8.1
FreeBSD 9.1

	Original test case works.
	Test case with fclose() works
	Test case with _exit() instead of exit() works

Solaris 9:

	Original test case fails (but terminates)
	Test case with fclose() fails
	Test case with _exit() instead of exit() works

Cygwin:
	Original test case fails (never terminates)
	Test case with fclose() fails
	Test case with _exit() instead of exit() works

Gentoo Linux:
	Original test case works
	Test case with fclose() -- don't have access right now
	Test case with _exit() instead of exit() -- don't have access rght now

So, as per other posters, exit() is wrong and should be _exit().  I accept
that, and will fix it, but it still seems to be that the Linux and FreeBSD
behavior is better here.  If the spec allows "spooky action at a distance",
that's not the same as encouraging it..

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-16  5:08             ` tednolan
@ 2014-01-16  8:50               ` Corinna Vinschen
  2014-01-17  5:19                 ` tednolan
  0 siblings, 1 reply; 22+ messages in thread
From: Corinna Vinschen @ 2014-01-16  8:50 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2419 bytes --]

On Jan 15 23:42, tednolan@bellsouth.net wrote:
> >> FWIW, FreeBSD, Linux and Solaris all compile and run the test program
> >> with the behavoir I expect..
> >
> >Just for completeness:  I can test on Linux, but not on FreeBSD and
> >Solaris.  Does the testcase also work as expected on both of them,
> >after you added fclose to the child?  On Linux it does.
> >
> >
> >Thanks,
> >Corinna
> >
> 
> Well, it appears I spoke too soon about Solaris.  I saw that it terminated
> rather than running forever, and assumed it was working correctly.
> That turns out not to be the case: For 3 lines in the input file, it somehow
> gets up to 8 processes before terminating.
> 
> Here's what I can say per OS:
> 
> FreeBSD 4.9
> FreeBSD 8.1
> FreeBSD 9.1
> 
> 	Original test case works.
> 	Test case with fclose() works
> 	Test case with _exit() instead of exit() works
> 
> Solaris 9:
> 
> 	Original test case fails (but terminates)
> 	Test case with fclose() fails
> 	Test case with _exit() instead of exit() works
> 
> Cygwin:
> 	Original test case fails (never terminates)
> 	Test case with fclose() fails
> 	Test case with _exit() instead of exit() works
> 
> Gentoo Linux:
> 	Original test case works
> 	Test case with fclose() -- don't have access right now
> 	Test case with _exit() instead of exit() -- don't have access rght now

Can you change your testcase another bit, please?  Enable your
`ftell' printf, but rather than printing the result of ftell,
print the result of lseek:

  fprintf(stderr, "(%s) (%s) %d %ld\n", infile,
        outfile, i, lseek(fileno(fp), 0, SEEK_CUR));

I would be curious what happens on Solaris here.

> So, as per other posters, exit() is wrong and should be _exit().  I accept
> that, and will fix it, but it still seems to be that the Linux and FreeBSD
> behavior is better here.  If the spec allows "spooky action at a distance",
> that's not the same as encouraging it..

Well, not quite.  In theory, Linux is our role model for this kind of
behaviour, so I would opt for changing that to follow Linux.  It's an
easy patch, but it's a bit dangerous, because the code in question is
shared with newlib, so a change affects a lot of other targets as well.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-16  8:50               ` Corinna Vinschen
@ 2014-01-17  5:19                 ` tednolan
  2014-01-17  9:28                   ` Corinna Vinschen
  0 siblings, 1 reply; 22+ messages in thread
From: tednolan @ 2014-01-17  5:19 UTC (permalink / raw)
  To: cygwin

In message <20140116085026.GA26205@calimero.vinschen.de>you write:
>
>Can you change your testcase another bit, please?  Enable your
>`ftell' printf, but rather than printing the result of ftell,
>print the result of lseek:
>
>  fprintf(stderr, "(%s) (%s) %d %ld\n", infile,
>        outfile, i, lseek(fileno(fp), 0, SEEK_CUR));
>
>I would be curious what happens on Solaris here.
>

OK,

I took the original test case and made your lseek change.  Here are
the Solaris & FreeBSD results.

Here is Solaris 9:

=====================SOLARIS========================
Script started on Thu Jan 16 23:47:20 2014
solabel10% ./a.out < test_data
(00.tif) (00.eps) 1 45
Running 0
child
(01.tif) (01.eps) 2 15
Running 1
child
(02.tif) (02.eps) 3 0
Running 2
child
(00.tif) (00.eps) 4 45
Running 3
child
(01.tif) (01.eps) 5 15
Running 4
child
(02.tif) (02.eps) 6 0
Running 5
child
(00.tif) (00.eps) 7 45
Running 6
(01.tif) (01.eps) 8 45
Running 7
(02.tif) (02.eps) 9 45
Running 8
Final wait
Final wait
Final wait
Final wait
Final wait
Final wait
Final wait
child
Final wait
child
child
Final wait
solabel10% exit
solabel10% 
script done on Thu Jan 16 23:47:29 2014
=====================END SOLARIS========================

Freebsd 4.9:

=====================FREEBSD 4.9========================
loft% ./a.out < test_data
(00.tif) (00.eps) 1 45
Running 0
(01.tif) (01.eps) 2 45
child
Running 1
(02.tif) (02.eps) 3 45
child
Running 2
Final wait
child
Final wait
Final wait
=====================END FREEBSD 4.9========================

FreeBSD 9.1:

=====================FREEBSD 9.1========================
(00.tif) (00.eps) 1 45
Running 0
(01.tif) (01.eps) 2 45
Running 1
(02.tif) (02.eps) 3 45
Running 2
Final wait
child
child
Final wait
Final wait
child
brookside% 
=====================END FREEBSD 9.1========================

FreeBSD 8.1:

=====================FREEBSD 8.1========================
%./a.out < test_data
(00.tif) (00.eps) 1 45
Running 0
(01.tif) (01.eps) 2 45
child
Running 1
(02.tif) (02.eps) 3 45
child
Running 2
Final wait
child
Final wait
Final wait
=====================END FREEBSD 8.1========================

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-17  5:19                 ` tednolan
@ 2014-01-17  9:28                   ` Corinna Vinschen
  0 siblings, 0 replies; 22+ messages in thread
From: Corinna Vinschen @ 2014-01-17  9:28 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1410 bytes --]

On Jan 16 23:53, tednolan@bellsouth.net wrote:
> In message <20140116085026.GA26205@calimero.vinschen.de>you write:
> >
> >Can you change your testcase another bit, please?  Enable your
> >`ftell' printf, but rather than printing the result of ftell,
> >print the result of lseek:
> >
> >  fprintf(stderr, "(%s) (%s) %d %ld\n", infile,
> >        outfile, i, lseek(fileno(fp), 0, SEEK_CUR));
> >
> >I would be curious what happens on Solaris here.
> >
> 
> OK,
> 
> I took the original test case and made your lseek change.  Here are
> the Solaris & FreeBSD results.
> 
> Here is Solaris 9:
> 
> =====================SOLARIS========================
> Script started on Thu Jan 16 23:47:20 2014
> solabel10% ./a.out < test_data
> (00.tif) (00.eps) 1 45
> Running 0
> child
> (01.tif) (01.eps) 2 15
> Running 1
> child
> (02.tif) (02.eps) 3 0
> Running 2

Thanks!  That's exactly what happens on Cygwin as well.

I'm about to check in code to newlib which allows to choose between
Solaris/POSIX semantics to BSD/Linux semantics(*) at build time.  Since
Linux is our role model, we're going to switch to BSD/Linux semantics
for the next Cygwin release as well.


Thanks,
Corinna

(*) More precise: BSD/Glibc semantics.

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-13 16:31 fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54 tednolan
                   ` (2 preceding siblings ...)
  2014-01-15 16:40 ` Tom Honermann
@ 2014-01-17 20:06 ` Eric Blake
  3 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2014-01-17 20:06 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 3900 bytes --]

On 01/13/2014 09:06 AM, tednolan@bellsouth.net wrote:

> 	while( fgets(buf, sizeof(buf), fp) ) {
> 		++i;
> 
> 		if(sscanf(buf, "%s %s", infile, outfile) != 2) {

> 
> 		switch (fork()) {
> 			

> 			case 0:
> 				fprintf(stderr, "child\n"); fflush(stderr);
> 				exit(0);

Your program violates POSIX, and triggers undefined behavior.  Add an
fflush(NULL) prior to the fork(), and that should avoid the infloop.

====
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05

> Note that after a fork(), two handles exist where one existed before.
The application shall ensure that, if both handles can ever be accessed,
they are both in a state where the other could become the active handle
first. The application shall prepare for a fork() exactly as if it were
a change of active handle. (If the only action performed by one of the
processes is one of the exec functions or _exit() (not exit()), the
handle is never accessed in that process.)

Let's label our two handles: Handle 1 is the parent's handle, as well as
the lone handle that existed pre-fork.  Handle 2 is the child's handle.

>
> For the first handle, the first applicable condition below applies.
After the actions required below are taken, if the handle is still open,
the application can close it.
>
>     If it is a file descriptor, no action is required.

But the handle is a stream, not a file descriptor, so this is not met.

>
>     If the only further action to be performed on any handle to this
open file descriptor is to close it, no action need be taken.

The code isn't calling close(fileno(fp)), so this is not met.

>
>     If it is a stream which is unbuffered, no action need be taken.
>

fp is buffered, so this is not met.

>     If it is a stream which is line buffered, and the last byte
written to the stream was a <newline> (that is, as if a:
>
>         putc('\n')
>
>     was the most recent operation on that stream), no action need be
taken.

fp is not line buffered, so this is not met.

>
>     If it is a stream which is open for writing or appending (but not
also open for reading), the application shall either perform an
fflush(), or the stream shall be closed.

fp is not open for writing, so this is not met.

>
>     If the stream is open for reading and it is at the end of the file
(feof() is true), no action need be taken.

fp is not at EOF, so this is not met.

>
>     If the stream is open with a mode that allows reading and the
underlying open file description refers to a device that is capable of
seeking, the application shall either perform an fflush(), or the stream
shall be closed.

The application did not call fflush(fp) (or fflush(NULL)), and the
stream was not closed prior to fork, so this is not met.

>
> For the second handle:
>
>     If any previous active handle has been used by a function that
explicitly changed the file offset, except as required above for the
first handle, the application shall perform an lseek() or fseek() (as
appropriate to the type of handle) to an appropriate location.

The previous active handle was used to change offset (by fgets), but
none of the requirements on the first handle were met, and we fail to
fseek() on the second handle.  Therefore, the fact that exit() calls
fflush() and changes the offset of the fd, leading to an infloop in the
parent, is a result of the bug in the program violating the POSIX
constraints on active handle manipulation.

>
> If the active handle ceases to be accessible before the requirements
on the first handle, above, have been met, the state of the open file
description becomes undefined. This might occur during functions such as
a fork() or _exit().

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-15  4:53     ` Lord Laraby
  2014-01-15  5:40       ` Lord Laraby
  2014-01-15  7:46       ` Peter Rosin
@ 2014-01-17 20:11       ` Eric Blake
  2 siblings, 0 replies; 22+ messages in thread
From: Eric Blake @ 2014-01-17 20:11 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 455 bytes --]

On 01/14/2014 09:53 PM, Lord Laraby wrote:
> My two cents say, since the child is not referencing 'fp' at all,
> there is no violation of the POSIX semantics in this situation.

But the child _IS_ referencing 'fp', via the implicit fflush(NULL) done
by exit().  If you use _exit() to bypass that implicit fflush, things
are a lot nicer.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-15 15:53         ` tednolan
  2014-01-15 16:33           ` Corinna Vinschen
@ 2014-01-17 20:13           ` Eric Blake
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Blake @ 2014-01-17 20:13 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]

On 01/15/2014 08:28 AM, tednolan@bellsouth.net wrote:

> Well, all I can say in this instance, is that arguably conforming to
> the bare letter of the standard (if that's in fact what is happening)
> is not "the right thing".  People certainly don't expect that stdio
> file pointers that exist at fork() time and which are never "used" by a
> child will be reset in the parent.  I mean, if they can't even be fclose()-ed
> to take them out of the picture, what chance have you got? -:)
> 
> FWIW, FreeBSD, Linux and Solaris all compile and run the test program
> with the behavoir I expect..

Rather, FreeBSD and Linux share the same behavior of having intuitive
action of not resetting the underlying fd position, but violating POSIX
in the process; while Solaris DOES reset the fd and would expose the
same undefined behavior in the sample program.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-14 16:15   ` tednolan
  2014-01-15  4:53     ` Lord Laraby
@ 2014-01-17 20:38     ` Eric Blake
  2014-01-17 21:02       ` tednolan
  2014-01-17 22:12       ` Eric Blake
  1 sibling, 2 replies; 22+ messages in thread
From: Eric Blake @ 2014-01-17 20:38 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1907 bytes --]

On 01/14/2014 08:50 AM, tednolan@bellsouth.net wrote:
> In message <52D55D96.8070407@redhat.com>you write:
>>
>> Your program may be violating POSIX, which would trigger undefined behavior.
>>
>> Quoting POSIX:
>> pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05
>>
> 
> [long quote elided]
> 
> Yikes!  That's pretty impenatrable.  And if it says what I think it says,
> it seems to violate the way I've understood Unix fork() and how fds 
> (and stdio buffers) are inherited since forever.

It says that, intuitively,

putc('a');
fork();
exit();

may or may not print 'a' twice, because both parent and child are
operating on the same stream buffer when they call exit() (which implies
a fflush()).  The fix is to:

putc('a');
fflush(NULL);
fork();
exit();

and now you are guaranteed 'a' is only printed once.

You have the same problem, but in the read direction.  You have both
parent and child with a stream buffer that hasn't yet been flushed back
to the fd underlying the stream.  If you add an fflush(NULL) before the
fork(), your bug should go away (if it doesn't, then _that's_ a cygwin
bug).  Yes, it's annoying that BSD and glibc don't comply with POSIX
behavior, and thereby mask the effects of your bug.  And we are patching
cygwin to mirror glibc's bug, so that your bug will also be masked on
cygwin.  But that's no excuse to not fix your program to not trigger the
bug in the first place.

> 
> However..
> 
> Do I understand that to say that if the first thing my child does is
> 
> 	fclose(fp);
> 
> everything should be hunky-dory?

No.  You have to fix things _in the parent, before the fork()_ for
everything to be hunky-dory.  The easiest way to do that is to
fflush(NULL) before fork()ing.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-17 20:38     ` Eric Blake
@ 2014-01-17 21:02       ` tednolan
  2014-01-17 22:12       ` Eric Blake
  1 sibling, 0 replies; 22+ messages in thread
From: tednolan @ 2014-01-17 21:02 UTC (permalink / raw)
  To: cygwin

In message <52D98E1D.8010907@redhat.com>you write:
>
>No.  You have to fix things _in the parent, before the fork()_ for
>everything to be hunky-dory.  The easiest way to do that is to
>fflush(NULL) before fork()ing.
>

You learn something new every day.

Usually just after you needed to know it.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-17 20:38     ` Eric Blake
  2014-01-17 21:02       ` tednolan
@ 2014-01-17 22:12       ` Eric Blake
  2014-01-17 23:25         ` Lord Laraby
  1 sibling, 1 reply; 22+ messages in thread
From: Eric Blake @ 2014-01-17 22:12 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1231 bytes --]

On 01/17/2014 01:10 PM, Eric Blake wrote:
>> However..
>>
>> Do I understand that to say that if the first thing my child does is
>>
>> 	fclose(fp);
>>
>> everything should be hunky-dory?
> 
> No.  You have to fix things _in the parent, before the fork()_ for
> everything to be hunky-dory.  The easiest way to do that is to
> fflush(NULL) before fork()ing.

The exception to needing to fflush() before forking is when the child
will call exec*() or _exit(), as those paths discard any partially-read
buffers without reflecting them back to the underlying fd, and thus
don't interfere with the parent's notion of where to continue reading
from the fd.

And as it is, all this discussion about fflush(NULL) before fork()
depends on the POSIX folks fixing this bug which I just filed:
http://austingroupbugs.net/view.php?id=816

without that fix in POSIX, you could argue that fflush(NULL) won't do
anything to input streams, and that you would have to explicitly call
fflush(stream) for every input stream, for cases where you know your
child is going to exit() rather than _exit() or exec*().

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54
  2014-01-17 22:12       ` Eric Blake
@ 2014-01-17 23:25         ` Lord Laraby
  0 siblings, 0 replies; 22+ messages in thread
From: Lord Laraby @ 2014-01-17 23:25 UTC (permalink / raw)
  To: Cygwin Mailing List

Well, that is some interesting stuff. So, in POSIX, the child gets the
same FD in the same place and it is actually a second reference to the
kernels open file table. The same entry as the parent uses (via FD) to
determine the offset, flags, etc.
That would explain why the child calling exit() flushes the parents
input file as well as its own. They are the same file as far as the
kernel maintains it. As well, the buffering on the file determines how
much of the file the parent loses when the child flushes the buffer.
For a standard 4K buffer, if the input file is less than a full
buffer, the parent would see an EOF even though it had last read one
or more short lines of input. Now, if the input file had been
unbuffered at the kernel level, this would not cause the problem we
see. Perhaps we'd lose the next character read and buffered.
I must remember to call _exit() when I use fork.
The above makes sense when you consider stdout and stderr, for they
keep the parent and child from clobbering each other's output. I'm not
so sure it's useful on input files so much, though. JMHO.

LL

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-01-17 23:25 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-13 16:31 fork() + file descriptor bug in 1.7.27(0.271/5/3) 2013-12-09 11:54 tednolan
2014-01-14 10:18 ` Corinna Vinschen
2014-01-14 15:54 ` Eric Blake
2014-01-14 16:15   ` tednolan
2014-01-15  4:53     ` Lord Laraby
2014-01-15  5:40       ` Lord Laraby
2014-01-15  7:46       ` Peter Rosin
2014-01-15 15:53         ` tednolan
2014-01-15 16:33           ` Corinna Vinschen
2014-01-16  5:08             ` tednolan
2014-01-16  8:50               ` Corinna Vinschen
2014-01-17  5:19                 ` tednolan
2014-01-17  9:28                   ` Corinna Vinschen
2014-01-17 20:13           ` Eric Blake
2014-01-17 20:11       ` Eric Blake
2014-01-17 20:38     ` Eric Blake
2014-01-17 21:02       ` tednolan
2014-01-17 22:12       ` Eric Blake
2014-01-17 23:25         ` Lord Laraby
2014-01-15 16:40 ` Tom Honermann
2014-01-15 16:50   ` Corinna Vinschen
2014-01-17 20:06 ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).