public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* setrlimit change to prlimit change in behavior?
@ 2017-10-18 13:46 Mark Wielaard
  2017-10-18 14:13 ` Andreas Schwab
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Wielaard @ 2017-10-18 13:46 UTC (permalink / raw)
  To: libc-alpha; +Cc: Michael Kerrisk (man-pages)

Hi,

I observed a, probably silly, change in behavior while running the
valgrind testsuite on a RHEL7 setup using linux 3.10 and glibc 2.17 and
a Fedora27 (beta) setup using linux 4.13 and glibc 2.26.

The valgrind testsuite has a testcase that checks that
 setrlimit (RLIMIT_NOFILE, NULL)
returns failure and sets errno to EFAULT.

Which it does on RHEL7, but on Fedora27 this silently returns success.

I suspect that this was caused by commit 695d7d138 "Assume prlimit64 is
available" which turns setrlimit (RLIMIT_NOFILE, NULL) into prlimit (0,
RLIMIT_NOFILE, NULL, NULL).

The man page http://man7.org/linux/man-pages/man2/prlimit.2.html
doesn't really make clear what happens if both old_limit and new_limit
are NULL. But apparently the kernel interprets that as a NOP. Or maybe
a check to see if you would have permission to get/set the limit for
the given pid. But that is zero here, which means, your own process.

There is probably no other code than the valgrind testsuite that
depends on this particular behavior of getting an EFAULT for a NULL
argument to setrlimit. And we could easily change the testcase. But
maybe someone sees a real issue in this change of behavior?

Cheers,

Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-18 13:46 setrlimit change to prlimit change in behavior? Mark Wielaard
@ 2017-10-18 14:13 ` Andreas Schwab
  2017-10-18 14:46   ` Mark Wielaard
  0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2017-10-18 14:13 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: libc-alpha, Michael Kerrisk (man-pages)

On Okt 18 2017, Mark Wielaard <mark@klomp.org> wrote:

> The valgrind testsuite has a testcase that checks that
>  setrlimit (RLIMIT_NOFILE, NULL)
> returns failure and sets errno to EFAULT.

In general, you cannot count on EFAULT.  The call has undefined
behaviour, thus unbounded effect (a crash would be ok too).  If you want
to check the effect of a syscall you have to call it directly, not via a
libc function.

> The man page http://man7.org/linux/man-pages/man2/prlimit.2.html
> doesn't really make clear what happens if both old_limit and new_limit
> are NULL. But apparently the kernel interprets that as a NOP.

Similar to sigaction, if an operand is NULL then the respective
operation is not performed, which implies that if both are NULL there is
nothing to do.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-18 14:13 ` Andreas Schwab
@ 2017-10-18 14:46   ` Mark Wielaard
  2017-10-18 15:04     ` Andreas Schwab
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Wielaard @ 2017-10-18 14:46 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha, Michael Kerrisk (man-pages)

On Wed, 2017-10-18 at 16:13 +0200, Andreas Schwab wrote:
> On Okt 18 2017, Mark Wielaard <mark@klomp.org> wrote:
> 
> > The valgrind testsuite has a testcase that checks that
> >  setrlimit (RLIMIT_NOFILE, NULL)
> > returns failure and sets errno to EFAULT.
> 
> In general, you cannot count on EFAULT.  The call has undefined
> behaviour, thus unbounded effect (a crash would be ok too).

Interesting. I didn't know undefined behavior extended to the Posix
functions. So does this hold for all functions defined by Posix that
have pointer arguments that are outside the accessible address space?
It seems a bit harsh for NULL arguments and at least the setrlimit man
page sounds like it is also describing the glibc implementation, not
just the direct linux setrlimit system call.

>   If you want
> to check the effect of a syscall you have to call it directly, not
> via a libc function.

Sure, the valgrind test actually is about the underlying system calls.
We will probably change the specific test to directly call the system
calls. But the observed behavior for the glibc setrlimit is what
changed in this case. It is not only the errno value that changed, but
also that the call now succeeds, while before it explicitly failed. But
I assume your argument is that it is undefined behavior so even though
the documentation said it would fail that was just by accident?

> > The man page http://man7.org/linux/man-pages/man2/prlimit.2.html
> > doesn't really make clear what happens if both old_limit and
> > new_limit
> > are NULL. But apparently the kernel interprets that as a NOP.
> 
> Similar to sigaction, if an operand is NULL then the respective
> operation is not performed, which implies that if both are NULL there
> is nothing to do.

OK, it would be nice to get that explicitly documented. Especially if
this implies none of the access/permission checks are done.

For valgrind the question is whether or not we want to warn about this.
If passing NULL might be explicitly done then maybe the user doesn't
want to be warned about the fact that they provided a NULL pointer to
setrlimit (or two NULL pointers to prlimit).

Cheers,

Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-18 14:46   ` Mark Wielaard
@ 2017-10-18 15:04     ` Andreas Schwab
  2017-10-18 17:44       ` Carlos O'Donell
  0 siblings, 1 reply; 12+ messages in thread
From: Andreas Schwab @ 2017-10-18 15:04 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: libc-alpha, Michael Kerrisk (man-pages)

On Okt 18 2017, Mark Wielaard <mark@klomp.org> wrote:

> On Wed, 2017-10-18 at 16:13 +0200, Andreas Schwab wrote:
>> On Okt 18 2017, Mark Wielaard <mark@klomp.org> wrote:
>> 
>> > The valgrind testsuite has a testcase that checks that
>> >  setrlimit (RLIMIT_NOFILE, NULL)
>> > returns failure and sets errno to EFAULT.
>> 
>> In general, you cannot count on EFAULT.  The call has undefined
>> behaviour, thus unbounded effect (a crash would be ok too).
>
> Interesting. I didn't know undefined behavior extended to the Posix
> functions. So does this hold for all functions defined by Posix that
> have pointer arguments that are outside the accessible address space?

If there is no special case for NULL then it is not allowed.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-18 15:04     ` Andreas Schwab
@ 2017-10-18 17:44       ` Carlos O'Donell
  2017-10-18 19:06         ` Mark Wielaard
  0 siblings, 1 reply; 12+ messages in thread
From: Carlos O'Donell @ 2017-10-18 17:44 UTC (permalink / raw)
  To: Andreas Schwab, Mark Wielaard; +Cc: libc-alpha, Michael Kerrisk (man-pages)

On 10/18/2017 08:04 AM, Andreas Schwab wrote:
> On Okt 18 2017, Mark Wielaard <mark@klomp.org> wrote:
> 
>> On Wed, 2017-10-18 at 16:13 +0200, Andreas Schwab wrote:
>>> On Okt 18 2017, Mark Wielaard <mark@klomp.org> wrote:
>>>
>>>> The valgrind testsuite has a testcase that checks that
>>>>  setrlimit (RLIMIT_NOFILE, NULL)
>>>> returns failure and sets errno to EFAULT.
>>>
>>> In general, you cannot count on EFAULT.  The call has undefined
>>> behaviour, thus unbounded effect (a crash would be ok too).
>>
>> Interesting. I didn't know undefined behavior extended to the Posix
>> functions. So does this hold for all functions defined by Posix that
>> have pointer arguments that are outside the accessible address space?
> 
> If there is no special case for NULL then it is not allowed.

Agreed.

And testing for NULL and returning EFAULT is not something we actually
want to do, please see:
https://sourceware.org/glibc/wiki/Style_and_Conventions#Error_Handling

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-18 17:44       ` Carlos O'Donell
@ 2017-10-18 19:06         ` Mark Wielaard
  2017-10-18 20:10           ` Carlos O'Donell
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Wielaard @ 2017-10-18 19:06 UTC (permalink / raw)
  To: Carlos O'Donell, Andreas Schwab
  Cc: libc-alpha, Michael Kerrisk (man-pages)

On Wed, 2017-10-18 at 10:44 -0700, Carlos O'Donell wrote:
> On 10/18/2017 08:04 AM, Andreas Schwab wrote:
> > If there is no special case for NULL then it is not allowed.
> 
> Agreed.
> 
> And testing for NULL and returning EFAULT is not something we actually
> want to do, please see:
> https://sourceware.org/glibc/wiki/Style_and_Conventions#Error_Handling

Thanks for that reference. Of course for my use case (syscall argument
sanity checking) I shouldn't rely on the glibc behavior. So we can
certainly change our testcase to not go through glibc and use direct
syscalls.

But if I were to rely on the glibc behavior what exactly defines the
contract? Naively I used the man page for setrlimit.
http://man7.org/linux/man-pages/man2/setrlimit.2.html
Which at first glance seems to define the contract for glibc (and the
kernel combined). And it does explicitly say:

ERRORS          
       EFAULT A pointer argument points to a location outside the
	      accessible address space.

I cannot immediately deduce from this document which errors define the
contract of the linux kernel interface and which of glibc.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-18 19:06         ` Mark Wielaard
@ 2017-10-18 20:10           ` Carlos O'Donell
  2017-10-18 20:12             ` Adhemerval Zanella
  2017-10-20 15:56             ` Mark Wielaard
  0 siblings, 2 replies; 12+ messages in thread
From: Carlos O'Donell @ 2017-10-18 20:10 UTC (permalink / raw)
  To: Mark Wielaard, Andreas Schwab; +Cc: libc-alpha, Michael Kerrisk (man-pages)

On 10/18/2017 12:06 PM, Mark Wielaard wrote:
> On Wed, 2017-10-18 at 10:44 -0700, Carlos O'Donell wrote:
>> On 10/18/2017 08:04 AM, Andreas Schwab wrote:
>>> If there is no special case for NULL then it is not allowed.
>>
>> Agreed.
>>
>> And testing for NULL and returning EFAULT is not something we actually
>> want to do, please see:
>> https://sourceware.org/glibc/wiki/Style_and_Conventions#Error_Handling
> 
> Thanks for that reference. Of course for my use case (syscall argument
> sanity checking) I shouldn't rely on the glibc behavior. So we can
> certainly change our testcase to not go through glibc and use direct
> syscalls.
> 
> But if I were to rely on the glibc behavior what exactly defines the
> contract? Naively I used the man page for setrlimit.

The linux man pages are not the canonical documentation for the interface.
I suggest sending them a patch to adjust their documentation to prevent
further confusion.

The canonical documentation is the glibc manual e.g. info setrlimit,
which says only EPERM is a documented return, and NULL is not discussed
as part of the contract:
https://www.gnu.org/software/libc/manual/html_node/Limits-on-Resources.html#index-setrlimit

Likewise POSIX Issue 7:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/setrlimit.html

> http://man7.org/linux/man-pages/man2/setrlimit.2.html
> Which at first glance seems to define the contract for glibc (and the
> kernel combined). And it does explicitly say:
> 
> ERRORS          
>        EFAULT A pointer argument points to a location outside the
> 	      accessible address space.

This is a case of the linux man pages project *overspecifying* the
interface and doing so is a bad idea. We take great pains in glibc, and
the upstream standards bodies to avoid doing this where it would limit
the implementation. Consider that a requirement for the above would mean
that pointers into setrlimit would have to be tested forever! That would
slow things down needlessly for conforming applications.

> I cannot immediately deduce from this document which errors define the
> contract of the linux kernel interface and which of glibc.

The linux man pages have two kinds of documentation

- linux kernel interface documentation (see futex for an example)

- API documentation (see getrlimit/setrlimit, prlimit for an example)

And it does a very good job of explaining which you are looking at.

In this case you are looking at linux man pages API documentation for
interfaces which are part of a published standard, though linux-isms
and glibc-isms have leaked into the documentation usually in the form of 
overspecifications, we should work to remove them.

At this moment you *cannot* know what the linux kernel interface is
for setprlimit without going to the linux kernel sources, and getting 
confirmation from core kernel developers to document that interface.
Such an interface documentation may not exist and for futex it didn't
and took tons of work for Michael and Torvald to write one such that
glibc could have something rock-solid to depend upon for concurrency
uses.

Lastly, note that nothing prevents interfaces from returning additional
error codes not specified in the standard, and in those cases your
application should just abort. In glibc we do this for futex syscalls
if the error code is outside of the contract provided by the kernel,
thus we fail early and catastrophically if someone breaks futex with
new undocumented error codes (see futex-internal.h (futex_fatal_error)).

I hope that answers your question and give you some insight into what
is going on and why. I also hope it spurs you into action to help
Michael Kerrisk make the linux man pages a better project by
contributing fixes.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-18 20:10           ` Carlos O'Donell
@ 2017-10-18 20:12             ` Adhemerval Zanella
  2017-10-20 15:56             ` Mark Wielaard
  1 sibling, 0 replies; 12+ messages in thread
From: Adhemerval Zanella @ 2017-10-18 20:12 UTC (permalink / raw)
  To: libc-alpha



On 18/10/2017 18:10, Carlos O'Donell wrote:
> On 10/18/2017 12:06 PM, Mark Wielaard wrote:
>> On Wed, 2017-10-18 at 10:44 -0700, Carlos O'Donell wrote:
>>> On 10/18/2017 08:04 AM, Andreas Schwab wrote:
>>>> If there is no special case for NULL then it is not allowed.
>>>
>>> Agreed.
>>>
>>> And testing for NULL and returning EFAULT is not something we actually
>>> want to do, please see:
>>> https://sourceware.org/glibc/wiki/Style_and_Conventions#Error_Handling
>>
>> Thanks for that reference. Of course for my use case (syscall argument
>> sanity checking) I shouldn't rely on the glibc behavior. So we can
>> certainly change our testcase to not go through glibc and use direct
>> syscalls.
>>
>> But if I were to rely on the glibc behavior what exactly defines the
>> contract? Naively I used the man page for setrlimit.
> 
> The linux man pages are not the canonical documentation for the interface.
> I suggest sending them a patch to adjust their documentation to prevent
> further confusion.
> 
> The canonical documentation is the glibc manual e.g. info setrlimit,
> which says only EPERM is a documented return, and NULL is not discussed
> as part of the contract:
> https://www.gnu.org/software/libc/manual/html_node/Limits-on-Resources.html#index-setrlimit
> 
> Likewise POSIX Issue 7:
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/setrlimit.html
> 
>> http://man7.org/linux/man-pages/man2/setrlimit.2.html
>> Which at first glance seems to define the contract for glibc (and the
>> kernel combined). And it does explicitly say:
>>
>> ERRORS          
>>        EFAULT A pointer argument points to a location outside the
>> 	      accessible address space.
> 
> This is a case of the linux man pages project *overspecifying* the
> interface and doing so is a bad idea. We take great pains in glibc, and
> the upstream standards bodies to avoid doing this where it would limit
> the implementation. Consider that a requirement for the above would mean
> that pointers into setrlimit would have to be tested forever! That would
> slow things down needlessly for conforming applications.
> 
>> I cannot immediately deduce from this document which errors define the
>> contract of the linux kernel interface and which of glibc.
> 
> The linux man pages have two kinds of documentation
> 
> - linux kernel interface documentation (see futex for an example)
> 
> - API documentation (see getrlimit/setrlimit, prlimit for an example)
> 
> And it does a very good job of explaining which you are looking at.
> 
> In this case you are looking at linux man pages API documentation for
> interfaces which are part of a published standard, though linux-isms
> and glibc-isms have leaked into the documentation usually in the form of 
> overspecifications, we should work to remove them.
> 
> At this moment you *cannot* know what the linux kernel interface is
> for setprlimit without going to the linux kernel sources, and getting 
> confirmation from core kernel developers to document that interface.
> Such an interface documentation may not exist and for futex it didn't
> and took tons of work for Michael and Torvald to write one such that
> glibc could have something rock-solid to depend upon for concurrency
> uses.
> 
> Lastly, note that nothing prevents interfaces from returning additional
> error codes not specified in the standard, and in those cases your
> application should just abort. In glibc we do this for futex syscalls
> if the error code is outside of the contract provided by the kernel,
> thus we fail early and catastrophically if someone breaks futex with
> new undocumented error codes (see futex-internal.h (futex_fatal_error)).
> 
> I hope that answers your question and give you some insight into what
> is going on and why. I also hope it spurs you into action to help
> Michael Kerrisk make the linux man pages a better project by
> contributing fixes.


I recall very similar discussion some time ago that lead to a LTP fix [1].
As for valgrind, they were also testing for the specific undefined glibc
behaviour.

[1] https://github.com/linux-test-project/ltp/commit/259db6fed55f88ab32a0875e66803eee44d298be

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-18 20:10           ` Carlos O'Donell
  2017-10-18 20:12             ` Adhemerval Zanella
@ 2017-10-20 15:56             ` Mark Wielaard
  2017-10-20 16:10               ` Carlos O'Donell
  1 sibling, 1 reply; 12+ messages in thread
From: Mark Wielaard @ 2017-10-20 15:56 UTC (permalink / raw)
  To: Carlos O'Donell, Andreas Schwab
  Cc: libc-alpha, Michael Kerrisk (man-pages)

On Wed, 2017-10-18 at 13:10 -0700, Carlos O'Donell wrote:
> On 10/18/2017 12:06 PM, Mark Wielaard wrote:
> The linux man pages are not the canonical documentation for the
> interface.
> I suggest sending them a patch to adjust their documentation to
> prevent further confusion.

What would you suggest the patch say?

> At this moment you *cannot* know what the linux kernel interface is
> for setprlimit without going to the linux kernel sources, and
> getting 
> confirmation from core kernel developers to document that interface.

And if I was interested in the glibc provided prlimit function?
Where should I look for documentation/specification?

> > I cannot immediately deduce from this document which errors define the
> > contract of the linux kernel interface and which of glibc.
> 
> The linux man pages have two kinds of documentation
> 
> - linux kernel interface documentation (see futex for an example)
> 
> - API documentation (see getrlimit/setrlimit, prlimit for an example)
> 
> And it does a very good job of explaining which you are looking at.

How does it make that distinction?

I note that there are section 2 and section 3 man pages which seem to
officially refer to system calls vs library calls. But in practice it
seems the section 2 man pages describe the combined linux/glibc
contract, while the section 3 man pages (if they exist) describe the
standardized/posix variants (and so don't document the GNU extensions,
which you have to lookup in section 2).

> I hope that answers your question and give you some insight into what
> is going on and why. I also hope it spurs you into action to help
> Michael Kerrisk make the linux man pages a better project by
> contributing fixes.

It answers my question partly :) Yes, for my valgrind work I would like
documentation of "pure" bare linux kernel syscalls so that we can make
valgrind memcheck better at checking the syscall contract. But as
"normal" GNU/Linux hacker I find the combined explanation much more
useful. Who in their right mind is going to write anything against bare
bones linux kernel syscalls? It is the glibc wrappers that make it a
useful interface, so it makes sense to document them in combination
IMHO (so I believe the section 2 man pages are much more useful than
the section 3 man pages).

Thanks,

Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-20 15:56             ` Mark Wielaard
@ 2017-10-20 16:10               ` Carlos O'Donell
  2017-10-20 16:43                 ` Mark Wielaard
  0 siblings, 1 reply; 12+ messages in thread
From: Carlos O'Donell @ 2017-10-20 16:10 UTC (permalink / raw)
  To: Mark Wielaard, Andreas Schwab; +Cc: libc-alpha, Michael Kerrisk (man-pages)

On 10/20/2017 08:56 AM, Mark Wielaard wrote:
> On Wed, 2017-10-18 at 13:10 -0700, Carlos O'Donell wrote:
>> On 10/18/2017 12:06 PM, Mark Wielaard wrote:
>> The linux man pages are not the canonical documentation for the
>> interface.
>> I suggest sending them a patch to adjust their documentation to
>> prevent further confusion.
> 
> What would you suggest the patch say?

Remove the EINVAL error specification.

>> At this moment you *cannot* know what the linux kernel interface is
>> for setprlimit without going to the linux kernel sources, and
>> getting 
>> confirmation from core kernel developers to document that interface.
> 
> And if I was interested in the glibc provided prlimit function?
> Where should I look for documentation/specification?

info setrlimit

>>> I cannot immediately deduce from this document which errors define the
>>> contract of the linux kernel interface and which of glibc.
>>
>> The linux man pages have two kinds of documentation
>>
>> - linux kernel interface documentation (see futex for an example)
>>
>> - API documentation (see getrlimit/setrlimit, prlimit for an example)
>>
>> And it does a very good job of explaining which you are looking at.
> 
> How does it make that distinction?

"Note: There is no glibc wrapper for this system call; see NOTES."

In which case it is documenting the *raw* kernel API.

> It answers my question partly :) Yes, for my valgrind work I would like
> documentation of "pure" bare linux kernel syscalls so that we can make
> valgrind memcheck better at checking the syscall contract. But as
> "normal" GNU/Linux hacker I find the combined explanation much more
> useful. Who in their right mind is going to write anything against bare
> bones linux kernel syscalls? It is the glibc wrappers that make it a
> useful interface, so it makes sense to document them in combination
> IMHO (so I believe the section 2 man pages are much more useful than
> the section 3 man pages).

You document the API.

The problem is that the people writing and submitting changes to the
linux man pages are *not* the authors of the API, and they document
what they see experimentally.

To give you a data point. When Amazon developed their AWS APIs they
chose to *consciously* and *randomly* rotate the order of the labelled
and accepted arguments to their network APIs to catch people who were
expecting and hard coding the arguments by number instead of by label.

Here we have people experimenting, seeing some error return codes,
and assuming they belong in the contract of the API when they don't.

I *wish* I could do something to break those assumptions, bu we don't
have any mechanism to do that.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-20 16:10               ` Carlos O'Donell
@ 2017-10-20 16:43                 ` Mark Wielaard
  2017-10-20 16:58                   ` Carlos O'Donell
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Wielaard @ 2017-10-20 16:43 UTC (permalink / raw)
  To: Carlos O'Donell, Andreas Schwab
  Cc: libc-alpha, Michael Kerrisk (man-pages)

On Fri, 2017-10-20 at 09:10 -0700, Carlos O'Donell wrote:
> On 10/20/2017 08:56 AM, Mark Wielaard wrote:
> > On Wed, 2017-10-18 at 13:10 -0700, Carlos O'Donell wrote:
> > > On 10/18/2017 12:06 PM, Mark Wielaard wrote:
> > > The linux man pages are not the canonical documentation for the
> > > interface.
> > > I suggest sending them a patch to adjust their documentation to
> > > prevent further confusion.
> > 
> > What would you suggest the patch say?
> 
> Remove the EINVAL error specification.

Hmmm. That doesn't seem correct. EINVAL might be returned both if you
interpret it as pure syscall and as glibc library call. Although If you
meant EFAULT then it depends on whether you believe it is documenting
the glibc functions or the system calls. I believe if you are
documenting the system calls then the intention is that they return
EFAULT instead of producing undefined behavior.

> > And if I was interested in the glibc provided prlimit function?
> > Where should I look for documentation/specification?
> 
> info setrlimit

That doesn't provide any documentation of the glibc provided prlimit
function.

> > > The linux man pages have two kinds of documentation
> > > 
> > > - linux kernel interface documentation (see futex for an example)
> > > 
> > > - API documentation (see getrlimit/setrlimit, prlimit for an
> > > example)
> > > 
> > > And it does a very good job of explaining which you are looking
> > > at.
> > 
> > How does it make that distinction?
> 
> "Note: There is no glibc wrapper for this system call; see NOTES."
> 
> In which case it is documenting the *raw* kernel API.

OK. And in all other cases it documents the glibc function?

> You document the API.
> 
> The problem is that the people writing and submitting changes to the
> linux man pages are *not* the authors of the API, and they document
> what they see experimentally.
> [...]
> Here we have people experimenting, seeing some error return codes,
> and assuming they belong in the contract of the API when they don't.
> 
> I *wish* I could do something to break those assumptions, bu we don't
> have any mechanism to do that.

Doesn't the glibc project document its own API? That is really what I
am after. glibc provides a prlimit function. How was that specified,
where is it documented? How and where should I send updates to that
documentation?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: setrlimit change to prlimit change in behavior?
  2017-10-20 16:43                 ` Mark Wielaard
@ 2017-10-20 16:58                   ` Carlos O'Donell
  0 siblings, 0 replies; 12+ messages in thread
From: Carlos O'Donell @ 2017-10-20 16:58 UTC (permalink / raw)
  To: Mark Wielaard, Andreas Schwab; +Cc: libc-alpha, Michael Kerrisk (man-pages)

On 10/20/2017 09:42 AM, Mark Wielaard wrote:
> On Fri, 2017-10-20 at 09:10 -0700, Carlos O'Donell wrote:
>> On 10/20/2017 08:56 AM, Mark Wielaard wrote:
>>> On Wed, 2017-10-18 at 13:10 -0700, Carlos O'Donell wrote:
>>>> On 10/18/2017 12:06 PM, Mark Wielaard wrote:
>>>> The linux man pages are not the canonical documentation for the
>>>> interface.
>>>> I suggest sending them a patch to adjust their documentation to
>>>> prevent further confusion.
>>>
>>> What would you suggest the patch say?
>>
>> Remove the EINVAL error specification.
> 
> Hmmm. That doesn't seem correct. EINVAL might be returned both if you
> interpret it as pure syscall and as glibc library call. Although If you
> meant EFAULT then it depends on whether you believe it is documenting
> the glibc functions or the system calls. I believe if you are
> documenting the system calls then the intention is that they return
> EFAULT instead of producing undefined behavior.

Sorry, my mistake, I did mean EFAULT.

>>> And if I was interested in the glibc provided prlimit function?
>>> Where should I look for documentation/specification?
>>
>> info setrlimit
> 
> That doesn't provide any documentation of the glibc provided prlimit
> function.

Again, sorry I misread your question as just asking about the general
rlimit functions like setrlimit and getrlimit.

There is no glibc documentation for prlimit, it needs to be written
by someone. I admit that glibc as a community had a bad habit of 
adding extensions without documenting them. That is a very bad habit
we are breaking. I don't think we've accepted anything without
documentation recently, and if we have, please tell me and I'll
correct that. However, historical missing documentation is something
we're working on.

>>>> The linux man pages have two kinds of documentation
>>>>
>>>> - linux kernel interface documentation (see futex for an example)
>>>>
>>>> - API documentation (see getrlimit/setrlimit, prlimit for an
>>>> example)
>>>>
>>>> And it does a very good job of explaining which you are looking
>>>> at.
>>>
>>> How does it make that distinction?
>>
>> "Note: There is no glibc wrapper for this system call; see NOTES."
>>
>> In which case it is documenting the *raw* kernel API.
> 
> OK. And in all other cases it documents the glibc function?

AFAIK yes.

>> You document the API.
>>
>> The problem is that the people writing and submitting changes to the
>> linux man pages are *not* the authors of the API, and they document
>> what they see experimentally.
>> [...]
>> Here we have people experimenting, seeing some error return codes,
>> and assuming they belong in the contract of the API when they don't.
>>
>> I *wish* I could do something to break those assumptions, bu we don't
>> have any mechanism to do that.
> 
> Doesn't the glibc project document its own API? That is really what I
> am after. glibc provides a prlimit function. How was that specified,
> where is it documented? How and where should I send updates to that
> documentation?

The glibc project *does* document it's own APIs, the problem is that
historically this was not enshrined well in the community practice.
It is now. And we have a backlog of APIs to document.

You need to add a new prlimit function to:
manual/resources.texi, and post the patch to libc-alpha. I'll be
happy to review.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-10-20 16:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-18 13:46 setrlimit change to prlimit change in behavior? Mark Wielaard
2017-10-18 14:13 ` Andreas Schwab
2017-10-18 14:46   ` Mark Wielaard
2017-10-18 15:04     ` Andreas Schwab
2017-10-18 17:44       ` Carlos O'Donell
2017-10-18 19:06         ` Mark Wielaard
2017-10-18 20:10           ` Carlos O'Donell
2017-10-18 20:12             ` Adhemerval Zanella
2017-10-20 15:56             ` Mark Wielaard
2017-10-20 16:10               ` Carlos O'Donell
2017-10-20 16:43                 ` Mark Wielaard
2017-10-20 16:58                   ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).