Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
@ 2022-12-13 18:20 Narayanan Iyer
  2022-12-13 18:31 ` Andrew Pinski
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Narayanan Iyer @ 2022-12-13 18:20 UTC (permalink / raw)
  To: libc-alpha; +Cc: Narayanan Iyer

Hi,

I had created a glibc bug report at https://sourceware.org/bugzilla/show_bug.cgi?id=29863

And have included the title of that report in the email subject as well.

It has been a week since I created the issue and I did not hear anything back. I believe this is a regression in glibc 2.36 and have provided enough information in the bug report (along with the commit that caused the regression).

Given the upcoming glibc 2.37 release, I was asked (by Carlos O'Donell <carlos@redhat.com>) to raise it on this mailing list to get the attention of the glibc developers.

I am hoping one of you could take a quick look at the bug report and see if it is worth fixing it in time for glibc 2.37.

Thanks,
Narayanan.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 18:20 Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change Narayanan Iyer
@ 2022-12-13 18:31 ` Andrew Pinski
  2022-12-13 18:39   ` Narayanan Iyer
  2022-12-13 18:39 ` Cristian Rodríguez
  2022-12-13 19:08 ` Noah Goldstein
  2 siblings, 1 reply; 28+ messages in thread
From: Andrew Pinski @ 2022-12-13 18:31 UTC (permalink / raw)
  To: Narayanan Iyer; +Cc: libc-alpha

On Tue, Dec 13, 2022 at 10:21 AM Narayanan Iyer via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Hi,
>
> I had created a glibc bug report at https://sourceware.org/bugzilla/show_bug.cgi?id=29863
>
> And have included the title of that report in the email subject as well.
>
> It has been a week since I created the issue and I did not hear anything back. I believe this is a regression in glibc 2.36 and have provided enough information in the bug report (along with the commit that caused the regression).

I don't see how this is a valid well defined testcase. Changing memory
without barriers in both threads is not well defined. memcmp is not
guaranteed to be atomic or have any atomicity either.
So you just happen to work before and now your undefined code does not work.

Thanks,
Andrew Pinski

>
> Given the upcoming glibc 2.37 release, I was asked (by Carlos O'Donell <carlos@redhat.com>) to raise it on this mailing list to get the attention of the glibc developers.
>
> I am hoping one of you could take a quick look at the bug report and see if it is worth fixing it in time for glibc 2.37.
>
> Thanks,
> Narayanan.
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 18:31 ` Andrew Pinski
@ 2022-12-13 18:39   ` Narayanan Iyer
  0 siblings, 0 replies; 28+ messages in thread
From: Narayanan Iyer @ 2022-12-13 18:39 UTC (permalink / raw)
  To: 'Andrew Pinski'; +Cc: libc-alpha

As I also mentioned at https://sourceware.org/bugzilla/show_bug.cgi?id=29863#c3, it might not be a well defined test case. But memcmp() is never expected to SIG-11 in any circumstance as long as the buffers it is passed in are accessible from start to finish.

In the same bug report, I had also provided a timeline of the instructions that I believe got executed in the crash case and pointed to the failing assembly instruction. The test case was just a easy way for you to see the problem yourself. Even if you don't accept the test case, can you take a look at the rest of my bug report to see if you agree with the finding.

Thanks,
Narayanan.

-----Original Message-----
From: Andrew Pinski [mailto:pinskia@gmail.com] 
Sent: Tuesday, December 13, 2022 1:32 PM
To: Narayanan Iyer <nars@yottadb.com>
Cc: libc-alpha@sourceware.org
Subject: Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change

On Tue, Dec 13, 2022 at 10:21 AM Narayanan Iyer via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Hi,
>
> I had created a glibc bug report at https://sourceware.org/bugzilla/show_bug.cgi?id=29863
>
> And have included the title of that report in the email subject as well.
>
> It has been a week since I created the issue and I did not hear anything back. I believe this is a regression in glibc 2.36 and have provided enough information in the bug report (along with the commit that caused the regression).

I don't see how this is a valid well defined testcase. Changing memory
without barriers in both threads is not well defined. memcmp is not
guaranteed to be atomic or have any atomicity either.
So you just happen to work before and now your undefined code does not work.

Thanks,
Andrew Pinski

>
> Given the upcoming glibc 2.37 release, I was asked (by Carlos O'Donell <carlos@redhat.com>) to raise it on this mailing list to get the attention of the glibc developers.
>
> I am hoping one of you could take a quick look at the bug report and see if it is worth fixing it in time for glibc 2.37.
>
> Thanks,
> Narayanan.
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 18:20 Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change Narayanan Iyer
  2022-12-13 18:31 ` Andrew Pinski
@ 2022-12-13 18:39 ` Cristian Rodríguez
  2022-12-13 19:08 ` Noah Goldstein
  2 siblings, 0 replies; 28+ messages in thread
From: Cristian Rodríguez @ 2022-12-13 18:39 UTC (permalink / raw)
  To: Narayanan Iyer; +Cc: libc-alpha

On Tue, Dec 13, 2022 at 3:20 PM Narayanan Iyer via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Hi,
>
> I had created a glibc bug report at https://sourceware.org/bugzilla/show_bug.cgi?id=29863
>

It might as well make  a shelter.. from pigs on the wing... That's UB
all the way.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 18:20 Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change Narayanan Iyer
  2022-12-13 18:31 ` Andrew Pinski
  2022-12-13 18:39 ` Cristian Rodríguez
@ 2022-12-13 19:08 ` Noah Goldstein
  2022-12-13 19:13   ` Narayanan Iyer
  2022-12-13 21:20   ` Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change Florian Weimer
  2 siblings, 2 replies; 28+ messages in thread
From: Noah Goldstein @ 2022-12-13 19:08 UTC (permalink / raw)
  To: Narayanan Iyer; +Cc: libc-alpha

On Tue, Dec 13, 2022 at 10:20 AM Narayanan Iyer via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Hi,
>
> I had created a glibc bug report at https://sourceware.org/bugzilla/show_bug.cgi?id=29863
>
> And have included the title of that report in the email subject as well.
>
> It has been a week since I created the issue and I did not hear anything back. I believe this is a regression in glibc 2.36 and have provided enough information in the bug report (along with the commit that caused the regression).
>
> Given the upcoming glibc 2.37 release, I was asked (by Carlos O'Donell <carlos@redhat.com>) to raise it on this mailing list to get the attention of the glibc developers.
>
> I am hoping one of you could take a quick look at the bug report and see if it is worth fixing it in time for glibc 2.37.
>
> Thanks,
> Narayanan.
>
>

The fix:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/memcmp-sse2.S;h=afd450d0206d6633da9fbc4607a7fa6aeb4e137c;hb=HEAD#l46
```
-#   define SIZE_OFFSET (CHAR_PER_VEC * 2)
+#   define SIZE_OFFSET 0
```

Is this something we have to support? I believe other functions /
implementations
of memcmp will suffer from a similar bug.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 19:08 ` Noah Goldstein
@ 2022-12-13 19:13   ` Narayanan Iyer
  2022-12-13 19:25     ` Noah Goldstein
  2022-12-13 21:20   ` Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change Florian Weimer
  1 sibling, 1 reply; 28+ messages in thread
From: Narayanan Iyer @ 2022-12-13 19:13 UTC (permalink / raw)
  To: 'Noah Goldstein'; +Cc: libc-alpha

Noah,

Thank you for acknowledging that it is a bug.

In case it helps, I found in our testing cluster, that a system which uses sysdeps/x86_64/multiarch/memcmp-sse2.S has the bug whereas a system that uses sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S does not have the bug.

So it is possible, the issue is only in the sse2 version.

Thanks,
Narayanan.

-----Original Message-----
From: Noah Goldstein [mailto:goldstein.w.n@gmail.com] 
Sent: Tuesday, December 13, 2022 2:08 PM
To: Narayanan Iyer <nars@yottadb.com>
Cc: libc-alpha@sourceware.org
Subject: Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change

On Tue, Dec 13, 2022 at 10:20 AM Narayanan Iyer via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Hi,
>
> I had created a glibc bug report at https://sourceware.org/bugzilla/show_bug.cgi?id=29863
>
> And have included the title of that report in the email subject as well.
>
> It has been a week since I created the issue and I did not hear anything back. I believe this is a regression in glibc 2.36 and have provided enough information in the bug report (along with the commit that caused the regression).
>
> Given the upcoming glibc 2.37 release, I was asked (by Carlos O'Donell <carlos@redhat.com>) to raise it on this mailing list to get the attention of the glibc developers.
>
> I am hoping one of you could take a quick look at the bug report and see if it is worth fixing it in time for glibc 2.37.
>
> Thanks,
> Narayanan.
>
>

The fix:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/memcmp-sse2.S;h=afd450d0206d6633da9fbc4607a7fa6aeb4e137c;hb=HEAD#l46
```
-#   define SIZE_OFFSET (CHAR_PER_VEC * 2)
+#   define SIZE_OFFSET 0
```

Is this something we have to support? I believe other functions /
implementations
of memcmp will suffer from a similar bug.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 19:13   ` Narayanan Iyer
@ 2022-12-13 19:25     ` Noah Goldstein
  2022-12-13 20:56       ` Zack Weinberg
  2022-12-13 22:52       ` Bug 29863 - Segmentation fault vs invalid results, memory models, and control/data dependencies Carlos O'Donell
  0 siblings, 2 replies; 28+ messages in thread
From: Noah Goldstein @ 2022-12-13 19:25 UTC (permalink / raw)
  To: Narayanan Iyer; +Cc: libc-alpha

On Tue, Dec 13, 2022 at 11:13 AM Narayanan Iyer <nars@yottadb.com> wrote:
>
> Noah,
>
> Thank you for acknowledging that it is a bug.
I'm not sure this is a bug, I'm just saying to avoid this behavior you can
make that change.

I think we all agree supporting a correct non-atomic memcmp when the values
in memory can change during execution is not reasonable.
I think SIG11 is not a great outcome (of the many possible behaviors when its
incorrect), but am not sure it's worth fixing.

>
> In case it helps, I found in our testing cluster, that a system which uses sysdeps/x86_64/multiarch/memcmp-sse2.S has the bug whereas a system that uses sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S does not have the bug.
I wouldn't count on any of the other implementations being hardened to this.


>
> So it is possible, the issue is only in the sse2 version.

I did a quick check of avx2/evex and don't see the same pattern but I think
it might be present in the tail of the loops (end of much larger copies so
probably comes up less).

>
> Thanks,
> Narayanan.
>
> -----Original Message-----
> From: Noah Goldstein [mailto:goldstein.w.n@gmail.com]
> Sent: Tuesday, December 13, 2022 2:08 PM
> To: Narayanan Iyer <nars@yottadb.com>
> Cc: libc-alpha@sourceware.org
> Subject: Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
>
> On Tue, Dec 13, 2022 at 10:20 AM Narayanan Iyer via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> > Hi,
> >
> > I had created a glibc bug report at https://sourceware.org/bugzilla/show_bug.cgi?id=29863
> >
> > And have included the title of that report in the email subject as well.
> >
> > It has been a week since I created the issue and I did not hear anything back. I believe this is a regression in glibc 2.36 and have provided enough information in the bug report (along with the commit that caused the regression).
> >
> > Given the upcoming glibc 2.37 release, I was asked (by Carlos O'Donell <carlos@redhat.com>) to raise it on this mailing list to get the attention of the glibc developers.
> >
> > I am hoping one of you could take a quick look at the bug report and see if it is worth fixing it in time for glibc 2.37.
> >
> > Thanks,
> > Narayanan.
> >
> >
>
> The fix:
> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/memcmp-sse2.S;h=afd450d0206d6633da9fbc4607a7fa6aeb4e137c;hb=HEAD#l46
> ```
> -#   define SIZE_OFFSET (CHAR_PER_VEC * 2)
> +#   define SIZE_OFFSET 0
> ```
>
> Is this something we have to support? I believe other functions /
> implementations
> of memcmp will suffer from a similar bug.
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 19:25     ` Noah Goldstein
@ 2022-12-13 20:56       ` Zack Weinberg
  2022-12-13 23:29         ` Carlos O'Donell
  2022-12-13 22:52       ` Bug 29863 - Segmentation fault vs invalid results, memory models, and control/data dependencies Carlos O'Donell
  1 sibling, 1 reply; 28+ messages in thread
From: Zack Weinberg @ 2022-12-13 20:56 UTC (permalink / raw)
  To: GNU libc development

On Tue, Dec 13, 2022, at 2:25 PM, Noah Goldstein via Libc-alpha wrote:
> On Tue, Dec 13, 2022 at 11:13 AM Narayanan Iyer <nars@yottadb.com> wrote:
>> Thank you for acknowledging that it is a bug.
> I'm not sure this is a bug, I'm just saying to avoid this behavior you can
> make that change.
>
> I think we all agree supporting a correct non-atomic memcmp when the values
> in memory can change during execution is not reasonable.
> I think SIG11 is not a great outcome (of the many possible behaviors when its
> incorrect), but am not sure it's worth fixing.

I think it would be reasonable for glibc to make the following weaker guarantee:
for any call `memcmp(a, b, n)`, if the data pointed to by `a` and/or `b` is being
concurrently modified, the return value is unspecified but *not* indeterminate.
Also, memcmp will never access memory outside the bounds [a, a+n) and [b, b+n),
no matter what.

This would, I believe, be sufficient to prevent a crash under the conditions
described by Narayanan Iyer.

zw


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 20:56       ` Zack Weinberg
@ 2022-12-13 23:29         ` Carlos O'Donell
  2022-12-14  2:28           ` Zack Weinberg
  0 siblings, 1 reply; 28+ messages in thread
From: Carlos O'Donell @ 2022-12-13 23:29 UTC (permalink / raw)
  To: Zack Weinberg, GNU libc development

On 12/13/22 15:56, Zack Weinberg via Libc-alpha wrote:
> On Tue, Dec 13, 2022, at 2:25 PM, Noah Goldstein via Libc-alpha wrote:
>> On Tue, Dec 13, 2022 at 11:13 AM Narayanan Iyer <nars@yottadb.com> wrote:
>>> Thank you for acknowledging that it is a bug.
>> I'm not sure this is a bug, I'm just saying to avoid this behavior you can
>> make that change.
>>
>> I think we all agree supporting a correct non-atomic memcmp when the values
>> in memory can change during execution is not reasonable.
>> I think SIG11 is not a great outcome (of the many possible behaviors when its
>> incorrect), but am not sure it's worth fixing.
> 
> I think it would be reasonable for glibc to make the following weaker guarantee:
> for any call `memcmp(a, b, n)`, if the data pointed to by `a` and/or `b` is being
> concurrently modified, the return value is unspecified but *not* indeterminate.
> Also, memcmp will never access memory outside the bounds [a, a+n) and [b, b+n),
> no matter what.

I disagree strongly.

These are advanced lockless techniques.

They should be hidden behind new APIs that provide the required guarantees.
 
> This would, I believe, be sufficient to prevent a crash under the conditions
> described by Narayanan Iyer.

I would want to see compiler and langauge authors agree to such changes.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 23:29         ` Carlos O'Donell
@ 2022-12-14  2:28           ` Zack Weinberg
  2022-12-14  4:16             ` Carlos O'Donell
  0 siblings, 1 reply; 28+ messages in thread
From: Zack Weinberg @ 2022-12-14  2:28 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU libc development

Carlos O'Donell <carlos@redhat.com> writes:

> On 12/13/22 15:56, Zack Weinberg via Libc-alpha wrote:
>> I think it would be reasonable for glibc to make the following weaker guarantee:
>> for any call `memcmp(a, b, n)`, if the data pointed to by `a` and/or `b` is being
>> concurrently modified, the return value is unspecified but *not* indeterminate.
>> Also, memcmp will never access memory outside the bounds [a, a+n) and [b, b+n),
>> no matter what.
>
> I disagree strongly.

I’m really surprised to hear you say that.  To me this is a natural
guarantee for memcmp — in fact, for *all* of the mem* functions — to
make, to the point where my reaction was *of course* this is our bug!

> These are advanced lockless techniques.
> They should be hidden behind new APIs that provide the required guarantees.

That the application was doing “advanced lockless techniques” is, to me,
not relevant in the slightest.  The important thing to me is that the
memory regions `memcmp` is allowed to access are wholly specified by the
mathematical values of its three arguments, and *not* by the data
pointed to by the first two arguments.  Nothing any other thread does
can change the fact that memcmp has no business touching bytes at
addresses outside [a, a+n) and [b, b+n).

Why do you think it is important for the C library to have latitude to
break that aspect of the mem* functions’ API contract?  Even if only
under exotic circumstances?

zw

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-14  2:28           ` Zack Weinberg
@ 2022-12-14  4:16             ` Carlos O'Donell
  2022-12-14 14:16               ` Zack Weinberg
  2022-12-29 19:32               ` “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…) Zack Weinberg
  0 siblings, 2 replies; 28+ messages in thread
From: Carlos O'Donell @ 2022-12-14  4:16 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: GNU libc development

On 12/13/22 21:28, Zack Weinberg wrote:
> Carlos O'Donell <carlos@redhat.com> writes:
> 
>> On 12/13/22 15:56, Zack Weinberg via Libc-alpha wrote:
>>> I think it would be reasonable for glibc to make the following weaker guarantee:
>>> for any call `memcmp(a, b, n)`, if the data pointed to by `a` and/or `b` is being
>>> concurrently modified, the return value is unspecified but *not* indeterminate.
>>> Also, memcmp will never access memory outside the bounds [a, a+n) and [b, b+n),
>>> no matter what.
>>
>> I disagree strongly.
> 
> I’m really surprised to hear you say that.  To me this is a natural
> guarantee for memcmp — in fact, for *all* of the mem* functions — to
> make, to the point where my reaction was *of course* this is our bug!

Please let me expand on my answer.

We are talking about the C language, and when you write "unspecified" in that context
it means the language *does* have something to say about the behaviour but does not pick
one or other of the available behaviours. This is not the case, the language very clearly
says this is undefined behaviour, so it says nothing about what should happen.

My understanding was that you were trying to ascribe more determinism to the operation
of memcpy under the presence of data races than could be granted by UB.

I strongly disagree to ascribing more determinism than UB.

Could you expand on why you think this is a "natural" guarantee and from what that
derives from? Is it that you view the input domain to the function as the "natural"
bytes upon which the function is allowed to operate?

>> These are advanced lockless techniques.
>> They should be hidden behind new APIs that provide the required guarantees.
> 
> That the application was doing “advanced lockless techniques” is, to me,

If the application does not follow the language requirements then it is UB.

There are some advanced lockless techniques that do not follow the C memory model.

My point is that such techniques and the requirements should be well understood
and implemented by APIs that provide the required guarantees, not by existing C
string and memory APIs.

> not relevant in the slightest.  The important thing to me is that the
> memory regions `memcmp` is allowed to access are wholly specified by the
> mathematical values of its three arguments, and *not* by the data
> pointed to by the first two arguments.  Nothing any other thread does
> can change the fact that memcmp has no business touching bytes at
> addresses outside [a, a+n) and [b, b+n).

I strongly disagree (though the quiet theoretician in me agrees with you).

The standards are in no way prescriptive in saying that memcmp shall not read or
write to memory outside of the input domain.

> Why do you think it is important for the C library to have latitude to
> break that aspect of the mem* functions’ API contract?  Even if only
> under exotic circumstances?

Why do you think an application programmer has the right to ignore the requirements
of the language and expect the runtime to operate as intended?

Why would we as library authors enter into an API contract that is *stronger* than
the language guarantees?

I am empathetic to the yottadb developers, but we need new APIs for these requirements.

I will still review Noah's patch here:
https://sourceware.org/pipermail/libc-alpha/2022-December/144058.html

I do not have a sustained objection to modifying things to "just work" (tm) because
with UB you could do anything, and it doesn't impact the normal use cases.

That is to say for a specific patch, and a specific change, I can agree, but not
to the larger generalizations about memcmp.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-14  4:16             ` Carlos O'Donell
@ 2022-12-14 14:16               ` Zack Weinberg
  2022-12-14 17:36                 ` Paolo Bonzini
  2022-12-29 19:32               ` “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…) Zack Weinberg
  1 sibling, 1 reply; 28+ messages in thread
From: Zack Weinberg @ 2022-12-14 14:16 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU libc development

I'll respond to this at more length later, but I want to point out that this statement ...

On Tue, Dec 13, 2022, at 11:16 PM, Carlos O'Donell wrote:
> The standards are in no way prescriptive in saying that memcmp shall not read or
> write to memory outside of the input domain.

... is (as I read it) contradicted by 7.1.4p5 (N1570) "A library function shall not directly or indirectly access objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's arguments."  There is more wiggle room in this wording than I'd ideally like, but since memcmp has no way of knowing whether any particular piece of data outside the ranges supplied as arguments is "accessible by threads other than the current thread", it needs to be conservative and not touch any of it.

zw

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-14 14:16               ` Zack Weinberg
@ 2022-12-14 17:36                 ` Paolo Bonzini
  2022-12-29  7:09                   ` Zack Weinberg
  0 siblings, 1 reply; 28+ messages in thread
From: Paolo Bonzini @ 2022-12-14 17:36 UTC (permalink / raw)
  To: Zack Weinberg, Carlos O'Donell; +Cc: GNU libc development

On 12/14/22 15:16, Zack Weinberg via Libc-alpha wrote:
>> The standards are in no way prescriptive in saying that memcmp
>> shall not read or write to memory outside of the input domain.
>
> ... is (as I read it) contradicted by 7.1.4p5 (N1570) "A library
> function shall not directly or indirectly access objects accessible
> by threads other than the current thread unless the objects are
> accessed directly or indirectly via the function's arguments."  There
> is more wiggle room in this wording than I'd ideally like, but since
> memcmp has no way of knowing whether any particular piece of data
> outside the ranges supplied as arguments is "accessible by threads
> other than the current thread", it needs to be conservative and not
> touch any of it.

I don't think this applies here for two reasons though:

1) a SIGSEGV would always be acceptable (that's not a valid object), and 
so would an infinite loop

2) I think that the as-if rule would even allow reads to objects 
accessible by other threads, if they don't affect the result (so they 
are not observable by the calling thread) *and* are not observable by 
those other threads either.  The classic case here is strlen() doing 
aligned word (or larger) reads, even though those reads might trespass 
the NUL terminator.

Promising any kind of behavior when a data race happens involving the 
str*/mem* functions is harder than it seems.  As soon as the functions 
read a byte more than once, the view of memory that they operate on does 
not even obey causality.

The following code is admittedly a bit contrived but shows the pitfalls 
of reading more than once from the same location:

   while (*(u32*)s == *(u32*)d) {
     n-=4, s+=4, d+=4;
     if (n < 4) goto short;
   }

   // we know the loop will stop don't we?
   while (*s==*d) s++, d++;
   return *s < *d ? -1 : 1;

short:
   while (n && *s == *d) s++, d++, n--;
   return n ? (*s < *d ? -1 : 1) : 0;

... and you could access beyond the [a,a+n) [b,b+n) range if a 
concurrent mutator causes the second while loop to go off the cliff.

There are also compiler issues: it's also hard to ensure that the 
compiler won't decide to read twice for very down-to-Earth instruction 
selection reasons, for example by using a register-and-memory ALU 
operation.  Many mem*/str* routines do not have a single memory write, 
so the compiler has a *lot* of leeway to reorder and rematerialize 
memory accesses.  For example you could have something like this in a 
memmem():

   c = *p;
   ...
   p += table[c];

If the compiler changes the second statement to "p += table1[*p]", for 
example to avoid a spill, concurrent mutation can result in 
out-of-bounds accesses or other kinds of UB.  The standard certainly 
didn't require that str*/mem* be written in assembly, or that it does 
all of it's accesses as atomic loads or perhaps (argh!) volatile.

Paolo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-14 17:36                 ` Paolo Bonzini
@ 2022-12-29  7:09                   ` Zack Weinberg
  0 siblings, 0 replies; 28+ messages in thread
From: Zack Weinberg @ 2022-12-29  7:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Carlos O'Donell, GNU libc development

On Wed, 14 Dec 2022 12:36:44 -0500, Paolo Bonzini wrote: 
> On 12/14/22 15:16, Zack Weinberg via Libc-alpha wrote:
> > 7.1.4p5 (N1570) "A library
> > function shall not directly or indirectly access objects accessible
> > by threads other than the current thread unless the objects are
> > accessed directly or indirectly via the function's arguments."
>
> I don't think this applies here for two reasons though:
> 
> 1) a SIGSEGV would always be acceptable (that's not a valid object),
> and so would an infinite loop

An infinite loop would indeed be acceptable, but I don’t see any
justification for a SIGSEGV as long as the _page tables_ are not being
changed concurrently from another thread.  To put it another way,
given a call `memcmp(a, b, n)`, as long as the address ranges
`[a, a+n)` and `[b, b+n)` remain _readable_ for the duration of the
call, no, SIGSEGV is not an acceptable outcome in my book, no matter
how unstable the contents of those address ranges are.

 > 2) I think that the as-if rule would even allow reads to objects
> accessible by other threads, if they don't affect the result (so they
> are not observable by the calling thread) *and* are not observable by
> those other threads either.  The classic case here is strlen() doing
> aligned word (or larger) reads, even though those reads might trespass
> the NUL terminator.

I’m not entirely comfortable with that logic, because those reads are
forbidden by the _abstract_ machine’s memory model, in which reading
even a single byte beyond the bounds of an explicitly declared array
or malloc() block is UB.  Very few concrete machines can enforce that
rule precisely; the only ones I can think of are the valgrind VM and
_maybe_ CHERI.  But that hasn’t so far stopped the compiler people
from implementing optimizations that assume the concrete machine
_does_ enforce that rule precisely.

> The following code is admittedly a bit contrived but shows the
> pitfalls of reading more than once from the same location:
> 
>   while (*(u32*)s == *(u32*)d) {
>     n-=4, s+=4, d+=4;
>     if (n < 4) goto short;
>   }
> 
>   // we know the loop will stop don't we?
>   while (*s==*d) s++, d++;
>   return *s < *d ? -1 : 1;
> 
> short:
>   while (n && *s == *d) s++, d++, n--;
>   return n ? (*s < *d ? -1 : 1) : 0;

In my book, this falls clearly into “don’t do that then” territory.
It is our responsibility as library implementors to code memcmp so
that it _does not_ access beyond the [a,a+n) [b,b+n) range, _even if_
there is a concurrent mutator.

> There are also compiler issues: it's also hard to ensure that the
> compiler won't decide to read twice for very down-to-Earth instruction
> selection reasons, for example by using a register-and-memory ALU
> operation.  Many mem*/str* routines do not have a single memory write,
> so the compiler has a *lot* of leeway to reorder and rematerialize
> memory accesses.

Again, “don’t do that then”―not “don’t read twice,” but “don’t elide
bounds checks, specifically, based on the assumption that two reads
from the same location will return the same value.”

> For example you could have something like this in a
> memmem():
> 
>   c = *p;
>   ...
>   p += table[c];
> 
> If the compiler changes the second statement to "p += table1[*p]", for
> example to avoid a spill,

that should be _perfectly fine_, because the next line of code is
something like

  if (p > limit) break;

and not

  c = *p;

If the compiler deletes the bounds check, I’m prepared to argue both
that that’s an incorrect optimization, and that if the standard says
it’s fine then the _standard_ is wrong.

zw

^ permalink raw reply	[flat|nested] 28+ messages in thread

* “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…)
  2022-12-14  4:16             ` Carlos O'Donell
  2022-12-14 14:16               ` Zack Weinberg
@ 2022-12-29 19:32               ` Zack Weinberg
  2022-12-29 22:20                 ` Andreas Schwab
  2022-12-30 15:09                 ` Florian Weimer
  1 sibling, 2 replies; 28+ messages in thread
From: Zack Weinberg @ 2022-12-29 19:32 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU libc development

The original issue with memcmp() seems to have been resolved, but I’d
like to start a broader discussion about the C library’s
responsibilities in the face of a program that’s, to some degree,
incorrect.

On Tue, 13 Dec 2022 23:16:19 -0500, Carlos O'Donell wrote:
> We are talking about the C language, and when you write
> "unspecified" in that context it means the language *does* have
> something to say about the behaviour but does not pick one or other
> of the available behaviours. This is not the case, the language very
> clearly says this is undefined behaviour, so it says nothing about
> what should happen.
> 
> My understanding was that you were trying to ascribe more
> determinism to the operation of memcpy under the presence of data
> races than could be granted by UB.
> 
> I strongly disagree to ascribing more determinism than UB.

Everything I’m about to say stems from the following three premises:

1. The C standard uses “undefined behavior” far more liberally than it
   ought to.  In many cases of existing UB the committee could define
   the behavior (possibly as implementation-defined or unspecified)
   without any actual negative consequences.  It seems the committee
   *is* moving in this direction as of C2x, for instance by dropping
   the allowances for non-twos-complement signed arithmetic, but they
   could and should go a lot farther down that road.

2. The remaining cases of UB are those where we still want to say that
   the program is *incorrect* if it does these things, but we don’t
   want to require the compiler to diagnose the incorrectness (usually
   because detection would be intractable).  *Even in these cases*,
   the current concept of “undefined behavior,” licensing the
   implementation to do *anything*, is troublesome.  The standard
   should replace the concept entirely, with something analogous to,
   say, the ARM ARM’s concept of “constrained unpredictable” behavior:
   A program that does X is incorrect, and doing X will have
   unpredictable consequences, BUT there are concrete limits on what
   those consequences can be.

3. Implementations can and should work in advance of the C committee
   on restricting the consequences of UB as outlined in (1) and (2).
   That is, the present state of the C standard should not stop _us_
   from specifying the behavior of GNU libc in cases where clause 7
   leaves behavior “undefined”.

The earlier thread provides a concrete example of a type 2 change that
I think is desirable: instead of 5.1.2.4p25 simply saying that if a
data race occurs, the behavior is undefined, it should say that if a
data race occurs, all calculations data- or control-dependent on the
conflicting expressions produce unpredictable results, where
“unpredictable result” is approximately the same thing as the current
“unspecified value” but might wind up needing to have a slightly
different definition.  [Notice that this is already a significant
constraint on what actually happens, since unpredictability can no
longer propagate backwards in time.]

Then, also, 7.1.4 should say that if the set of memory locations that
are potentially accessed by a library function can be specified
without reference to the contents of any of those memory locations
(true for memXXX, not for strXXX) then, a data race on the contents of
any of those memory locations cannot expand the set.

> Could you expand on why you think this is a "natural" guarantee and
> from what that derives from?

I am envisioning (pointer, length) 2-tuples as object capabilities.
By calling memcmp(a, b, n), the caller grants memcmp access to the
address ranges [a, a+n) and [b, b+n), but not a single byte more.
Some concrete machines (e.g. the valgrind and ASAN VMs, and to a
lesser extent CHERI) can actually enforce the limits of that grant.
Even if the hardware cannot enforce the limits, the callee should
honor them.

The contents, and the stability of the contents, of those address
ranges *cannot* affect the limits of the grant, because they appear
nowhere in the expressions that define the limits.

> If the application does not follow the language requirements then it
> is UB.

As described above, arguments from what is or is not currently UB have
no weight from my perspective.

> Why would we as library authors enter into an API contract that is
> *stronger* than the language guarantees?

Because we believe that the language guarantees are too weak and need
to be strengthened, and the standard committees will want to see that
strengthening play out as “existing practice” before they make any
changes.

zw

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…)
  2022-12-29 19:32               ` “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…) Zack Weinberg
@ 2022-12-29 22:20                 ` Andreas Schwab
  2022-12-30 13:28                   ` Florian Weimer
  2022-12-30 15:09                 ` Florian Weimer
  1 sibling, 1 reply; 28+ messages in thread
From: Andreas Schwab @ 2022-12-29 22:20 UTC (permalink / raw)
  To: Zack Weinberg via Libc-alpha; +Cc: Carlos O'Donell, Zack Weinberg

On Dez 29 2022, Zack Weinberg via Libc-alpha wrote:

> 1. The C standard uses “undefined behavior” far more liberally than it
>    ought to.  In many cases of existing UB the committee could define
>    the behavior (possibly as implementation-defined or unspecified)
>    without any actual negative consequences.  It seems the committee
>    *is* moving in this direction as of C2x, for instance by dropping
>    the allowances for non-twos-complement signed arithmetic, but they
>    could and should go a lot farther down that road.

That's not removing an undefined behavior, it's removing an
implementation choice.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…)
  2022-12-29 22:20                 ` Andreas Schwab
@ 2022-12-30 13:28                   ` Florian Weimer
  0 siblings, 0 replies; 28+ messages in thread
From: Florian Weimer @ 2022-12-30 13:28 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Zack Weinberg via Libc-alpha, Carlos O'Donell, Zack Weinberg

* Andreas Schwab:

> On Dez 29 2022, Zack Weinberg via Libc-alpha wrote:
>
>> 1. The C standard uses “undefined behavior” far more liberally than it
>>    ought to.  In many cases of existing UB the committee could define
>>    the behavior (possibly as implementation-defined or unspecified)
>>    without any actual negative consequences.  It seems the committee
>>    *is* moving in this direction as of C2x, for instance by dropping
>>    the allowances for non-twos-complement signed arithmetic, but they
>>    could and should go a lot farther down that road.
>
> That's not removing an undefined behavior, it's removing an
> implementation choice.

Right, arithmetic is still not twos-complement, it's only about
representation.  Same with C++.

(I associate twos-complement signed *arithmetic* with the obvious
(unsigned) overflow behavior.)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…)
  2022-12-29 19:32               ` “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…) Zack Weinberg
  2022-12-29 22:20                 ` Andreas Schwab
@ 2022-12-30 15:09                 ` Florian Weimer
  1 sibling, 0 replies; 28+ messages in thread
From: Florian Weimer @ 2022-12-30 15:09 UTC (permalink / raw)
  To: Zack Weinberg via Libc-alpha; +Cc: Carlos O'Donell, Zack Weinberg

* Zack Weinberg via Libc-alpha:

> The original issue with memcmp() seems to have been resolved, but I’d
> like to start a broader discussion about the C library’s
> responsibilities in the face of a program that’s, to some degree,
> incorrect.
>
> On Tue, 13 Dec 2022 23:16:19 -0500, Carlos O'Donell wrote:
>> We are talking about the C language, and when you write
>> "unspecified" in that context it means the language *does* have
>> something to say about the behaviour but does not pick one or other
>> of the available behaviours. This is not the case, the language very
>> clearly says this is undefined behaviour, so it says nothing about
>> what should happen.
>> 
>> My understanding was that you were trying to ascribe more
>> determinism to the operation of memcpy under the presence of data
>> races than could be granted by UB.
>> 
>> I strongly disagree to ascribing more determinism than UB.
>
> Everything I’m about to say stems from the following three premises:
>
> 1. The C standard uses “undefined behavior” far more liberally than it
>    ought to.  In many cases of existing UB the committee could define
>    the behavior (possibly as implementation-defined or unspecified)
>    without any actual negative consequences.  It seems the committee
>    *is* moving in this direction as of C2x, for instance by dropping
>    the allowances for non-twos-complement signed arithmetic, but they
>    could and should go a lot farther down that road.

I'm not sure if this is a priority.  It's a lot of work, and most
aspects are probably not technical in the end.  For example, Peter
Sewell's group did a lot of work on the (non-concurrent) C memory
model, both investigative and descriptive, and it's been hard to
integrate this into the standard, as far as I know.

From a licensing perspective, I would very much prefer if anything
that approaches an executable specification were outside the ISO
framework.  It would also enable exploring different directions based
on interests (some might prefer semantics more along the lines of
-fwrapv -fno-strict-aliasing, for example).

> 2. The remaining cases of UB are those where we still want to say that
>    the program is *incorrect* if it does these things, but we don’t
>    want to require the compiler to diagnose the incorrectness (usually
>    because detection would be intractable).  *Even in these cases*,
>    the current concept of “undefined behavior,” licensing the
>    implementation to do *anything*, is troublesome.  The standard
>    should replace the concept entirely, with something analogous to,
>    say, the ARM ARM’s concept of “constrained unpredictable” behavior:

Hmm.  Ada RM? Bounded errors?  Maybe these concepts are quite common?

> 3. Implementations can and should work in advance of the C committee
>    on restricting the consequences of UB as outlined in (1) and (2).
>    That is, the present state of the C standard should not stop _us_
>    from specifying the behavior of GNU libc in cases where clause 7
>    leaves behavior “undefined”.

That seems reasonable.  Yet in many cases, we still struggle to find
consensus among our smaller group.

> The earlier thread provides a concrete example of a type 2 change that
> I think is desirable: instead of 5.1.2.4p25 simply saying that if a
> data race occurs, the behavior is undefined, it should say that if a
> data race occurs, all calculations data- or control-dependent on the
> conflicting expressions produce unpredictable results, where
> “unpredictable result” is approximately the same thing as the current
> “unspecified value” but might wind up needing to have a slightly
> different definition.  [Notice that this is already a significant
> constraint on what actually happens, since unpredictability can no
> longer propagate backwards in time.]

I think this would have serious implications for any optimizations
involving multiple memory accesses.  With undefined data races and no
relevant writes from the current thread, you can assume that the
memory is stable.

I doubt there's a good alternative right now to the present situation.

>> Could you expand on why you think this is a "natural" guarantee and
>> from what that derives from?
>
> I am envisioning (pointer, length) 2-tuples as object capabilities.
> By calling memcmp(a, b, n), the caller grants memcmp access to the
> address ranges [a, a+n) and [b, b+n), but not a single byte more.
> Some concrete machines (e.g. the valgrind and ASAN VMs, and to a
> lesser extent CHERI) can actually enforce the limits of that grant.
> Even if the hardware cannot enforce the limits, the callee should
> honor them.
>
> The contents, and the stability of the contents, of those address
> ranges *cannot* affect the limits of the grant, because they appear
> nowhere in the expressions that define the limits.

But that's not how memcmp is implemented in practice.  Our
implementation assumes that it can access memory outside these bounds,
but that is not observable because we only do so when we are sure it
does not cause a page fault.  But there are other ways memory accesses
can be observable.

It's also proven difficult to find consensus whether string functions
may continue reading beyond bytes that determine the function's
result.

> Because we believe that the language guarantees are too weak and need
> to be strengthened, and the standard committees will want to see that
> strengthening play out as “existing practice” before they make any
> changes.

Or maybe they standardize that's incompatible with what we do, and
they feel free to do so because it was previously undefined, so it's
technically not a language change.

If we go in that direction, we should accept future, persistent
incompatibilities, and get out of the business to present a
standards-compliant view to the programmer even when the underlying
infrastructure isn't.  See __ASSUME_SYSVIPC_BROKEN_MODE_T for an
example what we've been doing so far.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault vs invalid results, memory models, and control/data dependencies.
  2022-12-13 19:25     ` Noah Goldstein
  2022-12-13 20:56       ` Zack Weinberg
@ 2022-12-13 22:52       ` Carlos O'Donell
  2022-12-14 12:03         ` Florian Weimer
  1 sibling, 1 reply; 28+ messages in thread
From: Carlos O'Donell @ 2022-12-13 22:52 UTC (permalink / raw)
  To: Noah Goldstein, Narayanan Iyer; +Cc: libc-alpha

On 12/13/22 14:25, Noah Goldstein via Libc-alpha wrote:
> I think we all agree supporting a correct non-atomic memcmp when the
> values in memory can change during execution is not reasonable. I
> think SIG11 is not a great outcome (of the many possible behaviors
> when its incorrect), but am not sure it's worth fixing.

The issue here is the ability to optimistically carry out an operation, and then
eventually determine if the result could have been consistent, particularly in
the case of a software transaction as implemented in OCC.

My opinion is that this breaks down to a discussion of valid memory models
that can be used by an application. Today the C memory models and the POSIX
Memory Synchronization language do not allow data races. However, looking at the
Linux kernel for reference we see idiomatic usage for control and data dependencies.
The Linux Kernel strategies for this do not easily apply to userspace.

In the case of bug 29863 the memcmp has a dependency on the data in the shared
memory segment, and the argument being made in the bug report is that, regardless
of the ordering, the dependencies in the code should not produce a segfault. The logic
of the assembly would have to assume any observable ordering of writes to the bytes
and take steps to ensure that the results are correct. This is pessimistic for the
performance of a program that avoids data races, but optimistic for a program that
wants to have low-overhead execution without atomics. It also begs the question:
Does this have to extend to *all* glibc routines? Clearly it can't, because once
the routine has a compiler involved we are enforcing the compiler's knowledge
of UB and memory models.

Compiler optimizations will make this more and more difficult to do correctly.
Consider the case where there is no library memcmp and the compiler inlines the
memcmp and assumes the memory model is not violated.

The most recent update of this discussion was at LPC 2022 for the kernel:
Status Report: Broken Dependency Orderings in the Linux Kernel - Marco Elver, Paul Heidekrüger
https://www.youtube.com/watch?v=ad-75tViBm4

A claim that "it's been working for 20 years" is true, but likewise "compilers
and libraries have gotten better in the last 20 years" is also true. 

Toolchain authors have begun to formalize memory models, and these models do
not support operations like OCC if they have data races.

In summary:

- We should pursue performance given a well formed understanding of the memory
  models glibc supports e.g. ISO C11, and POSIX Issue 7.

- Authors using OCC should explain their memory model in detail, and may need to
  raise the issue with the broader toolchain authors to gain consensus for the
  design constraints imposed by OCC on the implementation language being
  compiled e.g. compiler picks a special memcpy, or does other operations.

-- 
Cheers,
Carlos.

Notes:
15.3.2: Address- and Data-Dependency Difficulties
https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook-e2.pdf

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault vs invalid results, memory models,  and control/data dependencies.
  2022-12-13 22:52       ` Bug 29863 - Segmentation fault vs invalid results, memory models, and control/data dependencies Carlos O'Donell
@ 2022-12-14 12:03         ` Florian Weimer
  0 siblings, 0 replies; 28+ messages in thread
From: Florian Weimer @ 2022-12-14 12:03 UTC (permalink / raw)
  To: Carlos O'Donell via Libc-alpha
  Cc: Noah Goldstein, Narayanan Iyer, Carlos O'Donell

* Carlos O'Donell via Libc-alpha:

> - We should pursue performance given a well formed understanding of the memory
>   models glibc supports e.g. ISO C11, and POSIX Issue 7.

But the C memory model is known to be broken.  For example, it does not
rule out the existence of out-of-thin-air values, which makes relaxed MO
pretty useless.  I don't think we can use it as an argument to reject
requests from application developers.

> - Authors using OCC should explain their memory model in detail, and
>   may need to raise the issue with the broader toolchain authors to gain
>   consensus for the design constraints imposed by OCC on the
>   implementation language being compiled e.g. compiler picks a special
>   memcpy, or does other operations.

I think their requirements for memcmp are pretty clear.  We don't need a
fullly fleshed out memory model for that.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 19:08 ` Noah Goldstein
  2022-12-13 19:13   ` Narayanan Iyer
@ 2022-12-13 21:20   ` Florian Weimer
  2022-12-13 22:59     ` Noah Goldstein
  1 sibling, 1 reply; 28+ messages in thread
From: Florian Weimer @ 2022-12-13 21:20 UTC (permalink / raw)
  To: Noah Goldstein via Libc-alpha; +Cc: Narayanan Iyer, Noah Goldstein

* Noah Goldstein via Libc-alpha:

> Is this something we have to support? I believe other functions /
> implementations of memcmp will suffer from a similar bug.

Of course the crash is by no means deterministic, so I'm not sure how
useful it is to detect application bugs.  Maybe papering over the
application bug is the right approach here.

On the other hand, I really don't see how such a racing memcmp call
could deliver any useful information whatsoever.  The result will always
be arbitrary in practice.  So I hope such application bugs are really
rare.

> The fix:
> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/memcmp-sse2.S;h=afd450d0206d6633da9fbc4607a7fa6aeb4e137c;hb=HEAD#l46
> ```
> -#   define SIZE_OFFSET (CHAR_PER_VEC * 2)
> +#   define SIZE_OFFSET 0
> ```

How costly is this change?  I would have thought about ANDing the offset
so that it is always in range (but maybe it will stil result in a page
crossing, I don't really know how this works).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 21:20   ` Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change Florian Weimer
@ 2022-12-13 22:59     ` Noah Goldstein
  2022-12-14 12:06       ` Florian Weimer
  0 siblings, 1 reply; 28+ messages in thread
From: Noah Goldstein @ 2022-12-13 22:59 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Noah Goldstein via Libc-alpha, Narayanan Iyer

On Tue, Dec 13, 2022 at 1:20 PM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Noah Goldstein via Libc-alpha:
>
> > Is this something we have to support? I believe other functions /
> > implementations of memcmp will suffer from a similar bug.
>
> Of course the crash is by no means deterministic, so I'm not sure how
> useful it is to detect application bugs.  Maybe papering over the
> application bug is the right approach here.
>
> On the other hand, I really don't see how such a racing memcmp call
> could deliver any useful information whatsoever.  The result will always
> be arbitrary in practice.  So I hope such application bugs are really
> rare.

The usecase in the bugzilla is optimistic reads in concurrent databases.

>
> > The fix:
> > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/memcmp-sse2.S;h=afd450d0206d6633da9fbc4607a7fa6aeb4e137c;hb=HEAD#l46
> > ```
> > -#   define SIZE_OFFSET (CHAR_PER_VEC * 2)
> > +#   define SIZE_OFFSET 0
> > ```
>
> How costly is this change?  I would have thought about ANDing the offset
> so that it is always in range (but maybe it will stil result in a page
> crossing, I don't really know how this works).

Its a bit expensive b.c it misaligned a lot of hot paths.

A better fix (untested) is probably:
```
@@ -308,7 +308,7 @@ L(ret_nonzero_vec_end_0):
  setg %dl
  leal -1(%rdx, %rdx), %eax
 #  else
- addl %edx, %eax
+ addq %rdx, %rax
  movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rsi, %rax), %ecx
  movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rdi, %rax), %eax
  subl %ecx, %eax

```
The issue is the 32-bit negative number becomes a very large
unsigned 64-bit number. If we just use 64-bit addition the
issue goes away (I think at least).

I'm okay with this change going in (assuming it works).

This fix can be a "happy accident" that this unsupported
usage no longer causes a sig-11, but I think it's a mistake to
explicitly support this use case as anything other than UB.

>
> Thanks,
> Florian
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-13 22:59     ` Noah Goldstein
@ 2022-12-14 12:06       ` Florian Weimer
  0 siblings, 0 replies; 28+ messages in thread
From: Florian Weimer @ 2022-12-14 12:06 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: Noah Goldstein via Libc-alpha, Narayanan Iyer

* Noah Goldstein:

> On Tue, Dec 13, 2022 at 1:20 PM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * Noah Goldstein via Libc-alpha:
>>
>> > Is this something we have to support? I believe other functions /
>> > implementations of memcmp will suffer from a similar bug.
>>
>> Of course the crash is by no means deterministic, so I'm not sure how
>> useful it is to detect application bugs.  Maybe papering over the
>> application bug is the right approach here.
>>
>> On the other hand, I really don't see how such a racing memcmp call
>> could deliver any useful information whatsoever.  The result will always
>> be arbitrary in practice.  So I hope such application bugs are really
>> rare.
>
> The usecase in the bugzilla is optimistic reads in concurrent
> databases.

So what happens is that whatever the result is, it is still validated
with some separate mechanism (e.g., a seqlock or some other form of TM)?
That does not seem unreasonable at all, so I'd be in favor of supporting
this, given that the cost is so low.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 28+ messages in thread

[parent not found: <PAWPR08MB89825887E12FF900540365F483E09@PAWPR08MB8982.eurprd08.prod.outlook.com>]

[parent not found: <PAWPR08MB898260DA844D695EA70ED3E483E09@PAWPR08MB8982.eurprd08.prod.outlook.com>]

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
       [not found] ` <PAWPR08MB898260DA844D695EA70ED3E483E09@PAWPR08MB8982.eurprd08.prod.outlook.com>
@ 2022-12-14 21:56   ` Wilco Dijkstra
  2022-12-29  7:21     ` Zack Weinberg
  0 siblings, 1 reply; 28+ messages in thread
From: Wilco Dijkstra @ 2022-12-14 21:56 UTC (permalink / raw)
  To: zack, Carlos O'Donell; +Cc: 'GNU C Library'

Hi,

>On Tue, Dec 13, 2022, at 11:16 PM, Carlos O'Donell wrote:
>> The standards are in no way prescriptive in saying that memcmp shall not read or
>> write to memory outside of the input domain.
>
> ... is (as I read it) contradicted by 7.1.4p5 (N1570) "A library function shall not directly or
> indirectly access objects accessible by threads other than the current thread unless the
> objects are accessed directly or indirectly via the function's arguments."  There is more
> wiggle room in this wording than I'd ideally like, but since memcmp has no way of knowing
> whether any particular piece of data outside the ranges supplied as arguments is "accessible
> by threads other than the current thread", it needs to be conservative and not touch any of it.

I'd expect that mem* functions will never read outside their bounds since the bounds are
explicitly defined by the arguments, not by the data. So that should be easy to guarantee.

For the str* functions it may be harder since the data itself defines when to stop reading.
So if an implementation uses multiple accesses to the same address, you could potentially
mistake the end of a string (eg. first one detects a special case, while the 2nd then verifies it).

Still, I wouldn't expect totally random memory accesses even in this case - you would read
beyond the end of a string if the string end is changed concurrently.

Note there is also a security risk here - an attacker could cause an out of bounds access
to extract secret data that otherwise wouldn't be accessible. Eg. functions in the Linux
kernel accessing user data should be resilient to concurrent modifications of the data.

Finally it's worth mentioning that nscd does the exact same thing: it uses memcmp and
non-atomic accesses on shared data that is being modified by other threads. It looks
totally broken, especially with weaker memory ordering, however this kind of insanity
may actually be a common design pattern...

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-14 21:56   ` Wilco Dijkstra
@ 2022-12-29  7:21     ` Zack Weinberg
  2022-12-29 20:02       ` Alejandro Colomar
  0 siblings, 1 reply; 28+ messages in thread
From: Zack Weinberg @ 2022-12-29  7:21 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: Carlos O'Donell, 'GNU C Library'

On Wed, 14 Dec 2022 16:56:28 -0500, Wilco Dijkstra wrote:
> I'd expect that mem* functions will never read outside their bounds
> since the bounds are explicitly defined by the arguments, not by the
> data. So that should be easy to guarantee.

I concur.

> For the str* functions it may be harder since the data itself
> defines when to stop reading.  So if an implementation uses multiple
> accesses to the same address, you could potentially mistake the end
> of a string (eg. first one detects a special case, while the 2nd
> then verifies it).

I also concur here.
 
> Still, I wouldn't expect totally random memory accesses even in this
> case - you would read beyond the end of a string if the string end
> is changed concurrently.

We may run into a problem where it’s difficult to _state_ the limits
of the misbehavior, just because the C standard doesn’t itself try to
put limits on misbehavior in the face of an incorrect program, so we
don’t have any language for it (which I would argue is a bug in the
standard, see the detailed reply to Carlos that I’ll be writing, er,
tomorrow).

Still, taking strcmp(a, b) for example, and assuming WLOG a flat
address space in which a < b, it should be possible to guarantee

 - no accesses to any byte in the range [0, a) ever
 - if an oracle for strlen(), capable of executing in zero cycles,
   would return the same value for strlen(a) throughout the execution
   of strcmp(), then no accesses to any byte in the range
   [a+strlen(a), b)
 - if an oracle for strlen(), capable of executing in zero cycles,
   would return the same value for strlen(b) throughout the execution
   of strcmp(), then no accesses to any byte in the range
   [b+strlen(b), ADDR_MAX)
 - however, if the oracle strlen() values _do_ change during the
   execution of strcmp(), then accesses to bytes in the latter two
   ranges are possible
 - a SIGSEGV is permissible if and only if there was at least one
   point during execution at which a call to the oracle strlen() would
   have triggered a SIGSEGV

Ne?

> Finally it's worth mentioning that nscd does the exact same thing:
> it uses memcmp and non-atomic accesses on shared data that is being
> modified by other threads. It looks totally broken, especially with
> weaker memory ordering, however this kind of insanity may actually
> be a common design pattern...

I don’t want to hold up nscd as an example of quality design or
implementation, but yeah, I share your concern re “may actually be a
common design pattern”…

zw

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-29  7:21     ` Zack Weinberg
@ 2022-12-29 20:02       ` Alejandro Colomar
  2022-12-30 18:02         ` Joseph Myers
  0 siblings, 1 reply; 28+ messages in thread
From: Alejandro Colomar @ 2022-12-29 20:02 UTC (permalink / raw)
  To: Zack Weinberg, Wilco Dijkstra
  Cc: Carlos O'Donell, 'GNU C Library'

Hi Zack,

On 12/29/22 08:21, Zack Weinberg via Libc-alpha wrote:
> On Wed, 14 Dec 2022 16:56:28 -0500, Wilco Dijkstra wrote:
>> I'd expect that mem* functions will never read outside their bounds
>> since the bounds are explicitly defined by the arguments, not by the
>> data. So that should be easy to guarantee.
> 
> I concur.
> 
>> For the str* functions it may be harder since the data itself
>> defines when to stop reading.  So if an implementation uses multiple
>> accesses to the same address, you could potentially mistake the end
>> of a string (eg. first one detects a special case, while the 2nd
>> then verifies it).
> 
> I also concur here.
>   
>> Still, I wouldn't expect totally random memory accesses even in this
>> case - you would read beyond the end of a string if the string end
>> is changed concurrently.
> 
> We may run into a problem where it’s difficult to _state_ the limits
> of the misbehavior, just because the C standard doesn’t itself try to
> put limits on misbehavior in the face of an incorrect program, so we
> don’t have any language for it (which I would argue is a bug in the
> standard, see the detailed reply to Carlos that I’ll be writing, er,
> tomorrow).

The standard already makes some kind of guarantee, when it differences between 
bound UB and critical UB.  The problem is that once you've met bound UB, it's 
hard not to convert it to critical UB in the following lines of code.  Even a 
compiler assumption that would otherwise be fine might result in critical UB.

> 
> Still, taking strcmp(a, b) for example, and assuming WLOG a flat
> address space in which a < b, it should be possible to guarantee
> 
>   - no accesses to any byte in the range [0, a) ever
>   - if an oracle for strlen(), capable of executing in zero cycles,
>     would return the same value for strlen(a) throughout the execution
>     of strcmp(), then no accesses to any byte in the range
>     [a+strlen(a), b)
>   - if an oracle for strlen(), capable of executing in zero cycles,
>     would return the same value for strlen(b) throughout the execution
>     of strcmp(), then no accesses to any byte in the range
>     [b+strlen(b), ADDR_MAX)
>   - however, if the oracle strlen() values _do_ change during the
>     execution of strcmp(), then accesses to bytes in the latter two
>     ranges are possible
>   - a SIGSEGV is permissible if and only if there was at least one
>     point during execution at which a call to the oracle strlen() would
>     have triggered a SIGSEGV

Are you meaning this would be an invalid implementation of strcmp(3)?:

int
strcmp(const char *s1, const char *s2)
{
	for (; *s1 != '\0' || *s1 != '\0'; s1++, s2++) {
		if (*s1 < *s2)
			return -1;
		if (*s1 > *s2)
			return 1;
	}
}

Okay, probably it's not the fastest one, but it's simple.  This one would 
SIGSEGV in the following case:

Another thread might insert a NUL at the beginning of each string (after the 
loop has passed over it), and in the next cycle remove the 
previously-terminating NUL from the strings.  The loop would then run forever, 
until a crash.

Cheers,

Alex

> 
> Ne?
> 
>> Finally it's worth mentioning that nscd does the exact same thing:
>> it uses memcmp and non-atomic accesses on shared data that is being
>> modified by other threads. It looks totally broken, especially with
>> weaker memory ordering, however this kind of insanity may actually
>> be a common design pattern...
> 
> I don’t want to hold up nscd as an example of quality design or
> implementation, but yeah, I share your concern re “may actually be a
> common design pattern”…
> 
> zw

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-29 20:02       ` Alejandro Colomar
@ 2022-12-30 18:02         ` Joseph Myers
  2023-03-20 15:40           ` Zack Weinberg
  0 siblings, 1 reply; 28+ messages in thread
From: Joseph Myers @ 2022-12-30 18:02 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Zack Weinberg, Wilco Dijkstra, Carlos O'Donell,
	'GNU C Library'

On Thu, 29 Dec 2022, Alejandro Colomar via Libc-alpha wrote:

> Okay, probably it's not the fastest one, but it's simple.  This one would
> SIGSEGV in the following case:
> 
> Another thread might insert a NUL at the beginning of each string (after the
> loop has passed over it), and in the next cycle remove the
> previously-terminating NUL from the strings.  The loop would then run forever,
> until a crash.

I also think it should be OK for strcmp to SEGV if a NUL terminator byte 
in either string at the time strlen is called, or at any time during its 
execution, ceases at any point during the execution of strlen to be a NUL 
byte (even if there is an earlier or later NUL already present at the time 
the terminator byte is changed).  (There is a reasonable case for avoiding 
a SEGV when the contents of the strings change during execution, as long 
as any byte that is the NUL terminator byte at any point during execution 
of the call never ceases to be a NUL byte during execution of that call - 
an earlier NUL might be added, however.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
  2022-12-30 18:02         ` Joseph Myers
@ 2023-03-20 15:40           ` Zack Weinberg
  0 siblings, 0 replies; 28+ messages in thread
From: Zack Weinberg @ 2023-03-20 15:40 UTC (permalink / raw)
  To: Joseph Myers, 'Alejandro Colomar (man-pages)'
  Cc: Wilco Dijkstra, Carlos O'Donell, GNU libc development

On Fri, Dec 30, 2022, at 1:02 PM, Joseph Myers wrote:
> I also think it should be OK for strcmp to SEGV if a NUL terminator
> byte in either string at the time strlen is called, or at any time
> during its execution, ceases at any point during the execution of
> strlen to be a NUL byte (even if there is an earlier or later NUL
> already present at the time the terminator byte is changed).  (There
> is a reasonable case for avoiding a SEGV when the contents of the
> strings change during execution, as long as any byte that is the NUL
> terminator byte at any point during execution of the call never ceases
> to be a NUL byte during execution of that call - an earlier NUL might
> be added, however.)

Coming back to this ages later, I'm not sure how this differs from what
I said about the oracle for strlen... Imagine we have a hardware
debugger that can freeze execution of the entire computer at the moment
the call instruction for strcmp is invoked, perform strlen() on both
strings passed to strcmp, and then resume execution.  Call the string
lengths measured by this hypothetical debugger the "original" string
lengths.  What I was trying to communicate is that any concurrent
modification to the strings that *changes either string's length from
its original* should still permit constrained-unpredictable behavior by
the implementation of strcmp. It seems to me that this is the same as
talking about insertion or deletion of NULs within the bounds of the
original string lengths.

zw

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2023-03-20 15:41 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-13 18:20 Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change Narayanan Iyer
2022-12-13 18:31 ` Andrew Pinski
2022-12-13 18:39   ` Narayanan Iyer
2022-12-13 18:39 ` Cristian Rodríguez
2022-12-13 19:08 ` Noah Goldstein
2022-12-13 19:13   ` Narayanan Iyer
2022-12-13 19:25     ` Noah Goldstein
2022-12-13 20:56       ` Zack Weinberg
2022-12-13 23:29         ` Carlos O'Donell
2022-12-14  2:28           ` Zack Weinberg
2022-12-14  4:16             ` Carlos O'Donell
2022-12-14 14:16               ` Zack Weinberg
2022-12-14 17:36                 ` Paolo Bonzini
2022-12-29  7:09                   ` Zack Weinberg
2022-12-29 19:32               ` “Undefined behavior” considered harmful (was Re: Bug 29863 - Segmentation fault in memcmp-sse2.S…) Zack Weinberg
2022-12-29 22:20                 ` Andreas Schwab
2022-12-30 13:28                   ` Florian Weimer
2022-12-30 15:09                 ` Florian Weimer
2022-12-13 22:52       ` Bug 29863 - Segmentation fault vs invalid results, memory models, and control/data dependencies Carlos O'Donell
2022-12-14 12:03         ` Florian Weimer
2022-12-13 21:20   ` Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change Florian Weimer
2022-12-13 22:59     ` Noah Goldstein
2022-12-14 12:06       ` Florian Weimer
     [not found] <PAWPR08MB89825887E12FF900540365F483E09@PAWPR08MB8982.eurprd08.prod.outlook.com>
     [not found] ` <PAWPR08MB898260DA844D695EA70ED3E483E09@PAWPR08MB8982.eurprd08.prod.outlook.com>
2022-12-14 21:56   ` Wilco Dijkstra
2022-12-29  7:21     ` Zack Weinberg
2022-12-29 20:02       ` Alejandro Colomar
2022-12-30 18:02         ` Joseph Myers
2023-03-20 15:40           ` Zack Weinberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).