public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Zack Weinberg <zack@owlfolio.org>
To: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Cc: Carlos O'Donell <carlos@redhat.com>,
	'GNU C Library' <libc-alpha@sourceware.org>
Subject: Re: Bug 29863 - Segmentation fault in memcmp-sse2.S if memory contents can concurrently change
Date: Thu, 29 Dec 2022 02:21:45 -0500	[thread overview]
Message-ID: <ypikr0wioet2.wl-zack@owlfolio.org> (raw)
In-Reply-To: <PAWPR08MB8982F613727B0BC4942F146F83E09@PAWPR08MB8982.eurprd08.prod.outlook.com>

On Wed, 14 Dec 2022 16:56:28 -0500, Wilco Dijkstra wrote:
> I'd expect that mem* functions will never read outside their bounds
> since the bounds are explicitly defined by the arguments, not by the
> data. So that should be easy to guarantee.

I concur.

> For the str* functions it may be harder since the data itself
> defines when to stop reading.  So if an implementation uses multiple
> accesses to the same address, you could potentially mistake the end
> of a string (eg. first one detects a special case, while the 2nd
> then verifies it).

I also concur here.
 
> Still, I wouldn't expect totally random memory accesses even in this
> case - you would read beyond the end of a string if the string end
> is changed concurrently.

We may run into a problem where it’s difficult to _state_ the limits
of the misbehavior, just because the C standard doesn’t itself try to
put limits on misbehavior in the face of an incorrect program, so we
don’t have any language for it (which I would argue is a bug in the
standard, see the detailed reply to Carlos that I’ll be writing, er,
tomorrow).

Still, taking strcmp(a, b) for example, and assuming WLOG a flat
address space in which a < b, it should be possible to guarantee

 - no accesses to any byte in the range [0, a) ever
 - if an oracle for strlen(), capable of executing in zero cycles,
   would return the same value for strlen(a) throughout the execution
   of strcmp(), then no accesses to any byte in the range
   [a+strlen(a), b)
 - if an oracle for strlen(), capable of executing in zero cycles,
   would return the same value for strlen(b) throughout the execution
   of strcmp(), then no accesses to any byte in the range
   [b+strlen(b), ADDR_MAX)
 - however, if the oracle strlen() values _do_ change during the
   execution of strcmp(), then accesses to bytes in the latter two
   ranges are possible
 - a SIGSEGV is permissible if and only if there was at least one
   point during execution at which a call to the oracle strlen() would
   have triggered a SIGSEGV

Ne?

> Finally it's worth mentioning that nscd does the exact same thing:
> it uses memcmp and non-atomic accesses on shared data that is being
> modified by other threads. It looks totally broken, especially with
> weaker memory ordering, however this kind of insanity may actually
> be a common design pattern...

I don’t want to hold up nscd as an example of quality design or
implementation, but yeah, I share your concern re “may actually be a
common design pattern”…

zw

  reply	other threads:[~2022-12-29  7:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <PAWPR08MB89825887E12FF900540365F483E09@PAWPR08MB8982.eurprd08.prod.outlook.com>
     [not found] ` <PAWPR08MB898260DA844D695EA70ED3E483E09@PAWPR08MB8982.eurprd08.prod.outlook.com>
2022-12-14 21:56   ` Wilco Dijkstra
2022-12-29  7:21     ` Zack Weinberg [this message]
2022-12-29 20:02       ` Alejandro Colomar
2022-12-30 18:02         ` Joseph Myers
2023-03-20 15:40           ` Zack Weinberg
2022-12-13 18:20 Narayanan Iyer
2022-12-13 18:31 ` Andrew Pinski
2022-12-13 18:39   ` Narayanan Iyer
2022-12-13 18:39 ` Cristian Rodríguez
2022-12-13 19:08 ` Noah Goldstein
2022-12-13 19:13   ` Narayanan Iyer
2022-12-13 19:25     ` Noah Goldstein
2022-12-13 20:56       ` Zack Weinberg
2022-12-13 23:29         ` Carlos O'Donell
2022-12-14  2:28           ` Zack Weinberg
2022-12-14  4:16             ` Carlos O'Donell
2022-12-14 14:16               ` Zack Weinberg
2022-12-14 17:36                 ` Paolo Bonzini
2022-12-29  7:09                   ` Zack Weinberg
2022-12-13 21:20   ` Florian Weimer
2022-12-13 22:59     ` Noah Goldstein
2022-12-14 12:06       ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ypikr0wioet2.wl-zack@owlfolio.org \
    --to=zack@owlfolio.org \
    --cc=Wilco.Dijkstra@arm.com \
    --cc=carlos@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).