public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Michael Matz <matz@suse.de>
To: Martin Uecker <muecker@gwdg.de>
Cc: Jakub Jelinek <jakub@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: _BitInt vs. _Atomic
Date: Tue, 1 Aug 2023 15:54:14 +0000 (UTC)	[thread overview]
Message-ID: <alpine.LSU.2.20.2308011438290.25429@wotan.suse.de> (raw)
In-Reply-To: <716bb9346a0fbd9580b64f24ecd1d557802729f8.camel@gwdg.de>

[-- Attachment #1: Type: text/plain, Size: 7398 bytes --]

Hello,

On Mon, 31 Jul 2023, Martin Uecker wrote:

> >  Say you have a loop like so:
> > 
> > _Atomic T obj;
> > ...
> > T expected1, expected2, newval;
> > newval = ...;
> > expected1 = ...;
> > do {
> >   expected2 = expected1;
> >   if (atomic_compare_exchange_weak(&obj, &expected2, newval);
> >     break;
> >   expected1 = expected2;
> > } while (1);
> > 
> > As written this looks of course stupid, and you may say "don't do that", 
> > but internally the copies might result from temporaries (compiler 
> > generated or wrapper function arguments, or suchlike). 
> >  Now, while 
> > expected2 will contain the copied padding bits after the cmpxchg the 
> > copies to and from expected1 will possibly destroy them.  Either way I 
> > don't see why the above loop should be out-of-spec, so I can write it and 
> > expect it to proceed eventually (certainly when the _strong variant is 
> > used).  Any argument that would declare the above loop out-of-spec I would 
> > consider a defect in the spec.
> 
> It is "out-of-spec" for C in the sense that it can not be
> expected work with the semantics as specified in the C standard.

(I call that a defect.  See below)

> In practice, what the semantics specified using memcpy/memcmp
> allow one to do is to also apply atomic operations on non-atomic 
> types.  This is not guaranteed to work by the C standard, but
> in practice  people often have to do this.  For example, nobody
> is going to copy a 256 GB numerical array with non-atomic types
> into another data structure with atomic versions of the same
> type just so that you can apply atomic operations on it.
> So one simply does an unsafe cast and hopes the compiler does
> not break this.
> 
> If the non-atomic struct now has non-zero values in the padding, 
> and the compiler would clear those automatically for "expected", 
> you would create the problem of an infinite loop (this time 
> for real).

Only because cmpxchg is defined in terms of memcpy/memcmp.  If it were 
defined in terms of the == operator (obviously applied recursively 
member-wise for structs) and simple-assignment that wouldn't be a problem.  
In addition that would get rid of all discussion of what happens or 
doesn't happen with padding.  Introducing reliance on padding bits (which 
IMHO goes against some fundamental ideas of the C standard) has 
far-reaching consequences, see below.  The current definition of the 
atomic_cmpxchg is also inconsistent with the rest of the standard:

We have:

  ... (C is non-atomic variant of A) ...
  _Bool atomic_compare_exchange_strong(volatile A *object,
                                       C *expected, C desired);
  ... (is equivalent to atomic variant of:) 
  if (memcmp(object, expected, sizeof (*object)) == 0)
    { memcpy(object, &desired, sizeof (*object)); return true; }
  else
    { memcpy(expected, object, sizeof (*object)); return false; }

But we also have:

  The size, representation, and alignment of an atomic type need not be 
  the same as those of the corresponding unqualified type.

  (with later text only suggesting that at least for atomic integer 
  types these please be the same.  But here we aren't talking about
  integer types even.)

So, already the 'memcmp(object, expected, sizeof (*object)' may be 
undefined.  sizeof(*object) need not be the same as sizeof(*expected).
In particular the memcpy in the else branch might clobber memory outside 
*expected.

That alone should be sufficient to show that defining this all in terms of 
memcpy/memcmp is a bad idea.  But it also has other 
consequences: you can't copy (simple-assign) or compare (== operator) 
atomic values anymore reliably and expect the atomic_cmpxchg to work.  My 
example from earlier shows that you can't copy them, a similar one can be 
constructed for breaking ==.

But it goes further: you can also construct an example that shows an 
internal inconsistency just with using atomic_cmpxchg (of course, assume 
all this to run without concurrent accesses to the respective objects):

  _Atomic T obj;
  ...
  T expected, newval;
  expected = ...;
  newval = expected + 1;         // just to make it different
  atomic_store (&obj, expected);
  if (atomic_cmpxchg_strong (&obj, &expected, newval)) {
    /* Now we have: obj == newval.
       Do we also have memcmp(&obj,&newval)==0? */
    if (!atomic_cmpxchg_strong (&obj, &newval, expected)) {
      /* No, we can't rely on that!  */
      error("what's going on?");
    }
  } else {
    /* May happen, padding of expected may not be the same
       as in obj, even after atomic_store.  */
    error("WTH? a compare after a store doesn't even work?");
  }

So, even though cmpxchg is defined in terms of memcpy/memcmp, we still 
can't rely on anything after it succeeded (or failed).  Simply because the 
by-value passing of the 'desired' argument will have unknown padding 
(within the implementation of cmpxchg) that isn't necessarily the same as 
the newval object.

Now, about your suggestion of clearing or ignoring the padding bits at 
specific points:  Clearly with current standard language (being explicit 
about memcpy/memcmp and also within the text refering to 'contents of the 
memory pointed to by ...', unlike earlier versions that at least still 
talked about 'value of') padding bits cannot be ignored at the compare 
within cmpxchg.  Neither can anything be cleared from within cmpxchg (and 
that's even ignoring that *expected and *object might have completely 
different representations, as per above).

So, one idea was:

> A compiler could, for example, always clear the padding when
> initializing or storing atomic values.

But that doesn't help the memcmp: even if *object has 
cleared padding, *expected might not have (it's not of atomic type).  You 
explicitely ruled out ignoring the padding on *expected for the memcmp due 
to the above large-array example.  To that end you suggested:

> It might also clear the padding of the initial "expected", when it is 
> initialized or stored to.

But that amounts to magic, I don't see a way to do that: as far as the 
compiler is concerned it's just a random object of an arbitrary 
(non-atomic!) type coming from somewhere (that it's an argument to cmpxchg 
eventually might not be visible).  It would have to clear _all_ padding 
for all objects always, just because of the chance that it may eventually 
be passed to atomic_cmpxchg.  That quite clearly can't be the intention.

To be honest the whole passage that uses memcmp/memcpy within the 
definition of atomic_compare_exchange since C11 reads like a try to 
explain the semantics in simple terms, but failing to account for padding.  
Initially that wasn't such a problem because the normative text still 
said

   Atomically, compares the value pointed to by object for equality 
   with that in expected, and if true, replaces the value pointed to by 
   object with desired, and if false, updates the value in expected
   with the value pointed to by object.

Note: "value" not "value representation" or "memory pointed to".  It seems 
eventually people noticed an inconsistency between the memcpy/memcpy Note 
and the above text, and overcorrected this to also talk about "memory 
pointed to".  But that then elevates padding to something observable, and 
I think that is wrong and was done without sufficiently catering for the 
consequences.


Ciao,
Michael.

  reply	other threads:[~2023-08-01 15:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-27 17:16 [PATCH 5/5] testsuite part 2 for _BitInt support [PR102989] Jakub Jelinek
2023-07-27 19:06 ` Joseph Myers
2023-07-28 11:43   ` _BitInt vs. _Atomic Jakub Jelinek
2023-07-28 14:03     ` Martin Uecker
2023-07-28 14:06       ` Martin Uecker
2023-07-28 14:26       ` Jakub Jelinek
2023-07-28 14:53         ` Martin Uecker
2023-07-28 15:10           ` Jakub Jelinek
2023-07-28 16:01             ` Martin Uecker
2023-07-31 14:33               ` Michael Matz
2023-07-31 15:37                 ` Martin Uecker
2023-08-01 15:54                   ` Michael Matz [this message]
2023-08-01 16:54                     ` Joseph Myers
2023-08-02 12:25                       ` Michael Matz
2023-08-01 19:03                     ` Martin Uecker
2023-07-28 18:32     ` Joseph Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.20.2308011438290.25429@wotan.suse.de \
    --to=matz@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=muecker@gwdg.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).