[PATCH v2] detect out-of-bounds stores by atomic functions [PR102453]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Martin Sebor <msebor@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>, Jan Hubicka <hubicka@ucw.cz>
Subject: [PATCH v2] detect out-of-bounds stores by atomic functions [PR102453]
Date: Tue, 12 Oct 2021 13:44:18 -0600	[thread overview]
Message-ID: <923ccb56-fd9c-96ea-6f9d-619668fd789b@gmail.com> (raw)
In-Reply-To: <CAFiYyc0Buc+SkQq3ZK2bdxVSjhveH1xe=SxU-zk5HJSy5SQ3tg@mail.gmail.com>

On 10/12/21 12:52 AM, Richard Biener wrote:
> On Mon, Oct 11, 2021 at 11:25 PM Martin Sebor <msebor@gmail.com> wrote:
>>
>> The attached change extends GCC's warnings for out-of-bounds
>> stores to cover atomic (and __sync) built-ins.
>>
>> Rather than hardcoding the properties of these built-ins just
>> for the sake of the out-of-bounds detection, on the assumption
>> that it might be useful for future optimizations as well, I took
>> the approach of extending class attr_fnspec to express their
>> special property that they encode the size of the access in their
>> name.
>>
>> I also took the liberty of making attr_fnspec assignable (something
>> the rest of my patch relies on), and updating some comments for
>> the characters the class uses to encode function properties, based
>> on my understanding of their purpose.
>>
>> Tested on x86_64-linux.
> 
> Hmm, so you place 'A' at an odd place (where the return value is specified),
> but you do not actually specify the behavior on the return value.  Shoudln't
> 
> +     'A'        specifies that the function atomically accesses a constant
> +               1 << N bytes where N is indicated by character 3+2i
> 
> maybe read
> 
>      'A'     specifies that the function returns the memory pointed to
> by argument
>               one of size 1 << N bytes where N is indicated by
> character 3 +2i accessed atomically
> 
> ?

I didn't think the return value would be interesting because in
general (parallel accesses) it's not related (in an observable
way) to the value of the dereferenced operand.  Not all
the built-ins also return a value (e.g., atomic_store), and
whether or not one does return the argument would need to be
encoded somehow because it cannot be determined from the return
type (__atomic_compare_exchange and __atomic_test_and_set return
bool that's not necessarily the value of the operand).  Also,
since the functions return the operand value either before or
after the update, we'd need another letter to describe that.
(This alone could be dealt with simply by using 'A' and 'a',
but that's not enough for the other cases.)

So with all these possibilities I don't think encoding
the return value at this point is worthwhile.  If/when this
enhancement turns out to be used for optimization and we think
encoding the return value would be helpful, I'd say let's
revisit it then.  The accessor APIs should make it a fairly
straightforward exercise.

> I also wonder if it's necessary to constrain this to 'atomic' accesses
> for the purpose of the patch and whether that detail could be omitted to
> eventually make more use of it?

I pondered the same question but I couldn't think of any other
built-ins with similar semantics (read-write-modify, return
a result either pre- or post-modification), so I opted for
simplicity.  I am open to generalizing it if/when there is
a function I could test it with, although I'm not sure
the current encoding scheme has enough letters and letter
positions to describe the effects in their full generality.

> 
> Likewise
> 
> +     '0'...'9'  specifies the size of value written/read is given either
> +               by the specified argument, or for atomic functions, by
> +               2 ^ N where N is the constant value denoted by the character
> 
> should mention (excluding '0') for the argument position.

Sure, I'll update the comment if you think this change is worth
pursuing.

> 
>     /* length of the fn spec string.  */
> -  const unsigned len;
> +  unsigned len;
> 
> why that?

The const member is what prevents the struct from being assignable,
which is what the rest of the patch depends on.

> 
> +  /* Return true of the function is an __atomic or __sync built-in.  */
> 
> you didn't specify that for 'A' ...
> 
> +  bool
> +  atomic_p () const
> +  {
> +    return str[0] == 'A';
> +  }
> 
> +attr_fnspec
> +atomic_builtin_fnspec (tree callee)
> +{
> +  switch (DECL_FUNCTION_CODE (callee))
> +    {
> +#define BUILTIN_ACCESS_SIZE_FNSPEC(N, lgsz)            \
> +      BUILT_IN_ATOMIC_LOAD_ ## N:                      \
> +       return "Ap" "R" lgsz;
> 
> note that doing this for atomics makes those no longer a compiler barrier
> for (aliased) loads and stores which means they are no longer a reliable
> way to implement locks.  That's a reason why I never pushed a
> PTA/alias patch I have to open-code this.
> 
> Thus, do we really want to do this?

That's my question to you :) If you don't think this attr_fnspec
extension would be useful for optimization I'll drop this part
of the patch and open-code it separately only for the out-of-bounds
diagnostics.  I'd started out that way but the fnspec class made
the code cleaner and if it could be used elsewhere so much
the better.  Please let me know.

Martin

next prev parent reply	other threads:[~2021-10-12 19:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-11 21:25 [PATCH] " Martin Sebor
2021-10-12  6:52 ` Richard Biener
2021-10-12 19:44   ` Martin Sebor [this message]
2021-10-13  8:15     ` [PATCH v2] " Richard Biener
2021-10-24 23:40       ` [PATCH v3] " Martin Sebor
2021-10-25  2:21         ` Jeff Law
2021-10-26  8:23           ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=923ccb56-fd9c-96ea-6f9d-619668fd789b@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).