public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] restricting aliasing by standard containers (PR 98465)
@ 2021-01-07 21:41 Martin Sebor
  2021-01-08  7:51 ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Martin Sebor @ 2021-01-07 21:41 UTC (permalink / raw)
  To: gcc mailing list, Richard Biener

The test case in PR 98465 brings to light a problem we've discussed
before (e.g., PR 93971) where a standard container (std::string in
this case but the problem applies to any class that owns and manages
allocated memory) might trigger warnings for unreachable code.
The code is not eliminated due to a missing aliasing constraint:
because GCC doesn't know that the member pointer to the memory
managed by the container cannot alias other objects, it emits code
that can never be executed in a valid program and that's prone to
causing false positives.

To illustrate, at the moment it's impossible to fold away the assert
below because there's no way to determine in the middle end that
String::s cannot point to a:

   extern char array[];

   class String {
     char *s;
   public:
      String (const char *p): s (strdup (p)) { }
      String (const String &str): s (strdup (str.s)) { }
      ~String () { free (s); }

      void f () { assert (s != array); }
   };

The constraint is obvious to a human reader (String::s is private
and nothing sets it to point to array) but there's no way for GCC
to infer it from the code alone (at least not in general): there
could be member or friend functions defined in other translation
units that violate this assumption.

One way to solve the problem is to explicitly declare that
String::s, in fact, doesn't point to any such objects and that it
only ever points to allocated memory.  My idea for doing that is
to extend attribute malloc to (or add a new attribute for) pointer
variables to imply that the pointer only points to allocated memory.

However, besides pointing to allocated memory, std::string can also
point to its own internal buffer, so the extended malloc attribute
couldn't be used there by itself.  I think this could be solved by
also either extending the may_alias attribute or adding a new
"alias" (or some such) attribute to denote that a pointer variable
may point to an object or subobject.

Putting the two together, to eliminate the assert, std::string would
be annotated like so:

   class string {
     char *s __attribute__ ((malloc, may_alias (buf)));
     char buf[8];
   public:
      string (): s (buf) { }
      string (const char *p): s (strdup (p)) { }
      string (const string &str): s (strdup (str.s)) { }
      ~string () { if (s != buf) free (s); }

      void f () { assert (s != array); }
   };

The may_alias association with members is relative to the this pointer
(i.e., as if by may_alias (this->buf), as opposed to being taken as
may_alias (String::buf) and meaning that s might be equal to any other
String::s with a different this.  To help avoid mistakes, setting s
in violation of the constraints would trigger warnings.

If this sounds reasonable I'm prepared to prototype it, either for
GCC 11 if it's in scope to solve the PR and there's still time, or
(I suspect more likely) for GCC 12.

Richard, what are your thoughts/concerns?

Martin

PS An alternate solution might be to provide a late-evaluated built-in,
something like

   <tri-state> __builtin_decl (T *ptr)

that would return a <yes> answer if ptr could be determined to point
to a declared object or subobject, a <no> if not (e.g., it points to
allocated storage), and a <don't know> if it couldn't be determined.
  The built-in would then be used in code to eliminate infeasible
paths.  For example, a built-in like that could be used to eliminate
the assert in string::f():

   void string::f ()
   {
     if (<yes> == __builtin_decl_p (s) && s != buf)
       __builtin_unreachable ();

     assert (s != array);
   }

A built-in might be more flexible but would also be harder to use
(and likely more error-prone).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] restricting aliasing by standard containers (PR 98465)
  2021-01-07 21:41 [RFC] restricting aliasing by standard containers (PR 98465) Martin Sebor
@ 2021-01-08  7:51 ` Richard Biener
  2021-01-08 20:16   ` Martin Sebor
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2021-01-08  7:51 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc mailing list

On Thu, Jan 7, 2021 at 10:41 PM Martin Sebor <msebor@gmail.com> wrote:
>
> The test case in PR 98465 brings to light a problem we've discussed
> before (e.g., PR 93971) where a standard container (std::string in
> this case but the problem applies to any class that owns and manages
> allocated memory) might trigger warnings for unreachable code.
> The code is not eliminated due to a missing aliasing constraint:
> because GCC doesn't know that the member pointer to the memory
> managed by the container cannot alias other objects, it emits code
> that can never be executed in a valid program and that's prone to
> causing false positives.
>
> To illustrate, at the moment it's impossible to fold away the assert
> below because there's no way to determine in the middle end that
> String::s cannot point to a:
>
>    extern char array[];
>
>    class String {
>      char *s;
>    public:
>       String (const char *p): s (strdup (p)) { }
>       String (const String &str): s (strdup (str.s)) { }
>       ~String () { free (s); }
>
>       void f () { assert (s != array); }
>    };
>
> The constraint is obvious to a human reader (String::s is private
> and nothing sets it to point to array) but there's no way for GCC
> to infer it from the code alone (at least not in general): there
> could be member or friend functions defined in other translation
> units that violate this assumption.
>
> One way to solve the problem is to explicitly declare that
> String::s, in fact, doesn't point to any such objects and that it
> only ever points to allocated memory.  My idea for doing that is
> to extend attribute malloc to (or add a new attribute for) pointer
> variables to imply that the pointer only points to allocated memory.
>
> However, besides pointing to allocated memory, std::string can also
> point to its own internal buffer, so the extended malloc attribute
> couldn't be used there by itself.  I think this could be solved by
> also either extending the may_alias attribute or adding a new
> "alias" (or some such) attribute to denote that a pointer variable
> may point to an object or subobject.
>
> Putting the two together, to eliminate the assert, std::string would
> be annotated like so:
>
>    class string {
>      char *s __attribute__ ((malloc, may_alias (buf)));
>      char buf[8];
>    public:
>       string (): s (buf) { }
>       string (const char *p): s (strdup (p)) { }
>       string (const string &str): s (strdup (str.s)) { }
>       ~string () { if (s != buf) free (s); }
>
>       void f () { assert (s != array); }
>    };
>
> The may_alias association with members is relative to the this pointer
> (i.e., as if by may_alias (this->buf), as opposed to being taken as
> may_alias (String::buf) and meaning that s might be equal to any other
> String::s with a different this.  To help avoid mistakes, setting s
> in violation of the constraints would trigger warnings.
>
> If this sounds reasonable I'm prepared to prototype it, either for
> GCC 11 if it's in scope to solve the PR and there's still time, or
> (I suspect more likely) for GCC 12.
>
> Richard, what are your thoughts/concerns?

I'm not sure it's feasible to make use of this attribute.  First
there's the malloc part which has difficult semantics (similar
to restrict) when generating PTA constraints.  We might see

 _1 = str.s;
 _2 = str.s;

but are of course required to associate the same allocated
dummy object with both pointers (as opposed to when we'd
see two malloc calls).  What would possibly work is to
have the object keyed on the field decl, but then for

 _1 = p_to_str_4(D);
 _2 = _1 + offsetof-s;
 _3 = *_2;

we have to somehow conservatively arrive at the same object.
I don't see how that can work out.

All the same applies to the may_alias part but I guess when the
malloc part falls apart that's not of much interest.

So I'm concerned about correctness - I'm sure you can hack
sth together to get some testcases optimized.  But I'm not sure
you can make it correct in all cases (within the current PTA
framework).

Richard.

> Martin
>
> PS An alternate solution might be to provide a late-evaluated built-in,
> something like
>
>    <tri-state> __builtin_decl (T *ptr)
>
> that would return a <yes> answer if ptr could be determined to point
> to a declared object or subobject, a <no> if not (e.g., it points to
> allocated storage), and a <don't know> if it couldn't be determined.
>   The built-in would then be used in code to eliminate infeasible
> paths.  For example, a built-in like that could be used to eliminate
> the assert in string::f():
>
>    void string::f ()
>    {
>      if (<yes> == __builtin_decl_p (s) && s != buf)
>        __builtin_unreachable ();
>
>      assert (s != array);
>    }
>
> A built-in might be more flexible but would also be harder to use
> (and likely more error-prone).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] restricting aliasing by standard containers (PR 98465)
  2021-01-08  7:51 ` Richard Biener
@ 2021-01-08 20:16   ` Martin Sebor
  2021-01-11  7:43     ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Martin Sebor @ 2021-01-08 20:16 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc mailing list

On 1/8/21 12:51 AM, Richard Biener wrote:
> On Thu, Jan 7, 2021 at 10:41 PM Martin Sebor <msebor@gmail.com> wrote:
>>
>> The test case in PR 98465 brings to light a problem we've discussed
>> before (e.g., PR 93971) where a standard container (std::string in
>> this case but the problem applies to any class that owns and manages
>> allocated memory) might trigger warnings for unreachable code.
>> The code is not eliminated due to a missing aliasing constraint:
>> because GCC doesn't know that the member pointer to the memory
>> managed by the container cannot alias other objects, it emits code
>> that can never be executed in a valid program and that's prone to
>> causing false positives.
>>
>> To illustrate, at the moment it's impossible to fold away the assert
>> below because there's no way to determine in the middle end that
>> String::s cannot point to a:
>>
>>     extern char array[];
>>
>>     class String {
>>       char *s;
>>     public:
>>        String (const char *p): s (strdup (p)) { }
>>        String (const String &str): s (strdup (str.s)) { }
>>        ~String () { free (s); }
>>
>>        void f () { assert (s != array); }
>>     };
>>
>> The constraint is obvious to a human reader (String::s is private
>> and nothing sets it to point to array) but there's no way for GCC
>> to infer it from the code alone (at least not in general): there
>> could be member or friend functions defined in other translation
>> units that violate this assumption.
>>
>> One way to solve the problem is to explicitly declare that
>> String::s, in fact, doesn't point to any such objects and that it
>> only ever points to allocated memory.  My idea for doing that is
>> to extend attribute malloc to (or add a new attribute for) pointer
>> variables to imply that the pointer only points to allocated memory.
>>
>> However, besides pointing to allocated memory, std::string can also
>> point to its own internal buffer, so the extended malloc attribute
>> couldn't be used there by itself.  I think this could be solved by
>> also either extending the may_alias attribute or adding a new
>> "alias" (or some such) attribute to denote that a pointer variable
>> may point to an object or subobject.
>>
>> Putting the two together, to eliminate the assert, std::string would
>> be annotated like so:
>>
>>     class string {
>>       char *s __attribute__ ((malloc, may_alias (buf)));
>>       char buf[8];
>>     public:
>>        string (): s (buf) { }
>>        string (const char *p): s (strdup (p)) { }
>>        string (const string &str): s (strdup (str.s)) { }
>>        ~string () { if (s != buf) free (s); }
>>
>>        void f () { assert (s != array); }
>>     };
>>
>> The may_alias association with members is relative to the this pointer
>> (i.e., as if by may_alias (this->buf), as opposed to being taken as
>> may_alias (String::buf) and meaning that s might be equal to any other
>> String::s with a different this.  To help avoid mistakes, setting s
>> in violation of the constraints would trigger warnings.
>>
>> If this sounds reasonable I'm prepared to prototype it, either for
>> GCC 11 if it's in scope to solve the PR and there's still time, or
>> (I suspect more likely) for GCC 12.
>>
>> Richard, what are your thoughts/concerns?
> 
> I'm not sure it's feasible to make use of this attribute.  First
> there's the malloc part which has difficult semantics (similar
> to restrict) when generating PTA constraints.  We might see
> 
>   _1 = str.s;
>   _2 = str.s;
> 
> but are of course required to associate the same allocated
> dummy object with both pointers (as opposed to when we'd
> see two malloc calls).  What would possibly work is to
> have the object keyed on the field decl, but then for
> 
>   _1 = p_to_str_4(D);
>   _2 = _1 + offsetof-s;
>   _3 = *_2;
> 
> we have to somehow conservatively arrive at the same object.
> I don't see how that can work out.
> 
> All the same applies to the may_alias part but I guess when the
> malloc part falls apart that's not of much interest.
> 
> So I'm concerned about correctness - I'm sure you can hack
> sth together to get some testcases optimized.  But I'm not sure
> you can make it correct in all cases (within the current PTA
> framework).

Thanks for the feedback.

Absent some source level annotation I can't think of a good way
to avoid these false positives.  Do you have any other ideas?

If not, would you be opposed to introducing these attributes to
suppress warnings (at least at first)?  Besides avoiding the false
positives, implementing just that part might also be a good proof
of concept for the aliasing solution (or a confirmation of your
intuition).

Martin

> 
> Richard.
> 
>> Martin
>>
>> PS An alternate solution might be to provide a late-evaluated built-in,
>> something like
>>
>>     <tri-state> __builtin_decl (T *ptr)
>>
>> that would return a <yes> answer if ptr could be determined to point
>> to a declared object or subobject, a <no> if not (e.g., it points to
>> allocated storage), and a <don't know> if it couldn't be determined.
>>    The built-in would then be used in code to eliminate infeasible
>> paths.  For example, a built-in like that could be used to eliminate
>> the assert in string::f():
>>
>>     void string::f ()
>>     {
>>       if (<yes> == __builtin_decl_p (s) && s != buf)
>>         __builtin_unreachable ();
>>
>>       assert (s != array);
>>     }
>>
>> A built-in might be more flexible but would also be harder to use
>> (and likely more error-prone).


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] restricting aliasing by standard containers (PR 98465)
  2021-01-08 20:16   ` Martin Sebor
@ 2021-01-11  7:43     ` Richard Biener
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Biener @ 2021-01-11  7:43 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc mailing list

On Fri, Jan 8, 2021 at 9:16 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 1/8/21 12:51 AM, Richard Biener wrote:
> > On Thu, Jan 7, 2021 at 10:41 PM Martin Sebor <msebor@gmail.com> wrote:
> >>
> >> The test case in PR 98465 brings to light a problem we've discussed
> >> before (e.g., PR 93971) where a standard container (std::string in
> >> this case but the problem applies to any class that owns and manages
> >> allocated memory) might trigger warnings for unreachable code.
> >> The code is not eliminated due to a missing aliasing constraint:
> >> because GCC doesn't know that the member pointer to the memory
> >> managed by the container cannot alias other objects, it emits code
> >> that can never be executed in a valid program and that's prone to
> >> causing false positives.
> >>
> >> To illustrate, at the moment it's impossible to fold away the assert
> >> below because there's no way to determine in the middle end that
> >> String::s cannot point to a:
> >>
> >>     extern char array[];
> >>
> >>     class String {
> >>       char *s;
> >>     public:
> >>        String (const char *p): s (strdup (p)) { }
> >>        String (const String &str): s (strdup (str.s)) { }
> >>        ~String () { free (s); }
> >>
> >>        void f () { assert (s != array); }
> >>     };
> >>
> >> The constraint is obvious to a human reader (String::s is private
> >> and nothing sets it to point to array) but there's no way for GCC
> >> to infer it from the code alone (at least not in general): there
> >> could be member or friend functions defined in other translation
> >> units that violate this assumption.
> >>
> >> One way to solve the problem is to explicitly declare that
> >> String::s, in fact, doesn't point to any such objects and that it
> >> only ever points to allocated memory.  My idea for doing that is
> >> to extend attribute malloc to (or add a new attribute for) pointer
> >> variables to imply that the pointer only points to allocated memory.
> >>
> >> However, besides pointing to allocated memory, std::string can also
> >> point to its own internal buffer, so the extended malloc attribute
> >> couldn't be used there by itself.  I think this could be solved by
> >> also either extending the may_alias attribute or adding a new
> >> "alias" (or some such) attribute to denote that a pointer variable
> >> may point to an object or subobject.
> >>
> >> Putting the two together, to eliminate the assert, std::string would
> >> be annotated like so:
> >>
> >>     class string {
> >>       char *s __attribute__ ((malloc, may_alias (buf)));
> >>       char buf[8];
> >>     public:
> >>        string (): s (buf) { }
> >>        string (const char *p): s (strdup (p)) { }
> >>        string (const string &str): s (strdup (str.s)) { }
> >>        ~string () { if (s != buf) free (s); }
> >>
> >>        void f () { assert (s != array); }
> >>     };
> >>
> >> The may_alias association with members is relative to the this pointer
> >> (i.e., as if by may_alias (this->buf), as opposed to being taken as
> >> may_alias (String::buf) and meaning that s might be equal to any other
> >> String::s with a different this.  To help avoid mistakes, setting s
> >> in violation of the constraints would trigger warnings.
> >>
> >> If this sounds reasonable I'm prepared to prototype it, either for
> >> GCC 11 if it's in scope to solve the PR and there's still time, or
> >> (I suspect more likely) for GCC 12.
> >>
> >> Richard, what are your thoughts/concerns?
> >
> > I'm not sure it's feasible to make use of this attribute.  First
> > there's the malloc part which has difficult semantics (similar
> > to restrict) when generating PTA constraints.  We might see
> >
> >   _1 = str.s;
> >   _2 = str.s;
> >
> > but are of course required to associate the same allocated
> > dummy object with both pointers (as opposed to when we'd
> > see two malloc calls).  What would possibly work is to
> > have the object keyed on the field decl, but then for
> >
> >   _1 = p_to_str_4(D);
> >   _2 = _1 + offsetof-s;
> >   _3 = *_2;
> >
> > we have to somehow conservatively arrive at the same object.
> > I don't see how that can work out.
> >
> > All the same applies to the may_alias part but I guess when the
> > malloc part falls apart that's not of much interest.
> >
> > So I'm concerned about correctness - I'm sure you can hack
> > sth together to get some testcases optimized.  But I'm not sure
> > you can make it correct in all cases (within the current PTA
> > framework).
>
> Thanks for the feedback.
>
> Absent some source level annotation I can't think of a good way
> to avoid these false positives.  Do you have any other ideas?
>
> If not, would you be opposed to introducing these attributes to
> suppress warnings (at least at first)?  Besides avoiding the false
> positives, implementing just that part might also be a good proof
> of concept for the aliasing solution (or a confirmation of your
> intuition).

I don't think "overloading" malloc and alias attributes in the
suggested way is good.  Thinking of a general way to communicate
restrictions of what a pointer/field points to would be nice
though, with all of the above caveats in mind.

Richard.

>
> Martin
>
> >
> > Richard.
> >
> >> Martin
> >>
> >> PS An alternate solution might be to provide a late-evaluated built-in,
> >> something like
> >>
> >>     <tri-state> __builtin_decl (T *ptr)
> >>
> >> that would return a <yes> answer if ptr could be determined to point
> >> to a declared object or subobject, a <no> if not (e.g., it points to
> >> allocated storage), and a <don't know> if it couldn't be determined.
> >>    The built-in would then be used in code to eliminate infeasible
> >> paths.  For example, a built-in like that could be used to eliminate
> >> the assert in string::f():
> >>
> >>     void string::f ()
> >>     {
> >>       if (<yes> == __builtin_decl_p (s) && s != buf)
> >>         __builtin_unreachable ();
> >>
> >>       assert (s != array);
> >>     }
> >>
> >> A built-in might be more flexible but would also be harder to use
> >> (and likely more error-prone).
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-11  7:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-07 21:41 [RFC] restricting aliasing by standard containers (PR 98465) Martin Sebor
2021-01-08  7:51 ` Richard Biener
2021-01-08 20:16   ` Martin Sebor
2021-01-11  7:43     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).