Using source-level annotations to help GCC detect buffer overflows

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Using source-level annotations to help GCC detect buffer overflows
@ 2021-06-28 19:06 Martin Sebor
  2021-06-29 12:27 ` David Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Martin Sebor @ 2021-06-28 19:06 UTC (permalink / raw)
  To: gcc mailing list

I wrote an article for the Red Hat Developer blog about how
to annotate code to get the most out of GCC's access checking
warnings like -Warray-bounds, -Wformat-overflow, and
-Wstringop-overflow.  The article published last week:

https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using source-level annotations to help GCC detect buffer overflows
  2021-06-28 19:06 Using source-level annotations to help GCC detect buffer overflows Martin Sebor
@ 2021-06-29 12:27 ` David Brown
  2021-06-29 15:50   ` Martin Sebor
  0 siblings, 1 reply; 6+ messages in thread
From: David Brown @ 2021-06-29 12:27 UTC (permalink / raw)
  To: Martin Sebor, gcc mailing list

On 28/06/2021 21:06, Martin Sebor via Gcc wrote:
> I wrote an article for the Red Hat Developer blog about how
> to annotate code to get the most out of GCC's access checking
> warnings like -Warray-bounds, -Wformat-overflow, and
> -Wstringop-overflow.  The article published last week:
> 
> https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows
> 

Thanks for that write-up - and of course thank you to whoever
implemented these attributes!

The caveat that the access attributes are lost when a function is
inlined is an important one.  As a user who appreciates all the checks I
can get, it is disappointing - but I assume there are good reasons for
that limitation.  I can merely hope that will change in future gcc versions.

I believe it would make sense to add this information to the gcc manual
page for common function attributes.  There are quite a number of
attributes that are useful for static checking, such as
"warn_unused_result" and "nonnull".  Are these also dropped if the
function is inlined?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using source-level annotations to help GCC detect buffer overflows
  2021-06-29 12:27 ` David Brown
@ 2021-06-29 15:50   ` Martin Sebor
  2021-06-29 18:31     ` David Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Martin Sebor @ 2021-06-29 15:50 UTC (permalink / raw)
  To: David Brown, gcc mailing list

On 6/29/21 6:27 AM, David Brown wrote:
> On 28/06/2021 21:06, Martin Sebor via Gcc wrote:
>> I wrote an article for the Red Hat Developer blog about how
>> to annotate code to get the most out of GCC's access checking
>> warnings like -Warray-bounds, -Wformat-overflow, and
>> -Wstringop-overflow.  The article published last week:
>>
>> https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows
>>
> 
> Thanks for that write-up - and of course thank you to whoever
> implemented these attributes!
> 
> The caveat that the access attributes are lost when a function is
> inlined is an important one.  As a user who appreciates all the checks I
> can get, it is disappointing - but I assume there are good reasons for
> that limitation.  I can merely hope that will change in future gcc versions.

There's nothing the attribute could obviously attach to after a call
has been inlined.  An extreme example is a function whose argument
isn't used:

   __attribute__ ((access (write_only, 1, 2))) void
   f (char *p, int n) { }

(The function might have a body in the original source that could
be eliminated from the IL based on the values of other arguments.)

Calls to it that are not inlined will be checked but those that are
won't be.  This could be improved by doing the checking also before
inlining but at a cost of some false positives for code that's later
determined to be unreachable.  I don't have a sense of how bad it
might be so it's something to try.  This class of false positives
could also be dealt with by queuing up the warnings (e.g., by
emitting them into the IL via __builtin_warning) and issuing them
only if they survive dead code elimination.  This is something I'd
like to try to tackle for GCC 12.

> 
> I believe it would make sense to add this information to the gcc manual
> page for common function attributes.  There are quite a number of
> attributes that are useful for static checking, such as
> "warn_unused_result" and "nonnull".  Are these also dropped if the
> function is inlined?

I agree the documentation could stand to be made clearer on this
point.  In general, I think it would be helpful to give users
more guidance about what to expect from attributes as well as
warnings: which ones are purely lexical and which ones flow-
sensitive and so inherently susceptible to false positives and
negatives, and to what extent.

Whether an attribute has an effect depends on the compilation stage
where it's handled.  warn_unused_result is handled very early (well
before inlining) so it always has the expected effect.  Attribute
nonnull is handled both early (to catch the simple cases) and also
later, after inlining, to benefit from some flow analysis, so its
effect is lost if the function it attaches to is inlined.  Attribute
access is handled very late and so it suffers from this problem
even more.

The new attribute malloc (to associate allocators with deallocators)
is also handled very late but it deals with the same problem by
disabling inlining.  This was done to avoid false positives, not
to prevent false negatives, but it works for both.  Disabling
inlining is obviously suboptimal and wouldn't be appropriate for
simple accessor functions but for memory allocation it seems like
an acceptable tradeoff.

The inlining problem is not unique to attributes that affect
warnings.  It impacts all function (and function type) attributes,
including those that affect optimization.  Those that specifically
change optimization options disable inlining to avoid meaningless
mismatches between options controlling the codegen of the caller
and those intended to control the codegen for the callee.

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using source-level annotations to help GCC detect buffer overflows
  2021-06-29 15:50   ` Martin Sebor
@ 2021-06-29 18:31     ` David Brown
  2021-06-30 17:12       ` Martin Sebor
  0 siblings, 1 reply; 6+ messages in thread
From: David Brown @ 2021-06-29 18:31 UTC (permalink / raw)
  To: Martin Sebor, gcc mailing list

On 29/06/2021 17:50, Martin Sebor wrote:
> On 6/29/21 6:27 AM, David Brown wrote:
>> On 28/06/2021 21:06, Martin Sebor via Gcc wrote:
>>> I wrote an article for the Red Hat Developer blog about how
>>> to annotate code to get the most out of GCC's access checking
>>> warnings like -Warray-bounds, -Wformat-overflow, and
>>> -Wstringop-overflow.  The article published last week:
>>>
>>> https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows
>>>
>>>
>>
>> Thanks for that write-up - and of course thank you to whoever
>> implemented these attributes!
>>
>> The caveat that the access attributes are lost when a function is
>> inlined is an important one.  As a user who appreciates all the checks I
>> can get, it is disappointing - but I assume there are good reasons for
>> that limitation.  I can merely hope that will change in future gcc
>> versions.
> 
> There's nothing the attribute could obviously attach to after a call
> has been inlined.  An extreme example is a function whose argument
> isn't used:
> 
>   __attribute__ ((access (write_only, 1, 2))) void
>   f (char *p, int n) { }
> 
> (The function might have a body in the original source that could
> be eliminated from the IL based on the values of other arguments.)

Could these attributes not be attached to the arguments when the
function is called, or the parameters when the function is expanded?
After all, in cases such as the "access" attribute it is not the
function as such that has the access hints, it is the parameters of the
function.

(I'm talking here based on absolutely no knowledge of how this is
implemented, but it's always possible that a different view, unbiased by
knowing the facts, can inspire new ideas.)

> 
> Calls to it that are not inlined will be checked but those that are
> won't be.  This could be improved by doing the checking also before
> inlining but at a cost of some false positives for code that's later
> determined to be unreachable.  I don't have a sense of how bad it
> might be so it's something to try.  This class of false positives
> could also be dealt with by queuing up the warnings (e.g., by
> emitting them into the IL via __builtin_warning) and issuing them
> only if they survive dead code elimination.  This is something I'd
> like to try to tackle for GCC 12.
> 

I fully appreciate that some checks can be easier earlier in the
process, others later.  It might even be helpful to do similar checks at
more than one stage, and combine the results.

>>
>> I believe it would make sense to add this information to the gcc manual
>> page for common function attributes.  There are quite a number of
>> attributes that are useful for static checking, such as
>> "warn_unused_result" and "nonnull".  Are these also dropped if the
>> function is inlined?
> 
> I agree the documentation could stand to be made clearer on this
> point.  In general, I think it would be helpful to give users
> more guidance about what to expect from attributes as well as
> warnings: which ones are purely lexical and which ones flow-
> sensitive and so inherently susceptible to false positives and
> negatives, and to what extent.

It could be difficult to quantify that kind of thing, but sometimes
guidance could be useful.  (There is already such information for some
warning flags, especially those that support multiple levels.)

Certainly since first reading about the "access" attributes, I have been
considering adding them to my current project.  I have also been mulling
around in my head possibilities of making variadic templates in C++ that
add access attributes in the right places for some kinds of pointers -
but now that I know the attributes will get dropped for inline
functions, and such templates would involve inline functions, there is
little point.  (Maybe I will still figure a neat way to do this for
external functions - it just won't be useful in as many places.)

> 
> Whether an attribute has an effect depends on the compilation stage
> where it's handled.  warn_unused_result is handled very early (well
> before inlining) so it always has the expected effect.  Attribute
> nonnull is handled both early (to catch the simple cases) and also
> later, after inlining, to benefit from some flow analysis, so its
> effect is lost if the function it attaches to is inlined.  Attribute
> access is handled very late and so it suffers from this problem
> even more.
> 

I suppose some attributes are not needed for inline functions, since the
compiler has the full function definition and can figure some things out
itself.  That would apply to "pure" and "const" functions, I expect.
And if you want a parameter to be non-null, it's possible to do a check
inside the function:

extern void __attribute__((error("Nonnull check failed")))
		nonnull_check_failed(void);

#define nonnull(x) \
    do { \
        if (__builtin_constant_p(!(x))) { \
            if (!(x)) nonnull_check_failed(); \
        } \
        if (!(x)) __builtin_unreachable(); \
    } while (0)

inline int foo(int *p) {
	nonnull(p);
	(*p)++;
	return *p;
}

(The "__builtin_unreachable()" line could also be a call to an error
handler, or missing entirely, according to need.)

If you try to call "foo(0)" and the compiler can see at compile time
that the parameter is null, you'll get a compile-time error.  I've used
that kind of check in my code, but it's a little uglier than
__attribute__((nonnull)) !

> The new attribute malloc (to associate allocators with deallocators)
> is also handled very late but it deals with the same problem by
> disabling inlining.  This was done to avoid false positives, not
> to prevent false negatives, but it works for both.  Disabling
> inlining is obviously suboptimal and wouldn't be appropriate for
> simple accessor functions but for memory allocation it seems like
> an acceptable tradeoff.

I've sometimes had allocator functions that are so simple that inlining
is appropriate (on dedicated embedded systems it is not unusual to need
to allocate some memory early on in the code, but never need to free it,
leading to minimal allocation functions).  But the cost of a function
call would not be noticeable.

> 
> The inlining problem is not unique to attributes that affect
> warnings.  It impacts all function (and function type) attributes,
> including those that affect optimization.  Those that specifically
> change optimization options disable inlining to avoid meaningless
> mismatches between options controlling the codegen of the caller
> and those intended to control the codegen for the callee.
> 

It's obvious that some attributes don't play well with inlining, such as
"section" (unless inlined in a function with the same section
attribute), but it looks like there is a lot of detail that is missing
from the manual pages here.  And some of these effects are
counter-intuitive and unhelpful.

For example, it is very occasionally useful to arithmetic operations on
signed types with wrapping semantics rather than the usual "overflow
doesn't happen" semantics that lets gcc produce more efficient code.  A
neat and convenient way to write that in C++ would be to make an enum
class for wrapping ints:

enum class WInt : int {};

__attribute__((optimize("-fwrapv")))
WInt operator + (WInt x, WInt y) {
    return (WInt) ((int) x + (int) y);
}

__attribute__((optimize("-fwrapv")))
WInt operator - (WInt x, WInt y) {
    return (WInt) ((int) x - (int) y);
}

// etc.

Simple, clear, safe (you can't mix "int" and "WInt" by mistake) and
efficient - one might think.  But it turns out this is not the case -
using these operators from a function that does not also have "-fwrapv"
in effect will lead to function calls.

I'm glad I found out now, and not in a situation where inlining was
important.  But I think it would be a good thing to mention this in the
documentation.  (It would be even better to remove the restriction on
inlining, but I expect that will take more time!)

mvh.,

David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using source-level annotations to help GCC detect buffer overflows
  2021-06-29 18:31     ` David Brown
@ 2021-06-30 17:12       ` Martin Sebor
  2021-07-01  8:18         ` David Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Martin Sebor @ 2021-06-30 17:12 UTC (permalink / raw)
  To: David Brown, gcc mailing list

On 6/29/21 12:31 PM, David Brown wrote:
> On 29/06/2021 17:50, Martin Sebor wrote:
>> On 6/29/21 6:27 AM, David Brown wrote:
>>> On 28/06/2021 21:06, Martin Sebor via Gcc wrote:
>>>> I wrote an article for the Red Hat Developer blog about how
>>>> to annotate code to get the most out of GCC's access checking
>>>> warnings like -Warray-bounds, -Wformat-overflow, and
>>>> -Wstringop-overflow.  The article published last week:
>>>>
>>>> https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows
>>>>
>>>>
>>>
>>> Thanks for that write-up - and of course thank you to whoever
>>> implemented these attributes!
>>>
>>> The caveat that the access attributes are lost when a function is
>>> inlined is an important one.  As a user who appreciates all the checks I
>>> can get, it is disappointing - but I assume there are good reasons for
>>> that limitation.  I can merely hope that will change in future gcc
>>> versions.
>>
>> There's nothing the attribute could obviously attach to after a call
>> has been inlined.  An extreme example is a function whose argument
>> isn't used:
>>
>>    __attribute__ ((access (write_only, 1, 2))) void
>>    f (char *p, int n) { }
>>
>> (The function might have a body in the original source that could
>> be eliminated from the IL based on the values of other arguments.)
> 
> Could these attributes not be attached to the arguments when the
> function is called, or the parameters when the function is expanded?
> After all, in cases such as the "access" attribute it is not the
> function as such that has the access hints, it is the parameters of the
> function.
> 
> (I'm talking here based on absolutely no knowledge of how this is
> implemented, but it's always possible that a different view, unbiased by
> knowing the facts, can inspire new ideas.)

Attaching these attributes to function parameters is an interesting
idea that might be worth exploring.  We've talked about letting
attribute access apply to variables for other reasons (detecting
attempts to modify immutable objects, as I mention in the article).
so your suggestion would be in line with that.  Associating two
variables that aren't parameters might be tricky.

> 
>>
>> Calls to it that are not inlined will be checked but those that are
>> won't be.  This could be improved by doing the checking also before
>> inlining but at a cost of some false positives for code that's later
>> determined to be unreachable.  I don't have a sense of how bad it
>> might be so it's something to try.  This class of false positives
>> could also be dealt with by queuing up the warnings (e.g., by
>> emitting them into the IL via __builtin_warning) and issuing them
>> only if they survive dead code elimination.  This is something I'd
>> like to try to tackle for GCC 12.
>>
> 
> I fully appreciate that some checks can be easier earlier in the
> process, others later.  It might even be helpful to do similar checks at
> more than one stage, and combine the results.
> 
>>>
>>> I believe it would make sense to add this information to the gcc manual
>>> page for common function attributes.  There are quite a number of
>>> attributes that are useful for static checking, such as
>>> "warn_unused_result" and "nonnull".  Are these also dropped if the
>>> function is inlined?
>>
>> I agree the documentation could stand to be made clearer on this
>> point.  In general, I think it would be helpful to give users
>> more guidance about what to expect from attributes as well as
>> warnings: which ones are purely lexical and which ones flow-
>> sensitive and so inherently susceptible to false positives and
>> negatives, and to what extent.
> 
> It could be difficult to quantify that kind of thing, but sometimes
> guidance could be useful.  (There is already such information for some
> warning flags, especially those that support multiple levels.)
> 
> Certainly since first reading about the "access" attributes, I have been
> considering adding them to my current project.  I have also been mulling
> around in my head possibilities of making variadic templates in C++ that
> add access attributes in the right places for some kinds of pointers -
> but now that I know the attributes will get dropped for inline
> functions, and such templates would involve inline functions, there is
> little point.  (Maybe I will still figure a neat way to do this for
> external functions - it just won't be useful in as many places.)

Unfortunately, with extensive inlining and templates, C++ support
for these attributes is less robust than it ideally would be.
Improving it is on my to do list.

>> Whether an attribute has an effect depends on the compilation stage
>> where it's handled.  warn_unused_result is handled very early (well
>> before inlining) so it always has the expected effect.  Attribute
>> nonnull is handled both early (to catch the simple cases) and also
>> later, after inlining, to benefit from some flow analysis, so its
>> effect is lost if the function it attaches to is inlined.  Attribute
>> access is handled very late and so it suffers from this problem
>> even more.
>>
> 
> I suppose some attributes are not needed for inline functions, since the
> compiler has the full function definition and can figure some things out
> itself.  That would apply to "pure" and "const" functions, I expect.

I was going to agree, but then I tested it and found out that const
(and most likely pure) do actually make a difference on inline
functions.  For example in the test case below the inequality is
folded to false.

int f (int);

__attribute__ ((const))
int g (int i) { return f (i); }

void h (int i)
{
   if (g (i) != g (i))
     __builtin_abort ();
}

On the other hand, the equality below is not folded unless f() is
also declared with attribute malloc:

void* f (void);

static int a[1];

__attribute__ ((malloc))
void* g (void) { return f (); }

void h (void)
{
   if (g () == a)
     __builtin_abort ();
}

With heavy inlining (e.g., with LTO) whether a function attribute
will have an effect or not in a given caller is anyone's guess :(

> And if you want a parameter to be non-null, it's possible to do a check
> inside the function:
> 
> extern void __attribute__((error("Nonnull check failed")))
> 		nonnull_check_failed(void);
> 
> #define nonnull(x) \
>      do { \
>          if (__builtin_constant_p(!(x))) { \
>              if (!(x)) nonnull_check_failed(); \
>          } \
>          if (!(x)) __builtin_unreachable(); \
>      } while (0)
> 
> 
> inline int foo(int *p) {
> 	nonnull(p);
> 	(*p)++;
> 	return *p;
> }
> 
> 
> (The "__builtin_unreachable()" line could also be a call to an error
> handler, or missing entirely, according to need.)
> 
> 
> If you try to call "foo(0)" and the compiler can see at compile time
> that the parameter is null, you'll get a compile-time error.  I've used
> that kind of check in my code, but it's a little uglier than
> __attribute__((nonnull)) !

This is a possible workaround.  GCC should be able to do the same
thing for you (that's the __builtin_warning idea).

> 
> 
> 
>> The new attribute malloc (to associate allocators with deallocators)
>> is also handled very late but it deals with the same problem by
>> disabling inlining.  This was done to avoid false positives, not
>> to prevent false negatives, but it works for both.  Disabling
>> inlining is obviously suboptimal and wouldn't be appropriate for
>> simple accessor functions but for memory allocation it seems like
>> an acceptable tradeoff.
> 
> I've sometimes had allocator functions that are so simple that inlining
> is appropriate (on dedicated embedded systems it is not unusual to need
> to allocate some memory early on in the code, but never need to free it,
> leading to minimal allocation functions).  But the cost of a function
> call would not be noticeable.
> 
>>
>> The inlining problem is not unique to attributes that affect
>> warnings.  It impacts all function (and function type) attributes,
>> including those that affect optimization.  Those that specifically
>> change optimization options disable inlining to avoid meaningless
>> mismatches between options controlling the codegen of the caller
>> and those intended to control the codegen for the callee.
>>
> 
> It's obvious that some attributes don't play well with inlining, such as
> "section" (unless inlined in a function with the same section
> attribute), but it looks like there is a lot of detail that is missing
> from the manual pages here.  And some of these effects are
> counter-intuitive and unhelpful.

Agreed.  I hope my articles will be helpful here (it would be nice
if GCC had a blog where they could be posted too).  The manual
should be updated to mention these basic limitations; the challenge
is the inconsistency between how they're applied.

> 
> For example, it is very occasionally useful to arithmetic operations on
> signed types with wrapping semantics rather than the usual "overflow
> doesn't happen" semantics that lets gcc produce more efficient code.  A
> neat and convenient way to write that in C++ would be to make an enum
> class for wrapping ints:
> 
> enum class WInt : int {};
> 
> __attribute__((optimize("-fwrapv")))
> WInt operator + (WInt x, WInt y) {
>      return (WInt) ((int) x + (int) y);
> }
> 
> __attribute__((optimize("-fwrapv")))
> WInt operator - (WInt x, WInt y) {
>      return (WInt) ((int) x - (int) y);
> }
> 
> // etc.
> 
> 
> Simple, clear, safe (you can't mix "int" and "WInt" by mistake) and
> efficient - one might think.  But it turns out this is not the case -
> using these operators from a function that does not also have "-fwrapv"
> in effect will lead to function calls.
> 
> I'm glad I found out now, and not in a situation where inlining was
> important.  But I think it would be a good thing to mention this in the
> documentation.  (It would be even better to remove the restriction on
> inlining, but I expect that will take more time!)

I agree (on both counts).  Raising bugs (ideally with test cases
showing counterintuitive effects) would help.

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using source-level annotations to help GCC detect buffer overflows
  2021-06-30 17:12       ` Martin Sebor
@ 2021-07-01  8:18         ` David Brown
  0 siblings, 0 replies; 6+ messages in thread
From: David Brown @ 2021-07-01  8:18 UTC (permalink / raw)
  To: Martin Sebor, gcc mailing list


Thanks for the reply here.  I've snipped a bit to save space.

On 30/06/2021 19:12, Martin Sebor wrote:
> On 6/29/21 12:31 PM, David Brown wrote:
>> On 29/06/2021 17:50, Martin Sebor wrote:
>>> On 6/29/21 6:27 AM, David Brown wrote:
>>>> On 28/06/2021 21:06, Martin Sebor via Gcc wrote:
>>>>> I wrote an article for the Red Hat Developer blog about how
>>>>> to annotate code to get the most out of GCC's access checking
>>>>> warnings like -Warray-bounds, -Wformat-overflow, and
>>>>> -Wstringop-overflow.  The article published last week:
>>>>>
>>>>> https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows 
>>>>>
>>
>> Could these attributes not be attached to the arguments when the
>> function is called, or the parameters when the function is expanded?
>> After all, in cases such as the "access" attribute it is not the
>> function as such that has the access hints, it is the parameters of the
>> function.
>>
>> (I'm talking here based on absolutely no knowledge of how this is
>> implemented, but it's always possible that a different view, unbiased by
>> knowing the facts, can inspire new ideas.)
> 
> Attaching these attributes to function parameters is an interesting
> idea that might be worth exploring.  We've talked about letting
> attribute access apply to variables for other reasons (detecting
> attempts to modify immutable objects, as I mention in the article).
> so your suggestion would be in line with that.  Associating two
> variables that aren't parameters might be tricky.
> 

It has always seemed to me that some of these attributes are about the 
parameters, rather than the functions.  It would make more sense when 
using them if the attributes worked as qualifiers for the parameter 
(vaguely like "restrict") and were written before the parameter itself, 
rather than as a function attribute with a parameter index.  Of course, 
that gets messy when you have an attribute that ties two parameters 
together (readonly with a size, for example).

>>
>> Certainly since first reading about the "access" attributes, I have been
>> considering adding them to my current project.  I have also been mulling
>> around in my head possibilities of making variadic templates in C++ that
>> add access attributes in the right places for some kinds of pointers -
>> but now that I know the attributes will get dropped for inline
>> functions, and such templates would involve inline functions, there is
>> little point.  (Maybe I will still figure a neat way to do this for
>> external functions - it just won't be useful in as many places.)
> 
> Unfortunately, with extensive inlining and templates, C++ support
> for these attributes is less robust than it ideally would be.
> Improving it is on my to do list.
> 

I think with modern coding styles and compiler usage, you have to assume 
that /everything/ is likely to get inlined somewhere.  Lots of C++ 
coding is done with header-only libraries, full of templates and inline 
functions.  Even variables can be templates or inline variables with 
current standards.  And no matter how the code is written, once you use 
LTO then pretty much anything can be inlined back and forth, or 
partially inlined, or partially outlined (is that the right term?), or 
cloned.  The concept of "function" in C or C++ that corresponds to an 
externally visible assembly label, a sequence of assembly instructions 
following a strict ABI and a "return", is fast disappearing.

I don't foresee you or the other gcc developers getting bored anytime soon!

>>> Whether an attribute has an effect depends on the compilation stage
>>> where it's handled.  warn_unused_result is handled very early (well
>>> before inlining) so it always has the expected effect.  Attribute
>>> nonnull is handled both early (to catch the simple cases) and also
>>> later, after inlining, to benefit from some flow analysis, so its
>>> effect is lost if the function it attaches to is inlined.  Attribute
>>> access is handled very late and so it suffers from this problem
>>> even more.
>>>
>>
>> I suppose some attributes are not needed for inline functions, since the
>> compiler has the full function definition and can figure some things out
>> itself.  That would apply to "pure" and "const" functions, I expect.
> 
> I was going to agree, but then I tested it and found out that const
> (and most likely pure) do actually make a difference on inline
> functions.  For example in the test case below the inequality is
> folded to false.
> 
> int f (int);
> 
> __attribute__ ((const))
> int g (int i) { return f (i); }
> 
> void h (int i)
> {
>    if (g (i) != g (i))
>      __builtin_abort ();
> }
> 
> On the other hand, the equality below is not folded unless f() is
> also declared with attribute malloc:
> 
> void* f (void);
> 
> static int a[1];
> 
> __attribute__ ((malloc))
> void* g (void) { return f (); }
> 
> void h (void)
> {
>    if (g () == a)
>      __builtin_abort ();
> }
> 
> With heavy inlining (e.g., with LTO) whether a function attribute
> will have an effect or not in a given caller is anyone's guess :(
> 
>> And if you want a parameter to be non-null, it's possible to do a check
>> inside the function:
>>
>> extern void __attribute__((error("Nonnull check failed")))
>>         nonnull_check_failed(void);
>>
>> #define nonnull(x) \
>>      do { \
>>          if (__builtin_constant_p(!(x))) { \
>>              if (!(x)) nonnull_check_failed(); \
>>          } \
>>          if (!(x)) __builtin_unreachable(); \
>>      } while (0)
>>
>>
>> inline int foo(int *p) {
>>     nonnull(p);
>>     (*p)++;
>>     return *p;
>> }
>>
>>
>> (The "__builtin_unreachable()" line could also be a call to an error
>> handler, or missing entirely, according to need.)
>>
>>
>> If you try to call "foo(0)" and the compiler can see at compile time
>> that the parameter is null, you'll get a compile-time error.  I've used
>> that kind of check in my code, but it's a little uglier than
>> __attribute__((nonnull)) !
> 
> This is a possible workaround.  GCC should be able to do the same
> thing for you (that's the __builtin_warning idea).
> 

__builtin_warning is, AFAIK, not documented in the gcc user manual.  But 
if there were builtins to give a warning or an error that could be used 
instead of my usual method of calling a non-existent function with 
"error" attribute, that would be neater.

(I sometimes use macros like this to give the compiler extra information 
that it can use for optimisation or other checks.  I have exactly the 
same macro, with the name "assume", that is useful in some code as a 
check, an optimisation hint, and documentation to the programmer.)

>>
>>
>>
>>> The new attribute malloc (to associate allocators with deallocators)
>>> is also handled very late but it deals with the same problem by
>>> disabling inlining.  This was done to avoid false positives, not
>>> to prevent false negatives, but it works for both.  Disabling
>>> inlining is obviously suboptimal and wouldn't be appropriate for
>>> simple accessor functions but for memory allocation it seems like
>>> an acceptable tradeoff.
>>
>> I've sometimes had allocator functions that are so simple that inlining
>> is appropriate (on dedicated embedded systems it is not unusual to need
>> to allocate some memory early on in the code, but never need to free it,
>> leading to minimal allocation functions).  But the cost of a function
>> call would not be noticeable.
>>
>>>
>>> The inlining problem is not unique to attributes that affect
>>> warnings.  It impacts all function (and function type) attributes,
>>> including those that affect optimization.  Those that specifically
>>> change optimization options disable inlining to avoid meaningless
>>> mismatches between options controlling the codegen of the caller
>>> and those intended to control the codegen for the callee.
>>>
>>
>> It's obvious that some attributes don't play well with inlining, such as
>> "section" (unless inlined in a function with the same section
>> attribute), but it looks like there is a lot of detail that is missing
>> from the manual pages here.  And some of these effects are
>> counter-intuitive and unhelpful.
> 
> Agreed.  I hope my articles will be helpful here (it would be nice
> if GCC had a blog where they could be posted too).  

Yes, that would be good.  I can't speak for anyone else, but I find 
articles like yours helpful.

> The manual
> should be updated to mention these basic limitations; the challenge
> is the inconsistency between how they're applied.
> 

I would say it is better to note in the manual that particular 
attributes may be applied inconsistently, that let users assume that 
they are always applied.  People can then be explicit in their code if 
they need consistency, such using  the "noinline" attribute to ensure 
that a function is not inlined and its other attributes are applied.

>>
>> For example, it is very occasionally useful to arithmetic operations on
>> signed types with wrapping semantics rather than the usual "overflow
>> doesn't happen" semantics that lets gcc produce more efficient code.  A
>> neat and convenient way to write that in C++ would be to make an enum
>> class for wrapping ints:
>>
>> enum class WInt : int {};
>>
>> __attribute__((optimize("-fwrapv")))
>> WInt operator + (WInt x, WInt y) {
>>      return (WInt) ((int) x + (int) y);
>> }
>>
>> __attribute__((optimize("-fwrapv")))
>> WInt operator - (WInt x, WInt y) {
>>      return (WInt) ((int) x - (int) y);
>> }
>>
>> // etc.
>>
>>
>> Simple, clear, safe (you can't mix "int" and "WInt" by mistake) and
>> efficient - one might think.  But it turns out this is not the case -
>> using these operators from a function that does not also have "-fwrapv"
>> in effect will lead to function calls.
>>
>> I'm glad I found out now, and not in a situation where inlining was
>> important.  But I think it would be a good thing to mention this in the
>> documentation.  (It would be even better to remove the restriction on
>> inlining, but I expect that will take more time!)
> 
> I agree (on both counts).  Raising bugs (ideally with test cases
> showing counterintuitive effects) would help.
> 
> Martin

I've filed this as a bug at 
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101279>.

David



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-07-01  8:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28 19:06 Using source-level annotations to help GCC detect buffer overflows Martin Sebor
2021-06-29 12:27 ` David Brown
2021-06-29 15:50   ` Martin Sebor
2021-06-29 18:31     ` David Brown
2021-06-30 17:12       ` Martin Sebor
2021-07-01  8:18         ` David Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).