public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access()
@ 2022-11-24  1:49 Gavin Ray
  2022-11-25 17:33 ` David Malcolm
  0 siblings, 1 reply; 7+ messages in thread
From: Gavin Ray @ 2022-11-24  1:49 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 1005 bytes --]

Hey all, just a few questions about the fantastic GCC Static Analyzer:

- It's stated that support for C++ vs C is very limited. Does this apply if
  you're writing C++ that is very similar-looking to C and uses few of C++'s
  advanced features?

- I noticed that in C++, the "gnu::malloc" attributes don't seem to report
  "warning: leak of 'xxx_alloc()' [CWE-401] [-Wanalyzer-malloc-leak]", is
this
  normal?

- Is it worthwhile to spend time annotating your method signatures with
  attributes like "malloc" and "access"? Do these aid the -fanalyzer passes?

For instance:

[[gnu::malloc]] [[gnu::malloc(HeapPage_free, 1)]] [[gnu::returns_nonnull]]
struct HeapPage* HeapPage_alloc();

[[gnu::access(read_write, 1, 3)]]
struct RecordID
HeapPage_insert_record(struct HeapPage* page, const char* record,
                       uint32_t record_length);

That's quite a significant bit of annotation-noise tacked on to the
function, so
I wanted to be sure it's worth the investment!

Thank you =)
Gavin Ray

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access()
  2022-11-24  1:49 -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access() Gavin Ray
@ 2022-11-25 17:33 ` David Malcolm
  2022-11-25 18:55   ` Gavin Ray
  0 siblings, 1 reply; 7+ messages in thread
From: David Malcolm @ 2022-11-25 17:33 UTC (permalink / raw)
  To: Gavin Ray, gcc

On Wed, 2022-11-23 at 20:49 -0500, Gavin Ray via Gcc wrote:
> Hey all, just a few questions about the fantastic GCC Static
> Analyzer:

Hi!

> 
> - It's stated that support for C++ vs C is very limited. Does this
> apply if
>   you're writing C++ that is very similar-looking to C and uses few
> of C++'s
>   advanced features?

Unfortunately not: even fairly simple-looking C++ code can generate
extra CFG edges relating to exception-handling, which -fanalyzer
currently doesn't understand at all, making the output essentially
useless.  And that's just one issue...

Strictly speaking I have added some *very* minimal regression tests in
C++ to our test suite, but on anything beyond the most trivial example
it's likely to run into a known issue.  I'm tracking some of these
known issues here:
  https://gcc.gnu.org/bugzilla/showdependencytree.cgi?id=97110
but the C++ support is currently so minimal that it's probably not yet
worth filing extra bugs against that tracker :-/

I'm hoping to spend a good chunk of the GCC 14 development cycle
working on adding C++ support, with the aim of being able to analyze
GCC itself ("eating my own dog food"), so I hope this situation will
improve greatly then.

> 
> - I noticed that in C++, the "gnu::malloc" attributes don't seem to
> report
>   "warning: leak of 'xxx_alloc()' [CWE-401] [-Wanalyzer-malloc-
> leak]", is
> this
>   normal?

In theory they should; but you could be running into issues with the
analyzer not fully understanding the control flow graph.

> 
> - Is it worthwhile to spend time annotating your method signatures
> with
>   attributes like "malloc" and "access"? Do these aid the -fanalyzer
> passes?

The analyzer makes use of the "malloc", "nonnull" and "const" function
attributes.

It does make use of the "access" attribute, but only within -Wanalyzer-
tainted-size, for the case where the size param of the access is
attacker-controlled (and the taint checker is currently off by default,
even with -fanalyzer).

But like I said, don't expect these to work on C++ code yet.

> 
> For instance:
> 
> [[gnu::malloc]] [[gnu::malloc(HeapPage_free, 1)]]

IIRC, the first [[gnu::malloc]] here is redundant, as it's implied by
[[gnu::malloc(HeapPage_free, 1)]].

> [[gnu::returns_nonnull]]
> struct HeapPage* HeapPage_alloc();
> 
> [[gnu::access(read_write, 1, 3)]]
> struct RecordID
> HeapPage_insert_record(struct HeapPage* page, const char* record,
>                        uint32_t record_length);
> 
> That's quite a significant bit of annotation-noise tacked on to the
> function, so
> I wanted to be sure it's worth the investment!

Maybe in GCC 14 onwards, but it definitely won't be worth it at the
moment.

Hope this is helpful
Dave

> 
> Thank you =)
> Gavin Ray
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access()
  2022-11-25 17:33 ` David Malcolm
@ 2022-11-25 18:55   ` Gavin Ray
  2022-11-26 14:09     ` Jonathan Wakely
  0 siblings, 1 reply; 7+ messages in thread
From: Gavin Ray @ 2022-11-25 18:55 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc

> Unfortunately not: even fairly simple-looking C++ code can generate extra CFG
edges relating to exception-handling ... making the output essentially useless.

Ahh, I had figured the answer might be something like this. Thanks for
confirming. At least from the small codebase I have, swapping from C -> C++ when
compiling preserves all -fanalyzer warnings except for the ones from custom
malloc() attributes, so that's nice.

I won't count on the analyzer being able to do a stellar job though, so I'll
take what I can get =)

> I'm hoping to spend a good chunk of the GCC 14 development cycle working on
> adding C++ support

That's awesome! I hope you'll write some on these changes again -- I really
enjoyed reading your article on the state of Static Analysis in GCC 12 and your
LPC presentation.

I believe that's where I learned about the SARIF exporter, which with the VS
Code extension is just beyond cool!

> The analyzer makes use of the "malloc", "nonnull" and "const" function
attributes. It does make use of the "access" attribute, but only within
-Wanalyzer- tainted-size, for the case where the size param of the access is
attacker-controlled.

Ahh okay, thanks. I currently don't run the taint analysis, as this app is a
database a-la Postgres/SQLite built as a learning exercise/hobby so it's not so
much of a concern.

> the first [[gnu::malloc]] here is redundant, as it's implied by
> [[gnu::malloc(HeapPage_free, 1)]].

Good to know, ty -- it may be useful to modify the 'attribute' docs for these,
because currently it shows combined usage and phrases it as:

"Independently, the form of the attribute with one or two arguments associates
deallocator as a suitable deallocation function..."

__attribute__ ((malloc, malloc (fclose, 1))) FILE* fdopen (int, const char*);

On a related note, the "fd" analyzers warn: "fd may not be valid" but don't
clarify what "valid" means.

I found the DOT diagrams in the analyzer source and it turns out that the key is
to check "if (fd >= 0)". Maybe this could be added to the "fd" analyzer
warnings/info, too?

====================

Anyways, sorry to drone on. I think that the GCC Static Analyzer and the LLVM
Dataflow Analysis Framework are vital for the the evolution of safety and
developer experience in authoring C/C++

Big thanks to all involved and it's exciting to watch the shortlog & follow the
progress from the sidelines =)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access()
  2022-11-25 18:55   ` Gavin Ray
@ 2022-11-26 14:09     ` Jonathan Wakely
  2022-11-26 15:48       ` Gavin Ray
  0 siblings, 1 reply; 7+ messages in thread
From: Jonathan Wakely @ 2022-11-26 14:09 UTC (permalink / raw)
  To: Gavin Ray; +Cc: David Malcolm, gcc

[-- Attachment #1: Type: text/plain, Size: 1049 bytes --]

On Fri, 25 Nov 2022, 18:55 Gavin Ray via Gcc, <gcc@gcc.gnu.org> wrote:

>
> On a related note, the "fd" analyzers warn: "fd may not be valid" but don't
> clarify what "valid" means.
>

A valid file descriptor is one that was returned by the C library and
refers to an open file. That's not something GCC defines.


> I found the DOT diagrams in the analyzer source and it turns out that the
> key is
> to check "if (fd >= 0)".


The point is that the OS functions that return a new file descriptor return
a negative value on error, so all valid file descriptors are non-negative.
But not all non-negative integers are valid file descriptors.

You should check for errors when calling open, dup2, socket etc. so you
know whether it succeeded.


Maybe this could be added to the "fd" analyzer
> warnings/info, too?
>

I don't think that's a good idea unless word carefully, it's not as simple
as "test if it's non-negative". You should check for errors when using OS
APIs, but that's always true, it's not difficult to the analyzer output.



>
>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access()
  2022-11-26 14:09     ` Jonathan Wakely
@ 2022-11-26 15:48       ` Gavin Ray
  2022-11-26 17:47         ` Jonathan Wakely
  0 siblings, 1 reply; 7+ messages in thread
From: Gavin Ray @ 2022-11-26 15:48 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: David Malcolm, gcc

I was using if (fd != -1) and was still getting the warning which confused me
My suggestion was maybe to add the exact condition the fd analyzer is
looking for to the warning so that folks know how to fix it/trigger
its 'true' branch.

e.g. instead of:
"fd may not be valid"

Something like this, or thereabouts:
"fd may not be valid (expecting fd >= 0)"

On Sat, Nov 26, 2022 at 9:09 AM Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>
>
>
> On Fri, 25 Nov 2022, 18:55 Gavin Ray via Gcc, <gcc@gcc.gnu.org> wrote:
>>
>>
>> On a related note, the "fd" analyzers warn: "fd may not be valid" but don't
>> clarify what "valid" means.
>
>
> A valid file descriptor is one that was returned by the C library and refers to an open file. That's not something GCC defines.
>
>>
>> I found the DOT diagrams in the analyzer source and it turns out that the key is
>> to check "if (fd >= 0)".
>
>
> The point is that the OS functions that return a new file descriptor return a negative value on error, so all valid file descriptors are non-negative. But not all non-negative integers are valid file descriptors.
>
> You should check for errors when calling open, dup2, socket etc. so you know whether it succeeded.
>
>
>> Maybe this could be added to the "fd" analyzer
>> warnings/info, too?
>
>
> I don't think that's a good idea unless word carefully, it's not as simple as "test if it's non-negative". You should check for errors when using OS APIs, but that's always true, it's not difficult to the analyzer output.
>
>
>>
>>
>>
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access()
  2022-11-26 15:48       ` Gavin Ray
@ 2022-11-26 17:47         ` Jonathan Wakely
  2022-11-26 18:51           ` David Malcolm
  0 siblings, 1 reply; 7+ messages in thread
From: Jonathan Wakely @ 2022-11-26 17:47 UTC (permalink / raw)
  To: Gavin Ray; +Cc: David Malcolm, gcc

[-- Attachment #1: Type: text/plain, Size: 670 bytes --]

On Sat, 26 Nov 2022, 15:48 Gavin Ray, <ray.gavin97@gmail.com> wrote:

> I was using if (fd != -1) and was still getting the warning which confused
> me
> My suggestion was maybe to add the exact condition the fd analyzer is
> looking for to the warning so that folks know how to fix it/trigger
> its 'true' branch.
>
> e.g. instead of:
> "fd may not be valid"
>
> Something like this, or thereabouts:
> "fd may not be valid (expecting fd >= 0)"
>

That seems like an analyzer bug, checking for -1 should be ok. POSIX is
clear that open and socket return -1 on error. I didn't check all the other
functions that return new file descriptors, but I think they're the same.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access()
  2022-11-26 17:47         ` Jonathan Wakely
@ 2022-11-26 18:51           ` David Malcolm
  0 siblings, 0 replies; 7+ messages in thread
From: David Malcolm @ 2022-11-26 18:51 UTC (permalink / raw)
  To: Jonathan Wakely, Gavin Ray; +Cc: gcc

On Sat, 2022-11-26 at 17:47 +0000, Jonathan Wakely wrote:
> On Sat, 26 Nov 2022, 15:48 Gavin Ray, <ray.gavin97@gmail.com> wrote:
> 
> > I was using if (fd != -1) and was still getting the warning which
> > confused
> > me
> > My suggestion was maybe to add the exact condition the fd analyzer
> > is
> > looking for to the warning so that folks know how to fix it/trigger
> > its 'true' branch.
> > 
> > e.g. instead of:
> > "fd may not be valid"
> > 
> > Something like this, or thereabouts:
> > "fd may not be valid (expecting fd >= 0)"
> > 
> 
> That seems like an analyzer bug, checking for -1 should be ok. POSIX
> is
> clear that open and socket return -1 on error. I didn't check all the
> other
> functions that return new file descriptors, but I think they're the
> same.

Looking at sm-fd.cc: fd_state_machine::on_condition it looks like the
analyzer handles both checks against -1 (eq/ne) *and* checks for >= 0,
so maybe the .dot file needs updating.

Gavin, if you have a reproducer in C [1] where the wording of the
message seems "off", please can you file it in bugzilla and I'll take a
look next week.  The fd state machine is new in GCC 13, so it's
interesting to get this into the hands of end-users.

Thanks
Dave

[1] Sadly, C++ in -fanalyzer is sufficiently broken right now that it's
not worth filing extra bugs on top of the ones I already know about,
alas.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-11-26 18:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-24  1:49 -fanalyzer: Questions on C vs CPP + use of GCC attr's like malloc()/access() Gavin Ray
2022-11-25 17:33 ` David Malcolm
2022-11-25 18:55   ` Gavin Ray
2022-11-26 14:09     ` Jonathan Wakely
2022-11-26 15:48       ` Gavin Ray
2022-11-26 17:47         ` Jonathan Wakely
2022-11-26 18:51           ` David Malcolm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).