public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "dmalcolm at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug analyzer/106626] New: Improvements to wording of -Wanalyzer-out-of-bounds Date: Mon, 15 Aug 2022 13:08:23 +0000 [thread overview] Message-ID: <bug-106626-4@http.gcc.gnu.org/bugzilla/> (raw) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106626 Bug ID: 106626 Summary: Improvements to wording of -Wanalyzer-out-of-bounds Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: dmalcolm at gcc dot gnu.org CC: tlange at gcc dot gnu.org Target Milestone: --- During the patch review of -Wanalyzer-out-of-bounds we decided to focus on getting the feature implemented in trunk first, and defer coming up with the precise wording, to avoid holding up the feature. I'm filing this bug to track coming up with the precise wording for -Wanalyzer-out-of-bounds. Current status quo: Given e.g.: int arr[10]; int test (void) { return arr[10]; } https://godbolt.org/z/EPrqGTj8s we report: <source>: In function 'test': <source>:6:15: warning: buffer overread [CWE-126] [-Wanalyzer-out-of-bounds] 6 | return arr[10]; | ~~~^~~~ event 1 | | 1 | int arr[10]; | | ^~~ | | | | | (1) capacity is 40 bytes | +--> 'test': events 2-3 | | 4 | test (void) | | ^~~~ | | | | | (2) entry to 'test' | 5 | { | 6 | return arr[10]; | | ~~~~~~~ | | | | | (3) out-of-bounds read from byte 40 till byte 43 but 'arr' ends at byte 40 | <source>:6:15: note: write is 4 bytes past the end of 'arr' 6 | return arr[10]; | ~~~^~~~ The note erroneously says "write" due to a copy&paste which I plan to fix shortly. Goals: I'd like the diagnostics to (somehow) convey the following information to the user: * what was the expression (if available) responsible for the bad access, since we can't always underline exactly the problematic subexpression in a compound expression (or macros could be involved, obscuring the user's view of what the analyzer is "seeing"). e.g. in the above example 'arr[10]' * direction of the access: read vs write? e.g. in the above example: read * boundary being violated: is the access before or after the buffer? e.g. in the above example: after the buffer ,* location: where is the invalid access? heap vs stack vs elsewhere (since this can affect the impact of a vulnerability): e.g. in the above example: stack * magnitude: how far beyond the boundary is the invalid access (consider e.g. the cases of immediately beyond ("off-by-one"), vs near (a few bytes or elements), vs far); e.g. in the above example: 0-3 bytes beyond the boundary, 0 elements beyond when expressed as array index * data size: how much data beyond the boundary is accessed; e.g. in the above example 4 bytes, or 1 element. Doing so is likely to avoid a combinatorial explosion (due to the need for i18n), so to tame this, some of this may need to be split between different parts of the diagnostic (the initial warning, events in the warning's diagnostic path, and any notes after the warning). The above list is taken in part from the the Bug Framework's deprecated Buffer Overflow (BOF) Class: https://samate.nist.gov/BF/Old/BOFClass.html I'm not sure why the Bug Framework deprecated "BOF" (presumably in favor of the "BF Memory Model", https://samate.nist.gov/BF/Classes/MEM/MEMModel.html and its "Memory Use Bugs (MUS) Class": https://samate.nist.gov/BF/Classes/MEM/MUS.html ), but the BOF attributes seem to me to be pertinent information for the user, and good things to think about in test cases. Currently -Wanalyzer-out-of-bounds only warns when the size and offset of the access are constant (not symbolic), and the capacity of the underlying region is constant (not symbolic). I'd like to eventually generalize that (see PR 106625), so ideally whatever scheme we come up with should support that. Ideally these should be reported in terms of the user's source code. In the above example, the messages talk about bytes, but we should probably *also* talk about array indices (e.g. that indices 0 through 9 are valid, and 10 is one beyond). The wording should also be clear about inclusive vs exclusive ranges - I feel 'arr' ends at byte 40" is unclear. Perhaps something like: 'arr' has 10 elements, so valid indices are '[0]' to '[9]' (bytes 0-39) or somesuch. I'm not sure what a correct solution here is, but am filing this now to try to capture the things we might try to design for. Tim: I'm CCing you in case you want to work on this; otherwise I'd like to have a go at this in a few weeks.
next reply other threads:[~2022-08-15 13:08 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-08-15 13:08 dmalcolm at gcc dot gnu.org [this message] 2022-08-15 18:48 ` [Bug analyzer/106626] " cvs-commit at gcc dot gnu.org 2022-12-01 2:31 ` cvs-commit at gcc dot gnu.org 2022-12-01 2:31 ` cvs-commit at gcc dot gnu.org 2022-12-01 2:31 ` cvs-commit at gcc dot gnu.org 2022-12-01 2:31 ` cvs-commit at gcc dot gnu.org 2022-12-01 2:31 ` cvs-commit at gcc dot gnu.org 2023-04-07 12:36 ` dmalcolm at gcc dot gnu.org 2023-06-22 2:06 ` cvs-commit at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-106626-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).