From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 953223857825; Thu, 17 Mar 2022 12:47:54 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 953223857825
From: "dmalcolm at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/104854] -Wstringop-overread should not warn for
 strnlen, strndup and strncmp
Date: Thu, 17 Mar 2022 12:47:54 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: middle-end
X-Bugzilla-Version: 11.2.1
X-Bugzilla-Keywords: diagnostic
X-Bugzilla-Severity: normal
X-Bugzilla-Who: dmalcolm at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: siddhesh at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 11.3
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-104854-4-BqqWggHCGP@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-104854-4@http.gcc.gnu.org/bugzilla/>
References: <bug-104854-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Mar 2022 12:47:54 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104854
--- Comment #9 from David Malcolm <dmalcolm at gcc dot gnu.org> ---
(In reply to Siddhesh Poyarekar from comment #8)
> (In reply to Martin Sebor from comment #7)
> > Moving warnings into the analyzer and scaling it up to be able to run by
> > default, during development, sounds like a good long-term plan.  Until =
that
>=20
> That's not quite what I'm suggesting here.  I'm not a 100% convinced that
> those are the right heuristics at all; the size argument for strnlen,
> strndup and strncmp does not intend to describe the size of the passed
> strings.  It is only recommended security practice that the *n* variant
> functions be used instead of their unconstrained relatives to mitigate
> overflows.  In fact in more common cases the size argument (especially in
> case of strnlen and strncmp) may describe a completely different buffer or
> some other application-specific property.
>=20
> This is different from the -Wformat-overflow, where there is a clear
> relationship between buffer, the format string and the string representat=
ion
> of input numbers and we're only tweaking is the optimism level of the
> warnings.  So it is not just a question of levels of verosity/paranoia.
>=20
> In that context, using size to describe the underlying buffer of the sour=
ce
> only makes sense only for a subset of uses, making this heuristic quite
> noisy.  So what I'm actually saying is: the heuristic is too noisy but if=
 we
> insist on keeping it, it makes sense as an analyzer warning where the user
> *chooses* to look for pessimistic scenarios and is more tolerant of noisy
> heuristics.

Right now -fanalyzer enables all of the various -Wanalyzer-* warnings by
default [1], and in theory all of them only emit a diagnostic for the case =
when
the analyzer "thinks" there's a definite problem.  There may be bugs in the
analyzer, of course.  I'm a bit wary of the above sentence, as it suggests =
that
the analyzer should be the place to put noisy diagnostics.

Looking at the GCC UX guidelines:
  https://gcc.gnu.org/onlinedocs/gccint/Guidelines-for-Diagnostics.html
note "The user=E2=80=99s attention is important": if a diagnostic is too no=
isy, the
user will either turn it off, or stop paying attention.

The distinction I've been making for -fanalyzer is that -fanalyzer enables a
more expensive (path-based) analysis of the user's code, and will slow down=
 the
user's compile-time, with various warnings tied to that, i.e. I've been
messaging it primarily as a compile-time tradeoff for extra warnings that
otherwise would be too slow to implement, rather than a signal:noise ratio
tradeoff.  -fanalyzer can generate false positives, but I've been trying to
drive that down via bugfixes (it's also relatively new code)

[1] apart from -Wanalyzer-too-complex, but that's more of an implementation
detail.

>=20
> > happens, rather than gratuitously removing warnings that we've added ov=
er
> > the years, just because they fall short of the ideal 100% efficacy (as =
has
> > been known and documented), making them easier to control seems like a
> > better approach.
>=20
> It's not just a matter of efficacy here IMO.  The heuristic for strnlen,
> strncmp and strndup overreads is too loose for it to be taken seriously.=