From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 969AA3858031; Fri, 10 Dec 2021 16:29:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 969AA3858031 From: "jason at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/103534] [12 regression] Spurious -Wstringop-overflow warning with std::string concatencation Date: Fri, 10 Dec 2021 16:29:57 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: alias, diagnostic, missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: jason at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: priority cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2021 16:29:57 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103534 Jason Merrill changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P3 |P1 CC| |aldyh at gcc dot gnu.org, | |jason at gcc dot gnu.org --- Comment #5 from Jason Merrill --- The dataflow analysis seems to be: We set the length of one string to 0, and the other string to 16. Then we store a char to the string buffer, which the compiler thinks could possibly have clobbered the length we previously set to 0, so we reload it. And we= add the two together. So now we have a combined length about which we think we know nothing We should really somehow tell the compiler that stores to the string char buffer can't alias other non-char objects. And maybe in general we could do branch prediction based on assuming that char stores don't clobber values we knew before? But let's put that missed-optimization issue in a separate PR. So, let's focus away from that problem by making the second string unknown: #include std::string foo(std::string x) { return std::string("1234567890123456") + x; } I get the same surprising warning with this testcase. Now, we have an unknown total length. We compare this length to the size of the local buffer, which partitions the range at 16. On the path where the = sum of the lengths is <=3D16, we conclude that the length of string A must eith= er be 0 or a number so large that adding 16 to it causes it to wrap around to [0,= 16] (because integer overflow in unsigned arithmetic is defined). Which branch prediction thinks is just as likely as 0. So then along that branch we try to append this impossibly large hypothetic= al string to this string we do know the length of, and we get this warning. So, the warning seems to be that if we were to call _M_append with a ridiculously large __n argument, we would get undefined behavior. In other words, if x happened to be the longest possible string. It seems that we c= heck for unreasonable length arguments in the char* append functions, but not in= the string append function. Changing them to do that check silences the warnin= g.=20 I'll attach a patch in a moment.=