From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 469443948A58; Thu, 28 Apr 2022 17:51:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 469443948A58 From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug analyzer/105285] False positive with -Wanalyzer-null-dereference in git.git's reftable/reader.c Date: Thu, 28 Apr 2022 17:51:32 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: analyzer X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: dmalcolm at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Apr 2022 17:51:32 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105285 --- Comment #10 from CVS Commits --- The master branch has been updated by David Malcolm : https://gcc.gnu.org/g:00c4405cd7f6a144d0a439e4d848d246920e6ff3 commit r13-7-g00c4405cd7f6a144d0a439e4d848d246920e6ff3 Author: David Malcolm Date: Thu Apr 28 13:49:59 2022 -0400 analyzer: handle repeated accesses after init of unknown size [PR105285] PR analyzer/105285 reports a false positive from -Wanalyzer-null-dereference on git.git's reftable/reader.c. A reduced version of the problem can be seen in test_1a of gcc.dg/analyzer/symbolic-12.c in the following: void test_1a (void *p, unsigned next_off) { struct st_1 *r =3D p; external_fn(); if (next_off >=3D r->size) return; if (next_off >=3D r->size) /* We should have already returned if this is the case. */ __analyzer_dump_path (); /* { dg-bogus "path" } */ } where the analyzer erroneously considers this path, where (next_off >=3D r->size) is both false and then true: symbolic-12.c: In function =C3=A2test_1a=C3=A2: symbolic-12.c:22:5: note: path 22 | __analyzer_dump_path (); /* { dg-bogus "path" } */ | ^~~~~~~~~~~~~~~~~~~~~~~ =C3=A2test_1a=C3=A2: events 1-5 | | 17 | if (next_off >=3D r->size) | | ^ | | | | | (1) following =C3=A2false=C3=A2 branch... |...... | 20 | if (next_off >=3D r->size) | | ~ ~~~~~~~ | | | | | | | (2) ...to here | | (3) following =C3=A2true=C3=A2 branch... | 21 | /* We should have already returned if this is the case= .=20 */ | 22 | __analyzer_dump_path (); /* { dg-bogus "path" } */ | | ~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (4) ...to here | | (5) here | The root cause is that, at the call to the external function, the analyzer considers the cluster for *p to have been touched, binding it to a conjured_svalue, but because p is void * no particular size is known for the write, and so the cluster is bound using a symbolic key covering the base region. Later, the accesses to r->size are handled by binding_cluster::get_any_binding, but binding_cluster::get_binding fails to find a match for the concrete field lookup, due to the key for the binding being symbolic, and reaching this code: 1522 /* If this cluster has been touched by a symbolic write, then the content 1523 of any subregion not currently specifically bound is "UNKNOWN"= .=20 */ 1524 if (m_touched) 1525 { 1526 region_model_manager *rmm_mgr =3D mgr->get_svalue_manager (); 1527 return rmm_mgr->get_or_create_unknown_svalue (reg->get_type (= )); 1528 } Hence each access to r->size is an unknown svalue, and thus the condition (next_off >=3D r->size) isn't tracked, leading to the path wi= th contradictory conditions being treated as satisfiable. In the original reproducer in git's reftable/reader.c, the call to the external fn is: reftable_record_type(rec) which is considered to possibly write to *rec, which is *tab, where tab is the void * argument to reftable_reader_seek_void, and thus after the call to reftable_record_type some arbitrary amount of *rec could have been written to. This patch fixes things by detecting the "this cluster has been 'filled' with a conjured value of unknown size" case, and handling get_any_binding on it by returning a sub_svalue of the conjured_svalue, so that repeated accesses to r->size give the same symbolic value, so that the constraint manager rejects the bogus execution path, fixing the false positive. gcc/analyzer/ChangeLog: PR analyzer/105285 * store.cc (binding_cluster::get_any_binding): Handle accessing sub_svalues of clusters where the base region has a symbolic binding. gcc/testsuite/ChangeLog: PR analyzer/105285 * gcc.dg/analyzer/symbolic-12.c: New test. Signed-off-by: David Malcolm =