From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D6F4A3858C60; Thu, 2 Feb 2023 12:56:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D6F4A3858C60 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1675342591; bh=Gvc1dF9s40+6+Fpy0aA1ErPrFa4WoANdCOBbMzviKoI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Fz50EluNNRC99AG/USQS9xQebdHmnr8lzpHIRfZ+fszVv+gVNwCv6Jtvh2fg2xkju sB2p9m3sMaPn5AW+zJ1TlPN7I7JhPQuOvXn3F1PXS9SlEuN7gKk3CalhBjVkEmCxnT +Ufko8XOHOVjOVncAns0owo2+boWleLtaFj1+M1k= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug debug/106746] [13 Regression] '-fcompare-debug' failure (length) with -O2 -fsched2-use-superblocks since r13-2041-g6624ad73064de241 Date: Thu, 02 Feb 2023 12:56:25 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: debug X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: compare-debug-failure X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106746 --- Comment #27 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:465a9c51e7d5bafa7a81195b5af20f2a54f22210 commit r13-5652-g465a9c51e7d5bafa7a81195b5af20f2a54f22210 Author: Jakub Jelinek Date: Thu Feb 2 13:52:45 2023 +0100 sched-deps, cselib: Fix up some -fcompare-debug issues and regressions [PR108463] On Sat, Jan 14, 2023 at 08:26:00AM -0300, Alexandre Oliva via Gcc-patch= es wrote: > The testcase used to get scheduled differently depending on the > presence of debug insns with MEMs. It's not clear to me why those > MEMs affected scheduling, but the cselib pre-canonicalization of the > MEM address is not used at all when analyzing debug insns, so the > memory allocation and lookup are pure waste. Somehow, avoiding that > waste fixes the problem, or makes it go latent. Unfortunately, this patch breaks the following testcase. The code in sched_analyze_2 did 2 things: 1) cselib_lookup_from_insn 2) shallow_copy_rtx + cselib_subst_to_values_from_insn Now, 1) is precondition of 2), we can only subst the VALUEs if we have actually looked the address up, but as can be seen on that testcas= e, we are relying on at least the 1) to be done because we subst the values later on even on DEBUG_INSNs and actually use those when needed. cselib_subst_to_values_from_insn mostly just replaces stuff in the returned rtx, except for: /* This used to happen for autoincrements, but we deal with them properly now. Remove the if stmt for the next release. */ if (! e) { /* Assign a value that doesn't match any other. */ e =3D new_cselib_val (next_uid, GET_MODE (x), x); } which is like that since 2011, I hope it is never reachable and we shou= ld in stage1 replace that with gcc_assert or just remove (then it will segfault on following return e->val_rtx; ). So, I (as done in the patch below) reinstalled the 1) and not 2) for DEBUG_INSNs. This fixed the new testcase, but broke again the PR106746 testcases. I've spent a day debugging that and found the problem is that as docume= nted in a large comment in cselib.cc above n_useless_values variable definit= ion, we spend quite a few effort on making sure that VALUEs created on DEBUG_INSNs don't affect the cselib decisions for non-DEBUG_INSNs such = as pruning of useless values etc., but if a VALUE created that way is then looked up/needed from non-DEBUG_INSNs, we promote it to non-debug. The reason for -fcompare-debug failure is that there is one large DEBUG_INSN with 16 MEMs in it mostly with addresses that so far didn't appear in t= he IL otherwise. Later on, we see an instruction storing into MEM destination and invalidate that MEM. Unfortunately, there is a bug caused by the introduction of SP_DERIVED_VALUE_P where alias.cc isn't able to disambiguate MEMs with sp + optional offset in address vs. MEMs with address being a VALUE having SP_DERIVED_VALUE_P + constant (or the SP_DERIVED_VALUE_P itself), which ought to be possible when REG_VALUES (REGNO (stack_pointer_rtx)) has SP_DERIVED_VALUE_P + constant location. Not s= ure if I should try to fix that in stage4 or defer for stage1. Anyway, the cselib_invalidate_mem call because of this invalidates basically all MEMs with the exception of 5 which have MEM_EXPRs that guarantee non-aliasing with the sp based store. Unfortunately, n_useless_values which in my understanding should be alw= ays the same between -g and -g0 compilations diverges, has 3 more useless values for -g. Now, these were initially VALUEs created for DEBUG_INSN lookups. As I said, cselib.cc has code to promote such VALUEs (well, their location element= s) to non-debug if they are looked up from non-DEBUG_INSNs. The problem is t= hat when looking some completely unrelated MEM from a non-DEBUG_INSN we run into a hash collision and so call cselib_hasher::equal to check if the unrel= ated MEM is equal to the one from DEBUG_INSN only element. The equal static member function calls rtx_equal_for_cselib_1 and if that returns true, promotes the location to non-DEBUG, otherwise returns false. So far so good. But rtx_equal_for_cselib_1 internally performs various other cse= lib lookups, all done with the non-DEBUG_INSN cselib_current_insn, so they all promote to non-debug. And that is wrong, because if it was -g0 compilation, such hashtable entry wouldn't be there at all (or would be but wouldn't contain that locs element), so with -g0 we wouldn't call that rtx_equal_for_cselib_1 at all. So, I think we need to pretend that such lookup which only happens with -g and not -g0 actually comes from some DEBUG_INSN (note, the lookups rtx_equal_for_cselib_1 does are always with create =3D 0). The cselib.cc part of the patch does that. BTW, I'm not really sure how: if (num_mems < param_max_cselib_memory_locations && ! canon_anti_dependence (x, false, mem_rtx, GET_MODE (mem_rtx), mem_addr)) { has_mem =3D true; num_mems++; p =3D &(*p)->next; continue; } num_mems cap can actually work correctly for -fcompare-debug, I'd think we would need to differentiate between num_debug_mems and num_mems depending on if setting_insn is non-NULL DEBUG_INSN or not. That was one of my suspicions on this testcase, but the number of MEMs was small enough for the param in either case (especially because of the above mentioned missed non-aliasings). But as implemented, I think if we have tons of non-aliased MEMs from DEBUG_INSN setting_insns, we could unchain lots more non-DEBUG MEMs with -g than with -g0. 2023-02-02 Jakub Jelinek PR debug/106746 PR rtl-optimization/108463 PR target/108484 * cselib.cc (cselib_current_insn): Move declaration earlier. (cselib_hasher::equal): For debug only locs, temporarily overri= de cselib_current_insn to their l->setting_insn for the rtx_equal_for_cselib_1 call, so that unsuccessful comparisons d= on't promote some debug locs. * sched-deps.cc (sched_analyze_2) : For MEMs in DEBUG_INSNs when using cselib call cselib_lookup_from_insn on the address b= ut don't substitute it. * gcc.dg/pr108463.c: New test.=