From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 9F96E3857349; Wed, 14 Feb 2024 08:19:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9F96E3857349 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1707898777; bh=8KiZsm9xcEiKKBsRUrP4YLDiVXmBCM8xzf0GWLKOHA8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Qnr+7FzOw6rFqLJlcnTvpDCff5H2GS6aQkKJA4s0O26zh8b1goxyxbmKc27Bql/Hz TuCPa+XiC34hcoL/HpVRXuOn5kml0Ka1+UINjAdZLGrHk8ylesqxbp00JxDRo7ZBAM hVOE3O+IBMxtz7/eGyfngIZw32cmmgQnWbNUtRC0= From: "rguenther at suse dot de" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64 Date: Wed, 14 Feb 2024 08:19:31 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenther at suse dot de X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113787 --- Comment #16 from rguenther at suse dot de --- On Tue, 13 Feb 2024, hubicka at ucw dot cz wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113787 >=20 > --- Comment #15 from Jan Hubicka --- > >=20 > > IVOPTs does the above but it does it (or should) as > >=20 > > offset =3D (uintptr)&base2 - (uintptr)&base1; > > val =3D *((T *)((uintptr)base1 + i + offset)) > >=20 > > which is OK for points-to as no POINTER_PLUS_EXPR is involved so the > > resulting pointer points to both base1 and base2 (which isn't optimal > > but correct). > >=20 > > If we somehow get back a POINTER_PLUS that's where things go wrong. > >=20 > > Doing the above in C code would be valid input so we have to treat > > it correctly (OK, the standard only allows back-and-forth > > pointer-to-integer casts w/o any adjustment, but of course we relax > > this). >=20 > OK. Modrefs tracks base pointer for accesses and tries to prove that > they are function parameters. This should immitate ivopts: > void > __attribute__ ((noinline)) > set(int *a, unsigned long off) > { > *(int *)((unsigned long)a + off) =3D 1; > } > int > test () > { > int a; > int b =3D 0; > set (&a, (unsigned long)&b - (unsigned long)&a); > return b; > } >=20 > Here set gets following gimple at modref2 time: > __attribute__((noinline))=20 > void set (int * a, long unsigned int off) > { > long unsigned int a.0_1; > long unsigned int _2; > int * _3;=20 >=20 > [local count: 1073741824]: > a.0_1 =3D (long unsigned int) a_4(D); > _2 =3D a.0_1 + off_5(D);=20 > _3 =3D (int *) _2;=20 > *_3 =3D 1;=20 > return; >=20 > } >=20 > This is not pattern matched so modref does not think the access has a as > a base: >=20 > stores: > Base 0: alias set 1 > Ref 0: alias set 1 > Every access >=20 > While for: >=20 > void > __attribute__ ((noinline)) > set(int *a, unsigned long off) > { > *(a+off/sizeof(int))=3D1; > } >=20 > we produce: >=20 > __attribute__((noinline)) > void set (int * a, long unsigned int off) > { > sizetype _1; > int * _2; >=20 > [local count: 1073741824]: > _1 =3D off_3(D) & 18446744073709551612; > _2 =3D a_4(D) + _1; > *_2 =3D 1; > return; > } >=20 > And this is understood: >=20 > stores: > Base 0: alias set 1 > Ref 0: alias set 1 > access: Parm 0 >=20 > If we consider it correct to optimize out the conversion from and to > pointer type, then I suppose any addition of pointer and integer which > we do not see means that we need to give up on tracking base completely. >=20 > I guess PTA gets around by tracking points-to set also for non-pointer > types and consequently it also gives up on any such addition. >=20 > But what we really get from relaxing this? > >=20 > > IVOPTs then in putting all of the stuff into 'offset' gets at > > trying a TARGET_MEM_REF based on a NULL base but that's invalid. > > We then resort to a LEA (ADDR_EXPR of TARGET_MEM_REF) to compute > > the address which gets us into some phishy argument that it's > > not valid to decompose ADDR_EXPR of TARGET_MEM_REF to > > POINTER_PLUS of the TARGET_MEM_REF base and the offset. But > > that's how it is (points-to treats (address of) TARGET_MEM_REF > > as pointing to anything ...). > >=20 > > > A quick fix would be to run IPA modref before ivopts, but I do not se= e how such > > > transformation can work with rest of alias analysis (PTA etc) > >=20 > > It does. Somewhere IPA modref interprets things wrongly, I didn't figu= re > > out here though. >=20 >=20 > I guess PTA gets around by tracking points-to set also for non-pointer > types and consequently it also gives up on any such addition. It does. But note it does _not_ for POINTER_PLUS where it treats the offset operand as non-pointer. > I think it is ipa-prop.c::unadjusted_ptr_and_unit_offset. It accepts > pointer_plus expression, but does not look through POINTER_PLUS. > We can restrict it further, but tracking base pointer is quite useful, > so it would be nice to not give up completely. It looks like that function might treat that ADDR_EXPR > as integer_zerop base. It does if (TREE_CODE (op) =3D=3D ADDR_EXPR)=20 { poly_int64 extra_offset =3D 0;=20 tree base =3D get_addr_base_and_unit_offset (TREE_OPERAND (op, 0), &offset); if (!base) { base =3D get_base_address (TREE_OPERAND (op, 0)); if (TREE_CODE (base) !=3D MEM_REF) break; offset_known =3D false; } else { if (TREE_CODE (base) !=3D MEM_REF) break; with a variable offset we fall to the TREE_CODE (base) !=3D MEM_REF and will have offset_known =3D=3D true. Not sure what it does with the result though (it's not the address of a decl). This function seems to oddly special-case !=3D MEM_REF ... (maybe it wants to hande DECL_P () as finishing? Note get_addr_base_and_unit_offset will return NULL for a TARGET_MEM_REF <&decl, ..., offset> but TARGET_MEM_REF itself if the base isn't an ADDR_EXPR, irrespective of whether the offset within it is constant or not. Not sure if the above is a problem, but it seems the only caller will just call points_to_local_or_readonly_memory_p on the ADDR_EXPR where refs_local_or_readonly_memory_p via points_to_local_or_readonly_memory_p will eventually do /* See if memory location is clearly invalid. */ if (integer_zerop (t)) return flag_delete_null_pointer_checks; and that might be a problem. As said, we rely on ADDR_EXPR > to be an address computation that's not subject to strict interpretation to allow IVOPTs doing this kind of optimization w/o introducing some kind of INTEGER_LEA <...>. I know that's a bit awkward but we should make sure this is honored by IPA as well. I'd say diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc index 74c9b4e1d1e..45a770cf940 100644 --- a/gcc/ipa-fnsummary.cc +++ b/gcc/ipa-fnsummary.cc @@ -2642,7 +2642,8 @@ points_to_local_or_readonly_memory_p (tree t) return true; return !ptr_deref_may_alias_global_p (t, false); } - if (TREE_CODE (t) =3D=3D ADDR_EXPR) + if (TREE_CODE (t) =3D=3D ADDR_EXPR + && TREE_CODE (TREE_OPERAND (t, 0)) !=3D TARGET_MEM_REF) return refs_local_or_readonly_memory_p (TREE_OPERAND (t, 0)); return false; } might eventually work? Alternatively a bit less aggressive like the following. diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc index 74c9b4e1d1e..7c79adf6440 100644 --- a/gcc/ipa-fnsummary.cc +++ b/gcc/ipa-fnsummary.cc @@ -2642,7 +2642,9 @@ points_to_local_or_readonly_memory_p (tree t) return true; return !ptr_deref_may_alias_global_p (t, false); } - if (TREE_CODE (t) =3D=3D ADDR_EXPR) + if (TREE_CODE (t) =3D=3D ADDR_EXPR + && (TREE_CODE (TREE_OPERAND (t, 0)) !=3D TARGET_MEM_REF + || TREE_CODE (TREE_OPERAND (TREE_OPERAND (t, 0), 0)) !=3D=20 INTEGER_CST)) return refs_local_or_readonly_memory_p (TREE_OPERAND (t, 0)); return false; } A "nicer" solution might be to add a informational operand to TARGET_MEM_REF, representing the base pointer to be used for alias/points-to purposes. But if that's not invariant it might keep some otherwise unnecessary definition stmts live.=