From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1A64D3858D20 for ; Mon, 14 Feb 2022 16:35:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1A64D3858D20 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B09D313D5; Mon, 14 Feb 2022 08:35:21 -0800 (PST) Received: from localhost (unknown [10.32.98.88]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 263543F70D; Mon, 14 Feb 2022 08:35:21 -0800 (PST) From: Richard Sandiford To: Iain Sandoe Mail-Followup-To: Iain Sandoe , GCC Patches , Vmakarov@redhat.com, richard.sandiford@arm.com Cc: GCC Patches , Vmakarov@redhat.com Subject: Re: [pushed] LRA, rs6000, Darwin: Amend lo_sum use for forced constants [PR104117]. References: <20220211235904.25765-1-iain@sandoe.co.uk> Date: Mon, 14 Feb 2022 16:35:19 +0000 In-Reply-To: (Iain Sandoe's message of "Mon, 14 Feb 2022 16:21:01 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Feb 2022 16:35:24 -0000 Iain Sandoe writes: > Hi Richard, > > (hopefully, my take won=E2=80=99t cloud the issue =E2=80=A6.) > >> On 14 Feb 2022, at 16:00, Richard Sandiford via Gcc-patches wrote: >>=20 >> Hi Vlad, >>=20 >> Vladimir Makarov via Gcc-patches writes: >>> On 2022-02-14 04:44, Richard Sandiford via Gcc-patches wrote: >>>> Iain Sandoe via Gcc-patches writes: >>>>> Two issues resulted in this PR, which manifests when we force a const= ant into >>>>> memory in LRA (in PIC code on Darwin). The presence of such forced c= onstants >>>>> is quite dependent on other RTL optimisations, and it is easy for the= issue to >>>>> become latent for a specific case. >>>>>=20 >>>>> First, in the Darwin-specific rs6000 backend code, we were not being = careful >>>>> enough in rejecting invalid symbolic addresses. Specifically, when g= enerating >>>>> PIC code, we require a SYMBOL_REF to be wrapped in an UNSPEC_MACHOPIC= _OFFSET. >>>>>=20 >>>>> Second, LRA was attempting to load a register using an invalid lo_sum= address. >>>>>=20 >>>>> The LRA changes are approved in the PR by Vladimir, and the RS6000 ch= anges are >>>>> Darwin-specific (although, of course, any observations are welcome). >>>>>=20 >>>>> Tested on several lo_sum targets and x86_64 all languages except as n= oted: >>>>> powerpc64-linux (m32/m64) -D >>>>> powerpc64le-linux -D >>>>> powerpc64-aix -Ada -Go -D >>>>> aarch64-linux -Ada -D >>>>> x86_64-linux all langs -D >>>>> powerpc-darwin9 (master and 11.2) -D -Go. >>>>>=20 >>>>> pushed to master, thanks, >>>>> Iain >>>>>=20 >>>>> Signed-off-by: Iain Sandoe >>>>> Co-authored-by: Vladimir Makarov >>>>>=20 >>>>> PR target/104117 >>>>>=20 >>>>> gcc/ChangeLog: >>>>>=20 >>>>> * config/rs6000/rs6000.cc (darwin_rs6000_legitimate_lo_sum_const_p): >>>>> Check for UNSPEC_MACHOPIC_OFFSET wrappers on symbolic addresses when >>>>> emitting PIC code. >>>>> (legitimate_lo_sum_address_p): Likewise. >>>>> * lra-constraints.cc (process_address_1): Do not attempt to emit a r= eg >>>>> load from an invalid lo_sum address. >>>>> --- >>>>> gcc/config/rs6000/rs6000.cc | 38 +++++++++++++++++++++++++++++++++++= -- >>>>> gcc/lra-constraints.cc | 17 ++--------------- >>>>> 2 files changed, 38 insertions(+), 17 deletions(-) >>>>>=20 >>>>> [=E2=80=A6] >>>>> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc >>>>> index fdff9e0720a..c700c3f4578 100644 >>>>> --- a/gcc/lra-constraints.cc >>>>> +++ b/gcc/lra-constraints.cc >>>>> @@ -3625,21 +3625,8 @@ process_address_1 (int nop, bool check_only_p, >>>>> *ad.inner =3D gen_rtx_LO_SUM (Pmode, new_reg, addr); >>>>> if (!valid_address_p (op, &ad, cn)) >>>>> { >>>>> - /* Try to put lo_sum into register. */ >>>>> - insn =3D emit_insn (gen_rtx_SET >>>>> - (new_reg, >>>>> - gen_rtx_LO_SUM (Pmode, new_reg, addr))); >>>>> - code =3D recog_memoized (insn); >>>>> - if (code >=3D 0) >>>>> - { >>>>> - *ad.inner =3D new_reg; >>>>> - if (!valid_address_p (op, &ad, cn)) >>>>> - { >>>>> - *ad.inner =3D addr; >>>>> - code =3D -1; >>>>> - } >>>>> - } >>>>> - >>>>> + *ad.inner =3D addr; /* Punt. */ >>>>> + code =3D -1; >>>>> } >>>>> } >>>>> if (code < 0) >>>> Could you go into more details about this? Why is it OK to continue >>>> to try: >>>>=20 >>>> (lo_sum new_reg addr) >>>>=20 >>>> directly as an address (the context at the top of the hunk), but not t= ry >>>> moving the lo_sum into a register? They should be semantically equiva= lent, >>>> so it seems that if one is wrong, the other would be too. >>>>=20 >>> Hi, Richard. Change LRA is mine and I approved it for Iain's patch. >>>=20 >>> I think there is no need for this code and it is misleading. If=20 >>> 'mem[low_sum]' does not work, I don't think that 'reg=3Dlow_sum;mem[reg= ]'=20 >>> will help for any existing target. As machine-dependent code for any=20 >>> target most probably (for ppc64 darwin it is exactly the case) checks=20 >>> address only in memory, it can wrongly accept wrong address by reloadin= g=20 >>> it into reg and use it in memory. So these are my arguments for the=20 >>> remove this code from process_address_1. >>=20 >> I'm probably making too much of this, but: >>=20 >> I think the code is potentially useful in that existing targets do forbid >> forbid lo_sum addresses in certain contexts (due to limited offset range) >> while still wanting lo_sum to be used to be load the address. If we >> handle the high/lo_sum split in generic code then we have more chance >> of being able to optimise things. So it feels like this is setting an >> unfortunate precedent. >>=20 >> I still don't understand what went wrong before though (the PR trail >> was a bit too long to process :-)). Is there a case where >> (lo_sum (high X) X) !=3D X? If so, that seems like a target bug to me. > > If X is an invalid address (in this case for PIC code a SYMBOL_REF is not > valid without an UNSPEC wrapper) - then (high X) and (lo_sum (high X) X)= =20 > are also invalid. AFAICT the target never (this late in the process) gets > the opportunity to apply a transform to validate the address. > >> Or does the target accept (set R1 (lo_sum R2 X)) for an X that cannot >> be split into a HIGH/LO_SUM pair? I'd argue that's a target bug too. > > once again if X is plain invalid - loading it into a register or splittin= g it > does not solve that - we=E2=80=99d need to call the target=E2=80=99s addr= ess legitimizer. > > Of course, if we conclude that it is a target bug, I=E2=80=99ll try to fi= x it up . But if the target treats: (set R2 (high X)) (set R1 (lo_sum R2 X)) as an invalid way of setting R1 to X (e.g. because X is plain invalid), then recog should fail on at least one instruction, preferably both. So yeah, this does feel like a target bug to me. We shouldn't assume that lo_sum/high splits will only be generated by the target. Target-independent code can generate them =E2=80=9Cspontaneous= ly=E2=80=9D, provided that it validates the result. (Combine does this too, for example.) It looks like the LRA code might have been missing a HIGH in the fallback case, but it doesn't sound like that was the bug that was being fixed. Thanks, Richard