From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.sandiford@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id 1A64D3858D20
 for <gcc-patches@gcc.gnu.org>; Mon, 14 Feb 2022 16:35:22 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1A64D3858D20
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B09D313D5;
 Mon, 14 Feb 2022 08:35:21 -0800 (PST)
Received: from localhost (unknown [10.32.98.88])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 263543F70D;
 Mon, 14 Feb 2022 08:35:21 -0800 (PST)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Iain Sandoe <iain@sandoe.co.uk>
Mail-Followup-To: Iain Sandoe <iain@sandoe.co.uk>,
 GCC Patches <gcc-patches@gcc.gnu.org>, Vmakarov@redhat.com,
 richard.sandiford@arm.com
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,  Vmakarov@redhat.com
Subject: Re: [pushed] LRA, rs6000,
 Darwin: Amend lo_sum use for forced constants [PR104117].
References: <20220211235904.25765-1-iain@sandoe.co.uk>
 <mptv8xhye7g.fsf@arm.com>
 <dc92ba2a-e5b3-bc16-ee1a-cfc0860dfbf4@redhat.com>
 <mpt7d9xwi86.fsf@arm.com>
 <BDE7613E-4570-48A9-802A-9D99BD948872@sandoe.co.uk>
Date: Mon, 14 Feb 2022 16:35:19 +0000
In-Reply-To: <BDE7613E-4570-48A9-802A-9D99BD948872@sandoe.co.uk> (Iain
 Sandoe's message of "Mon, 14 Feb 2022 16:21:01 +0000")
Message-ID: <mptfsolv22g.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Feb 2022 16:35:24 -0000

Iain Sandoe <iain@sandoe.co.uk> writes:
> Hi Richard,
>
> (hopefully, my take won=E2=80=99t cloud the issue =E2=80=A6.)
>
>> On 14 Feb 2022, at 16:00, Richard Sandiford via Gcc-patches <gcc-patches=
@gcc.gnu.org> wrote:
>>=20
>> Hi Vlad,
>>=20
>> Vladimir Makarov via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>> On 2022-02-14 04:44, Richard Sandiford via Gcc-patches wrote:
>>>> Iain Sandoe via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>>>> Two issues resulted in this PR, which manifests when we force a const=
ant into
>>>>> memory in LRA (in PIC code on Darwin).  The presence of such forced c=
onstants
>>>>> is quite dependent on other RTL optimisations, and it is easy for the=
 issue to
>>>>> become latent for a specific case.
>>>>>=20
>>>>> First, in the Darwin-specific rs6000 backend code, we were not being =
careful
>>>>> enough in rejecting invalid symbolic addresses.  Specifically, when g=
enerating
>>>>> PIC code, we require a SYMBOL_REF to be wrapped in an UNSPEC_MACHOPIC=
_OFFSET.
>>>>>=20
>>>>> Second, LRA was attempting to load a register using an invalid lo_sum=
 address.
>>>>>=20
>>>>> The LRA changes are approved in the PR by Vladimir, and the RS6000 ch=
anges are
>>>>> Darwin-specific (although, of course, any observations are welcome).
>>>>>=20
>>>>> Tested on several lo_sum targets and x86_64 all languages except as n=
oted:
>>>>> powerpc64-linux (m32/m64) -D
>>>>> powerpc64le-linux  -D
>>>>> powerpc64-aix -Ada -Go -D
>>>>> aarch64-linux -Ada -D
>>>>> x86_64-linux all langs -D
>>>>> powerpc-darwin9 (master and 11.2) -D -Go.
>>>>>=20
>>>>> pushed to master, thanks,
>>>>> Iain
>>>>>=20
>>>>> Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
>>>>> Co-authored-by: Vladimir Makarov <vmakarov@redhat.com>
>>>>>=20
>>>>> 	PR target/104117
>>>>>=20
>>>>> gcc/ChangeLog:
>>>>>=20
>>>>> 	* config/rs6000/rs6000.cc (darwin_rs6000_legitimate_lo_sum_const_p):
>>>>> 	Check for UNSPEC_MACHOPIC_OFFSET wrappers on symbolic addresses when
>>>>> 	emitting PIC code.
>>>>> 	(legitimate_lo_sum_address_p): Likewise.
>>>>> 	* lra-constraints.cc (process_address_1): Do not attempt to emit a r=
eg
>>>>> 	load from an invalid lo_sum address.
>>>>> ---
>>>>>  gcc/config/rs6000/rs6000.cc | 38 +++++++++++++++++++++++++++++++++++=
--
>>>>>  gcc/lra-constraints.cc      | 17 ++---------------
>>>>>  2 files changed, 38 insertions(+), 17 deletions(-)
>>>>>=20
>>>>> [=E2=80=A6]
>>>>> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
>>>>> index fdff9e0720a..c700c3f4578 100644
>>>>> --- a/gcc/lra-constraints.cc
>>>>> +++ b/gcc/lra-constraints.cc
>>>>> @@ -3625,21 +3625,8 @@ process_address_1 (int nop, bool check_only_p,
>>>>>  		  *ad.inner =3D gen_rtx_LO_SUM (Pmode, new_reg, addr);
>>>>>  		  if (!valid_address_p (op, &ad, cn))
>>>>>  		    {
>>>>> -		      /* Try to put lo_sum into register.  */
>>>>> -		      insn =3D emit_insn (gen_rtx_SET
>>>>> -					(new_reg,
>>>>> -					 gen_rtx_LO_SUM (Pmode, new_reg, addr)));
>>>>> -		      code =3D recog_memoized (insn);
>>>>> -		      if (code >=3D 0)
>>>>> -			{
>>>>> -			  *ad.inner =3D new_reg;
>>>>> -			  if (!valid_address_p (op, &ad, cn))
>>>>> -			    {
>>>>> -			      *ad.inner =3D addr;
>>>>> -			      code =3D -1;
>>>>> -			    }
>>>>> -			}
>>>>> -
>>>>> +		      *ad.inner =3D addr; /* Punt.  */
>>>>> +		      code =3D -1;
>>>>>  		    }
>>>>>  		}
>>>>>  	      if (code < 0)
>>>> Could you go into more details about this?  Why is it OK to continue
>>>> to try:
>>>>=20
>>>>   (lo_sum new_reg addr)
>>>>=20
>>>> directly as an address (the context at the top of the hunk), but not t=
ry
>>>> moving the lo_sum into a register?  They should be semantically equiva=
lent,
>>>> so it seems that if one is wrong, the other would be too.
>>>>=20
>>> Hi, Richard.  Change LRA is mine and I approved it for Iain's patch.
>>>=20
>>> I think there is no need for this code and it is misleading.  If=20
>>> 'mem[low_sum]' does not work, I don't think that 'reg=3Dlow_sum;mem[reg=
]'=20
>>> will help for any existing target.  As machine-dependent code for any=20
>>> target most probably (for ppc64 darwin it is exactly the case) checks=20
>>> address only in memory, it can wrongly accept wrong address by reloadin=
g=20
>>> it into reg and use it in memory. So these are my arguments for the=20
>>> remove this code from process_address_1.
>>=20
>> I'm probably making too much of this, but:
>>=20
>> I think the code is potentially useful in that existing targets do forbid
>> forbid lo_sum addresses in certain contexts (due to limited offset range)
>> while still wanting lo_sum to be used to be load the address.  If we
>> handle the high/lo_sum split in generic code then we have more chance
>> of being able to optimise things.  So it feels like this is setting an
>> unfortunate precedent.
>>=20
>> I still don't understand what went wrong before though (the PR trail
>> was a bit too long to process :-)).  Is there a case where
>> (lo_sum (high X) X) !=3D X?  If so, that seems like a target bug to me.
>
> If X is an invalid address (in this case for PIC code a SYMBOL_REF is not
> valid without an UNSPEC wrapper) - then (high X) and (lo_sum (high X) X)=
=20
> are also invalid.  AFAICT the target never (this late in the process) gets
> the opportunity to apply a transform to validate the address.
>
>> Or does the target accept (set R1 (lo_sum R2 X)) for an X that cannot
>> be split into a HIGH/LO_SUM pair?  I'd argue that's a target bug too.
>
> once again if X is plain invalid - loading it into a register or splittin=
g it
> does not solve that - we=E2=80=99d need to call the target=E2=80=99s addr=
ess legitimizer.
>
> Of course, if we conclude that it is a target bug, I=E2=80=99ll try to fi=
x it up .

But if the target treats:

   (set R2 (high X))
   (set R1 (lo_sum R2 X))

as an invalid way of setting R1 to X (e.g. because X is plain invalid),
then recog should fail on at least one instruction, preferably both.
So yeah, this does feel like a target bug to me.

We shouldn't assume that lo_sum/high splits will only be generated by
the target.  Target-independent code can generate them =E2=80=9Cspontaneous=
ly=E2=80=9D,
provided that it validates the result.  (Combine does this too,
for example.)

It looks like the LRA code might have been missing a HIGH in the
fallback case, but it doesn't sound like that was the bug that was
being fixed.

Thanks,
Richard