From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.sandiford@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id C040F38618BB
 for <gcc-patches@gcc.gnu.org>; Tue, 11 May 2021 10:59:26 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C040F38618BB
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 57D88169E;
 Tue, 11 May 2021 03:59:26 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6EF5C3F719;
 Tue, 11 May 2021 03:59:25 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Jakub Jelinek <jakub@redhat.com>
Mail-Followup-To: Jakub Jelinek <jakub@redhat.com>,
 Hongtao Liu <crazylht@gmail.com>, Eric Botcazou <ebotcazou@libertysurf.fr>,
 Jeff Law <jeffreyalaw@gmail.com>, gcc-patches@gcc.gnu.org,
 richard.sandiford@arm.com
Cc: Hongtao Liu <crazylht@gmail.com>, Eric Botcazou <ebotcazou@libertysurf.fr>,
 Jeff Law <jeffreyalaw@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] regcprop: Fix another cprop_hardreg bug [PR100342]
References: <CAMZc-by1DCvkq-TwSssx6LDUKbTrBNC6VPCU-amzwQ5GPzousw@mail.gmail.com>
 <mpta6t67dkj.fsf@arm.com>
 <CAMZc-byONEr3Kpj7ThTERK6aVKyg8_uPwQ0zigYGpA_nGgVkNg@mail.gmail.com>
 <mpt35yy7b74.fsf@arm.com>
 <CAMZc-bwJwbnWafSQsWXoSR7-0PysyfXNTTGqajpUmgoREgvVUQ@mail.gmail.com>
 <mpteeih3xuw.fsf@arm.com> <20210119144514.GA4020736@tucnak>
 <mpt35yw3o1y.fsf@arm.com> <20210505174446.GU1179226@tucnak>
Date: Tue, 11 May 2021 11:59:24 +0100
In-Reply-To: <20210505174446.GU1179226@tucnak> (Jakub Jelinek's message of
 "Wed, 5 May 2021 19:44:46 +0200")
Message-ID: <mptcztx359v.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 11 May 2021 10:59:29 -0000

Jakub Jelinek <jakub@redhat.com> writes:
> On Tue, Jan 19, 2021 at 04:10:33PM +0000, Richard Sandiford via Gcc-patch=
es wrote:
>> Ah, ok, thanks for the extra context.
>>=20
>> So AIUI the problem when recording xmm2<-di isn't just:
>>=20
>>  [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
>>=20
>> but also that:
>>=20
>>  [B] partial_subreg_p (vd->e[sr].mode, vd->e[vd->e[sr].oldest_regno].mod=
e)
>>=20
>> For example, all registers in this sequence can be part of the same chai=
n:
>>=20
>>     (set (reg:HI R1) (reg:HI R0))
>>     (set (reg:SI R2) (reg:SI R1)) // [A]
>>     (set (reg:DI R3) (reg:DI R2)) // [A]
>>     (set (reg:SI R4) (reg:SI R[0-3]))
>>     (set (reg:HI R5) (reg:HI R[0-4]))
>>=20
>> But:
>>=20
>>     (set (reg:SI R1) (reg:SI R0))
>>     (set (reg:HI R2) (reg:HI R1))
>>     (set (reg:SI R3) (reg:SI R2)) // [A] && [B]
>>=20
>> is problematic because it dips below the precision of the oldest regno
>> and then increases again.
>>=20
>> When this happens, I guess we have two choices:
>>=20
>> (1) what the patch does: treat R3 as the start of a new chain.
>> (2) pretend that the copy occured in vd->e[sr].mode instead
>>     (i.e. copy vd->e[sr].mode to vd->e[dr].mode)
>>=20
>> I guess (2) would need to be subject to REG_CAN_CHANGE_MODE_P.
>> Maybe the optimisation provided by (2) compared to (1) isn't common
>> enough to be worth the complication.
>>=20
>> I think we should test [B] as well as [A] though.  The pass is set
>> up to do some quite elaborate mode changes and I think rejecting
>> [A] on its own would make some of the other code redundant.
>> It also feels like it should be a seperate =E2=80=9Cif=E2=80=9D or =E2=
=80=9Celse if=E2=80=9D,
>> with its own comment.
>
> Unfortunately, we now have a testcase that shows that testing also [B]
> is a problem (unfortunately now latent on the trunk, only reproduces
> on 10 and 11 branches).

This whole area feels way more complicated than it ought to be :-/

> The comment in the patch tries to list just the interesting instructions,
> we have a 64-bit value, copy low 8 bit of those to another register,
> copy full 64 bits to another register and then clobber the original regis=
ter.
> Before that (set (reg:DI r14) (const_int ...)) we have a chain
> DI r14, QI si, DI bp , that instruction drops the DI r14 from that chain,=
 so
> we have QI si, DI bp , si being the oldest_regno.
> Next DI si is copied into DI dx.  Only the low 8 bits of that are defined,
> the rest is unspecified, but we would add DI dx into that same chain at t=
he
> end, so QI si, DI bp, DI dx [*].  Next si is overwritten, so the chain is
> DI bp, DI dx.  And then we see (set (reg:DI dx) (reg:DI bp)) and remove it
> as redundant, because we think bp and dx are already equivalent, when in
> reality that is true only for the lowpart 8 bits.
> I believe the [*] marked step above is where the bug is.
>
> The committed regcprop.c (copy_value) change (but only committed to
> trunk/11, not to 10) added
>   else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
>            && partial_subreg_p (vd->e[sr].mode,
>                                 vd->e[vd->e[sr].oldest_regno].mode))
>     return;
> and while the first partial_subreg_p call returns true, the second one
> doesn't; before the (set (reg:DI r14) (const_int ...)) insn it would be
> true and we'd return, but as that reg got clobbered, si became the oldest
> regno in the chain and so vd->e[vd->e[sr].oldest_regno].mode is QImode
> and vd->e[sr].mode is QImode too, so the second partial_subreg_p is false.
> But as the testcase shows, what is the oldest_regno in the chain is
> something that changes over time, so relying on it for anything is
> problematic, something could have a different oldest_regno and later
> on get a different oldest_regno (perhaps with different mode) because
> the oldest_regno got overwritten and it can change both ways.
>
> I wrote the following patch (originally against 10 branch because that is
> where Uros has been debugging it) and bootstrapped/regtested it on 11
> branch successfully.
> It effectively implements your (2) above; I'm not sure if
> REG_CAN_CHANGE_MODE_P is needed there, because it is already tested in
> find_oldest_value_reg -> maybe_mode_change -> mode_change_ok.

The REG_CAN_CHANGE_MODE_P test would in this case be for
vd->e[dr].mode =E2=86=92 vd->e[sr].mode, rather than oldest_regno's mode.
I'm just worried that:

   (set (reg:HI R1) (reg:HI R0))
   (set (reg:SI R2) (reg:SI R1))

isn't equivalent to:

   (set (reg:HI R1) (reg:HI R0))
   (set (reg:HI R2) (reg:HI R1))

if REG_CAN_CHANGE_MODE_P is false for either the R2 or R1 change.
If we pretend that it is when building the chain then there's a
risk of GIGO when using it in find_oldest_value_reg.

(Although in this case SI and HI are both valid for R1,
REG_CAN_CHANGE_MODE_P might still be false if the HI bits are
not in the low 16 bits of the SI.  That's unlikely in this case,
but a similar thing can happen for vector modes or multi-register modes.)

I'm not saying the patch is wrong.  I just wanted to clarify
why I thought the check might be needed.

Thanks,
Richard