From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.sandiford@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id 8C089396BC04
 for <gcc-patches@gcc.gnu.org>; Fri,  6 Aug 2021 12:14:22 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8C089396BC04
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D8E3A11FB;
 Fri,  6 Aug 2021 05:14:21 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 624363F40C;
 Fri,  6 Aug 2021 05:14:21 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org>
Mail-Followup-To: Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org>,
 Robin Dapp <rdapp@linux.ibm.com>, richard.sandiford@arm.com
Subject: Re: [PATCH 4/7] ifcvt/optabs: Allow using a CC comparison for
 emit_conditional_move.
References: <20210625160905.23786-1-rdapp@linux.ibm.com>
 <20210625160905.23786-5-rdapp@linux.ibm.com> <mptv95b5mdn.fsf@arm.com>
 <06f8a450-4da1-b7e3-4b5b-e7085cfa822a@linux.ibm.com>
 <mpt1r7kna6q.fsf@arm.com>
 <284552e9-d93f-dc80-b355-e1e6788d14dd@linux.ibm.com>
Date: Fri, 06 Aug 2021 13:14:20 +0100
In-Reply-To: <284552e9-d93f-dc80-b355-e1e6788d14dd@linux.ibm.com> (Robin Dapp
 via Gcc-patches's message of "Tue, 27 Jul 2021 22:49:44 +0200")
Message-ID: <mptv94izs6b.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
 KAM_SHORT, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Aug 2021 12:14:24 -0000

Sorry for the slow reply.

Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> Hmm, OK.  Doesn't expanding both versions up-front create the same kind =
of
>> problem that the patch is fixing, in that we expand (and therefore cost)
>> both the reversed and unreversed comparison?  Also=E2=80=A6
>>=20
> [..]
>>=20
>> =E2=80=A6for min/max, I would have expected this swap to create the cano=
nical
>> operand order for the min and max too.  What causes it to be rejected?
>>=20
>
> We should not be expanding two comparisons but only emit (and cost) the=20
> reversed comparison if expanding the non-reversed one failed.

The (potential) problem is that prepare_cmp_insn can itself emit
instructions.  With the current code we rewind any prepare_cmp_insn
that isn't needed, whereas with the new code we might keep both.

This also means that prepare_cmp_insn calls need to stay inside the:

  saved_pending_stack_adjust save;
  save_pending_stack_adjust (&save);
  last =3D get_last_insn ();
  do_pending_stack_adjust ();

  =E2=80=A6

  delete_insns_since (last);
  restore_pending_stack_adjust (&save);

block.

> Regarding the reversal, I checked again - the commit introducing the=20
> op2/op3 swap is g:deed3da9af697ecf073aea855ecce2d22d85ef71, the=20
> corresponding test case is gcc.target/i386/pr70465-2.c.  It inlines one=20
> long double ternary operation into another, probably causing  not for=20
> multiple sets, mind you.  The situation doesn't occur with double.

OK, so going back to that revision and using the original SciMark test
case, we first try:

  (lt (reg/v:DF 272 [ ab ])
      (reg/v:DF 271 [ t ]))
  (reg/v:SI 227 [ jp ])
  (subreg:SI (reg:DI 346 [ ivtmp.59 ]) 0)

but i386 doesn't provide a native cbranchdf4 for lt and so the
prepare_cmp_insn fails.  Interesting that we use cbranch<mode>4
as the test for what conditional moves should accept, but I guess
that isn't something to change now.

So the key piece of information that I didn't realise before is
that it was the prepare_cmp_insn that failed, not the mov<mode>cc
expander.  I think we can accomodate that in the new scheme
by doing:

  if (rev_comparison && COMPARISON_P (rev_comparison))
    prepare_cmp_insn (XEXP (rev_comparison, 0), XEXP (rev_comparison, 1),
		      GET_CODE (rev_comparison), NULL_RTX,
		      unsignedp, OPTAB_WIDEN, &rev_comparison, &cmode);

first and then making:

  if (comparison && COMPARISON_P (comparison))
    prepare_cmp_insn (XEXP (comparison, 0), XEXP (comparison, 1),
		      GET_CODE (comparison), NULL_RTX,
		      unsignedp, OPTAB_WIDEN, &comparison, &cmode);

conditional on !rev_comparison.  But maybe the above makes
that moot.

>>> +
>>> +  rtx rev_comparison =3D NULL_RTX;
>>>     bool swapped =3D false;
>>> -  if (swap_commutative_operands_p (op2, op3)
>>> -      && ((reversed =3D reversed_comparison_code_parts (code, op0, op1=
, NULL))
>>> -          !=3D UNKNOWN))
>>> +
>>> +  code =3D unsignedp ? unsigned_condition (code) : code;
>>> +  comparison =3D simplify_gen_relational (code, VOIDmode, cmode, op0, =
op1);
>>> +
>>> +  if ((reversed =3D reversed_comparison_code_parts (code, op0, op1, NU=
LL))
>>> +      !=3D UNKNOWN)
>>>       {
>>> -      std::swap (op2, op3);
>>> -      code =3D reversed;
>>> -      swapped =3D true;
>>> +      reversed =3D unsignedp ? unsigned_condition (reversed) : reverse=
d;
>>=20
>> When is this needed?  I'd have expected the reversed from of an unsigned
>> code to be naturally unsigned.
>
> This was also introduced by the commit above, probably just repeating=20
> what was done for the non-reversed comparison.

Yeah, but in the original code, the first reverse_comparison_code_parts
happens outside the loop, before the first unsigned_condition (which
happens inside the loop).  In the new code, the unsigned_condition
happens first, before we try reversing it.

IMO the new order makes more sense than the old one.  But it means that
reversed_comparison_code_parts always sees a comparison of the right
signedness, so we shouldn't need to adjust the result.

Thanks,
Richard