From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 23D5F38323C2 for ; Wed, 5 Jun 2024 09:21:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 23D5F38323C2 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 23D5F38323C2 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717579315; cv=none; b=YhrUYW49Ic2Cv32DuR7ytEgZwDBXHyP7fSdMgeu1U5rt86Io3kcHv6Ibm2HQxoSVMnY/iinbA2+zA3CUy5I10/qpiDfeaQmxE/sU8XDgyOCWhM+BnmALiduZD3tU19SXET66tbSjwkO1WLYn3wo0quFqCT6kHqyL2AUwtyAcfRE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717579315; c=relaxed/simple; bh=ijaAAjXE8xrfyKNnkEeqCMn8o/RR5mhK+VLtDNfmCzA=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=xxey0t3+wOQSf770lxzmKd4Q6F79hjBw0WIDn4Ln0xt/6vLD1JvJtvC1Kpvh8mWXM64PVdFJ5B3bftMf+mUhHj//3HlRlbuhoL6cS7HMjDptMk+oi0rZNHA2dbP8Q49zcbEdtF+p/KsaGwF8YBWJR2zgTsEC62exj//jg9U5Jr4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 00895DA7; Wed, 5 Jun 2024 02:22:17 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EA1C33F792; Wed, 5 Jun 2024 02:21:51 -0700 (PDT) From: Richard Sandiford To: Hongyu Wang Mail-Followup-To: Hongyu Wang ,gcc-patches@gcc.gnu.org, ubizjak@gmail.com, hongtao.liu@intel.com, richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, ubizjak@gmail.com, hongtao.liu@intel.com Subject: Re: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates References: <20240515082054.3934069-1-hongyu.wang@intel.com> <20240515082054.3934069-3-hongyu.wang@intel.com> Date: Wed, 05 Jun 2024 10:21:50 +0100 In-Reply-To: (Hongyu Wang's message of "Wed, 15 May 2024 16:25:27 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-18.7 required=5.0 tests=BAYES_00,BODY_8BITS,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hongyu Wang writes: > CC'd Richard for ccmp part as previously it is added only for aarch64. > The original logic will not interrupted since if > aarch64_gen_ccmp_first succeeded, aarch64_gen_ccmp_next will also > success, the cmp/fcmp and ccmp/fccmp supports all GPI/GPF, and the > prepare_operand will fixup the input that cmp supports but ccmp not, > so ret/ret2 will all be valid when comparing cost. > Thanks in advance. Sorry for the slow review. > Hongyu Wang =E4=BA=8E2024=E5=B9=B45=E6=9C=8815=E6= =97=A5=E5=91=A8=E4=B8=89 16:22=E5=86=99=E9=81=93=EF=BC=9A >> >> For general ccmp scenario, the tree sequence is like >> >> _1 =3D (a < b) >> _2 =3D (c < d) >> _3 =3D _1 & _2 >> >> current ccmp expanding will try to swap compare order for _1 and _2, >> compare the cost/cost2 between compare _1 and _2 first, then return the >> sequence with lower cost. >> >> For x86 ccmp, we don't support FP compare as ccmp operand, but we >> support fp comi + int ccmp sequence. With current cost comparison >> model, the fp comi + int ccmp can never be generated since it doesn't >> check whether expand_ccmp_next returns available result and the rtl >> cost for the empty ccmp sequence is always smaller. >> >> Check the expand_ccmp_next result ret and ret2, returns the valid one >> before cost comparison. >> >> gcc/ChangeLog: >> >> * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of >> expand_ccmp_next, returns the valid one first before >> comparing cost. >> --- >> gcc/ccmp.cc | 12 +++++++++++- >> 1 file changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc >> index 7cb525addf4..4b424220068 100644 >> --- a/gcc/ccmp.cc >> +++ b/gcc/ccmp.cc >> @@ -247,7 +247,17 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq,= rtx_insn **gen_seq) >> cost2 =3D seq_cost (prep_seq_2, speed_p); >> cost2 +=3D seq_cost (gen_seq_2, speed_p); >> } >> - if (cost2 < cost1) >> + >> + /* For x86 target the ccmp does not support fp operands, but >> + have fcomi insn that can produce eflags and then do int >> + ccmp. So if one of the op is fp compare, ret1 or ret2 can >> + fail, and the cost of the corresponding empty seq will >> + always be smaller, then the NULL sequence will be returned. >> + Add check for ret and ret2, returns the available one if >> + the other is NULL. */ I think the more fundamental point is that the cost of a failed expansion isn't meaningful. So how about: /* It's possible that one expansion succeeds and the other fails. For example, x86 has int ccmp but not fp ccmp, and so a combined fp and int comparison must be ordered such that the fp comparison happens first. The costs are not meaningful for failed expansions. */ >> + if ((!ret && ret2) >> + || (!(ret && !ret2) >> + && cost2 < cost1)) I think this simplifies to: if (ret2 && (!ret1 || cost2 < cost1)) OK with those changes, thanks. Richard >> { >> *prep_seq =3D prep_seq_2; >> *gen_seq =3D gen_seq_2; >> -- >> 2.31.1 >>