From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vs1-xe31.google.com (mail-vs1-xe31.google.com [IPv6:2607:f8b0:4864:20::e31]) by sourceware.org (Postfix) with ESMTPS id 7EDB43858D20 for ; Wed, 15 May 2024 08:34:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7EDB43858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7EDB43858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::e31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715762066; cv=none; b=MXXUK+GyMXGgXpcYtFWOj+HCiAbiN8ksU4ne33UQQKvI3Jfgl5r7oSUez+aB7a4FXQG4rJdCyWcPxPTbo65wb3rGCv0Bi5yBUvFm2IOvhpNH5r2/eI7suTvDbfF6jVioHdW5Nho9gJIdkKl0/FMkpUM7bZlAd9OmzvtqtjtAuOQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715762066; c=relaxed/simple; bh=yjdnNktApDedqlDiy4k9xLSW5mi5A2t1i34m9SgAhtI=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=nfygSKRqHX4k+dhicLzsXFvQk5qH6kYfAp5ymwo5f+/QVrAQfOwZvITXTx7VqikRbrSjKNj9Rr0mEoGO4XwM0LYw80+/oKeAY3v26Iddt8OF5oqqGITVwJvXZizxkctA1WjhkclIuKsb0FNiR16Q8vRx87Li1D90UM4Jf1g9NyI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-vs1-xe31.google.com with SMTP id ada2fe7eead31-47eeb8efb6fso3685874137.1 for ; Wed, 15 May 2024 01:34:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715762056; x=1716366856; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jXzxqlQb5SJ+jzijbt6C51qoXKSQcJcN/CJwm6D4WaM=; b=QWwOxmkpGSahqxaKWSLUpPUXjjoqs5D9BBuvt746mebU9gHYY6yEgqV8YJ9FXg1WpJ B7bJbrvbTMSBxyX5Ipdh19EmefDPHX6Zr1T+SfyldvYNbsN4X2E5g3rT7esZqTWZd189 mBWzhFv8W0iaM1rYbHb0U4xVZzJx85A4D8KZIpBQvEixamVu0UNNQ30NeEkDQ2+SMVRC pc33OgeVz0Vg0sfNbjRu+EPpqz5ZtA0OHf1ymh4U5iyK9T+FXwMwbWbY/u9Ab0Bu6+2l H0Eq81us4Q+mtG4fM0CTy2yQpUp4WDheGlMXQQYe0Fxkj+DHg30Wyi/TtIx5rtr/Y4lf ckGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715762056; x=1716366856; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jXzxqlQb5SJ+jzijbt6C51qoXKSQcJcN/CJwm6D4WaM=; b=NDUKnoYOWROaGY6FUc8P4fYZvEWpqmQ8ovQFxak42USlJb5o1t6IgaGep6EozTtNw9 aFK6YoEmTJPckkoqyXyPqD1zgxb3clRrIuBBtM1KmmwHp1Zj4rxrc7WfCuwLpcbLEq5d Bx9QRFoDeWazQUH3Rto10gBd8HNS01kP4okbENgx1/oa5XsufXZ2boLF3i2Hl+wwckS3 FJRV+9CN6F3W2olO+iHDzG96hi7nUG3BTwW0z9SwX2DGxXEtc68fL9ku1zM1VP0M5UY7 DJFbUTtzOTmYaTFqPYNX96pI8ZA8aVG6435ssnpO/DruKCHWpv0GRV0O36f6qr3oySLA quCA== X-Gm-Message-State: AOJu0YwdLE4xz1kvfisgVCi/x2bngBseAWEReeqn0DL4cWLAlf7lQBRB 14qqVcAVMavfCWjMLFU1Zs5XlbYnN+c54PGJ7qXyiVC4KRcFPPuU46I7sqZc/+fDU4WDMK16OQN loULH7ef/JJ2vigG30WtWBDlNsG8= X-Google-Smtp-Source: AGHT+IHuRifBt1HDkCbbpvGzLIJHZ4PWiAJrvYY6vBcRxqURuztiat2vxbSPpmW6pidOHnVZDMOGim43EhJV7ZoUmWk= X-Received: by 2002:a05:6102:c50:b0:47c:296c:5fe3 with SMTP id ada2fe7eead31-48077b9908cmr10683432137.9.1715762055423; Wed, 15 May 2024 01:34:15 -0700 (PDT) MIME-Version: 1.0 References: <20240515082054.3934069-1-hongyu.wang@intel.com> <20240515082054.3934069-3-hongyu.wang@intel.com> In-Reply-To: <20240515082054.3934069-3-hongyu.wang@intel.com> From: Hongyu Wang Date: Wed, 15 May 2024 16:25:27 +0800 Message-ID: Subject: Re: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates To: Richard Sandiford Cc: gcc-patches@gcc.gnu.org, ubizjak@gmail.com, hongtao.liu@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: CC'd Richard for ccmp part as previously it is added only for aarch64. The original logic will not interrupted since if aarch64_gen_ccmp_first succeeded, aarch64_gen_ccmp_next will also success, the cmp/fcmp and ccmp/fccmp supports all GPI/GPF, and the prepare_operand will fixup the input that cmp supports but ccmp not, so ret/ret2 will all be valid when comparing cost. Thanks in advance. Hongyu Wang =E4=BA=8E2024=E5=B9=B45=E6=9C=8815=E6= =97=A5=E5=91=A8=E4=B8=89 16:22=E5=86=99=E9=81=93=EF=BC=9A > > For general ccmp scenario, the tree sequence is like > > _1 =3D (a < b) > _2 =3D (c < d) > _3 =3D _1 & _2 > > current ccmp expanding will try to swap compare order for _1 and _2, > compare the cost/cost2 between compare _1 and _2 first, then return the > sequence with lower cost. > > For x86 ccmp, we don't support FP compare as ccmp operand, but we > support fp comi + int ccmp sequence. With current cost comparison > model, the fp comi + int ccmp can never be generated since it doesn't > check whether expand_ccmp_next returns available result and the rtl > cost for the empty ccmp sequence is always smaller. > > Check the expand_ccmp_next result ret and ret2, returns the valid one > before cost comparison. > > gcc/ChangeLog: > > * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of > expand_ccmp_next, returns the valid one first before > comparing cost. > --- > gcc/ccmp.cc | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc > index 7cb525addf4..4b424220068 100644 > --- a/gcc/ccmp.cc > +++ b/gcc/ccmp.cc > @@ -247,7 +247,17 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, = rtx_insn **gen_seq) > cost2 =3D seq_cost (prep_seq_2, speed_p); > cost2 +=3D seq_cost (gen_seq_2, speed_p); > } > - if (cost2 < cost1) > + > + /* For x86 target the ccmp does not support fp operands, but > + have fcomi insn that can produce eflags and then do int > + ccmp. So if one of the op is fp compare, ret1 or ret2 can > + fail, and the cost of the corresponding empty seq will > + always be smaller, then the NULL sequence will be returned. > + Add check for ret and ret2, returns the available one if > + the other is NULL. */ > + if ((!ret && ret2) > + || (!(ret && !ret2) > + && cost2 < cost1)) > { > *prep_seq =3D prep_seq_2; > *gen_seq =3D gen_seq_2; > -- > 2.31.1 >