From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by sourceware.org (Postfix) with ESMTPS id 7D3FE3858D28 for ; Fri, 22 Sep 2023 12:30:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7D3FE3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695385810; x=1726921810; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=YS35Rfvjut/YKR1y1tNSvxPy2cNDQfzrck183V9x8pQ=; b=Jp7i7YDY/bXwSQOP/y2aqTVOrX4LkQ72HfiEr/PCPDc+JFm9v6bXXFZO ewb2QQQ2/MzV2iwfNRPh9wW+uWgVA430KS1+LYOyl/7dnKpR3P9PHfBjI 1cEJCh8BnBeSKI1S+b9OtlWQ8cVcMFjsiyV+XSm/u58yjqxK6SrjZdsJT eNA21KZrRZSr1WlvAwku/2W31OMzOMvjIyB+aXjOQikfye42idydzADNn 655nij9EatCRzLgi8i7qdtx2H6LG5iF7NSCUOoUYSiJB26lJVmtNAO7Wh xDsHPxZXZMjH/TkiBO99zpOM1NiwB4mu2eJTRaUbX6jHux6NIYx1xHikz w==; X-IronPort-AV: E=McAfee;i="6600,9927,10841"; a="384651739" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208,217";a="384651739" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 05:30:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10841"; a="750826269" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208,217";a="750826269" Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16]) by fmsmga007.fm.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 22 Sep 2023 05:30:09 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Fri, 22 Sep 2023 05:30:08 -0700 Received: from ORSEDG602.ED.cps.intel.com (10.7.248.7) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32 via Frontend Transport; Fri, 22 Sep 2023 05:30:08 -0700 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (104.47.55.100) by edgegateway.intel.com (134.134.137.103) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.32; Fri, 22 Sep 2023 05:30:08 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SpAqW+b6Mgv0UiceOgv09vLFCMSIYrLw9EKWaOa+JJMKIYXxS943jQYXfAMgkBBi1wPa5vr5amZK0a3XU9yJsUXyvztGl4Pyc/+Yy5QIiR85oJPxyrop7HuYxtU+VLEsxKoLGvLH0JLulo4PLXGjg9Xo/e38z6658WHEK3tGIAL7sQ4wwZVfSB8O5ouM7GdNr1hACdy8A+COyviyv40mzAp6athvyMiSt2+6tTaDYF5c0sShHd3BikGgahi6IoBfcrCl+vOqpJF+fg8RwoCez1/baVdAXRZi7yW8aXHNSi0SfJ5jPmzRolsqaFXrk9+TTzmLfdKrwywDYGoAnXr+FA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FO11UO6yz8xc3TCkPGa9s0TQJXcpJtcG96nJJF3PZf0=; b=cJFPWAchFhsXw7fg14bZe1cMEp9NcioiSE7IVkXTZU7yTCNaxXQDPsW6v4Zou5EnnpcCcdGlH4o8+0uMetjxaj+F+X+tk78dddi7KzT9XXXzsT3Exfftl7EdDnlS8PWbbl6wpPXo/NPsaOW+Gw4NLscGc4Y4YKsuSXWLga5LC41fv6zCZ0PlrTv46VDJiCwJmgks9CcvzDte0WjOXGxqWb/UQm5RwchGvVd9A8HUacfMMd0Ggf28gKYgOTTG31JQgO5LpoYy3n+9CE8ePgpIgh5+DbswPU9S1q6qX5eJ6ozuSP6xY3y2mCcORKiYtff6iGc4rNOsG0mTQ/wXNINlTA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from MW5PR11MB5908.namprd11.prod.outlook.com (2603:10b6:303:194::10) by DM4PR11MB7758.namprd11.prod.outlook.com (2603:10b6:8:101::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6813.20; Fri, 22 Sep 2023 12:30:05 +0000 Received: from MW5PR11MB5908.namprd11.prod.outlook.com ([fe80::6ff9:5a3d:4981:3476]) by MW5PR11MB5908.namprd11.prod.outlook.com ([fe80::6ff9:5a3d:4981:3476%5]) with mapi id 15.20.6792.024; Fri, 22 Sep 2023 12:30:05 +0000 From: "Li, Pan2" To: "juzhe.zhong@rivai.ai" , gcc-patches CC: "Wang, Yanzhang" , kito.cheng Subject: RE: [PATCH v2] RISC-V: Refine the code gen for ceil auto vectorization. Thread-Topic: [PATCH v2] RISC-V: Refine the code gen for ceil auto vectorization. Thread-Index: AQHZ7U6qJbJi9D7g1kGRfRJv0JQgr7Amw3O8gAAC6zA= Date: Fri, 22 Sep 2023 12:30:05 +0000 Message-ID: References: <20230922111925.2033728-1-pan2.li@intel.com>, <20230922121635.2203266-1-pan2.li@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: MW5PR11MB5908:EE_|DM4PR11MB7758:EE_ x-ms-office365-filtering-correlation-id: 5200c2be-fbde-4aec-81f7-08dbbb67a66a x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: cntbRISO7pBDr8xt9NuVr9C+KSdheVfDanqknMZica7CTj21h1jnxOMr7BB3EALzQvTPP2WDat4xRKZsMl4ihoEus1qKNb56BUJ91s1gyVyr36NcXu58+9U8ouHjk7/CpK4xC6tpeJxXSXhlyY7asdz+zxx+H35dKcIDKEZlRgSKCGYRvpWQnGgw65bRCWlfiYQfxCi+ZiyI5tc+RgtmIoQ2mq5RpQNaYsBKXd0XufzpCluEr8GDFPaXdFfYZr2xokom8DuPUs4k6gyuGtb5YGhfQ57sgWcZf+uMSB/aauXOB7/Vs2zq4ndLmULg8Wk8Ax7MkuUuuqcUUgDym+gvTMNYMxkoACHI/0zuyJFlkyZhWHF3X4yic1R9Vd/Hmgqd6JMz5begJbNkO8kmX0sD8gQA79wJ1ETvjoWfHUOlgIbF4M9uFht+T0z0qILDnA4vqKhwE0NJYTB1XJJDQt4CVxMGc8HblnOuRKB/obSx5xW28Q0VD3AFk5mEWFOc3heV0mcsfhNMTIOYkzvtR9FzOhDgOxWs4JWMjmfLi7HXeKWNLzE1De2X+TUVWGowRd/xTIUZPHoLprsbtKZW5wlw9dJn6ljmD9oVlxV7hBWnU/0= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MW5PR11MB5908.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(39860400002)(396003)(136003)(376002)(346002)(366004)(186009)(451199024)(1800799009)(7696005)(6506007)(478600001)(9686003)(71200400001)(53546011)(55016003)(83380400001)(2906002)(66446008)(66476007)(41300700001)(110136005)(66556008)(66946007)(316002)(5660300002)(64756008)(54906003)(52536014)(4326008)(76116006)(26005)(8676002)(86362001)(122000001)(82960400001)(38070700005)(33656002)(8936002)(38100700002)(84970400001)(559001)(579004);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?Ol2W3FF7Rk9IgT3Oz+D49cl4fy5BQHquEgRKTpvFclZnduknLQrjzFvZg90x?= =?us-ascii?Q?lFthS0lNxSPYtV8P8i7g9tlSaXCO0LszMcKzdnv/ti2HefjqoLA4gj4Sj7zq?= =?us-ascii?Q?Y2y8t8JA8y06P2MStRf4kn7EkpAYaSFn+DVf+8vSLvslt6CzW9hDtP9u5aAS?= =?us-ascii?Q?aAvA0vSWduWZR9l+t86RwwWP/EcH0eTclcQZ7sFMZQVLCl381RIOve1AeYbI?= =?us-ascii?Q?e4uw/q9UR+qQnN1sBBJ7Vrn30C0JQhZ7PlipUzDBfWXZsJL7UlKpmF7laS4S?= =?us-ascii?Q?ZTdH2r7zcswOJN+8TZWyouLiW37480Lc8l2ulSD05cMgixGFg4WUImjhrQQi?= =?us-ascii?Q?BLilx26iQXMNKwUHYbD4sVI8HVrU73D/z1PGHd/K3GNoifSSUNcmXmThiK0z?= =?us-ascii?Q?XLqBxox/2qtgA1kHwipMsRoA/rIbn4aZZ5jxg4hVI1PL+s3X27Mah1qx2SAc?= =?us-ascii?Q?mGQLKafn1Tvv3J5L40rPFH8+BC6Odbi95G5W13WduCKXZXhsOzEFRGtDoZc6?= =?us-ascii?Q?ecsl263h73tra61KRTm/OxwBqUAdGvAy6FXKaYAIDaF0ecnCoKrhZZvgHMJT?= =?us-ascii?Q?XJ84buTSUNmwbg13ToIA1i+/VhWbvtRU8hFRSx6ufdmEH4IyUPUcXD0TRr0u?= =?us-ascii?Q?drvFixIqmdVGdPepG3I+ktTaCSMBHLrJuGTQTFPi32zKVxWfX9Gh9py9MbY/?= =?us-ascii?Q?pO8ViT3j3Un0ay3UTgxy16P6GuofmCX6/X+IoaUIFJfoaSx0bOKWIlXOeVlQ?= =?us-ascii?Q?Cn2RttljDd7lDkf391WR6CkEAvosOL75q3OLje8qQSELUgd2QCZNK4il/QOQ?= =?us-ascii?Q?WXSpehbNxl3ty2miANYDSGvB/PjBVKXRW/B9W2komPOGxGq4sBEImDzFu8bX?= =?us-ascii?Q?m5IughtxhxPWodOE25x5+8nCo0oSLMAz0rQ3MhLISvmhW4VUxLQFd9tCCreg?= =?us-ascii?Q?i9KtqpsGq/XjWaCkAJ/kc02pHFIQ7lspYua6MmIhAkkvXuSOdUp4O4SbsaJp?= =?us-ascii?Q?UNnHOlKr+3TgV7QDiWxIKMTvhSIHxUbF6hF+Q4l0gbSPHEBPJ60/OWY+hdF/?= =?us-ascii?Q?QTq+IomrkfSMpXfy01WmQWOv7KGBOgqeb+1QzdkJFlGgtjEdh7EVHRSsVhkD?= =?us-ascii?Q?t7YK8fkd4efmibEQBbzdF0abK43v8RGWyLCa5ctleVE95u6AtMErpUDIM0jt?= =?us-ascii?Q?hyjH6rpfky3Axb3dAJdJwr3HUWrxGe2jOjkdFGQLIArybqJeII8c9lhG7qiX?= =?us-ascii?Q?O+jL9FUi5gUGcvTR18hRaKOOgyMBaHbyN/ybHo/MMW9Gs9QXZtb+rkvTHHRc?= =?us-ascii?Q?+mQsMuguE0OaraMT6xZ1J29P19/nae1jglgH5RSR7mWPFPK7QOfaMEWe8BZU?= =?us-ascii?Q?vVvYMlLdR17SdHaKD3TNRpW20uB2j5D12Tq2vfY8WuTJNnNiEJ4NZqnX1bMt?= =?us-ascii?Q?LbN3gapT3RYQJPo99lTKkCRc99n7k26PSbgcMa3bY4zjWxAfXTcD9lwoaZLk?= =?us-ascii?Q?nv1DdoVBqQCAGY4xiycCS2rz+y+ovmgE2YrRt6pF6flqf3vkzpU87ukfr4OJ?= =?us-ascii?Q?tAmncv8acnVGx/UWAiI=3D?= Content-Type: multipart/alternative; boundary="_000_MW5PR11MB590881C9F2B05455AB1F7D68A9FFAMW5PR11MB5908namp_" MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MW5PR11MB5908.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5200c2be-fbde-4aec-81f7-08dbbb67a66a X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Sep 2023 12:30:05.0647 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 4PQqyNDYS5GceJKco9fLQktLbu9Kvjv7Z7avIdNq+0BV+HahNpq3VeL9iHCMcfeE0gkwPTAq6ImXNHr9TRkPGA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR11MB7758 X-OriginatorOrg: intel.com X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,HTML_MESSAGE,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --_000_MW5PR11MB590881C9F2B05455AB1F7D68A9FFAMW5PR11MB5908namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Committed, thanks Juzhe. Pan From: juzhe.zhong@rivai.ai Sent: Friday, September 22, 2023 8:19 PM To: Li, Pan2 ; gcc-patches Cc: Li, Pan2 ; Wang, Yanzhang ;= kito.cheng Subject: Re: [PATCH v2] RISC-V: Refine the code gen for ceil auto vectoriza= tion. LGTM. ________________________________ juzhe.zhong@rivai.ai From: pan2.li Date: 2023-09-22 20:16 To: gcc-patches CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v2] RISC-V: Refine the code gen for ceil auto vectorization. From: Pan Li > We vectorized below ceil code already. void test_ceil (float *out, float *in, int count) { for (unsigned i =3D 0; i < count; i++) out[i] =3D __builtin_ceilf (in[i]); } Before this patch: vfmv.v.x v4,fa0 // can be removed vfabs.v v0,v1 vmv1r.v v2,v1 // can be removed vmflt.vv v0,v0,v4 // can be refined to vmflt.vf vfcvt.x.f.v v3,v1,v0.t vfcvt.f.x.v v2,v3,v0.t vfsgnj.vv v2,v2,v1 After this patch: vfabs.v v1,v2 vmflt.vf v0,v1,fa5 vfcvt.x.f.v v3,v2,v0.t vfcvt.f.x.v v1,v3,v0.t vfsgnj.vv v1,v1,v2 We can generate better code include below items. * Remove vfmv.v.f. * Take vmflt.vf instead of vmflt.vv. * Remove vmv1r.v. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vec_float_cmp_mask): Refactor. (emit_vec_float_cmp_mask): Rename. (expand_vec_copysign): Ditto. (emit_vec_copysign): Ditto. (emit_vec_abs): New function impl. (emit_vec_cvt_x_f): Ditto. (emit_vec_cvt_f_x): Ditto. (expand_vec_ceil): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c: Adjust body check. * gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c: Ditto. Signed-off-by: Pan Li > --- gcc/config/riscv/riscv-v.cc | 81 ++++++++++++------- .../riscv/rvv/autovec/unop/math-ceil-0.c | 5 +- .../riscv/rvv/autovec/unop/math-ceil-1.c | 5 +- .../riscv/rvv/autovec/unop/math-ceil-2.c | 5 +- .../riscv/rvv/autovec/unop/math-ceil-3.c | 5 +- 5 files changed, 54 insertions(+), 47 deletions(-) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 4d0e1d8d1a9..251d827d973 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -3557,36 +3557,27 @@ gen_ceil_const_fp (machine_mode inner_mode) } static rtx -expand_vec_float_cmp_mask (rtx fp_vector, rtx_code code, rtx fp_scalar, - machine_mode vec_fp_mode) +emit_vec_float_cmp_mask (rtx fp_vector, rtx_code code, rtx fp_scalar, + machine_mode vec_fp_mode) { - /* Step-1: Get the abs float value for mask generation. */ - rtx tmp =3D gen_reg_rtx (vec_fp_mode); - rtx abs_ops[] =3D {tmp, fp_vector}; - insn_code icode =3D code_for_pred (ABS, vec_fp_mode); - emit_vlmax_insn (icode, UNARY_OP, abs_ops); - - /* Step-2: Prepare the scalar float compare register. */ + /* Step-1: Prepare the scalar float compare register. */ rtx fp_reg =3D gen_reg_rtx (GET_MODE_INNER (vec_fp_mode)); emit_insn (gen_move_insn (fp_reg, fp_scalar)); - /* Step-3: Prepare the vector float compare register. */ - rtx vec_dup =3D gen_reg_rtx (vec_fp_mode); - icode =3D code_for_pred_broadcast (vec_fp_mode); - rtx vfmv_ops[] =3D {vec_dup, fp_reg}; - emit_vlmax_insn (icode, UNARY_OP, vfmv_ops); - - /* Step-4: Generate the mask. */ + /* Step-2: Generate the mask. */ machine_mode mask_mode =3D get_mask_mode (vec_fp_mode); rtx mask =3D gen_reg_rtx (mask_mode); - expand_vec_cmp (mask, code, tmp, vec_dup); + rtx cmp =3D gen_rtx_fmt_ee (code, mask_mode, fp_vector, fp_reg); + rtx cmp_ops[] =3D {mask, cmp, fp_vector, fp_reg}; + insn_code icode =3D code_for_pred_cmp_scalar (vec_fp_mode); + emit_vlmax_insn (icode, COMPARE_OP, cmp_ops); return mask; } static void -expand_vec_copysign (rtx op_dest, rtx op_src_0, rtx op_src_1, - machine_mode vec_mode) +emit_vec_copysign (rtx op_dest, rtx op_src_0, rtx op_src_1, + machine_mode vec_mode) { rtx sgnj_ops[] =3D {op_dest, op_src_0, op_src_1}; insn_code icode =3D code_for_pred (UNSPEC_VCOPYSIGN, vec_mode); @@ -3594,30 +3585,58 @@ expand_vec_copysign (rtx op_dest, rtx op_src_0, rtx= op_src_1, emit_vlmax_insn (icode, BINARY_OP, sgnj_ops); } +static void +emit_vec_abs (rtx op_dest, rtx op_src, machine_mode vec_mode) +{ + rtx abs_ops[] =3D {op_dest, op_src}; + insn_code icode =3D code_for_pred (ABS, vec_mode); + + emit_vlmax_insn (icode, UNARY_OP, abs_ops); +} + +static void +emit_vec_cvt_x_f (rtx op_dest, rtx op_src, rtx mask, + insn_type type, machine_mode vec_mode) +{ + rtx cvt_x_ops[] =3D {op_dest, mask, op_dest, op_src}; + insn_code icode =3D code_for_pred_fcvt_x_f (UNSPEC_VFCVT, vec_mode); + + emit_vlmax_insn (icode, type, cvt_x_ops); +} + +static void +emit_vec_cvt_f_x (rtx op_dest, rtx op_src, rtx mask, + insn_type type, machine_mode vec_mode) +{ + rtx cvt_fp_ops[] =3D {op_dest, mask, op_dest, op_src}; + insn_code icode =3D code_for_pred (FLOAT, vec_mode); + + emit_vlmax_insn (icode, type, cvt_fp_ops); +} + void expand_vec_ceil (rtx op_0, rtx op_1, machine_mode vec_fp_mode, machine_mode vec_int_mode) { - /* Step-1: Generate the mask on const fp. */ + /* Step-1: Get the abs float value for mask generation. */ + emit_vec_abs (op_0, op_1, vec_fp_mode); + + /* Step-2: Generate the mask on const fp. */ rtx const_fp =3D gen_ceil_const_fp (GET_MODE_INNER (vec_fp_mode)); - rtx mask =3D expand_vec_float_cmp_mask (op_1, LT, const_fp, vec_fp_mode); + rtx mask =3D emit_vec_float_cmp_mask (op_0, LT, const_fp, vec_fp_mode); - /* Step-2: Convert to integer on mask, with rounding up (aka ceil). */ + /* Step-3: Convert to integer on mask, with rounding up (aka ceil). */ rtx tmp =3D gen_reg_rtx (vec_int_mode); - rtx cvt_x_ops[] =3D {tmp, mask, tmp, op_1}; - insn_code icode =3D code_for_pred_fcvt_x_f (UNSPEC_VFCVT, vec_fp_mode); - emit_vlmax_insn (icode, UNARY_OP_TAMU_FRM_RUP, cvt_x_ops); + emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RUP, vec_fp_mode); - /* Step-3: Convert to floating-point on mask for the final result. + /* Step-4: Convert to floating-point on mask for the final result. To avoid unnecessary frm register access, we use RUP here and it will never do the rounding up because the tmp rtx comes from the float to int conversion. */ - rtx cvt_fp_ops[] =3D {op_0, mask, op_1, tmp}; - icode =3D code_for_pred (FLOAT, vec_fp_mode); - emit_vlmax_insn (icode, UNARY_OP_TAMU_FRM_RUP, cvt_fp_ops); + emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_RUP, vec_fp_mode); - /* Step-4: Retrieve the sign bit. */ - expand_vec_copysign (op_0, op_0, op_1, vec_fp_mode); + /* Step-5: Retrieve the sign bit. */ + emit_vec_copysign (op_0, op_0, op_1, vec_fp_mode); } } // namespace riscv_vector diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c = b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c index 0959afd57d6..1c53d9b67d3 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c @@ -12,11 +12,8 @@ ** ... ** vsetvli\s+[atx][0-9]+,\s*zero,\s*e16,\s*m1,\s*ta,\s*mu ** vfabs\.v\s+v[0-9]+,\s*v[0-9]+ -** ... -** vmflt\.vv\s+v0,\s*v[0-9]+,\s*v[0-9]+ -** ... +** vmflt\.vf\s+v0,\s*v[0-9]+,\s*[fa]+[0-9]+ ** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t -** ... ** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t ** vfsgnj\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+ ** ... diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c = b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c index 142705b7eed..a6d0ac3fc83 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-1.c @@ -12,11 +12,8 @@ ** ... ** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*mu ** vfabs\.v\s+v[0-9]+,\s*v[0-9]+ -** ... -** vmflt\.vv\s+v0,\s*v[0-9]+,\s*v[0-9]+ -** ... +** vmflt\.vf\s+v0,\s*v[0-9]+,\s*[fa]+[0-9]+ ** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t -** ... ** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t ** vfsgnj\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+ ** ... diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c = b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c index d232e36e1db..d196fc678c4 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-2.c @@ -12,11 +12,8 @@ ** ... ** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*mu ** vfabs\.v\s+v[0-9]+,\s*v[0-9]+ -** ... -** vmflt\.vv\s+v0,\s*v[0-9]+,\s*v[0-9]+ -** ... +** vmflt\.vf\s+v0,\s*v[0-9]+,\s*[fa]+[0-9]+ ** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t -** ... ** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t ** vfsgnj\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+ ** ... diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c = b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c index 82e4f89a82a..cd3df49de6d 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-3.c @@ -12,11 +12,8 @@ ** ... ** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*mu ** vfabs\.v\s+v[0-9]+,\s*v[0-9]+ -** ... -** vmflt\.vv\s+v0,\s*v[0-9]+,\s*v[0-9]+ -** ... +** vmflt\.vf\s+v0,\s*v[0-9]+,\s*[fa]+[0-9]+ ** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t -** ... ** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+,\s*v0\.t ** vfsgnj\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+ ** ... -- 2.34.1 --_000_MW5PR11MB590881C9F2B05455AB1F7D68A9FFAMW5PR11MB5908namp_--