From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by sourceware.org (Postfix) with ESMTPS id A90ED3858D39 for ; Wed, 20 Sep 2023 03:42:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A90ED3858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695181366; x=1726717366; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=VVjQ0ecJEQrtYlLkz4nGKfPAu3Nq5e67lAz70AmbX+g=; b=LlftiudzcSlYnvTnxkodlWjCjHGEQD6YQULyAwhmW8X0btuUWaH+d0yS /VAYIkHQAgy/exHf85UxwTRPAbAqSAKZCvcg+GsCaL2T0lIoWLKV/YYTS y5PHvIhbdeqeTvbr/gLXxzkiiPVnfM4eYMBsQAO+8wfLDA2NagTugmy03 Gy0M72ZdfE13dYvj1ebpyVhcjGV7Llv2xTkaEr/tdWwUtC/56YFh9snd/ dkHf2c6VaAwOO3nr0myQgTMmrkxYgN19inRWCOMK3Bj5DGb0bZkaXsIEm 5SOnMnG3c7YfBX3d7ng4qSgbG+uJeSFsvvHQ1ISKtfN+ufV9ZBIAOxcQu w==; X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="359502775" X-IronPort-AV: E=Sophos;i="6.02,160,1688454000"; d="scan'208,217";a="359502775" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Sep 2023 20:42:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10838"; a="993418410" X-IronPort-AV: E=Sophos;i="6.02,160,1688454000"; d="scan'208,217";a="993418410" Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by fmsmga006.fm.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 19 Sep 2023 20:42:44 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Tue, 19 Sep 2023 20:42:43 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32 via Frontend Transport; Tue, 19 Sep 2023 20:42:43 -0700 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.177) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.32; Tue, 19 Sep 2023 20:42:40 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VfHYs9fPoStkAzdPPIloAjAiyyAznmeCZEsyZ7ulSEF3j27EYyFrlQRurulKHgDPwWprX2iDoUGGzkjGGF82O/kHwfOoGV6A5KCiOy/3mqk/pBKysqs2bseuiLD0Z9FkcVU+C6H6wFC7taAbRKaxukoWFMb/U7ad7trnvtrypgPSg26jmiVnd8tR5ePcxLXteVJ5XTIaSvuavlrZ4NtNyQXdNNSYPtFZmXwM75S+y59qCg2gzeiewfyTzztutafWLkM3B0cYowLj+HOF7NQVLa+p5MCwdn71LzRX00juuj28d/EYDfxyBf9A3uTNrG5HI1dnv+aMhWP2hfrpQeTyXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bC9y7YGWPZImqeOLZNfes/dv60Ag+ZTGkLCEg+lLPp0=; b=mLbaR3hrwkpz45fMDWGg7lOsxCrcK3XzaqKkyOQrdyBZVfDJQZVFIW5Ih5vxcmR7btBtfsC0BJS1YslfnvhiXH4mIaY/oQ/WOCQxk2hro1R6rGHF9XxDRALSB5yzSPLm4YgIztvZggZKMYQFZLK9+HeYJnQRRQzHM7Fts5VgWhyf/Uapj1cf5589LryWvCuzWxIhuGKPDTX8vvPzAoH91N012tZtMMS3h5CjMgslTtTuSEeH1fvNA5OkqDkwIsdLhG9Sg1yjntojgEkkiRK4Wg3f2l3xVp90kyJfvE+RCthNkpKXaaoXyuNBNtvySdPlkTEB+5ZnUgPp1lRw7w69Wg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from MW5PR11MB5908.namprd11.prod.outlook.com (2603:10b6:303:194::10) by MN0PR11MB5988.namprd11.prod.outlook.com (2603:10b6:208:373::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.26; Wed, 20 Sep 2023 03:42:19 +0000 Received: from MW5PR11MB5908.namprd11.prod.outlook.com ([fe80::6ff9:5a3d:4981:3476]) by MW5PR11MB5908.namprd11.prod.outlook.com ([fe80::6ff9:5a3d:4981:3476%5]) with mapi id 15.20.6792.024; Wed, 20 Sep 2023 03:42:19 +0000 From: "Li, Pan2" To: "juzhe.zhong@rivai.ai" , gcc-patches CC: "Wang, Yanzhang" , kito.cheng Subject: RE: RE: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization Thread-Topic: RE: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization Thread-Index: AQHZ62qETN7gL16l7k2UoF55+BQyz7Ai/73FgAAB1tCAAA92RIAAAPRg Date: Wed, 20 Sep 2023 03:42:19 +0000 Message-ID: References: <20230920023059.1728132-1-pan2.li@intel.com>, <8B36A59D2874CD87+2023092010351706302342@rivai.ai>, In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: MW5PR11MB5908:EE_|MN0PR11MB5988:EE_ x-ms-office365-filtering-correlation-id: 549ba946-879d-47c8-bc36-08dbb98b9780 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: UVb600GJ1Kog3bkYj7LWK309YYHFLAfGTaMfYvu9QkGT9E4k3BHoimgSpC9tMElZoJg+T+Zmm/wFBomNYtGmyRKuTP+49ajvlI72jM34D8SFxb/7Aa/AZCUqTNGRBEBl4M3cDlnoF2bC+QtNdv1dMdwUib6MhDJMF6gNv4X3TH6RYfywaz1we6nK/MhUgKObkoZ1S1aNxiaoYDG9Ancv9VKb168SfVFmKQELU+TOItKJu9hSmOjWNDWPRfywL7jwr0HnLG/FtpU+1pQd88ohmZl+epFTjgWCMP1dgAdG4dm68vgkuQ+GALgnIKBOgtB1cu4CHAIqlNveAQFe5LKNZ/5LWTtv/itFYTUk7SzOMayA4fQS19JTD0q4E+VVoCJY4vF7UQBy8k9MR/6UOPqP7npqqk07lJrCXvaX0U1mjPBZCQyHopawnmszy0mdHd6G88niitL4B+TD5ySrdkDzJVD/foH6uHgrgIElwNGpZDwjUxXmG2IAjMrCd3he0XOI5dtPJKDXLagSAy3F0vnwWtMmXZ3xkrz0UPUzCV1zoVo3WbgCPa4tEBy4gNWvrXaX3r6h99e9lFKBQsw+P1sMAjRYDtP5xLlGaEAhAJ+gsA8= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MW5PR11MB5908.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(376002)(346002)(39860400002)(366004)(136003)(396003)(451199024)(186009)(1800799009)(55016003)(5660300002)(6506007)(7696005)(53546011)(86362001)(9686003)(64756008)(54906003)(316002)(38070700005)(66946007)(66446008)(66556008)(66476007)(38100700002)(76116006)(110136005)(41300700001)(478600001)(84970400001)(71200400001)(82960400001)(8936002)(52536014)(26005)(2906002)(33656002)(30864003)(66899024)(122000001)(8676002)(83380400001)(4326008)(579004)(559001);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?Ze4Og9jJl+TttRvQRZPEL+XYXHJLN1/PF5Mg6JBYL7K3pqzavouC34OxWdAb?= =?us-ascii?Q?25OwnRcnWTppfvIyscDqhygWbeWUIKa1tk8DT6nGWYyKd5DcYXy27LxB6QIF?= =?us-ascii?Q?X+rmvhOfnXaGcx6oVwKqHe1qCRF5AS9+HmRacYegr1bl770iN4PDZ9WET9s0?= =?us-ascii?Q?uvGqyRc3nrJsQX95tAhhCP5sEBPYIoFagmLNRUf71p86s+5s4/fhv5gucekN?= =?us-ascii?Q?3nRESGP91etE/MlO/L2UkaCoiNzQQsOBDBo3axey52XJ+hYeO6x/SzbHXq1U?= =?us-ascii?Q?hyJ1hP5d3iV7AklkMiMtORnF4wUcJLanCW3yVJ9tx3P7+iKyI8A1bfu5XCgW?= =?us-ascii?Q?iZywZuqYF2/KXID0m6n9vSafX/EK+Wdq7JXl1btefQfaxAP8T4o1WukEDvlz?= =?us-ascii?Q?Fzn1kfQflY7pDL1Kh5sMZwxLTAse7YV/YWweFFCP0bA34Y/NjXfUFzAz5PY5?= =?us-ascii?Q?j+NLVQzqyZjLAiPUCmncfZj3QhHxXAzVDed0+u4c0zLaAI12h4p8Ie+a1kig?= =?us-ascii?Q?/ogQLRIyt/d99WHeLOPLOrK0kA75XSxwpox4CdyQGhYRaFQ5LMF+CFYwXcRq?= =?us-ascii?Q?kbDNlQfVZVeHXK7QIjIQPLchgFnhemFyoWClW5PW2HcaulCjnOV3Mndor+6Z?= =?us-ascii?Q?JiTesWYe+3HC3mZXphY6AAC3wTm6X9c/k01rnZ+aDm/cLk574rhPsVmmTYlJ?= =?us-ascii?Q?z9A4t/QRIBdWB8BVJ8nVQVkJnr8EZE4b+XE3OeI1qDBEhED3QZagfAXSPqvY?= =?us-ascii?Q?Sqzyq5NEpAyhs+oFumX1yrTGtKT0Czdn1QcPRGT2udua07nZe9qtJGcf0jTx?= =?us-ascii?Q?TstdvXxwkyIlWOY0m2ctLElrvHTX7IXJpJN1I6uZLlvF6m0/gyiosi+YjCbH?= =?us-ascii?Q?1zLDJ+tm6PZEV+ocxDV62yFf8v3vq2rJPmWHq1ijPDcxp7JTHDxVD+UXJd16?= =?us-ascii?Q?lYz+XmHo6R7aUIKlPRMgPUkXUdQeVKRIDfUm34EifXDIaRT5//qXZyrXEt9n?= =?us-ascii?Q?UcMzcW64NP3TAvFiChp5fd6f/mPtUOSCYJeIgFS/h4/sarTdG57A0m980ZaY?= =?us-ascii?Q?zyYQfwJ9DSuK8E15xVuv8XX2VDaMh7+jjYgX/HjRG3ZVi3rQ/jLgy9Tsun8R?= =?us-ascii?Q?EcwxX+TPS6g5s8By9PNE3cRUigVzetikcgX8QENCgZH2EE/f/1tvGpOpdSgn?= =?us-ascii?Q?HNG5vy7mCsog/DW4qDWbo8LUwVI38M7+lH96PYe+xNCEgt7Z43+CPqI6BGbF?= =?us-ascii?Q?E9aYFdb5goEzKqnP/l49NvERULOeg5eT/g3iG9BDXEr+82PMJGo2vfVeR3Il?= =?us-ascii?Q?Vu0BpLZf2LXivLF8tljD4MDfsXZqxbn9WqHAGHQ2hHSINTJPRm6aaK8+6Pny?= =?us-ascii?Q?fTf25150+j0RwvWicE4ZBS3f94U/p3DXx0KqzxsAzZPigxfnjgdAIX9loDlz?= =?us-ascii?Q?9qUmJasIyA+1hZCSoXtSUP9zo89+McOzKi25KwRqKNhakFv46xmH25Jp5noH?= =?us-ascii?Q?dhNdY+q/Aa1GBR5o0bRmuQp7/EeofdFE99IJGrvr3cFd8xqB2ICsKbQcT1E/?= =?us-ascii?Q?xkOZHdopBU+4gfGJSyA=3D?= Content-Type: multipart/alternative; boundary="_000_MW5PR11MB590820264B2DDEC032D9A8A3A9F9AMW5PR11MB5908namp_" MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MW5PR11MB5908.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 549ba946-879d-47c8-bc36-08dbb98b9780 X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Sep 2023 03:42:19.6358 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: J88VDZ0lIhElV1g8hCu/6tV7R1QZ00P81lwQssMNBKD4Ot8ZyGmLY6bXX7FsiZCSx2KY6mKtAVXdH9yCcPqTVw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN0PR11MB5988 X-OriginatorOrg: intel.com X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,HTML_MESSAGE,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --_000_MW5PR11MB590820264B2DDEC032D9A8A3A9F9AMW5PR11MB5908namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Thanks Juzhe, let me check and keep you posted. Pan From: juzhe.zhong@rivai.ai Sent: Wednesday, September 20, 2023 11:37 AM To: Li, Pan2 ; gcc-patches Cc: Wang, Yanzhang ; kito.cheng Subject: Re: RE: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorizati= on I just checked the LLVM implementation. This is their codes of rounding autovectorizaton: They handle CEIL/FLOOR/FROUND/FROUNDEVEN/FROUND TO ZERO with the same hand= ling switch (Op.getOpcode()) { default: llvm_unreachable("Unexpected opcode"); case ISD::FCEIL: case ISD::VP_FCEIL: case ISD::FFLOOR: case ISD::VP_FFLOOR: case ISD::FROUND: case ISD::FROUNDEVEN: case ISD::VP_FROUND: case ISD::VP_FROUNDEVEN: case ISD::VP_FROUNDTOZERO: { RISCVFPRndMode::RoundingMode FRM =3D matchRoundingOp(Op.getOpcode()); assert(FRM !=3D RISCVFPRndMode::Invalid); Truncated =3D DAG.getNode(RISCVISD::VFCVT_RM_X_F_VL, DL, IntVT, Src, Ma= sk, DAG.getTargetConstant(FRM, DL, XLenVT), VL); break; } case ISD::FTRUNC: Truncated =3D DAG.getNode(RISCVISD::VFCVT_RTZ_X_F_VL, DL, IntVT, Src, Mask, VL); break; case ISD::VP_FRINT: Truncated =3D DAG.getNode(RISCVISD::VFCVT_X_F_VL, DL, IntVT, Src, Mask,= VL); break; case ISD::VP_FNEARBYINT: Truncated =3D DAG.getNode(RISCVISD::VFROUND_NOEXCEPT_VL, DL, ContainerV= T, Src, Mask, VL); break; } // VFROUND_NOEXCEPT_VL includes SINT_TO_FP_VL. if (Op.getOpcode() !=3D ISD::VP_FNEARBYINT) Truncated =3D DAG.getNode(RISCVISD::SINT_TO_FP_VL, DL, ContainerVT, Tru= ncated, Mask, VL); // Restore the original sign so that -0.0 is preserved. Truncated =3D DAG.getNode(RISCVISD::FCOPYSIGN_VL, DL, ContainerVT, Trunca= ted, Src, Src, Mask, VL); I think you could just copy LLVM implementation and translate them into GCC= codes. It's so simple. Create a function call 'expand_rounding". LLVM code is very easy to read. I believe you could leverage LLVM implement= ation quickly. ________________________________ juzhe.zhong@rivai.ai From: Li, Pan2 Date: 2023-09-20 10:44 To: juzhe.zhong@rivai.ai; gcc-patches CC: Wang, Yanzhang; kito.cheng Subject: RE: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization > It should be "V_VLSF" instead of "VF" so that you could also support VLS = CEIL. Under preparing, and will append to this V2 instead of another patch. > a[i] =3D cond[i] ? CEIL (b[i]): c[i]; Sure Pan From: juzhe.zhong@rivai.ai > Sent: Wednesday, September 20, 2023 10:35 AM To: Li, Pan2 >; gcc-patches > Cc: Li, Pan2 >; Wang, Yanzhang = >; kito.cheng > Subject: Re: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization +;; -----------------------------------------------------------------------= -- +;; ---- [FP] Math.h. +;; -----------------------------------------------------------------------= -- +;; Includes: +;; - ceil/ceilf +;; -----------------------------------------------------------------------= -- +(define_expand "ceil2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp =3D gen_reg_rtx (mode); + rtx ops_1[] =3D {tmp, operands[1]}; + insn_code icode =3D code_for_pred_fcvt_x_f (UNSPEC_VFCVT, mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, = ops_1); + + rtx ops_2[] =3D {operands[0], tmp}; + icode =3D code_for_pred (FLOAT, mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, = ops_2); + + DONE; + } +) It should be "V_VLSF" instead of "VF" so that you could also support VLS CE= IL. Besides, I want to see this following case: a[i] =3D cond[i] ? CEIL (b[i]): c[i]; Ideally, we should be able to combine vfcvt + vmerge into vfcvt with mask. ________________________________ juzhe.zhong@rivai.ai From: pan2.li Date: 2023-09-20 10:30 To: gcc-patches CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization From: Pan Li > This patch would like to support auto-vectorization for both the ceil and ceilf of math.h. It depends on the -ffast-math option. When we would like to call ceil/ceilf like v2 =3D ceil (v1), we will onvert it into below insn (reference the implementation of llvm). * vfcvt.x.f v3, v1, RUP * vfcvt.f.x v2, v3 The conditional auto-vectorization for ceil/ceilf is also supported and covered by test cases. Befor this patch: math-ceil-1.c:21:1: missed: couldn't vectorize loop ... .L3: flw fa0,0(s0) addi s0,s0,4 addi s1,s1,4 call ceilf fsw fa0,-4(s1) bne s0,s2,.L3 After this patch: ... fsrmi 3 .L4: vsetvli a5,a2,e32,m1,ta,ma vle32.v v1,0(a1) vsetvli a3,zero,e32,m1,ta,ma slli a4,a5,2 vfcvt.x.f.v v1,v1 sub a2,a2,a5 vfcvt.f.x.v v1,v1 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) add a1,a1,a4 add a0,a0,a4 bne a2,zero,.L4 .L14: fsrm a6 ret Please not VLS mode is not involved in this patch and will be token care of in the underlying patches soon. gcc/ChangeLog: * config/riscv/autovec.md (ceil2): New pattern. * config/riscv/riscv-protos.h (enum insn_flags): New enum type. (enum insn_type): Ditto. * config/riscv/riscv-v.cc: Handle rounding up. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/math-ceil-1.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-2.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-3.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-4.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-run-1.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-run-2.c: New test. * gcc.target/riscv/rvv/autovec/test-math.h: New test. Signed-off-by: Pan Li > --- gcc/config/riscv/autovec.md | 30 +++++++++++++ gcc/config/riscv/riscv-protos.h | 4 ++ gcc/config/riscv/riscv-v.cc | 2 + .../riscv/rvv/autovec/math-ceil-1.c | 21 +++++++++ .../riscv/rvv/autovec/math-ceil-2.c | 21 +++++++++ .../riscv/rvv/autovec/math-ceil-3.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-4.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-run-1.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-run-2.c | 24 ++++++++++ .../gcc.target/riscv/rvv/autovec/test-math.h | 45 +++++++++++++++++++ 10 files changed, 219 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run= -1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run= -2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 493d5745485..ea508d81047 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2374,3 +2374,33 @@ (define_expand "avg3_ceil" riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3); DONE; }) + +;; -----------------------------------------------------------------------= -- +;; ---- [FP] Math.h. +;; -----------------------------------------------------------------------= -- +;; Includes: +;; - ceil/ceilf +;; -----------------------------------------------------------------------= -- +(define_expand "ceil2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp =3D gen_reg_rtx (mode); + rtx ops_1[] =3D {tmp, operands[1]}; + insn_code icode =3D code_for_pred_fcvt_x_f (UNSPEC_VFCVT, mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, = ops_1); + + rtx ops_2[] =3D {operands[0], tmp}; + icode =3D code_for_pred (FLOAT, mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, = ops_2); + + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-proto= s.h index 5a2d218d67b..833f1efbaf4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -250,6 +250,9 @@ enum insn_flags : unsigned int /* flags for the floating-point rounding mode. */ /* Means INSN has FRM operand and the value is FRM_DYN. */ FRM_DYN_P =3D 1 << 15, + + /* Means INSN has FRM operand and the value is FRM_RUP. */ + FRM_RUP_P =3D 1 << 16, }; enum insn_type : unsigned int @@ -290,6 +293,7 @@ enum insn_type : unsigned int UNARY_OP_TAMA =3D __MASK_OP_TAMA | UNARY_OP_P, UNARY_OP_TAMU =3D __MASK_OP_TAMU | UNARY_OP_P, UNARY_OP_FRM_DYN =3D UNARY_OP | FRM_DYN_P, + UNARY_OP_FRM_RUP =3D UNARY_OP | FRM_RUP_P, /* Binary operator. */ BINARY_OP =3D __NORMAL_OP | BINARY_OP_P, diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index a9287e5d671..4192f988648 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -323,6 +323,8 @@ public: /* Add rounding mode operand. */ if (m_insn_flags & FRM_DYN_P) add_rounding_mode_operand (FRM_DYN); + if (m_insn_flags & FRM_RUP_P) + add_rounding_mode_operand (FRM_RUP); gcc_assert (insn_data[(int) icode].n_operands =3D=3D m_opno); expand (icode, any_mem_p); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c b/gcc= /testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c new file mode 100644 index 00000000000..8f0f09609eb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=3Drv64gcv -mabi=3Dlp64d -O3 -ftree-vectorize -fno-= vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c b/gcc= /testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c new file mode 100644 index 00000000000..73395d30d7a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=3Drv64gcv -mabi=3Dlp64d -O3 -ftree-vectorize -fno-= vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c b/gcc= /testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c new file mode 100644 index 00000000000..eb0f3a3db78 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=3Drv64gcv -mabi=3Dlp64d -O3 -ftree-vectorize -fno-= vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c b/gcc= /testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c new file mode 100644 index 00000000000..b9a3c8ebf84 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=3Drv64gcv -mabi=3Dlp64d -O3 -ftree-vectorize -fno-= vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c b= /gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c new file mode 100644 index 00000000000..014c4c3ac0a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=3Drv64gcv -mabi=3Dlp64d -O3 -ftree-vect= orize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +float out[ARRAY_SIZE]; +float ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(float, ceilf) +TEST_INIT(float) +TEST_ASSERT(float) + +int +main () +{ + test_float_init (in, ref, ARRAY_SIZE); + test_float_ceilf (out, in, ARRAY_SIZE); + test_float_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c b= /gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c new file mode 100644 index 00000000000..ae361e11144 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=3Drv64gcv -mabi=3Dlp64d -O3 -ftree-vect= orize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +double out[ARRAY_SIZE]; +double ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(double, ceil) +TEST_INIT(double) +TEST_ASSERT(double) + +int +main () +{ + test_double_init (in, ref, ARRAY_SIZE); + test_double_ceil (out, in, ARRAY_SIZE); + test_double_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h b/gcc/t= estsuite/gcc.target/riscv/rvv/autovec/test-math.h new file mode 100644 index 00000000000..57dd5e0e460 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h @@ -0,0 +1,45 @@ +#include + +#define TEST_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, TYPE *in, unsigned count) \ + { \ + for (unsigned i =3D 0; i < count; i++) \ + out[i] =3D CALL (in[i]); \ + } + +#define TEST_COND_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, int *cond, TYPE *in, unsigned coun= t) \ + { = \ + for (unsigned i =3D 0; i < count; i++) = \ + out[i] =3D cond[i] ? CALL (in[i]) : in[i]; = \ + } + +#define TEST_INIT(TYPE) \ + void test_##TYPE##_init (TYPE *in, TYPE *ref, unsigned size) \ + { \ + for (unsigned i =3D 0; i < size; i++) \ + { \ + TYPE tmp =3D (TYPE)i; \ + \ + if (i % 2 =3D=3D 0) \ + { \ + in[i] =3D 1.5f + (TYPE)i; \ + ref[i] =3D (TYPE)(i + 2); \ + } \ + else \ + { \ + in[i] =3D (TYPE)i; \ + ref[i] =3D (TYPE)i; \ + } \ + } \ + } + +#define TEST_ASSERT(TYPE) \ + void test_##TYPE##_assert (TYPE *out, TYPE *ref, unsigned size) \ + { \ + for (unsigned i =3D 0; i < size; i++) \ + { \ + if (out[i] !=3D ref[i]) \ + __builtin_abort (); \ + } \ + } -- 2.34.1 --_000_MW5PR11MB590820264B2DDEC032D9A8A3A9F9AMW5PR11MB5908namp_--