From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by sourceware.org (Postfix) with ESMTPS id A13F53856944 for ; Fri, 16 Jun 2023 08:16:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A13F53856944 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686903405; x=1718439405; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=nnZYCQ3NqRVz8nWdqEUOmD/XYqlMbTjdRyEyOB6SmQc=; b=NK4GdvjcFi0DAgtZQ0icBsW/HgAQZmnbKNjLyNB18tVGW8zbk3jFVxwW nYQxFbokTGmOBmGrX61AR6LOBjtxmFLeBViTqc1IYeeaFwc6acOBNz89i FIXcN+D2iuhCtY152kx+B0cL71wt5l8y+nooHbr3vsLOAUUi6xkhxzKMF hcyfXMDov36tjNoQZX91hm/hjxeLxS0JS8WiwMoC3JRzjAQSVXZjUXGdc /3mCfaJys6nL1V0me4nVSO9xrZX1iB0+O7suILGm5uMFSnOfQUGYasSRc NICc0onkz/2cQvEK0mQObLNvLDCXCwq4zYpREAxDUE1BdiBqrJnqntOyd w==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="359164678" X-IronPort-AV: E=Sophos;i="6.00,247,1681196400"; d="scan'208,217";a="359164678" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Jun 2023 01:16:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="707010324" X-IronPort-AV: E=Sophos;i="6.00,247,1681196400"; d="scan'208,217";a="707010324" Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by orsmga007.jf.intel.com with ESMTP; 16 Jun 2023 01:16:43 -0700 Received: from orsmsx611.amr.corp.intel.com (10.22.229.24) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Fri, 16 Jun 2023 01:16:42 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX611.amr.corp.intel.com (10.22.229.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Fri, 16 Jun 2023 01:16:42 -0700 Received: from orsedg603.ED.cps.intel.com (10.7.248.4) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23 via Frontend Transport; Fri, 16 Jun 2023 01:16:42 -0700 Received: from NAM04-MW2-obe.outbound.protection.outlook.com (104.47.73.172) by edgegateway.intel.com (134.134.137.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.23; Fri, 16 Jun 2023 01:16:41 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iTgkGMiYcM9lLl46KtQGtTuQH4JgPyqoZInfgoOWnhHdVFV9LSQK3rt0dntDrP57VC/wzy4P3zokr+4Ha5OSaL0B1tNhdddDZnhyhzg+Z6EpWk6eh+5HFKGP02V9vt3uUrUE+v0ollu8IK3nRwm78jlbE3fremrLpfXIK9MGyCGiWyijxQ163ixQyp3/6/hgb5L0JI3bs5aEz75tviK/LWnazJjyfrsqpsyK3jrQQxqsa2mFyaGkaNsTOxSsxqqiXXyPzuU6kcvVxlHT2VStw8sIfEwEk0JFAtr4j6scxnUWxCdQCsqh9ggDt1U4Y8DQ/7Nc644DbFjSJ+pp+oA2Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EUkzB8Orf7WSwDfaiUkGzE1wbPrXvfF5f80KgEv78fU=; b=bRvikVDjnpab+IFmiixxxrrMYokoyBFhWKuQWxbBRUJyRtvNjBZv8zoSxHPZhB15UDa0WjqefWFkKeN6ldpfsencBOoOc+uviVYUEnjqfvyj6ozxWhNXCRpftB4AosuIn4RRI9+5zHjWpeOq/FtCAnna1vCPHY+LpiCcYg+8kYdBYdSZjA4wcViqpI//6TMN5WqeaQuxWA8PLs0Sg0U8RQUyIJZPbeuF2zbIPoSB1D81Zsh7nGWEQlmOG0IXzMlxQajbL/+yuBsju5mcJGkcIjf89ZsQ0CAGv4bHgLZmRT978hmyjPLE2ootXJDsezT6D+g6aVVx+HxINrgITnIl1Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from MW5PR11MB5908.namprd11.prod.outlook.com (2603:10b6:303:194::10) by CO1PR11MB4785.namprd11.prod.outlook.com (2603:10b6:303:6f::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.27; Fri, 16 Jun 2023 08:16:38 +0000 Received: from MW5PR11MB5908.namprd11.prod.outlook.com ([fe80::127c:f4cc:e699:8f73]) by MW5PR11MB5908.namprd11.prod.outlook.com ([fe80::127c:f4cc:e699:8f73%6]) with mapi id 15.20.6500.029; Fri, 16 Jun 2023 08:16:38 +0000 From: "Li, Pan2" To: "juzhe.zhong@rivai.ai" , gcc-patches CC: Robin Dapp , jeffreyalaw , "Wang, Yanzhang" , kito.cheng Subject: RE: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64. Thread-Topic: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64. Thread-Index: AQHZoCnoXWc3iCHcn0utLq6hgs+Qca+NFB4KgAAANIA= Date: Fri, 16 Jun 2023 08:16:38 +0000 Message-ID: References: <20230616072834.3754201-1-pan2.li@intel.com>, <20230616080932.4190921-1-pan2.li@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: MW5PR11MB5908:EE_|CO1PR11MB4785:EE_ x-ms-office365-filtering-correlation-id: bb24d8a1-82ef-4225-cb48-08db6e420216 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: U8Eo4SIQDjvRynkEqLSNI7Su0VBSCPIB3enmlExRepEvcAzKhdzIzZWU3NcBMFKjyXuVTm4ittoiqw5qj/90FOunAQNUyVKyDBV3Yv52Emap69bCB6TCfQ1mJYIXx+hzC1Rj4RpUzrNmyEPD2W4YcsfnaqFyM2/wVZ+pTVvJRzZ7qvFNBP3hoBMRbyQt7TEzFYssiX4o2ZxKGGDy/1DAb8M7vFhF0l2RbavBEdhEoBZS4QFn/UvnFsTRdT9+uzwoE3jVRsIfX2Ob5QMILzR1qP0ouMVFhsfkXyqYOu2Z5/26OplS2Wbz5cwKzbPXyl9JVLA1OyE7K1E2nVqKJdpaargrxegNLTNgjL8SfYLbSZu+EgWZflR1WJbICy5JefC1nuLlDXE5Ly2BOjHNuLTLywJvx15dnI6wbGw/T9OOgu5H3LiO/IC5la29ghiA/PloBneTLCjoXctiLWRJ5xlD2Vw6+nRViHZLZs6LN+m8UrWt+n1XykCjlD9ecT2pnlkk1qPxvDDUcPC3t+SWid+3Ny+kCdOsQzzikHTC5mNUVrhssBu1kNgwLHheXU8xPAP75PUKP0lTf5GtkcmztoVIOvTa3a9vFA8FBbzjLYHU+gc= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MW5PR11MB5908.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(366004)(396003)(39860400002)(346002)(136003)(376002)(451199021)(84970400001)(110136005)(54906003)(41300700001)(8676002)(86362001)(7696005)(64756008)(66946007)(66476007)(66556008)(66446008)(8936002)(316002)(33656002)(71200400001)(4326008)(76116006)(38070700005)(478600001)(9686003)(26005)(30864003)(55016003)(5660300002)(83380400001)(52536014)(6506007)(53546011)(2906002)(186003)(82960400001)(38100700002)(122000001)(579004)(559001);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?P35HCqY7FfAS/TiTkrHHSW4INjtWpQsNrFUfTqHQbKY0mOf0uaAdWM223l+4?= =?us-ascii?Q?SlvHp0ub+jqKi0wJX4nMoVI88jsu4Ty48VDx8YX/WHUuKg9gAf6t6uSOUl70?= =?us-ascii?Q?x+9I6A6tZ0Tdr4hx7+2mTZLeHuGFpXO+xNGfkyjjTPXfIH6NNl+CfSnOqDXZ?= =?us-ascii?Q?8jWdF/mhZ7SgzGk01DeXoipuKcDv9zC5lXLNh/gzItRWPzSVLwgukcL/tpXa?= =?us-ascii?Q?QivWwa8hI09zMqWHLVM5kWp2S6o6bp9IqQ5i3AnrdS9V+OgyLfH2YGroaenY?= =?us-ascii?Q?ISulsNb61m6FXbfSY3Vb7u7RoWrZvJKJ7lDK8kkHnpUww8Rqm3CF3RKrYWt+?= =?us-ascii?Q?T5flc5W6Zt7O9Wz5ayQsGeWlQLhoHkcreJjCpSgkTBaHDXFINVsgp9tXhW2Q?= =?us-ascii?Q?fYM/QcAt2U9XReXqKIs0uMbiL6XDpFmLGsHtEWmYmhv2PhVcaR1JeZstWwua?= =?us-ascii?Q?IBykgmJkAXcANogVsICX34mBIR5wKBpHdreoe1Dari1tw48gKLy/AiHDiaas?= =?us-ascii?Q?+Cx5fwjj1iGRI71nP3JMsdizm7vG6as3m/aoysbBNaekliUpYClqJfIyXHJD?= =?us-ascii?Q?Qbn7lSrjVdk1EDZHjf+ZdyfxnDEEhNVnHKhSZhi/1XSGW5jPHzG+JG+cPcnW?= =?us-ascii?Q?KkmlSWZJjWkp1FMyZ0K3ZMwULGtmRI9eMK72Xsn1S+Id4ZQ+4FzNQ1TH+ZKo?= =?us-ascii?Q?FNbz9xZLIRXtETszk77WEspp4L4t+XJgftCjt6cF3vocCVQ4XyHGxGMW1J7t?= =?us-ascii?Q?g28z27uqcxf5FB4AbJqw5LkjrydyBF+Xw7RlCDFDntuZrBTImwkW6sNRVxXm?= =?us-ascii?Q?hjxM2Mh3qe92U9OX460GyDQpkJwQINNm6tpvA7k07+wNfGaX8VQl/SSx/Dgt?= =?us-ascii?Q?06j4SFzPc9fqZ07uML7piQcCd4xGwQzonbs76n3nr3qr2mSDE95pbVLB6ks1?= =?us-ascii?Q?vIJCCbblDNK9kaHxxFOZ3KtMq5tDGdQUdeEWmYLLJT0IT2YuHAjMRVa69y+h?= =?us-ascii?Q?IW+NRUVjmw6uwoa3odzlJ58W9qOeXlID5kfvQ2pxy4j/FmtRJgBu4QNgBFPv?= =?us-ascii?Q?csBLoRfNviPlcTNLpCZkOvlpUAtUW8TjQtgwbiHNKKe3g5LSCUiGbywnGHoB?= =?us-ascii?Q?QIC3Ns5Iy3R4utAvyBG6Gukx1vUMXRlf+wHXJPETaoYjFXYDYRGnnh7iJ0zr?= =?us-ascii?Q?wswBeYQnsn0hvtYH+cWLlrvAEV9qZ3y0CYXCR5r+yeWT45HItLazsyRkmAC7?= =?us-ascii?Q?56ZyDVvJnkAQYeDjLFfQXhHgQVaKUWCF2PkLoUAZ/lGLemOfZFGqJYbb37qw?= =?us-ascii?Q?pjuX1nwGGIPHgJE2cmQrNa6JKqxKu3Bh8vwnriatUG7AJBBroRztiEAusIdI?= =?us-ascii?Q?SY7awdHGqS2L6Q8U0nkd89w84rnRIRT6ls2H/2W9+vP/AtRCN8HZJlkYiCJc?= =?us-ascii?Q?zpeGsfanWOeQuH8ZUeLe6FGN+huYRN/L87EKyb9ijk7QcHTtxSNApdYdjPUF?= =?us-ascii?Q?gPU+nQburcxbn+oOGw2smDxpLuz2Xv9NkAWPC7rFZRbYi5ifTg7cUM8uwQf+?= =?us-ascii?Q?GvUkFs9x3RgQDcoOOwk=3D?= Content-Type: multipart/alternative; boundary="_000_MW5PR11MB5908D2B33D780404A6F66926A958AMW5PR11MB5908namp_" MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MW5PR11MB5908.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: bb24d8a1-82ef-4225-cb48-08db6e420216 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Jun 2023 08:16:38.4738 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: fduBnrLqdc4ppA6AaDvE3n2cniHSMvlnSRDxY2UTPBFpnSmmVMqhecBvy1xTRHlW4SyN5xrwkBOlfUeQOVN+FQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR11MB4785 X-OriginatorOrg: intel.com X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,HTML_MESSAGE,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --_000_MW5PR11MB5908D2B33D780404A6F66926A958AMW5PR11MB5908namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Thanks Juzhe for reviewing, will take care of the FP and widen part soon. Pan From: juzhe.zhong@rivai.ai Sent: Friday, June 16, 2023 4:11 PM To: Li, Pan2 ; gcc-patches Cc: Robin Dapp ; jeffreyalaw ; = Li, Pan2 ; Wang, Yanzhang ; kit= o.cheng Subject: Re: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/6= 4. LGTM. Thanks for fix this bug. Let's wait for Jeff's final approve. Thanks. ________________________________ juzhe.zhong@rivai.ai From: pan2.li Date: 2023-06-16 16:09 To: gcc-patches CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v2] RISC-V: Bugfix for RVV integer reduction in ZVE32/64. From: Pan Li > The rvv integer reduction has 3 different patterns for zve128+, zve64 and zve32. They take the same iterator with different attributions. However, we need the generated function code_for_reduc (code, mode1, mode2). The implementation of code_for_reduc may look like below. code_for_reduc (code, mode1, mode2) { if (code =3D=3D max && mode1 =3D=3D VNx1QI && mode2 =3D=3D VNx1QI) return CODE_FOR_pred_reduc_maxvnx1qivnx16qi; // ZVE128+ if (code =3D=3D max && mode1 =3D=3D VNx1QI && mode2 =3D=3D VNx1QI) return CODE_FOR_pred_reduc_maxvnx1qivnx8qi; // ZVE64 if (code =3D=3D max && mode1 =3D=3D VNx1QI && mode2 =3D=3D VNx1QI) return CODE_FOR_pred_reduc_maxvnx1qivnx4qi; // ZVE32 } Thus there will be a problem here. For example zve32, we will have code_for_reduc (max, VNx1QI, VNx1QI) which will return the code of the ZVE128+ instead of the ZVE32 logically. This patch will merge the 3 patterns into pattern, and pass both the input_vector and the ret_vector of code_for_reduc. For example, ZVE32 will = be code_for_reduc (max, VNx1Q1, VNx8QI), then the correct code of ZVE32 will be returned as expectation. Please note both GCC 13 and 14 are impacted by this issue. Signed-off-by: Pan Li > Co-Authored by: Juzhe-Zhong > PR 110265 gcc/ChangeLog: PR target/110265 * config/riscv/riscv-vector-builtins-bases.cc: Add ret_mode for integer reduction expand. * config/riscv/vector-iterators.md: Add VQI, VHI, VSI and VDI, and the LMUL1 attr respectively. * config/riscv/vector.md. (@pred_reduc_): Removed. (@pred_reduc_): Likewise. (@pred_reduc_): Likewise. (@pred_reduc_): New pattern. (@pred_reduc_): Likewise. (@pred_reduc_): Likewise. (@pred_reduc_): Likewise. gcc/testsuite/ChangeLog: PR target/110265 * gcc.target/riscv/rvv/base/pr110265-1.c: New test. * gcc.target/riscv/rvv/base/pr110265-1.h: New test. * gcc.target/riscv/rvv/base/pr110265-2.c: New test. * gcc.target/riscv/rvv/base/pr110265-2.h: New test. * gcc.target/riscv/rvv/base/pr110265-3.c: New test. --- .../riscv/riscv-vector-builtins-bases.cc | 13 +- gcc/config/riscv/vector-iterators.md | 61 +++++ gcc/config/riscv/vector.md | 208 +++++++++++++----- .../gcc.target/riscv/rvv/base/pr110265-1.c | 13 ++ .../gcc.target/riscv/rvv/base/pr110265-1.h | 65 ++++++ .../gcc.target/riscv/rvv/base/pr110265-2.c | 14 ++ .../gcc.target/riscv/rvv/base/pr110265-2.h | 57 +++++ .../gcc.target/riscv/rvv/base/pr110265-3.c | 14 ++ 8 files changed, 385 insertions(+), 60 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/r= iscv/riscv-vector-builtins-bases.cc index 87a684dd127..53bd0ed2534 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc @@ -1396,8 +1396,17 @@ public: rtx expand (function_expander &e) const override { - return e.use_exact_insn ( - code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ())); + machine_mode mode =3D e.vector_mode (); + machine_mode ret_mode =3D e.ret_mode (); + + /* TODO: we will use ret_mode after all types of PR110265 are addresse= d. */ + if ((GET_MODE_CLASS (MODE) =3D=3D MODE_VECTOR_FLOAT) + || GET_MODE_INNER (mode) !=3D GET_MODE_INNER (ret_mode)) + return e.use_exact_insn ( + code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ())); + else + return e.use_exact_insn ( + code_for_pred_reduc (CODE, e.vector_mode (), e.ret_mode ())); } }; diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector= -iterators.md index 8c71c9e22cc..e2c8ade98eb 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -929,6 +929,67 @@ (define_mode_iterator V64T [ (VNx2x64QI "TARGET_MIN_VLEN >=3D 128") ]) +(define_mode_iterator VQI [ + (VNx1QI "TARGET_MIN_VLEN < 128") + VNx2QI + VNx4QI + VNx8QI + VNx16QI + VNx32QI + (VNx64QI "TARGET_MIN_VLEN > 32") + (VNx128QI "TARGET_MIN_VLEN >=3D 128") +]) + +(define_mode_iterator VHI [ + (VNx1HI "TARGET_MIN_VLEN < 128") + VNx2HI + VNx4HI + VNx8HI + VNx16HI + (VNx32HI "TARGET_MIN_VLEN > 32") + (VNx64HI "TARGET_MIN_VLEN >=3D 128") +]) + +(define_mode_iterator VSI [ + (VNx1SI "TARGET_MIN_VLEN < 128") + VNx2SI + VNx4SI + VNx8SI + (VNx16SI "TARGET_MIN_VLEN > 32") + (VNx32SI "TARGET_MIN_VLEN >=3D 128") +]) + +(define_mode_iterator VDI [ + (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") + (VNx2DI "TARGET_VECTOR_ELEN_64") + (VNx4DI "TARGET_VECTOR_ELEN_64") + (VNx8DI "TARGET_VECTOR_ELEN_64") + (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >=3D 128") +]) + +(define_mode_iterator VQI_LMUL1 [ + (VNx16QI "TARGET_MIN_VLEN >=3D 128") + (VNx8QI "TARGET_MIN_VLEN =3D=3D 64") + (VNx4QI "TARGET_MIN_VLEN =3D=3D 32") +]) + +(define_mode_iterator VHI_LMUL1 [ + (VNx8HI "TARGET_MIN_VLEN >=3D 128") + (VNx4HI "TARGET_MIN_VLEN =3D=3D 64") + (VNx2HI "TARGET_MIN_VLEN =3D=3D 32") +]) + +(define_mode_iterator VSI_LMUL1 [ + (VNx4SI "TARGET_MIN_VLEN >=3D 128") + (VNx2SI "TARGET_MIN_VLEN =3D=3D 64") + (VNx1SI "TARGET_MIN_VLEN =3D=3D 32") +]) + +(define_mode_iterator VDI_LMUL1 [ + (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >=3D 128") + (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN =3D=3D 64") +]) + (define_mode_attr VLMULX2 [ (VNx1QI "VNx2QI") (VNx2QI "VNx4QI") (VNx4QI "VNx8QI") (VNx8QI "VNx16QI")= (VNx16QI "VNx32QI") (VNx32QI "VNx64QI") (VNx64QI "VNx128QI") (VNx1HI "VNx2HI") (VNx2HI "VNx4HI") (VNx4HI "VNx8HI") (VNx8HI "VNx16HI")= (VNx16HI "VNx32HI") (VNx32HI "VNx64HI") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 1d1847bd85a..d396e278503 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -7244,76 +7244,168 @@ (define_insn "@pred_rod_trunc" ;; ------------------------------------------------------------------------= ------- ;; For reduction operations, we should have seperate patterns for -;; TARGET_MIN_VLEN =3D=3D 32 and TARGET_MIN_VLEN > 32. +;; different types. For each type, we will cover MIN_VLEN =3D=3D 32, MIN_V= LEN =3D=3D 64 +;; and the MIN_VLEN >=3D 128 from the well defined iterators. ;; Since reduction need LMUL =3D 1 scalar operand as the input operand ;; and they are different. ;; For example, The LMUL =3D 1 corresponding mode of VNx16QImode is VNx4QIm= ode ;; for -march=3Drv*zve32* wheras VNx8QImode for -march=3Drv*zve64* -(define_insn "@pred_reduc_>" - [(set (match_operand: 0 "register_operand" "=3Dvr, = vr") - (unspec: - [(unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") + +;; Integer Reduction for QI +(define_insn "@pred_reduc_" + [ + (set + (match_operand:VQI_LMUL1 0 "register_operand" "=3Dvr,= vr") + (unspec:VQI_LMUL1 + [ + (unspec: + [ + (match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_reduc:VI - (vec_duplicate:VI - (vec_select: - (match_operand: 4 "register_operand" " vr, vr") - (parallel [(const_int 0)]))) - (match_operand:VI 3 "register_operand" " vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0")] UNS= PEC_REDUC))] - "TARGET_VECTOR && TARGET_MIN_VLEN >=3D 128" + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE + ) + (any_reduc:VQI + (vec_duplicate:VQI + (vec_select: + (match_operand:VQI_LMUL1 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]) + ) + ) + (match_operand:VQI 3 "register_operand" " vr, vr") + ) + (match_operand:VQI_LMUL1 2 "vector_merge_operand" " vu, 0") + ] UNSPEC_REDUC + ) + ) + ] + "TARGET_VECTOR" "vred.vs\t%0,%3,%4%p1" - [(set_attr "type" "vired") - (set_attr "mode" "")]) + [ + (set_attr "type" "vired") + (set_attr "mode" "") + ] +) -(define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=3D= vr, vr") - (unspec: - [(unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") +;; Integer Reduction for HI +(define_insn "@pred_reduc_" + [ + (set + (match_operand:VHI_LMUL1 0 "register_operand" "=3Dvr,= vr") + (unspec:VHI_LMUL1 + [ + (unspec: + [ + (match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_reduc:VI_ZVE64 - (vec_duplicate:VI_ZVE64 - (vec_select: - (match_operand: 4 "register_operand" " vr, vr") - (parallel [(const_int 0)]))) - (match_operand:VI_ZVE64 3 "register_operand" " vr, vr"= )) - (match_operand: 2 "vector_merge_operand" " vu, 0"= )] UNSPEC_REDUC))] - "TARGET_VECTOR && TARGET_MIN_VLEN =3D=3D 64" + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE + ) + (any_reduc:VHI + (vec_duplicate:VHI + (vec_select: + (match_operand:VHI_LMUL1 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]) + ) + ) + (match_operand:VHI 3 "register_operand" " vr, vr") + ) + (match_operand:VHI_LMUL1 2 "vector_merge_operand" " vu, 0") + ] UNSPEC_REDUC + ) + ) + ] + "TARGET_VECTOR" "vred.vs\t%0,%3,%4%p1" - [(set_attr "type" "vired") - (set_attr "mode" "")]) + [ + (set_attr "type" "vired") + (set_attr "mode" "") + ] +) -(define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=3Dvd= , vd, vr, vr") - (unspec: - [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,= Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK,= rK") - (match_operand 6 "const_int_operand" " i, i, i,= i") - (match_operand 7 "const_int_operand" " i, i, i,= i") +;; Integer Reduction for SI +(define_insn "@pred_reduc_" + [ + (set + (match_operand:VSI_LMUL1 0 "register_operand" "=3Dvr,= vr") + (unspec:VSI_LMUL1 + [ + (unspec: + [ + (match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_reduc:VI_ZVE32 - (vec_duplicate:VI_ZVE32 - (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr,= vr") - (parallel [(const_int 0)]))) - (match_operand:VI_ZVE32 3 "register_operand" " vr, vr, vr,= vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu,= 0")] UNSPEC_REDUC))] - "TARGET_VECTOR && TARGET_MIN_VLEN =3D=3D 32" + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE + ) + (any_reduc:VSI + (vec_duplicate:VSI + (vec_select: + (match_operand:VSI_LMUL1 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]) + ) + ) + (match_operand:VSI 3 "register_operand" " vr, vr") + ) + (match_operand:VSI_LMUL1 2 "vector_merge_operand" " vu, 0") + ] UNSPEC_REDUC + ) + ) + ] + "TARGET_VECTOR" "vred.vs\t%0,%3,%4%p1" - [(set_attr "type" "vired") - (set_attr "mode" "")]) + [ + (set_attr "type" "vired") + (set_attr "mode" "") + ] +) + +;; Integer Reduction for DI +(define_insn "@pred_reduc_" + [ + (set + (match_operand:VDI_LMUL1 0 "register_operand" "=3Dvr,= vr") + (unspec:VDI_LMUL1 + [ + (unspec: + [ + (match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE + ) + (any_reduc:VDI + (vec_duplicate:VDI + (vec_select: + (match_operand:VDI_LMUL1 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]) + ) + ) + (match_operand:VDI 3 "register_operand" " vr, vr") + ) + (match_operand:VDI_LMUL1 2 "vector_merge_operand" " vu, 0") + ] UNSPEC_REDUC + ) + ) + ] + "TARGET_VECTOR" + "vred.vs\t%0,%3,%4%p1" + [ + (set_attr "type" "vired") + (set_attr "mode" "") + ] +) (define_insn "@pred_widen_reduc_plus" [(set (match_operand: 0 "register_operand" "=3D&vr, = &vr") diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c b/gcc/tes= tsuite/gcc.target/riscv/rvv/base/pr110265-1.c new file mode 100644 index 00000000000..2e4aeb5b90b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=3Drv32gc_zve32f -mabi=3Dilp32f -O3 -Wno-psabi" } */ + +#include "pr110265-1.h" + +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v= [0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 2 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h b/gcc/tes= tsuite/gcc.target/riscv/rvv/base/pr110265-1.h new file mode 100644 index 00000000000..ade44cc27ea --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h @@ -0,0 +1,65 @@ +#include "riscv_vector.h" + +vint8m1_t test_vredand_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredand_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredand_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t sc= alar, size_t vl) { + return __riscv_vredand_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredmax_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredmax_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vint32m1_t test_vredmax_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scala= r, size_t vl) { + return __riscv_vredmax_vs_i32m8_i32m1(vector, scalar, vl); +} + +vuint8m1_t test_vredmaxu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scal= ar, size_t vl) { + return __riscv_vredmaxu_vs_u8mf4_u8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredmaxu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t s= calar, size_t vl) { + return __riscv_vredmaxu_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredmin_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredmin_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vint32m1_t test_vredmin_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scala= r, size_t vl) { + return __riscv_vredmin_vs_i32m8_i32m1(vector, scalar, vl); +} + +vuint8m1_t test_vredminu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scal= ar, size_t vl) { + return __riscv_vredminu_vs_u8mf4_u8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredminu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t s= calar, size_t vl) { + return __riscv_vredminu_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, s= ize_t vl) { + return __riscv_vredor_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t sca= lar, size_t vl) { + return __riscv_vredor_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredsum_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredsum_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredsum_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t sc= alar, size_t vl) { + return __riscv_vredsum_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredxor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredxor_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredxor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t sc= alar, size_t vl) { + return __riscv_vredxor_vs_u32m8_u32m1(vector, scalar, vl); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c b/gcc/tes= tsuite/gcc.target/riscv/rvv/base/pr110265-2.c new file mode 100644 index 00000000000..7454c1cc918 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-march=3Drv32gc_zve64d -mabi=3Dilp32d -O3 -Wno-psabi" } */ + +#include "pr110265-1.h" +#include "pr110265-2.h" + +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 3 } } */ +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 3 } } */ +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v= [0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 4 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h b/gcc/tes= tsuite/gcc.target/riscv/rvv/base/pr110265-2.h new file mode 100644 index 00000000000..6a7e14e51f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h @@ -0,0 +1,57 @@ +#include "riscv_vector.h" + +vint8m1_t test_vredand_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredand_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vint8m1_t test_vredmax_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredmax_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vuint8m1_t test_vredmaxu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scal= ar, size_t vl) { + return __riscv_vredmaxu_vs_u8mf8_u8m1(vector, scalar, vl); +} + +vint8m1_t test_vredmin_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredmin_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vuint8m1_t test_vredminu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scal= ar, size_t vl) { + return __riscv_vredminu_vs_u8mf8_u8m1(vector, scalar, vl); +} + +vint8m1_t test_vredor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, s= ize_t vl) { + return __riscv_vredor_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vint8m1_t test_vredsum_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredsum_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vint8m1_t test_vredxor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, = size_t vl) { + return __riscv_vredxor_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vuint64m1_t test_vredand_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t sc= alar, size_t vl) { + return __riscv_vredand_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredmaxu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t s= calar, size_t vl) { + return __riscv_vredmaxu_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredminu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t s= calar, size_t vl) { + return __riscv_vredminu_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t sca= lar, size_t vl) { + return __riscv_vredor_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredsum_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t sc= alar, size_t vl) { + return __riscv_vredsum_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredxor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t sc= alar, size_t vl) { + return __riscv_vredxor_vs_u64m8_u64m1(vector, scalar, vl); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c b/gcc/tes= tsuite/gcc.target/riscv/rvv/base/pr110265-3.c new file mode 100644 index 00000000000..0ed1fbae35a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-march=3Drv32gc_zve64f -mabi=3Dilp32f -O3 -Wno-psabi" } */ + +#include "pr110265-1.h" +#include "pr110265-2.h" + +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 3 } } */ +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 3 } } */ +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v= [0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*= v[0-9]+} 4 } } */ -- 2.34.1 --_000_MW5PR11MB5908D2B33D780404A6F66926A958AMW5PR11MB5908namp_--