From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-eopbgr130041.outbound.protection.outlook.com [40.107.13.41]) by sourceware.org (Postfix) with ESMTPS id 37F453858C39 for ; Tue, 8 Nov 2022 17:36:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 37F453858C39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=G4Gzo/1ilEfnOSMx54R0az2HwUMiegHW3Ha4KnwC9jvN2T0a63OuZ8Ft67VrFzDF8nGnn7/I+3dhsftWGaXuvbtoJCu1RjDFiF6ptZqOtF5Ov5tUmji+ToCUmEgnVPndLMXT7dXaXF/wpF/cMsbaOpwCOqyLUD74zJ1PyrrSPb78T+GusjjdmF+3i8YgIcTLDJ9i0P4G1yLN0Qyq7GZET3Sfg9iw7XJWq2iJJnl0XAcnNQ1INXgu62AedWWMeDLDMxlMDPwOlXZQcc3Nah6WjWiQ0ViWggkGVM7DOnkqkMMXwl4YdtBpfa3MDmzHbAnNr1bO50qO27DUmLEmvM3RhQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lQfP9AxRWew60tM9ayCpk8iMIIHkurDzYDpbcusmS3U=; b=a4wpdvkIc3lTE7y2irxB5wTptwGvLVqggo5Pf+IlT9VVf6upmLFgCXbPO7SdUDFtvI6zTv0ulKf1zQo+CM/x8pxPeBYuvtPLj0IXzdZ2B2IEY/zAOXXWgr54HikHny4UERHzfnTq1N/AuN6qS6LfDBRdaQ6mPXBo4G4xTJg1jXBMmUxG47w1qsjOYIyK4/87STzDmlQ5fYQ+ATN6kQgdNVBoRjMhS/aYyyosyA7aFuoBtCEG/k6A55dC9ZUJq5tvaRxVe9TNmdBot+FssDhr9kUxbKLKT22Pagb18Nv9tE7IVdkpeIjGLMGBi+/ZZSD803si09ttdU/WHRrOEW6AEQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lQfP9AxRWew60tM9ayCpk8iMIIHkurDzYDpbcusmS3U=; b=V3Y2m6xdy/u4Tci36gP38TAFVMYBKZmTfEDPptQTGQIfmg3nq5VAsN7XZQ+NYqLe1vCXcW45dUC6asc+wiSHezzCoRmY2vikHclRb+1saB/88v2FuTeY9RQ2T6UojL1co4O99t/v8w8MUzgeqeV3xz2BmALtI5SaWsmLIGduLy0= Received: from DB6PR0202CA0038.eurprd02.prod.outlook.com (2603:10a6:4:a5::24) by DB8PR08MB5402.eurprd08.prod.outlook.com (2603:10a6:10:11b::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5813.11; Tue, 8 Nov 2022 17:36:21 +0000 Received: from DBAEUR03FT016.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:a5:cafe::36) by DB6PR0202CA0038.outlook.office365.com (2603:10a6:4:a5::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.22 via Frontend Transport; Tue, 8 Nov 2022 17:36:21 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT016.mail.protection.outlook.com (100.127.142.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.20 via Frontend Transport; Tue, 8 Nov 2022 17:36:21 +0000 Received: ("Tessian outbound 58faf9791229:v130"); Tue, 08 Nov 2022 17:36:21 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: b2be4f8e71982735 X-CR-MTA-TID: 64aa7808 Received: from f80c7581f8ca.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 975E77C8-04F3-489D-9541-3FEAABA74A0D.1; Tue, 08 Nov 2022 17:36:14 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f80c7581f8ca.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 08 Nov 2022 17:36:14 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KoJmXFM5IL9sSwUIq/BuRclsU86JBPLX1PUntdLW9GnblaYmd1EkWN+vd+X86WBoUNveqxi9/2AVqUfFTTPpycFq097UwHHstSj+5qvdUKzWMn8DwvBHsHkkDNV+lWR0mc3GK8+luhq1RDV+63iMP50WsCjgB345/b2R11/ytsUikVobdEhAVztJLZoTWJ6SEeLOoRSU5pRO7IgOa5k0tIuUs4lkEpSKNHd7jhdCkQM0yznBtPKwtTnY5fg9BzPx+R1rEjxy7jAlV0PZz3qVYDH49gLazHCRcDKp8ky5CqGrkVv8c1fKrjAWujMNiDJlV2Gy9iHzJfMhcfpXt9f6QQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lQfP9AxRWew60tM9ayCpk8iMIIHkurDzYDpbcusmS3U=; b=KWeblYgRce5TXQvAWU98rHcJF4liVVtm7nHDsKsFKKY3lM0ml/xzvLtWLFf/w4U9PJhRWhbkpLpA+D16CHbpDY8OJSath0mOkHzOpf11WS/gJ7zs2reJkQNHjUOiaqxp+upt2fQtJjNai2St6mIgFGHvaMx328LmhFoyryow6WgihpmOb+9qZusDgPFm69FP6+Hg7Hplntsi3PFsOZDVWwy9QJzgORtf7Di2m7Ke6nKDpYmS8NwQ40vppCFLiJOO5QPuiRICZ3cpXzrssG1kUemqqS0Gw6dTMAsWv7ppPaFCqvR3APKTKwF3VHmvhJVzTJXghIR5DJ8KyHlHmgylhw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lQfP9AxRWew60tM9ayCpk8iMIIHkurDzYDpbcusmS3U=; b=V3Y2m6xdy/u4Tci36gP38TAFVMYBKZmTfEDPptQTGQIfmg3nq5VAsN7XZQ+NYqLe1vCXcW45dUC6asc+wiSHezzCoRmY2vikHclRb+1saB/88v2FuTeY9RQ2T6UojL1co4O99t/v8w8MUzgeqeV3xz2BmALtI5SaWsmLIGduLy0= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by PAVPR08MB8917.eurprd08.prod.outlook.com (2603:10a6:102:329::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5813.11; Tue, 8 Nov 2022 17:36:11 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::bd2a:aff9:b1a0:2fc7]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::bd2a:aff9:b1a0:2fc7%4]) with mapi id 15.20.5813.011; Tue, 8 Nov 2022 17:36:11 +0000 From: Tamar Christina To: Richard Biener CC: "gcc-patches@gcc.gnu.org" , nd , "jeffreyalaw@gmail.com" Subject: RE: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization. Thread-Topic: [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization. Thread-Index: AQHY0ZRWEd9J41klKUyTQhS2iUxjL64olabAgAz4P6A= Date: Tue, 8 Nov 2022 17:36:11 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 308B8BFAB47A064187FDC7935082EBEC.0 Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|PAVPR08MB8917:EE_|DBAEUR03FT016:EE_|DB8PR08MB5402:EE_ X-MS-Office365-Filtering-Correlation-Id: 6dcf0619-d64f-4051-f972-08dac1afc053 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 3Fm6rViKzhuieoZd3BxRMWtnZfuBdhn2fEm/II38bU3nQAYskCGPEVircRuNOl7oly+cSTehjFIkSmyzngsNijb8igWTM1C5DymbYetB9Nz5JdLSn6K5HArOKPRc7qUtm+6PyB/UzkJx3XMi+PI56lI482nz9hMqLSUn/5dGQRwdxPW+UlJcjzG56ojrrbmJnUbXWL2b+afy4ljjgVfrAS068XaW38YxFisM0LHYablt4MeDOSpIWVMpgQZwN4NA2vxSzQT7ewzbNqhXioV9Nvn917YaGax5qf7EGvPoxRS0Tgomkw8DrGuftmNKNEJx6MWUaSY0bFeFzYTWAribOAgofpXQV5lRYP9eciEWDDTbEuM0bc2Z11KN6bLvVFR0k8bBq6mwqGki6pMpV7Fi74FYm1ULCcefZTyBJjm2GGcpwFjETWyfCJ0/nj8LHtZQ4j456QIwYyOoqNuTrmKxTO/xDZVYrS/zAL5j0jj//bqMvNwPfDcjrdd0+9d/JKLs5IE77baieZdxjmyRp7wZSOJ8FnS1rpnAZPdTz2GxlpA5CHRDuHoosnircW7eGoE2O+lXBe6/Wiv/ySQApdVU8FYcVNQ7qvpVs4QPBQ9nS/hIRjgSZLiRQMGPOcH6Te45Fe2tT5E7KBs/IMa7nhIEL6Um0Ffmqy8yvNjK/+iiv2+2VEEZV1GNYscDywFGvCG0LgAnr0ZlA0PT38A1n4oupqgwS/SvGge2UxzlkxEf9xqczZmTwjnEn70h0Bbhr7htVanxNqSChQh491UFRMcqJO3vcg4vcqGzVC4NxbEYjAiNZywIbR8iKen9kmJcFwNnk5s8jZF86+Gpfy1A33hTKQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(376002)(346002)(39860400002)(366004)(396003)(451199015)(2906002)(5660300002)(52536014)(30864003)(8936002)(84970400001)(33656002)(76116006)(316002)(66946007)(6916009)(86362001)(66556008)(55016003)(71200400001)(38100700002)(38070700005)(478600001)(8676002)(122000001)(6506007)(186003)(83380400001)(66476007)(26005)(41300700001)(54906003)(7696005)(9686003)(53546011)(64756008)(66446008)(4326008)(579004);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB8917 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT016.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a6508520-d450-45d0-901f-08dac1afba3c X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LMkMWzt3qN3yc9d5FY61KJYTz2RTKomP/oN6xy555kD3brzWSL9UsUj40vJQ7gxB1PxOVrxVevYcZ5ubYZBuBkx7XPluOesxiFdtfbWbPBycvUyNAM5ZRCcP7z+L0Oe6IxSkL8uWv/znFtZ5Bv2wSIqO9NBbHOfFWh7ILNpfT14wqaPPbMrn0dSjOp3tE7LrElsqn+dBeiPRCFXTic2XKmgKfbCcOmrcDTf8xGs4Qo9b4UzzdQ6luvKaQHUH2kbsr3Wr7iWAwh3ve4hwG/DT9cnDNa/wRAKQ4zjP9G01Q9UdIKwdDjU5eh3kl9YYmV4orRgiqADelg4f+zpgaERRTHxbfakykc92PnHk6zFUeJqKmvfXDj2Wlin2k4ilRwMEbYqJAcfr0m8H1SieeE0nWsRAU9Sgm75M+fQ5zQJI/FXVZHUE5LEyisa6Mbr5qOsS4lD7k4SBmRWFBCGCrxhybGoDWiAXt9r2bR3bj3IhySNm91oL7e05MXF/ZuY5RFcrOfXknEexhZ/5O8Y0xEUxxkcxqkN6jVKDSI4W/wLvTpGxMAS7kYv2chpwIPcexzxJG0dyEqkq3rHM8/9qoLjxyAw6wSFVNj/e4NgeLDiQUaQRmIpkkI/u8XgL266BSQK/OS2cLem3wxWujxMRGCaz49QpxhX1lQ1KnMML7mARhSghbG58Wo/PL5OnebNDRBAaif53y1istVtORwgDv3c2cPOAgLCcHVmwt+5p13mn/UNtBFNbieCfNhHyrVASZuXuCdnlU7z+zINTXGcEXOAyXm+pZCuLDKMDT9C3cCXfs1sg5cgF7imH+Unb6tRIax1ElEWrxoouytlCARLaPNg0Nw== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(396003)(39860400002)(136003)(346002)(376002)(451199015)(46966006)(36840700001)(40470700004)(33656002)(356005)(81166007)(84970400001)(86362001)(36860700001)(5660300002)(2906002)(30864003)(9686003)(47076005)(6506007)(83380400001)(26005)(7696005)(82740400003)(53546011)(336012)(186003)(4326008)(316002)(70586007)(70206006)(8676002)(6862004)(41300700001)(40460700003)(54906003)(55016003)(40480700001)(478600001)(8936002)(107886003)(52536014)(82310400005);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Nov 2022 17:36:21.6356 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6dcf0619-d64f-4051-f972-08dac1afc053 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT016.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5402 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Ping. > -----Original Message----- > From: Tamar Christina > Sent: Monday, October 31, 2022 11:35 AM > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; nd ; jeffreyalaw@gmail.com > Subject: RE: [PATCH 1/4]middle-end Support not decomposing specific > divisions during vectorization. >=20 > > > > The type of the expression should be available via the mode and the > > signedness, no? So maybe to avoid having both RTX and TREE on the > > target hook pass it a wide_int instead for the divisor? > > >=20 > Done. >=20 > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu > and no issues. >=20 > Ok for master? >=20 > Thanks, > Tamar >=20 > gcc/ChangeLog: >=20 > * expmed.h (expand_divmod): Pass tree operands down in addition > to RTX. > * expmed.cc (expand_divmod): Likewise. > * explow.cc (round_push, align_dynamic_address): Likewise. > * expr.cc (force_operand, expand_expr_divmod): Likewise. > * optabs.cc (expand_doubleword_mod, > expand_doubleword_divmod): > Likewise. > * target.h: Include tree-core. > * target.def (can_special_div_by_const): New. > * targhooks.cc (default_can_special_div_by_const): New. > * targhooks.h (default_can_special_div_by_const): New. > * tree-vect-generic.cc (expand_vector_operation): Use it. > * doc/tm.texi.in: Document it. > * doc/tm.texi: Regenerate. > * tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for > support. > * tree-vect-stmts.cc (vectorizable_operation): Likewise. >=20 > gcc/testsuite/ChangeLog: >=20 > * gcc.dg/vect/vect-div-bitmask-1.c: New test. > * gcc.dg/vect/vect-div-bitmask-2.c: New test. > * gcc.dg/vect/vect-div-bitmask-3.c: New test. > * gcc.dg/vect/vect-div-bitmask.h: New file. >=20 > --- inline copy of patch --- >=20 > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index > 92bda1a7e14a3c9ea63e151e4a49a818bf4d1bdb..a29f5c39be3f0927f8ef6e094 > c7a712c0604fb77 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -6112,6 +6112,22 @@ instruction pattern. There is no need for the hoo= k > to handle these two implementation approaches itself. > @end deftypefn >=20 > +@deftypefn {Target Hook} bool > TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST > +(enum @var{tree_code}, tree @var{vectype}, wide_int @var{constant}, rtx > +*@var{output}, rtx @var{in0}, rtx @var{in1}) This hook is used to test > +whether the target has a special method of division of vectors of type > +@var{vectype} using the value @var{constant}, and producing a vector of > type @var{vectype}. The division will then not be decomposed by the and > kept as a div. > + > +When the hook is being used to test whether the target supports a > +special divide, @var{in0}, @var{in1}, and @var{output} are all null. > +When the hook is being used to emit a division, @var{in0} and @var{in1} > +are the source vectors of type @var{vecttype} and @var{output} is the > +destination vector of type @var{vectype}. > + > +Return true if the operation is possible, emitting instructions for it > +if rtxes are provided and updating @var{output}. > +@end deftypefn > + > @deftypefn {Target Hook} tree > TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned > @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in}) This hook > should return the decl of a function that implements the vectorized vari= ant > of the function with the @code{combined_fn} code diff --git > a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index > 112462310b134705d860153294287cfd7d4af81d..d5a745a02acdf051ea1da1b04 > 076d058c24ce093 100644 > --- a/gcc/doc/tm.texi.in > +++ b/gcc/doc/tm.texi.in > @@ -4164,6 +4164,8 @@ address; but often a machine-dependent strategy > can generate better code. >=20 > @hook TARGET_VECTORIZE_VEC_PERM_CONST >=20 > +@hook TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST > + > @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION >=20 > @hook TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION > diff --git a/gcc/explow.cc b/gcc/explow.cc index > ddb4d6ae3600542f8d2bb5617cdd3933a9fae6c0..568e0eb1a158c696458ae678f > 5e346bf34ba0036 100644 > --- a/gcc/explow.cc > +++ b/gcc/explow.cc > @@ -1037,7 +1037,7 @@ round_push (rtx size) > TRUNC_DIV_EXPR. */ > size =3D expand_binop (Pmode, add_optab, size, alignm1_rtx, > NULL_RTX, 1, OPTAB_LIB_WIDEN); > - size =3D expand_divmod (0, TRUNC_DIV_EXPR, Pmode, size, align_rtx, > + size =3D expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, size, > + align_rtx, > NULL_RTX, 1); > size =3D expand_mult (Pmode, size, align_rtx, NULL_RTX, 1); >=20 > @@ -1203,7 +1203,7 @@ align_dynamic_address (rtx target, unsigned > required_align) > gen_int_mode (required_align / BITS_PER_UNIT - 1, > Pmode), > NULL_RTX, 1, OPTAB_LIB_WIDEN); > - target =3D expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target, > + target =3D expand_divmod (0, TRUNC_DIV_EXPR, Pmode, NULL, NULL, > target, > gen_int_mode (required_align / BITS_PER_UNIT, > Pmode), > NULL_RTX, 1); > diff --git a/gcc/expmed.h b/gcc/expmed.h index > 0b2538c4c6bd51dfdc772ef70bdf631c0bed8717..0db2986f11ff4a4b10b59501c6 > f33cb3595659b5 100644 > --- a/gcc/expmed.h > +++ b/gcc/expmed.h > @@ -708,8 +708,9 @@ extern rtx expand_variable_shift (enum tree_code, > machine_mode, extern rtx expand_shift (enum tree_code, machine_mode, > rtx, poly_int64, rtx, > int); > #ifdef GCC_OPTABS_H > -extern rtx expand_divmod (int, enum tree_code, machine_mode, rtx, rtx, > - rtx, int, enum optab_methods =3D > OPTAB_LIB_WIDEN); > +extern rtx expand_divmod (int, enum tree_code, machine_mode, tree, > tree, > + rtx, rtx, rtx, int, > + enum optab_methods =3D OPTAB_LIB_WIDEN); > #endif > #endif >=20 > diff --git a/gcc/expmed.cc b/gcc/expmed.cc index > 8d7418be418406e72a895ecddf2dc7fdb950c76c..bab020c07222afa38305ef8d7 > 333f271b1965b78 100644 > --- a/gcc/expmed.cc > +++ b/gcc/expmed.cc > @@ -4222,8 +4222,8 @@ expand_sdiv_pow2 (scalar_int_mode mode, rtx > op0, HOST_WIDE_INT d) >=20 > rtx > expand_divmod (int rem_flag, enum tree_code code, machine_mode > mode, > - rtx op0, rtx op1, rtx target, int unsignedp, > - enum optab_methods methods) > + tree treeop0, tree treeop1, rtx op0, rtx op1, rtx target, > + int unsignedp, enum optab_methods methods) > { > machine_mode compute_mode; > rtx tquotient; > @@ -4375,6 +4375,17 @@ expand_divmod (int rem_flag, enum tree_code > code, machine_mode mode, >=20 > last_div_const =3D ! rem_flag && op1_is_constant ? INTVAL (op1) : 0; >=20 > + /* Check if the target has specific expansions for the division. */ > + tree cst; > + if (treeop0 > + && treeop1 > + && (cst =3D uniform_integer_cst_p (treeop1)) > + && targetm.vectorize.can_special_div_by_const (code, TREE_TYPE > (treeop0), > + wi::to_wide (cst), > + &target, op0, op1)) > + return target; > + > + > /* Now convert to the best mode to use. */ > if (compute_mode !=3D mode) > { > @@ -4618,8 +4629,8 @@ expand_divmod (int rem_flag, enum tree_code > code, machine_mode mode, > || (optab_handler (sdivmod_optab, int_mode) > !=3D CODE_FOR_nothing))) > quotient =3D expand_divmod (0, TRUNC_DIV_EXPR, > - int_mode, op0, > - gen_int_mode (abs_d, > + int_mode, treeop0, treeop1, > + op0, gen_int_mode (abs_d, > int_mode), > NULL_RTX, 0); > else > @@ -4808,8 +4819,8 @@ expand_divmod (int rem_flag, enum tree_code > code, machine_mode mode, > size - 1, NULL_RTX, 0); > t3 =3D force_operand (gen_rtx_MINUS (int_mode, t1, nsign), > NULL_RTX); > - t4 =3D expand_divmod (0, TRUNC_DIV_EXPR, int_mode, t3, > op1, > - NULL_RTX, 0); > + t4 =3D expand_divmod (0, TRUNC_DIV_EXPR, int_mode, > treeop0, > + treeop1, t3, op1, NULL_RTX, 0); > if (t4) > { > rtx t5; > diff --git a/gcc/expr.cc b/gcc/expr.cc > index > 80bb1b8a4c5b8350fb1b8f57a99fd52e5882fcb6..b786f1d75e25f3410c0640cd96 > a8abc055fa34d9 100644 > --- a/gcc/expr.cc > +++ b/gcc/expr.cc > @@ -8028,16 +8028,17 @@ force_operand (rtx value, rtx target) > return expand_divmod (0, > FLOAT_MODE_P (GET_MODE (value)) > ? RDIV_EXPR : TRUNC_DIV_EXPR, > - GET_MODE (value), op1, op2, target, 0); > + GET_MODE (value), NULL, NULL, op1, op2, > + target, 0); > case MOD: > - return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), > op1, op2, > - target, 0); > + return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), > NULL, NULL, > + op1, op2, target, 0); > case UDIV: > - return expand_divmod (0, TRUNC_DIV_EXPR, GET_MODE (value), > op1, op2, > - target, 1); > + return expand_divmod (0, TRUNC_DIV_EXPR, GET_MODE (value), > NULL, NULL, > + op1, op2, target, 1); > case UMOD: > - return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), > op1, op2, > - target, 1); > + return expand_divmod (1, TRUNC_MOD_EXPR, GET_MODE (value), > NULL, NULL, > + op1, op2, target, 1); > case ASHIFTRT: > return expand_simple_binop (GET_MODE (value), code, op1, op2, > target, 0, OPTAB_LIB_WIDEN); > @@ -8990,11 +8991,13 @@ expand_expr_divmod (tree_code code, > machine_mode mode, tree treeop0, > bool speed_p =3D optimize_insn_for_speed_p (); > do_pending_stack_adjust (); > start_sequence (); > - rtx uns_ret =3D expand_divmod (mod_p, code, mode, op0, op1, target= , 1); > + rtx uns_ret =3D expand_divmod (mod_p, code, mode, treeop0, treeop1= , > + op0, op1, target, 1); > rtx_insn *uns_insns =3D get_insns (); > end_sequence (); > start_sequence (); > - rtx sgn_ret =3D expand_divmod (mod_p, code, mode, op0, op1, target= , 0); > + rtx sgn_ret =3D expand_divmod (mod_p, code, mode, treeop0, treeop1= , > + op0, op1, target, 0); > rtx_insn *sgn_insns =3D get_insns (); > end_sequence (); > unsigned uns_cost =3D seq_cost (uns_insns, speed_p); @@ -9016,7 +9= 019,8 > @@ expand_expr_divmod (tree_code code, machine_mode mode, tree > treeop0, > emit_insn (sgn_insns); > return sgn_ret; > } > - return expand_divmod (mod_p, code, mode, op0, op1, target, unsignedp); > + return expand_divmod (mod_p, code, mode, treeop0, treeop1, > + op0, op1, target, unsignedp); > } >=20 > rtx > diff --git a/gcc/optabs.cc b/gcc/optabs.cc index > 165f8d1fa22432b96967c69a58dbb7b4bf18120d..cff37ccb0dfc3dd79b97d0abfd > 872f340855dc96 100644 > --- a/gcc/optabs.cc > +++ b/gcc/optabs.cc > @@ -1104,8 +1104,9 @@ expand_doubleword_mod (machine_mode mode, > rtx op0, rtx op1, bool unsignedp) > return NULL_RTX; > } > } > - rtx remainder =3D expand_divmod (1, TRUNC_MOD_EXPR, word_mode, > sum, > - gen_int_mode (INTVAL (op1), > word_mode), > + rtx remainder =3D expand_divmod (1, TRUNC_MOD_EXPR, word_mode, > NULL, NULL, > + sum, gen_int_mode (INTVAL (op1), > + word_mode), > NULL_RTX, 1, OPTAB_DIRECT); > if (remainder =3D=3D NULL_RTX) > return NULL_RTX; > @@ -1208,8 +1209,8 @@ expand_doubleword_divmod (machine_mode > mode, rtx op0, rtx op1, rtx *rem, >=20 > if (op11 !=3D const1_rtx) > { > - rtx rem2 =3D expand_divmod (1, TRUNC_MOD_EXPR, mode, quot1, op11, > - NULL_RTX, unsignedp, OPTAB_DIRECT); > + rtx rem2 =3D expand_divmod (1, TRUNC_MOD_EXPR, mode, NULL, NULL, > quot1, > + op11, NULL_RTX, unsignedp, > OPTAB_DIRECT); > if (rem2 =3D=3D NULL_RTX) > return NULL_RTX; >=20 > @@ -1223,8 +1224,8 @@ expand_doubleword_divmod (machine_mode > mode, rtx op0, rtx op1, rtx *rem, > if (rem2 =3D=3D NULL_RTX) > return NULL_RTX; >=20 > - rtx quot2 =3D expand_divmod (0, TRUNC_DIV_EXPR, mode, quot1, op11, > - NULL_RTX, unsignedp, OPTAB_DIRECT); > + rtx quot2 =3D expand_divmod (0, TRUNC_DIV_EXPR, mode, NULL, NULL, > quot1, > + op11, NULL_RTX, unsignedp, > OPTAB_DIRECT); > if (quot2 =3D=3D NULL_RTX) > return NULL_RTX; >=20 > diff --git a/gcc/target.def b/gcc/target.def index > 2a7fa68f83dd15dcdd2c332e8431e6142ec7d305..f491e2233cf18760631f148dac > f18d0e0b133e4c 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -1902,6 +1902,25 @@ implementation approaches itself.", > const vec_perm_indices &sel), > NULL) >=20 > +DEFHOOK > +(can_special_div_by_const, > + "This hook is used to test whether the target has a special method > +of\n\ division of vectors of type @var{vectype} using the value > +@var{constant},\n\ and producing a vector of type @var{vectype}. The > +division\n\ will then not be decomposed by the and kept as a div.\n\ > +\n\ When the hook is being used to test whether the target supports a > +special\n\ divide, @var{in0}, @var{in1}, and @var{output} are all null. > +When the hook\n\ is being used to emit a division, @var{in0} and > +@var{in1} are the source\n\ vectors of type @var{vecttype} and > +@var{output} is the destination vector of\n\ type @var{vectype}.\n\ \n\ > +Return true if the operation is possible, emitting instructions for > +it\n\ if rtxes are provided and updating @var{output}.", bool, (enum > +tree_code, tree vectype, wide_int constant, rtx *output, > + rtx in0, rtx in1), > + default_can_special_div_by_const) > + > /* Return true if the target supports misaligned store/load of a > specific factor denoted in the third parameter. The last parameter > is true if the access is defined in a packed struct. */ diff --git a= /gcc/target.h > b/gcc/target.h index > d6fa6931499d15edff3e5af3e429540d001c7058..c836036ac7fa7910d62bd3da56 > f39c061f68b665 100644 > --- a/gcc/target.h > +++ b/gcc/target.h > @@ -51,6 +51,7 @@ > #include "insn-codes.h" > #include "tm.h" > #include "hard-reg-set.h" > +#include "tree-core.h" >=20 > #if CHECKING_P >=20 > diff --git a/gcc/targhooks.h b/gcc/targhooks.h index > ecce55ebe797cedc940620e8d89816973a045d49..c8df2af02b9d8c41d953b7887 > dd980b1a7c5cf1c 100644 > --- a/gcc/targhooks.h > +++ b/gcc/targhooks.h > @@ -207,6 +207,8 @@ extern void default_addr_space_diagnose_usage > (addr_space_t, location_t); extern rtx default_addr_space_convert (rtx, > tree, tree); extern unsigned int default_case_values_threshold (void); > extern bool default_have_conditional_execution (void); > +extern bool default_can_special_div_by_const (enum tree_code, tree, > wide_int, > + rtx *, rtx, rtx); >=20 > extern bool default_libc_has_function (enum function_class, tree); exte= rn > bool default_libc_has_fast_function (int fcode); diff --git a/gcc/targhoo= ks.cc > b/gcc/targhooks.cc index > b15ae19bcb60c59ae8112e67b5f06a241a9bdbf1..f941b1c218d3c4de8b7f780b6 > 9fe04593ae3419e 100644 > --- a/gcc/targhooks.cc > +++ b/gcc/targhooks.cc > @@ -1807,6 +1807,14 @@ default_have_conditional_execution (void) > return HAVE_conditional_execution; > } >=20 > +/* Default that no division by constant operations are special. */ > +bool default_can_special_div_by_const (enum tree_code, tree, wide_int, > +rtx *, rtx, > + rtx) > +{ > + return false; > +} > + > /* By default we assume that c99 functions are present at the runtime, > but sincos is not. */ > bool > diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-1.c > b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-1.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..472cd710534bc8aa9b1b4916f3 > d7b4d5b64a19b9 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-1.c > @@ -0,0 +1,25 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#include > +#include "tree-vect.h" > + > +#define N 50 > +#define TYPE uint8_t > + > +__attribute__((noipa, noinline, optimize("O1"))) void fun1(TYPE* > +restrict pixel, TYPE level, int n) { > + for (int i =3D 0; i < n; i+=3D1) > + pixel[i] =3D (pixel[i] * level) / 0xff; } > + > +__attribute__((noipa, noinline, optimize("O3"))) void fun2(TYPE* > +restrict pixel, TYPE level, int n) { > + for (int i =3D 0; i < n; i+=3D1) > + pixel[i] =3D (pixel[i] * level) / 0xff; } > + > +#include "vect-div-bitmask.h" > + > +/* { dg-final { scan-tree-dump-not "vect_recog_divmod_pattern: > +detected" "vect" { target aarch64*-*-* } } } */ > diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-2.c > b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-2.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..e904a71885b2e8487593a2cd3 > db75b3e4112e2cc > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-2.c > @@ -0,0 +1,25 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#include > +#include "tree-vect.h" > + > +#define N 50 > +#define TYPE uint16_t > + > +__attribute__((noipa, noinline, optimize("O1"))) void fun1(TYPE* > +restrict pixel, TYPE level, int n) { > + for (int i =3D 0; i < n; i+=3D1) > + pixel[i] =3D (pixel[i] * level) / 0xffffU; } > + > +__attribute__((noipa, noinline, optimize("O3"))) void fun2(TYPE* > +restrict pixel, TYPE level, int n) { > + for (int i =3D 0; i < n; i+=3D1) > + pixel[i] =3D (pixel[i] * level) / 0xffffU; } > + > +#include "vect-div-bitmask.h" > + > +/* { dg-final { scan-tree-dump-not "vect_recog_divmod_pattern: > +detected" "vect" { target aarch64*-*-* } } } */ > diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-3.c > b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-3.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..a1418ebbf5ea8731ed4e3e720 > 157701d9d1cf852 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-3.c > @@ -0,0 +1,26 @@ > +/* { dg-require-effective-target vect_int } */ > +/* { dg-additional-options "-fno-vect-cost-model" { target aarch64*-*-* > +} } */ > + > +#include > +#include "tree-vect.h" > + > +#define N 50 > +#define TYPE uint32_t > + > +__attribute__((noipa, noinline, optimize("O1"))) void fun1(TYPE* > +restrict pixel, TYPE level, int n) { > + for (int i =3D 0; i < n; i+=3D1) > + pixel[i] =3D (pixel[i] * (uint64_t)level) / 0xffffffffUL; } > + > +__attribute__((noipa, noinline, optimize("O3"))) void fun2(TYPE* > +restrict pixel, TYPE level, int n) { > + for (int i =3D 0; i < n; i+=3D1) > + pixel[i] =3D (pixel[i] * (uint64_t)level) / 0xffffffffUL; } > + > +#include "vect-div-bitmask.h" > + > +/* { dg-final { scan-tree-dump-not "vect_recog_divmod_pattern: > +detected" "vect" { target aarch64*-*-* } } } */ > diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask.h > b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask.h > new file mode 100644 > index > 0000000000000000000000000000000000000000..29a16739aa4b706616367bfd1 > 832f28ebd07993e > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask.h > @@ -0,0 +1,43 @@ > +#include > + > +#ifndef N > +#define N 65 > +#endif > + > +#ifndef TYPE > +#define TYPE uint32_t > +#endif > + > +#ifndef DEBUG > +#define DEBUG 0 > +#endif > + > +#define BASE ((TYPE) -1 < 0 ? -126 : 4) > + > +int main () > +{ > + TYPE a[N]; > + TYPE b[N]; > + > + for (int i =3D 0; i < N; ++i) > + { > + a[i] =3D BASE + i * 13; > + b[i] =3D BASE + i * 13; > + if (DEBUG) > + printf ("%d: 0x%x\n", i, a[i]); > + } > + > + fun1 (a, N / 2, N); > + fun2 (b, N / 2, N); > + > + for (int i =3D 0; i < N; ++i) > + { > + if (DEBUG) > + printf ("%d =3D 0x%x =3D=3D 0x%x\n", i, a[i], b[i]); > + > + if (a[i] !=3D b[i]) > + __builtin_abort (); > + } > + return 0; > +} > + > diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index > 350129555a0c71c0896c4f1003163f3b3557c11b..6ad6372c55eef94a742a8fa35e7 > 9d66aa24e2f3b 100644 > --- a/gcc/tree-vect-generic.cc > +++ b/gcc/tree-vect-generic.cc > @@ -1237,6 +1237,17 @@ expand_vector_operation (gimple_stmt_iterator > *gsi, tree type, tree compute_type > tree rhs2 =3D gimple_assign_rhs2 (assign); > tree ret; >=20 > + /* Check if the target was going to handle it through the special > + division callback hook. */ > + tree cst =3D uniform_integer_cst_p (rhs2); > + if (cst && > + targetm.vectorize.can_special_div_by_const (code, type, > + wi::to_wide (cst), > + NULL, > + NULL_RTX, > NULL_RTX)) > + return NULL_TREE; > + > + > if (!optimize > || !VECTOR_INTEGER_TYPE_P (type) > || TREE_CODE (rhs2) !=3D VECTOR_CST diff --git a/gcc/tree-vect- > patterns.cc b/gcc/tree-vect-patterns.cc index > 09574bb1a2696b3438a4ce9f09f74b42e784aca0..e91bcef56fff931a7a7ba534a0 > affd56e7314370 100644 > --- a/gcc/tree-vect-patterns.cc > +++ b/gcc/tree-vect-patterns.cc > @@ -3432,7 +3432,7 @@ vect_recog_divmod_pattern (vec_info *vinfo, > gimple *pattern_stmt, *def_stmt; > enum tree_code rhs_code; > optab optab; > - tree q; > + tree q, cst; > int dummy_int, prec; >=20 > if (!is_gimple_assign (last_stmt)) > @@ -3596,6 +3596,14 @@ vect_recog_divmod_pattern (vec_info *vinfo, >=20 > return pattern_stmt; > } > + else if ((cst =3D uniform_integer_cst_p (oprnd1)) > + && targetm.vectorize.can_special_div_by_const (rhs_code, > vectype, > + wi::to_wide (cst), > + NULL, NULL_RTX, > + NULL_RTX)) > + { > + return NULL; > + } >=20 > if (prec > HOST_BITS_PER_WIDE_INT > || integer_zerop (oprnd1)) > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index > c9dab217f059f17e91e9a7582523e627d7a45b66..1399c22ba0df75f582887d7e8 > 3b67e3ea53d25f4 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -6260,6 +6260,14 @@ vectorizable_operation (vec_info *vinfo, > } > target_support_p =3D (optab_handler (optab, vec_mode) > !=3D CODE_FOR_nothing); > + tree cst; > + if (!target_support_p > + && (cst =3D uniform_integer_cst_p (op1))) > + target_support_p > + =3D targetm.vectorize.can_special_div_by_const (code, vectype, > + wi::to_wide (cst), > + NULL, NULL_RTX, > + NULL_RTX); > } >=20 > bool using_emulated_vectors_p =3D vect_emulated_vector_p (vectype);