From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140080.outbound.protection.outlook.com [40.107.14.80]) by sourceware.org (Postfix) with ESMTPS id B22A43838E70 for ; Wed, 22 Jun 2022 00:35:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B22A43838E70 ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=BLF38w5exWR7Z+RrUKrkjI8N5FFI2ghAtUGL6Df76BgnmN1+DSrz205j4PI5CjkpYPJS1jOQvK5vLV9/XN5cbD5TvlTV19GJDrMVLnxVe00E2bC+axQr6HKCgJUeCO98V7AANvCtwzSJ0+Cgw17Wa9u8iHo1IaPyzcTd96/0N0NSAIT/cxxrQ/DAiBH/72lX9JxGR02RV2Kd4Cyx9Y/JuFEgQ+37XMuMiffNHw3kUCbrfFp9f4JvrRsfUFJPpz77ROelHa5Tt2z7Y5ZVj9Y8T9DlTLBWuQmHa1IKkPyp3YgvWV4fsKIMUEOMmZKHD5Mr2m0xR1Ay6Km8FyjbXGSaCQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WumrnU3hwJQ8FZGlH2LX7zOD9EOr5llzr9L/vq1AMGA=; b=Q/YtekNoXFrAfkJhN1IbCoCd9CCkTem90A5/qgHGFL68wvrVgu1/tHPfoK8FD4Qqn7rOlsafDpxPKWerzKOnR8Sqnrf5/cUjtomwgVFSTtopZHqZzqpYNMzUGvr+CGv1QXcPNgcVZbaiTrJ1ks1KVlA6rjivYuXXa25h46MG39mLOQFVz9bYcMqTjO0CECd7X6mSQ+yMxCjg99WUp7Fbrw5kMLaSPOsi9kaRdrY9rOCp3UklHCbcHD2RCHoEK3uplK5dMxISMS1rgba24hz+DbekTHqR5XLAXy2tf1D3ieJxSqnbdganjIez4gWmUjMj38WPtlWLL+8Up0IzGCRiOQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from AM6P191CA0107.EURP191.PROD.OUTLOOK.COM (2603:10a6:209:8a::48) by PAXPR08MB7018.eurprd08.prod.outlook.com (2603:10a6:102:201::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.16; Wed, 22 Jun 2022 00:35:08 +0000 Received: from VE1EUR03FT025.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:8a:cafe::d7) by AM6P191CA0107.outlook.office365.com (2603:10a6:209:8a::48) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.15 via Frontend Transport; Wed, 22 Jun 2022 00:35:08 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT025.mail.protection.outlook.com (10.152.18.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.14 via Frontend Transport; Wed, 22 Jun 2022 00:35:07 +0000 Received: ("Tessian outbound ff2e13d26e0f:v120"); Wed, 22 Jun 2022 00:35:07 +0000 X-CR-MTA-TID: 64aa7808 Received: from beb4442be82b.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id FD457DE6-F992-4899-8C9B-662260E7EDBA.1; Wed, 22 Jun 2022 00:34:57 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id beb4442be82b.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 22 Jun 2022 00:34:57 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ck81Ynv9YafSeMfzBB05bWXqeUQy7xp+6CTYw3FbAFLxocQHNq4HbSwch0PlHObrUVMxiKft8187w71AJQZ9PuRe0qhFrIgreM3zxd983uCK5CKvBZrrnBaubnIzcwUUwPI+gWVwPanU5UL3bBkDpGNXeqmgLLXtIR84R3W8e5xgPcWqQ2gmTMXBgLhCDCUF9Dxl+fkk0lleJ9GS9WLqg9vkwCvi3dZaMW5MgE+T2TLSE7UWY3neBwvuegm/irXDNXEng0Jwe+ThUv5mawub3JnOjoRQH0dy0Qjfe9ffu4fBbZm7WfkpZZIbkKtXA1p7tb7iAg8eUo5Go/YeXsAnEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WumrnU3hwJQ8FZGlH2LX7zOD9EOr5llzr9L/vq1AMGA=; b=BhYCHrMDsOLsRJsxd7S/FZ5CFqN5BeLu2to8N1pNjDgttVilZ1Rif/9k942WRX0W7w0alajlYLbLKQrHIujweEa0rinmAbCUXu/E1zvxL+25Nyxs9Gk8W6ya5BrfSmXQoHDAxr0ZfrDQ81TJV9TIk8EdcfLl23zakF2C4LWN3AoGQ44o5qxQz44hHCaWDHoll3ik3zMfgyWA0MuXkbPP/D39XVTsuQPg1AFtZo2fcG4hmL3VvYPrWR06HAY3MaEuiON6P89Zc3EBcC/I4AyUU5NtXo5yOhga0PajQ58SSBLV8O9Z93sJ9r4aebMiV9NMmL2s+ZLZyFCdo4YEuGcKZw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by HE1PR08MB2651.eurprd08.prod.outlook.com (2603:10a6:7:2e::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.19; Wed, 22 Jun 2022 00:34:54 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::54e5:594b:e5fd:a9b4]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::54e5:594b:e5fd:a9b4%8]) with mapi id 15.20.5353.022; Wed, 22 Jun 2022 00:34:54 +0000 From: Tamar Christina To: Richard Sandiford , Richard Biener CC: "gcc-patches@gcc.gnu.org" , nd Subject: RE: [PATCH 1/2]middle-end Support optimized division by pow2 bitmask Thread-Topic: [PATCH 1/2]middle-end Support optimized division by pow2 bitmask Thread-Index: AQHYe7rkG8mCfxiDnUegM0t31rGaD61NFy0AgAAEOACAAAPNUIAAIDAAgAAlHcCAAYaPAIAABse7gAAjrKCAC5Jz8A== Date: Wed, 22 Jun 2022 00:34:53 +0000 Message-ID: References: <2p382n54-427o-8q82-6o45-p2nn6869opr5@fhfr.qr> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: F9704E753C61EE4F98C494A5F5CF0F9B.0 x-checkrecipientchecked: true Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-MS-Office365-Filtering-Correlation-Id: 88048e73-dc48-4834-cc67-08da53e70ecd x-ms-traffictypediagnostic: HE1PR08MB2651:EE_|VE1EUR03FT025:EE_|PAXPR08MB7018:EE_ X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: /xDLJKhk1z8yfWKnWFABgaRuLEQGO1zT3EZEvb8BzQUwMTZJYZmDVGKB13nXF0akt4xpXQj1f2QlO27ug534IShTQQ8zQaQu8H1bAVcWU7TZtvppLRNIsqc5V/MOGS1gyebg8z+x5nLUqTk/xQOJhEPK6T826XEJmieBSXfiw7FZkdPhtu+I/JoYzVbEU9prwk3F2bvUpAFTOZH7xNSlASaxpkURQeCe1L94BPWrIYW3RwSKSQ/i17MOt25J9cF3ERoIfcy5FZ2w36bquMxHUizRjdquZ+Vb0ymplKmbN1tS6ExBz+WSoyd4YE34rKfSTddKq0BLP1odOd6Dh5WCL9NTQRP2vVbM8sDu2Aao3eKv1mnnFibaDflAidA8DiP1AxWikngvLl8LM1tTDPxl6Nan8WA2r+T0WEO53OdvMBLc2WVSe1w1UmY6kI9YCSRrNz9EShg66yjCwoJWah/2kiefyh2enQU+y3U6dtLNIPxBk5Aau8psCyX2rJCpIc/pbsVL635SCqVxAl6KqMBkgEB0HxBMdNNHq5D4HZzrg0ETJsx1EGgmobrlx08/FGAxFo+OJnwFjSLETkNJcBvrOVnrXJGOkZ2V29JDf8JgIB/cYPaIxHH2cjj2MyvAL+EEKqWmbHcFKfftGbEuv04U2dTJOP5kpSTj3zTEz471cHgw0AAR7MNLj50BtFCfbZAk1lit3Ha72t6/ffPJriDxhw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(4636009)(376002)(366004)(39860400002)(346002)(396003)(136003)(122000001)(53546011)(38100700002)(2906002)(71200400001)(26005)(7696005)(6506007)(186003)(9686003)(55016003)(83380400001)(64756008)(38070700005)(8676002)(5660300002)(66556008)(4326008)(41300700001)(52536014)(86362001)(110136005)(66476007)(66446008)(54906003)(76116006)(8936002)(33656002)(66946007)(316002)(478600001); DIR:OUT; SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR08MB2651 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT025.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 399aa3de-f179-4f93-3a93-08da53e70681 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: N/5eZONpK2uZQG0nUU6XAvTzRBzMy2fAyPKxymaGS+MmYtPbFEpRZyGpyjZcteUsuTfyP7HGldX4fi0UZYMm4EiyB/mU1e75nMONjfQ+/n/TENdI9PmbCAQ3jSlTm5zbWS6T/q1W0YFiSew9bQjusDLIU8H4VCb2EsHOn4QmxB4usXNYmh6wXNuVYAGk1VtljmtaYiATQtlw57XM8eV9HC9uEQpT2AEuDhA7Q4pmW/Wbep9YQmEggg7Zv9g7nUen0pKOzOFqj5qkxj+QyM5fXVbVuFpCMEbN3Tb5XFDCLyRuWP+YP+wgVyg/zPW8VRDjotwLYd/gvJw4BwWy6pTsLpHwa/X4nJ7C/wj2gzPaj6iRkQ8WZM1Rfuq9fDGXuKU06NmdYglx0HaapEUiTdK/5KrMJ1pUQZaPb3+gLsxrhaSWDfFm2YwUgbWaDcksP4X3xEia/YcsZsv0GLxZV4d77olxKkWbW6vjKx+nZOk3pvSO9gvE/QRG8D6ePpAjhP5RUuU2JVpeuo5E+6jbzQyStNz6QQagauuxtVdBH4EiklEIa+xdzVykXcN8ZroMQSVWFoqf/wnu33Kaa4YDyrTRzppDMDOMWTvilzP86GfI8s22AcRy7ZFkYcej6J8IsjnEqSkSvC1cDyex15L8uriivB1OEtDbH3IVWDlos5LeX7uVfTLEH7mtf1UzBF0LwVoZ1XTXpr5u0HncgE7s2pao3Q== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230016)(4636009)(376002)(346002)(39860400002)(396003)(136003)(36840700001)(40470700004)(46966006)(186003)(70586007)(54906003)(478600001)(41300700001)(4326008)(83380400001)(8676002)(33656002)(8936002)(52536014)(336012)(47076005)(55016003)(316002)(6506007)(5660300002)(70206006)(110136005)(2906002)(40480700001)(82740400003)(356005)(40460700003)(81166007)(9686003)(26005)(53546011)(7696005)(86362001)(82310400005)(36860700001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Jun 2022 00:35:07.5703 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 88048e73-dc48-4834-cc67-08da53e70ecd X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT025.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB7018 X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Jun 2022 00:35:16 -0000 > -----Original Message----- > From: Tamar Christina > Sent: Tuesday, June 14, 2022 4:58 PM > To: Richard Sandiford ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; nd > Subject: RE: [PATCH 1/2]middle-end Support optimized division by pow2 > bitmask >=20 >=20 >=20 > > -----Original Message----- > > From: Richard Sandiford > > Sent: Tuesday, June 14, 2022 2:43 PM > > To: Richard Biener > > Cc: Tamar Christina ; > > gcc-patches@gcc.gnu.org; nd > > Subject: Re: [PATCH 1/2]middle-end Support optimized division by pow2 > > bitmask > > > > Richard Biener writes: > > > On Mon, 13 Jun 2022, Tamar Christina wrote: > > > > > >> > -----Original Message----- > > >> > From: Richard Biener > > >> > Sent: Monday, June 13, 2022 12:48 PM > > >> > To: Tamar Christina > > >> > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford > > >> > > > >> > Subject: RE: [PATCH 1/2]middle-end Support optimized division by > > >> > pow2 bitmask > > >> > > > >> > On Mon, 13 Jun 2022, Tamar Christina wrote: > > >> > > > >> > > > -----Original Message----- > > >> > > > From: Richard Biener > > >> > > > Sent: Monday, June 13, 2022 10:39 AM > > >> > > > To: Tamar Christina > > >> > > > Cc: gcc-patches@gcc.gnu.org; nd ; Richard > > >> > > > Sandiford > > >> > > > Subject: Re: [PATCH 1/2]middle-end Support optimized division > > >> > > > by > > >> > > > pow2 bitmask > > >> > > > > > >> > > > On Mon, 13 Jun 2022, Richard Biener wrote: > > >> > > > > > >> > > > > On Thu, 9 Jun 2022, Tamar Christina wrote: > > >> > > > > > > >> > > > > > Hi All, > > >> > > > > > > > >> > > > > > In plenty of image and video processing code it's common > > >> > > > > > to modify pixel values by a widening operation and then > > >> > > > > > scale them back into range > > >> > > > by dividing by 255. > > >> > > > > > > > >> > > > > > This patch adds an optab to allow us to emit an optimized > > >> > > > > > sequence when doing an unsigned division that is equivalen= t > to: > > >> > > > > > > > >> > > > > > x =3D y / (2 ^ (bitsize (y)/2)-1 > > >> > > > > > > > >> > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, > > >> > > > > > x86_64-pc-linux-gnu and no issues. > > >> > > > > > > > >> > > > > > Ok for master? > > >> > > > > > > >> > > > > Looking at 2/2 it seems that this is the wrong way to > > >> > > > > attack the problem. The ISA doesn't have such instruction > > >> > > > > so adding an optab looks premature. I suppose that there's > > >> > > > > no unsigned vector integer division and thus we open-code > > >> > > > > that in a different > > way? > > >> > > > > Isn't the correct thing then to fixup that open-coding if > > >> > > > > it is more > > >> > efficient? > > >> > > > > > >> > > > > >> > > The problem is that even if you fixup the open-coding it would > > >> > > need to be something target specific? The sequence of > > >> > > instructions we generate don't have a GIMPLE representation. > > >> > > So whatever is generated I'd have to fixup in RTL then. > > >> > > > >> > What's the operation that doesn't have a GIMPLE representation? > > >> > > >> For NEON use two operations: > > >> 1. Add High narrowing lowpart, essentially doing (a +w b) >>.n > bitsize(a)/2 > > >> Where the + widens and the >> narrows. So you give it two > > >> shorts, get a byte 2. Add widening add of lowpart so basically > > >> lowpart (a +w b) > > >> > > >> For SVE2 we use a different sequence, we use two back-to-back > > sequences of: > > >> 1. Add narrow high part (bottom). In SVE the Top and Bottom > > >> instructions > > select > > >> Even and odd elements of the vector rather than "top half" and > > >> "bottom > > half". > > >> > > >> So this instruction does : Add each vector element of the first > > >> source > > vector to the > > >> corresponding vector element of the second source vector, and > > >> place > > the most > > >> significant half of the result in the even-numbered half-width > > destination elements, > > >> while setting the odd-numbered elements to zero. > > >> > > >> So there's an explicit permute in there. The instructions are > > >> sufficiently different that there wouldn't be a single GIMPLE > > representation. > > > > > > I see. Are these also useful to express scalar integer division? > > > > > > I'll defer to others to ack the special udiv_pow2_bitmask optab or > > > suggest some piecemail things other targets might be able to do as > > > well. It does look very special. I'd also bikeshed it to > > > udiv_pow2m1 since 'bitmask' is less obvious than 2^n-1 (assuming I > > > interpreted 'bitmask' correctly ;)). It seems to be even less > > > general since it is an unary op and the actual divisor is > > > constrained by the mode itself? > > > > Yeah, those were my concerns as well. For n-bit numbers, the same > > kind of arithmetic transformation can be used for any 2^m-1 for m in > > [n/2, n), so from a target-independent point of view, m=3D=3Dn/2 isn't > particularly special. > > Hard-coding one value of m would make sense if there was an underlying > > instruction that did exactly this, but like you say, there isn't. > > > > Would a compromise be to define an optab for ADDHN and then add a > > vector pattern for this division that (at least initially) prefers > > ADDHN over the current approach whenever ADDHN is available? We > could > > then adapt the conditions on the pattern if other targets also provide > > ADDHN but don't want this transform. (I think the other instructions > > in the pattern already have > > optabs.) > > > > That still leaves open the question about what to do about SVE2, but > > the underlying problem there is that the vectoriser doesn't know about > > the B/T layout. >=20 > Wouldn't it be better to just generalize the optab and to pass on the mas= k? > I'd prefer to do that than teach the vectorizer about ADDHN (which can't = be > easily done now) let alone teaching it about B/T. It also seems somewha= t > unnecessary to diverge the implementation here in the mid-end. After all, > you can generate better SSE code here as well, so focusing on generating = ISA > specific code from here for each ISA seems like the wrong approach to me. Ping, is there any consensus here?=20 Thanks, Tamar >=20 > Thanks, > Tamar >=20 > > > > Thanks, > > Richard