From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 488753858D28 for ; Thu, 29 Sep 2022 09:50:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 488753858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.93,354,1654588800"; d="scan'208";a="83829214" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 29 Sep 2022 01:50:14 -0800 IronPort-SDR: 7eEeVclKObPGB9MOeK9BjLngOQmjL3nRT/ORpH6xtmlTWyZj/L2sqE71LW3rx/8wMweO6Jos+v yfJYy5j2rbx3IHO9h2d0G/TUG18d1ex8Bc3XCdBErcH8H/DzcMdnTBXJo7GJ3BRT4Pw9InwtqZ NC71++YqP4Oc+0rkk5CXPtYBNcVzQRWd8suveimdcvAqs238kL2GrguabGOaBGHlClRK3J0Za/ kuULPuPgzmAGGTk2oYPzLUTNgpzfTvFMPfp6O4q8PHjUD/IgSGMU3uGLSWL/jXcRurwPJhebEb aFg= Message-ID: Date: Thu, 29 Sep 2022 10:50:09 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [PATCH] vect: while_ult for integer mask Content-Language: en-GB To: Richard Biener , Richard Sandiford CC: "gcc-patches@gcc.gnu.org" References: <87180de9-d0d4-b92f-405f-100aca3d5cf8@codesourcery.com> From: Andrew Stubbs In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 29/09/2022 08:52, Richard Biener wrote: > On Wed, Sep 28, 2022 at 5:06 PM Andrew Stubbs wrote: >> >> This patch is a prerequisite for some amdgcn patches I'm working on to >> support shorter vector lengths (having fixed 64 lanes tends to miss >> optimizations, and masking is not supported everywhere yet). >> >> The problem is that, unlike AArch64, I'm not using different mask modes >> for different sized vectors, so all loops end up using the while_ultsidi >> pattern, regardless of vector length. In theory I could use SImode for >> V32, HImode for V16, etc., but there's no mode to fit V4 or V2 so >> something else is needed. Moving to using vector masks in the backend >> is not a natural fit for GCN, and would be a huge task in any case. >> >> This patch adds an additional length operand so that we can distinguish >> the different uses in the back end and don't end up with more lanes >> enabled than there ought to be. >> >> I've made the extra operand conditional on the mode so that I don't have >> to modify the AArch64 backend; that uses while_ family of >> operators in a lot of places and uses iterators, so it would end up >> touching a lot of code just to add an inactive operand, plus I don't >> have a way to test it properly. I've confirmed that AArch64 builds and >> expands while_ult correctly in a simple example. >> >> OK for mainline? > > Hmm, but you could introduce BI4mode and BI2mode for V4 and V2, no? > Not sure if it is possible to have two partial integer modes and use those. When we first tried to do this port we tried to use V64BImode for masks and got into a horrible mess. DImode works much better. In any case, at this point retrofitting new mask types into the back end would be a big job. We also have the problem that the mask register is actually two 32-bit registers so if you try to use smaller modes the compiler ends up leaving the high part undefined and bad things happen. Basically, regardless of the notional size of the vector, the mask really is 64-bit, and the high bits really do have to be well defined (to zero). The problem is simply that while_ult has lost information in the lowering and expanding process. The size of the vector was clear in gimple, but lost in RTL. Andrew