From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 1826B3858C1F for ; Wed, 22 Mar 2023 11:02:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1826B3858C1F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,281,1673942400"; d="scan'208";a="101162337" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa4.mentor.iphmx.com with ESMTP; 22 Mar 2023 03:02:49 -0800 IronPort-SDR: H/2MXnkR60mUqqvMEKOsuNFNDaWz4vNFtF/kmppgbdX3ayI3mkGM877gadqg2TC6wBx4mSbsCV jycWY4HjOYmFAxy2QbidcTMwiEIsPaPifsnEvF9Gbv9kqmtHWHADXBk8RhaKDiEYRyc84Zf64K WtRGDvd6IZr7DeF+8rL4VDxsNk3Zn7e88VINCcN1NJgPePDzCR7uHJEmYY685M+thPWT+wDO69 646guxUwglgIam3Y22xB271I2cixli+qbtJaytX4can/dgEhh6vxiVU6LLNCArRgpZyI3lR6Pj EdM= Message-ID: <7f227691-7c3d-599b-ed24-6fde2ce3c11d@codesourcery.com> Date: Wed, 22 Mar 2023 11:02:44 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: Libgcc divide vectorization question Content-Language: en-GB To: Richard Biener CC: GCC Development References: From: Andrew Stubbs In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-13.mgc.mentorg.com (139.181.222.13) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 22/03/2023 10:09, Richard Biener wrote: > On Tue, Mar 21, 2023 at 6:00 PM Andrew Stubbs wrote: >> >> Hi all, >> >> I want to be able to vectorize divide operators (softfp and integer), >> but amdgcn only has hardware instructions suitable for -ffast-math. >> >> We have recently implemented vector versions of all the libm functions, >> but the libgcc functions aren't builtins and therefore don't use those >> hooks. >> >> What's the best way to achieve this? Add a new __builtin_div (and >> __builtin_mod) that tree-vectorize can find, perhaps? Or something else? > > What do you want to do? Vectorize the out-of-line libgcc copy? Or > emit inline vectorized code for int/softfp operations? In the latter > case just emit the code from the pattern expanders? I'd like to investigate having vectorized versions of the libgcc instruction functions, like we do for libm. The inline code expansion is certainly an option, but I think there's quite a lot of code in those routines. I know how to do that option at least (except, maybe not the errno handling without making assumptions about the C runtime). Basically, the -ffast-math instructions will always be the fastest way, but the goal is that the default optimization shouldn't just disable vectorization entirely for any loop that has a divide in it. Andrew