From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by sourceware.org (Postfix) with ESMTPS id AC3363858D3C for ; Wed, 22 Mar 2023 13:57:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AC3363858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x12f.google.com with SMTP id y20so23565365lfj.2 for ; Wed, 22 Mar 2023 06:57:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679493436; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ILVYnKgLtVc832Ch7CVgRAYdQ/7eWjRBo/GCPnEi0k0=; b=T82TJ6oM0dDoqlwwNEzyODjJ4/xJ5zcoL4j/BG0WfdISR6zAToqd+xfZ4VfmDIw1z1 kbU6BHALK2espXpZCAEQbZqo3kbxnmZc9P1T6DWIq+IuN6Mz8o1KDMhGRE5Ah7Q5MIOQ qxZjOnMuGtxWEALz5iOUPdwwFeV7uGmmrbvBMEgrOHw5AWjYTLNd64dti5REVyEZHyLk LE9RaqmMNJMnM9dtMcoBbFP/pZFk86UlFu/c/RjdtZJKuyxuFWg3iWlkS12pY3cLSRjw +3yRubUb8KKuTCEfDPOo9zT0twXkkDXq5kOQ85/NQpSPW9rG7Jooge7qCPRBPoL/sbTw nTdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679493436; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ILVYnKgLtVc832Ch7CVgRAYdQ/7eWjRBo/GCPnEi0k0=; b=opJXaM1JpYHZDAtPF4eYXw+1KUI3n9QIBR4oISrl8LYQVXGkb6gB+/6bNe6BkRUJp3 cyk4Kiz75C5jvOlwUsjce6ekp/DxrHpPCQhHGsYmX7MPcLwPg8fP6YzQJIxPl8Mfp4ek r1N8TpPpTQ4UY2Aa23GZYjJzkI4ZCxf7rUyC6uyNFlzhIeE4NQbC53wD0EhQFQUJJXTS +Q0d6N7dpBn4bBLSAWAm6wRXRO0QHGqzfbSUaZfyxMx+26fUHPGmwt9sE/7XJRl1tuhv s3AQJDu91CVmBh3Uo/jlJrZ8DazzfBNO4tKDWzy1Z2z2sadw58MJNMcfk3oeAxgq535w 1gKQ== X-Gm-Message-State: AO0yUKVy4c0rQfkFk6rTXLK7GeV2KS1iW5n7gTe0VEaohn9nchF+0WdH sNOy/mOnqIR+kN6AfQzesQiTpjN02zt9plTq9BM= X-Google-Smtp-Source: AK7set8xndT8njiCHb47iHxaP7QrAzF6gSzSLUMN7DszoDKjeGbtu76pdtRqrhgHMgTWmc+s56oyyf0gC03F5zxzMI8= X-Received: by 2002:a19:ee11:0:b0:4e8:569d:9f38 with SMTP id g17-20020a19ee11000000b004e8569d9f38mr1931908lfb.5.1679493435850; Wed, 22 Mar 2023 06:57:15 -0700 (PDT) MIME-Version: 1.0 References: <7f227691-7c3d-599b-ed24-6fde2ce3c11d@codesourcery.com> In-Reply-To: <7f227691-7c3d-599b-ed24-6fde2ce3c11d@codesourcery.com> From: Richard Biener Date: Wed, 22 Mar 2023 14:56:43 +0100 Message-ID: Subject: Re: Libgcc divide vectorization question To: Andrew Stubbs Cc: GCC Development Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Mar 22, 2023 at 12:02=E2=80=AFPM Andrew Stubbs wrote: > > On 22/03/2023 10:09, Richard Biener wrote: > > On Tue, Mar 21, 2023 at 6:00=E2=80=AFPM Andrew Stubbs wrote: > >> > >> Hi all, > >> > >> I want to be able to vectorize divide operators (softfp and integer), > >> but amdgcn only has hardware instructions suitable for -ffast-math. > >> > >> We have recently implemented vector versions of all the libm functions= , > >> but the libgcc functions aren't builtins and therefore don't use those > >> hooks. > >> > >> What's the best way to achieve this? Add a new __builtin_div (and > >> __builtin_mod) that tree-vectorize can find, perhaps? Or something els= e? > > > > What do you want to do? Vectorize the out-of-line libgcc copy? Or > > emit inline vectorized code for int/softfp operations? In the latter > > case just emit the code from the pattern expanders? > > I'd like to investigate having vectorized versions of the libgcc > instruction functions, like we do for libm. > > The inline code expansion is certainly an option, but I think there's > quite a lot of code in those routines. I know how to do that option at > least (except, maybe not the errno handling without making assumptions > about the C runtime). > > Basically, the -ffast-math instructions will always be the fastest way, > but the goal is that the default optimization shouldn't just disable > vectorization entirely for any loop that has a divide in it. We try to express division as multiplication, but yes, I think there's currently no way to tell the vectorizer that vectorized division is available as libcall (nor for any other arithmetic operator that is not a call in the first place). > Andrew