From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <crazylht@gmail.com>
Received: from mail-vs1-xe2a.google.com (mail-vs1-xe2a.google.com
 [IPv6:2607:f8b0:4864:20::e2a])
 by sourceware.org (Postfix) with ESMTPS id 800BF3855026
 for <gcc-patches@gcc.gnu.org>; Wed, 28 Jul 2021 01:26:31 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 800BF3855026
Received: by mail-vs1-xe2a.google.com with SMTP id o8so523158vss.4
 for <gcc-patches@gcc.gnu.org>; Tue, 27 Jul 2021 18:26:31 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=Ta0G64Bej6dMlswly++GDMcOwSAlTXbzBwm7tC9dNDs=;
 b=Kr0VbhQ/2MyzCnOWMyaUy3OdXVSNnYUlrtJ+ioAtp4knW9MBGNX3nIeQopi176eKic
 wAUGa3Zsy5M6bIHb3M10VHA+BgLZSev7k/vIWnYLubossr+xihvJWv+4DUVRcmdwpJA/
 1UagNzHGAOEmVgvUHRmWU9i0DsbGtXrdm9CGUifcZ9D5PY0rrrImBg7Ju5GgYTxAAiy/
 WZoJdX4IKymAt88ARFYQ39aAr7hFVZSRHdwBen8ZgLXP90cItkudBCqmrlxTyTEn+vSb
 9FE41NeYyfo8yCDJasVupiGi5JePL3rzFOVnww+D3vxHB10cSeIo8Ybb2fsm1ldhtgQP
 BRCA==
X-Gm-Message-State: AOAM530EAXTtvHnuaLe88xeDcAJCPd3Q48PoEPqleTD6Y/uNyoeuUIQD
 6Aph7lDQHZqaxL8gh9GMPNlINiVUSIi4K45Cbhc=
X-Google-Smtp-Source: ABdhPJxVz8ZAxBsi9nhgZQ1ZxmpYS+tyzhjgHVoE0NLiUOgLr54gDVDklrh7x6YLnwsqN2QBpswbREiHKaX5/hCdSYY=
X-Received: by 2002:a67:2d8b:: with SMTP id t133mr19007733vst.45.1627435591039; 
 Tue, 27 Jul 2021 18:26:31 -0700 (PDT)
MIME-Version: 1.0
References: <20210727081138.GG2380545@tucnak>
 <CAMZc-bwTdtKK-jDKUt_-v826N2aMScraYrtr7tBfPe=dDSNSXw@mail.gmail.com>
 <20210727103919.GH2380545@tucnak>
In-Reply-To: <20210727103919.GH2380545@tucnak>
From: Hongtao Liu <crazylht@gmail.com>
Date: Wed, 28 Jul 2021 09:31:59 +0800
Message-ID: <CAMZc-bzNLnKa_xx7xpkg1hyw6Z3yHs5T125cf-ksdLqEKgZZAA@mail.gmail.com>
Subject: Re: [PATCH] i386: Improve AVX2 expansion of vector >> vector DImode
 arithm. shifts [PR101611]
To: Jakub Jelinek <jakub@redhat.com>
Cc: Uros Bizjak <ubizjak@gmail.com>, gcc-patches@gcc.gnu.org
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jul 2021 01:26:32 -0000

On Tue, Jul 27, 2021 at 6:39 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Tue, Jul 27, 2021 at 06:33:24PM +0800, Hongtao Liu wrote:
> > > AVX2 introduced vector >> vector shifts, but unfortunately for V{2,4}DImode
> > > it only supports logical and not arithmetic shifts, only AVX512F for
> > > V8DImode or AVX512VL for V{2,4}DImode fixed that omission.
> > > Earlier in GCC12 cycle I've committed vector >> scalar arithmetic shift
> > > emulation using various sequences, this patch handles the vector >> vector
> > > case.  No need to adjust costs, the previous cost adjustment actually
> > > covers even the vector by vector shifts.
> > > The patch emits the right arithmetic V{2,4}DImode shifts using 2 logical right
> > > V{2,4}DImode shifts (once of the original operands, once of sign mask
> > > constant by the vector shift count), xor and subtraction, on each element
> > > (long long) x >> y is done as
> > > (((unsigned long long) x >> y) ^ (0x8000000000000000ULL >> y))
> > > - (0x8000000000000000ULL >> y)
> > I'm wondering when y > 64, would the transformation still be proper.
> > Guess since it's UD, compiler can do anything.
>
> The patch is changing optabs, not something from target builtins where the
> intrinsics might make it well defined.
> In the optabs out of bound shifts (including y == 64) are UB - i386.h
> doesn't define SHIFT_COUNTS_TRUNCATED.
Thanks for the explanation, patch LGTM.
>
>         Jakub
>


-- 
BR,
Hongtao