From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.guenther@gmail.com>
Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com
 [IPv6:2607:f8b0:4864:20::72b])
 by sourceware.org (Postfix) with ESMTPS id 48ED73857BA4
 for <gcc-patches@gcc.gnu.org>; Thu,  2 Jun 2022 08:48:26 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 48ED73857BA4
Received: by mail-qk1-x72b.google.com with SMTP id m68so3178743qkb.9
 for <gcc-patches@gcc.gnu.org>; Thu, 02 Jun 2022 01:48:26 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=HvmbcMITQCofm8P7iFJRXoNfvNg4KqhOGif0wesyFF0=;
 b=X2V2nrEhptez5BSgEhZiBEnbqiy9v0Zk3oDx5S6oT9xWhs9nk5vz3Wllz8FRVeUref
 X7Kj683/IntwACPryn2hHWLNe5zNHfAawgd44cLYgaxlcdodYEJJbxt/R4h1V+xVTye6
 YfYPseQT2MrHtEz0JfudcE3jW7Z00PwNwiicbW5aDbebn0PNcSzV1LLygWvfXN9Cbe2N
 t3Bf+i2ibn0eEclEKl02nMyVqSRsEZMHgBYPWrvbVGvkW1v0Fhw9r624SRsGz8p8lK9B
 PuY4sI/1ZAW+0L6+ZWJBTnKceyhVUObTA+jKMEnqNqH+hQnyU6RdNToDOd3cO142XlX2
 m+Sg==
X-Gm-Message-State: AOAM5333BZcQOewPeRjhRZuE7oi8YhH7+gOEXOBdMihyMQwoaCNQx/hA
 UOyBvp8bYXEfb8wom6U6aSAjG/vzrA5YOMwCiNg=
X-Google-Smtp-Source: ABdhPJy/AhT6QN/vZC30F8CBgffq4XH9KaaOdvf2rgx1B6k6nxOwJgnBvL2hqDiw9krzNOUyej8OAyPsa6j2rjQlRjo=
X-Received: by 2002:a05:620a:1398:b0:67d:1561:f4f2 with SMTP id
 k24-20020a05620a139800b0067d1561f4f2mr1497412qki.581.1654159704742; Thu, 02
 Jun 2022 01:48:24 -0700 (PDT)
MIME-Version: 1.0
References: <020201d87420$1e944a80$5bbcdf80$@nextmovesoftware.com>
In-Reply-To: <020201d87420$1e944a80$5bbcdf80$@nextmovesoftware.com>
From: Richard Biener <richard.guenther@gmail.com>
Date: Thu, 2 Jun 2022 10:48:13 +0200
Message-ID: <CAFiYyc0azu2aSV-Lbqv86Ap5X+BL3kBpraRXyRbcDnxxgAO35w@mail.gmail.com>
Subject: Re: [PATCH] Fold truncations of left shifts in match.pd
To: Roger Sayle <roger@nextmovesoftware.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Jun 2022 08:48:28 -0000

On Mon, May 30, 2022 at 2:24 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> Whilst investigating PR 55278, I noticed that the tree-ssa optimizers
> aren't eliminating the promotions of shifts to "int" as inserted by the
> c-family front-ends, instead leaving this simplification to be left to
> the RTL optimizers.  This patch allows match.pd to do this itself earlier,
> narrowing (T)(X << C) to (T)X << C when the constant C is known to be
> valid for the (narrower) type T.
>
> Hence for this simple test case:
> short foo(short x) { return x << 5; }
>
> the .optimized dump currently looks like:
>
> short int foo (short int x)
> {
>   int _1;
>   int _2;
>   short int _4;
>
>   <bb 2> [local count: 1073741824]:
>   _1 = (int) x_3(D);
>   _2 = _1 << 5;
>   _4 = (short int) _2;
>   return _4;
> }
>
> but with this patch, now becomes:
>
> short int foo (short int x)
> {
>   short int _2;
>
>   <bb 2> [local count: 1073741824]:
>   _2 = x_1(D) << 5;
>   return _2;
> }
>
> This is always reasonable as RTL expansion knows how to use
> widening optabs if it makes sense at the RTL level to perform
> this shift in a wider mode.
>
> Of course, there's often a catch.  The above simplification not only
> reduces the number of statements in gimple, but also allows further
> optimizations, for example including the perception of rotate idioms
> and bswap16.  Alas, optimizing things earlier than anticipated
> requires several testsuite changes [though all these tests have
> been confirmed to generate identical assembly code on x86_64].
> The only significant change is that the vectorization pass previously
> wouldn't vectorize rotations if the backend doesn't explicitly provide
> an optab for them.  This is curious as if the rotate is expressed as
> ior(lshift,rshift) it will vectorize, and likewise RTL expansion will
> generate the iorv(lshiftv,rshiftv) sequence if required for a vector
> mode rotation.  Hence this patch includes a tweak to the optabs
> test in tree-vect-stmts.cc's vectorizable_shifts to better reflect
> the functionality supported by RTL expansion.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?

can the lshift pattern be merged with the preceeding one?  It looks
awfully similar.  Possibly even do

  (if (wi::ltu_p (....))
   (lshift (convert @1) (convert @2))
   { build_zero_cst (type); })

for when the truncation leaves us with zero?

+  /* RTL expansion knows how to expand rotates using shift/or.  */
+  if (icode == CODE_FOR_nothing
+      && (code == LROTATE_EXPR || code == RROTATE_EXPR)
+      && optab_handler (ior_optab, vec_mode) != CODE_FOR_nothing
+      && optab_handler (ashl_optab, vec_mode) != CODE_FOR_nothing)
+    icode = (int) optab_handler (lshr_optab, vec_mode);

but we then get the vector costing wrong.  Also note that vector lowering
will figure the rotate is not supported and do its own "lowering" using
IOR.  Also it seems that only handles the case of vector by scalar (aka
uniform vector) rotates, otherwise will expand to scalar operations.

That said, the appropriate way to deal with this is in tree-vect-patterns.cc
where there already is vect_recog_rotate_pattern that should be detected
so the above hunk shouldn't be necessary - instead eventually the
pattern recognition routine needs improving?

Thanks,
Richard.


>
> 2022-05-30  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * match.pd (convert (lshift @1 INTEGER_CST@2)): Narrow integer
>         left shifts by a constant when the result is truncated, and the
>         shift constant is well-defined for the narrower mode.
>         * tree-vect-stmts.cc (vectorizable_shift): Rotations by
>         constants are vectorizable, if the backend supports logical
>         shifts and IOR logical operations in the required vector mode.
>
> gcc/testsuite/ChangeLog
>         * gcc.dg/fold-convlshift-4.c: New test case.
>         * gcc.dg/optimize-bswaphi-1.c: Update found bswap count.
>         * gcc.dg/tree-ssa/pr61839_3.c: Shift is now optimized before VRP.
>         * gcc.dg/vect/vect-over-widen-1-big-array.c: Remove obsolete tests.
>         * gcc.dg/vect/vect-over-widen-1.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-3.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
>         * gcc.dg/vect/vect-over-widen-4.c: Likewise.
>
>
> Thanks in advance,
> Roger
> --
>