public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST
@ 2021-03-05  6:53 linkw at gcc dot gnu.org
  2021-03-05  6:56 ` [Bug tree-optimization/99398] " linkw at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: linkw at gcc dot gnu.org @ 2021-03-05  6:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398

            Bug ID: 99398
           Summary: Miss to optimize vector permutation fed by CTOR and
                    CTOR/CST
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

#include "altivec.h"

vector long long foo(long long a, long long b) {
  vector long long v1 = {a, 0};
  vector long long v2 = {b, 0};
  vector unsigned char vc = {0,1,2,3,4,5,6,7, 16,17,18,19,20,21,22,23};
  vector long long vres = (vector long long)vec_perm ((vector unsigned char)v1,
(vector unsigned char)v2, vc);
  return vres;
}

gcc -Ofast -mcpu=power9, it generates (asm on BE btw)

        mtvsrdd 32,3,9
        mtvsrdd 33,4,9
        lxv 34,0(10)
        vperm 2,0,1,2
        blr

But it can be optimized into:

        mtvsrdd 34,3,4
        blr

The gimple at optimized dumping looks like:

__vector long foo (long long int a, long long int b)
{
  __vector long vres;
  __vector long v2;
  __vector long v1;
  __vector unsigned char _5;
  __vector unsigned char _6;
  __vector unsigned char _7;

  <bb 2> [local count: 1073741824]:
  v1_2 = {a_1(D), 0};
  v2_4 = {b_3(D), 0};
  _5 = VIEW_CONVERT_EXPR<__vector unsigned char>(v1_2);
  _6 = VIEW_CONVERT_EXPR<__vector unsigned char>(v2_4);
  _7 = VEC_PERM_EXPR <_5, _6, { 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21,
22, 23 }>;
  vres_8 = VIEW_CONVERT_EXPR<__vector long>(_7);
  return vres_8;

}

But it can look like:

__vector long foo (long long int a, long long int b)
{
  vector(2) long long int _10;

  <bb 2> [local count: 1073741824]:
  _10 = {a_1(D), b_3(D)};
  return _10;

}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/99398] Miss to optimize vector permutation fed by CTOR and CTOR/CST
  2021-03-05  6:53 [Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST linkw at gcc dot gnu.org
@ 2021-03-05  6:56 ` linkw at gcc dot gnu.org
  2021-03-08  3:20 ` linkw at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: linkw at gcc dot gnu.org @ 2021-03-05  6:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-03-05

--- Comment #1 from Kewen Lin <linkw at gcc dot gnu.org> ---
I have a patch by teaching simplify_permutation@tree-ssa-forwprop.c to bypass
VIEW_CONVERT_EXPR.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/99398] Miss to optimize vector permutation fed by CTOR and CTOR/CST
  2021-03-05  6:53 [Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST linkw at gcc dot gnu.org
  2021-03-05  6:56 ` [Bug tree-optimization/99398] " linkw at gcc dot gnu.org
@ 2021-03-08  3:20 ` linkw at gcc dot gnu.org
  2021-05-28  6:13 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: linkw at gcc dot gnu.org @ 2021-03-08  3:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398

--- Comment #2 from Kewen Lin <linkw at gcc dot gnu.org> ---
Created attachment 50329
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50329&action=edit
tested patch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/99398] Miss to optimize vector permutation fed by CTOR and CTOR/CST
  2021-03-05  6:53 [Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST linkw at gcc dot gnu.org
  2021-03-05  6:56 ` [Bug tree-optimization/99398] " linkw at gcc dot gnu.org
  2021-03-08  3:20 ` linkw at gcc dot gnu.org
@ 2021-05-28  6:13 ` cvs-commit at gcc dot gnu.org
  2021-05-28  6:14 ` linkw at gcc dot gnu.org
  2021-09-17  6:38 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-05-28  6:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kewen Lin <linkw@gcc.gnu.org>:

https://gcc.gnu.org/g:4a9f2306cb39a3cf265eeb6f8f3a8bbaf230c4c8

commit r12-1103-g4a9f2306cb39a3cf265eeb6f8f3a8bbaf230c4c8
Author: Kewen Lin <linkw@linux.ibm.com>
Date:   Fri May 28 01:11:45 2021 -0500

    forwprop: Enhance vec perm fed by CTOR and CTOR/CST [PR99398]

    VEC_PERM_EXPR requires the number of MASK elements must be the
    same with the number of elements in operands V0 and V1.  In
    some cases, like with Power altivec built-in function vec_perm,
    VIEW_CONVERT_EXPR has to be used to guarantee this requirement,
    but it can prevent some simplifications which don't consider
    this well.

    For the cases that the permutated operands of vector
    permutation are from two same type CTOR and CTOR, or one CTOR
    and one VECTOR CST, this patch is to enhance forwprop to look
    through intermediate VIEW_CONVERT_EXPR and further simplify
    them if possible.

    Bootstrapped/regtested on powerpc64le-linux-gnu P9,
    powerpc64-linux-gnu P8, x86_64-redhat-linux and
    aarch64-linux-gnu.

    gcc/ChangeLog:

            PR tree-optimization/99398
            * tree-ssa-forwprop.c (simplify_permutation): Optimize some cases
            where the fed operands are CTOR/CST and propagated through
            VIEW_CONVERT_EXPR.  Call vec_perm_indices::new_shrunk_vector.
            * vec-perm-indices.c (vec_perm_indices::new_shrunk_vector): New
            function.
            * vec-perm-indices.h (vec_perm_indices::new_shrunk_vector): New
            declare.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/99398
            * gcc.target/powerpc/vec-perm-ctor-run.c: New test.
            * gcc.target/powerpc/vec-perm-ctor.c: New test.
            * gcc.target/powerpc/vec-perm-ctor.h: New test.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/99398] Miss to optimize vector permutation fed by CTOR and CTOR/CST
  2021-03-05  6:53 [Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST linkw at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-05-28  6:13 ` cvs-commit at gcc dot gnu.org
@ 2021-05-28  6:14 ` linkw at gcc dot gnu.org
  2021-09-17  6:38 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: linkw at gcc dot gnu.org @ 2021-05-28  6:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from Kewen Lin <linkw at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/99398] Miss to optimize vector permutation fed by CTOR and CTOR/CST
  2021-03-05  6:53 [Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST linkw at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-05-28  6:14 ` linkw at gcc dot gnu.org
@ 2021-09-17  6:38 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-09-17  6:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99398

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-09-17  6:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-05  6:53 [Bug tree-optimization/99398] New: Miss to optimize vector permutation fed by CTOR and CTOR/CST linkw at gcc dot gnu.org
2021-03-05  6:56 ` [Bug tree-optimization/99398] " linkw at gcc dot gnu.org
2021-03-08  3:20 ` linkw at gcc dot gnu.org
2021-05-28  6:13 ` cvs-commit at gcc dot gnu.org
2021-05-28  6:14 ` linkw at gcc dot gnu.org
2021-09-17  6:38 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).