public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/112579] New: bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8}
@ 2023-11-17  2:57 liuhongt at gcc dot gnu.org
  2023-11-17  5:41 ` [Bug tree-optimization/112579] " liuhongt at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2023-11-17  2:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112579

            Bug ID: 112579
           Summary: bb vectorizer failed to reduction sum += inv >>
                    {1,2,3,4,5,6,7,8}
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: liuhongt at gcc dot gnu.org
            Blocks: 112325
  Target Milestone: ---

This is from PR112325

unsigned
foo (unsigned * restrict s, unsigned qh, unsigned * restrict qs) {
  unsigned int sumi = 0;

  sumi += (qh >> 0);
  sumi += (qh >> 1);
  sumi += (qh >> 2);
  sumi += (qh >> 3);
  sumi += (qh >> 4);
  sumi += (qh >> 5);
  sumi += (qh >> 6);
  sumi += (qh >> 7);
  sumi += (qh >> 8);
  sumi += (qh >> 9);
  sumi += (qh >> 10);
  sumi += (qh >> 11);
  sumi += (qh >> 12);
  sumi += (qh >> 13);
  sumi += (qh >> 14);
  sumi += (qh >> 15);
  return sumi;
}

test2.c:28:8: note:   nunits = 8
test2.c:28:8: missed:   Build SLP failed: unrolling required in basic block SLP
test2.c:28:8: note:   Build SLP for _5 = qh_16(D) >> 5;
test2.c:28:8: note:   get vectype for scalar type (group size 14): unsigned int
test2.c:28:8: note:   vectype: vector(8) unsigned int
test2.c:28:8: note:   nunits = 8
test2.c:28:8: missed:   Build SLP failed: unrolling required in basic block SLP
test2.c:28:8: note:   Build SLP for _4 = qh_16(D) >> 4;
test2.c:28:8: note:   get vectype for scalar type (group size 14): unsigned int
test2.c:28:8: note:   vectype: vector(8) unsigned int
test2.c:28:8: note:   nunits = 8
test2.c:28:8: missed:   Build SLP failed: unrolling required in basic block SLP
test2.c:28:8: note:   Build SLP for _3 = qh_16(D) >> 3;
test2.c:28:8: note:   get vectype for scalar type (group size 14): unsigned int
test2.c:28:8: note:   vectype: vector(8) unsigned int
test2.c:28:8: note:   nunits = 8
test2.c:28:8: missed:   Build SLP failed: unrolling required in basic block SLP
test2.c:28:8: note:   Build SLP for _1 = qh_16(D) >> 1;
test2.c:28:8: note:   get vectype for scalar type (group size 14): unsigned int
test2.c:28:8: note:   vectype: vector(8) unsigned int
test2.c:28:8: note:   nunits = 8
test2.c:28:8: missed:   Build SLP failed: unrolling required in basic block SLP
test2.c:28:8: note:   SLP discovery for node 0x6415a60 failed
test2.c:28:8: note:   SLP discovery failed

rewrite it as

unsigned
foo1 (unsigned * restrict s, unsigned qh, unsigned * restrict qs) {
  unsigned int sumi = 0;

  for (int i = 0; i != 16; i++)
    sumi += qh >> i;
  return sumi;
}

loop vectorizer successfully vectorize it.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
[Bug 112325] Missed vectorization of reduction after unrolling

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8}
  2023-11-17  2:57 [Bug tree-optimization/112579] New: bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8} liuhongt at gcc dot gnu.org
@ 2023-11-17  5:41 ` liuhongt at gcc dot gnu.org
  2023-11-17  5:43 ` [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15} liuhongt at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2023-11-17  5:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112579

--- Comment #1 from liuhongt at gcc dot gnu.org ---
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 1, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D), type of def:
external
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 3, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 4, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 5, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 6, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 7, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 8, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 9, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 10, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 11, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 12, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 13, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 14, type of def:
internal
test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 15, type of def:
internal
test.c:28:8: note:   === vect_analyze_slp ===
test.c:28:8: note:   Starting SLP discovery for
test.c:28:8: note:     _15 = qh_16(D) >> 15;
test.c:28:8: note:     _14 = qh_16(D) >> 14;
test.c:28:8: note:     _13 = qh_16(D) >> 13;
test.c:28:8: note:     _12 = qh_16(D) >> 12;
test.c:28:8: note:     _11 = qh_16(D) >> 11;
test.c:28:8: note:     _10 = qh_16(D) >> 10;
test.c:28:8: note:     _9 = qh_16(D) >> 9;
test.c:28:8: note:     _8 = qh_16(D) >> 8;
test.c:28:8: note:     _7 = qh_16(D) >> 7;
test.c:28:8: note:     _6 = qh_16(D) >> 6;
test.c:28:8: note:     _5 = qh_16(D) >> 5;
test.c:28:8: note:     _4 = qh_16(D) >> 4;
test.c:28:8: note:     _3 = qh_16(D) >> 3;
test.c:28:8: note:     _1 = qh_16(D) >> 1;
test.c:28:8: note:     _2 = qh_16(D) >> 2;
test.c:28:8: note:   starting SLP discovery for node 0x5d4e740
test.c:28:8: note:   Build SLP for _15 = qh_16(D) >> 15;
test.c:28:8: note:   get vectype for scalar type (group size 14): unsigned int
test.c:28:8: note:   vectype: vector(1) unsigned int
test.c:28:8: note:   nunits = 1
test.c:28:8: missed:   Build SLP failed: op not supported by target.
test.c:28:8: note:   SLP discovery for node 0x5d4e740 failed
test.c:28:8: note:   SLP discovery failed

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
  2023-11-17  2:57 [Bug tree-optimization/112579] New: bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8} liuhongt at gcc dot gnu.org
  2023-11-17  5:41 ` [Bug tree-optimization/112579] " liuhongt at gcc dot gnu.org
@ 2023-11-17  5:43 ` liuhongt at gcc dot gnu.org
  2023-11-17  6:02 ` liuhongt at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2023-11-17  5:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112579

--- Comment #2 from liuhongt at gcc dot gnu.org ---
Got vectorized after change source code to

unsigned
foo (unsigned * restrict s, unsigned qh, unsigned * restrict qs) {
  unsigned int sumi = 0;

  sumi += (qh >> 16);
  sumi += (qh >> 1);
  sumi += (qh >> 2);
  sumi += (qh >> 3);
  sumi += (qh >> 4);
  sumi += (qh >> 5);
  sumi += (qh >> 6);
  sumi += (qh >> 7);
  sumi += (qh >> 8);
  sumi += (qh >> 9);
  sumi += (qh >> 10);
  sumi += (qh >> 11);
  sumi += (qh >> 12);
  sumi += (qh >> 13);
  sumi += (qh >> 14);
  sumi += (qh >> 15);
  return sumi;
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
  2023-11-17  2:57 [Bug tree-optimization/112579] New: bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8} liuhongt at gcc dot gnu.org
  2023-11-17  5:41 ` [Bug tree-optimization/112579] " liuhongt at gcc dot gnu.org
  2023-11-17  5:43 ` [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15} liuhongt at gcc dot gnu.org
@ 2023-11-17  6:02 ` liuhongt at gcc dot gnu.org
  2023-11-17  6:12 ` liuhongt at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2023-11-17  6:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112579

--- Comment #3 from liuhongt at gcc dot gnu.org ---
(In reply to liuhongt from comment #1)
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 1, type of def:
> internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D), type of def:
> external
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 3, type of def:
> internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 4, type of def:
> internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 5, type of def:
> internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 6, type of def:
> internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 7, type of def:
> internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 8, type of def:
> internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 9, type of def:
> internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 10, type of
> def: internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 11, type of
> def: internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 12, type of
> def: internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 13, type of
> def: internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 14, type of
> def: internal
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D) >> 15, type of
> def: internal
> test.c:28:8: note:   === vect_analyze_slp ===
> test.c:28:8: note:   Starting SLP discovery for
> test.c:28:8: note:     _15 = qh_16(D) >> 15;
> test.c:28:8: note:     _14 = qh_16(D) >> 14;
> test.c:28:8: note:     _13 = qh_16(D) >> 13;
> test.c:28:8: note:     _12 = qh_16(D) >> 12;
> test.c:28:8: note:     _11 = qh_16(D) >> 11;
> test.c:28:8: note:     _10 = qh_16(D) >> 10;
> test.c:28:8: note:     _9 = qh_16(D) >> 9;
> test.c:28:8: note:     _8 = qh_16(D) >> 8;
> test.c:28:8: note:     _7 = qh_16(D) >> 7;
> test.c:28:8: note:     _6 = qh_16(D) >> 6;
> test.c:28:8: note:     _5 = qh_16(D) >> 5;
> test.c:28:8: note:     _4 = qh_16(D) >> 4;
> test.c:28:8: note:     _3 = qh_16(D) >> 3;
> test.c:28:8: note:     _1 = qh_16(D) >> 1;
> test.c:28:8: note:     _2 = qh_16(D) >> 2;
> test.c:28:8: note:   starting SLP discovery for node 0x5d4e740
> test.c:28:8: note:   Build SLP for _15 = qh_16(D) >> 15;
> test.c:28:8: note:   get vectype for scalar type (group size 14): unsigned
> int
> test.c:28:8: note:   vectype: vector(1) unsigned int
> test.c:28:8: note:   nunits = 1
> test.c:28:8: missed:   Build SLP failed: op not supported by target.
> test.c:28:8: note:   SLP discovery for node 0x5d4e740 failed
> test.c:28:8: note:   SLP discovery failed

If compiler is smart, it should take 
> test.c:28:8: note:   vect_is_simple_use: operand qh_16(D), type of def:
> external
in the group as qh_16 >> 0.

or normally, it should splitted into groups size 4 + 4 + 3 and vectorize for 2
group size 4.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
  2023-11-17  2:57 [Bug tree-optimization/112579] New: bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8} liuhongt at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-11-17  6:02 ` liuhongt at gcc dot gnu.org
@ 2023-11-17  6:12 ` liuhongt at gcc dot gnu.org
  2023-11-17  6:18 ` pinskia at gcc dot gnu.org
  2023-11-17  6:19 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2023-11-17  6:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112579

--- Comment #4 from liuhongt at gcc dot gnu.org ---
> or normally, it should splitted into groups size 4 + 4 + 3 and vectorize for
> 2 group size 4.

  /* Try to break the group up into pieces.  */
  if (kind == slp_inst_kind_store

Currently, we only split for store.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
  2023-11-17  2:57 [Bug tree-optimization/112579] New: bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8} liuhongt at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-11-17  6:12 ` liuhongt at gcc dot gnu.org
@ 2023-11-17  6:18 ` pinskia at gcc dot gnu.org
  2023-11-17  6:19 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-17  6:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112579

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=106343

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Basically PR 106343 but for >>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
  2023-11-17  2:57 [Bug tree-optimization/112579] New: bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8} liuhongt at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-11-17  6:18 ` pinskia at gcc dot gnu.org
@ 2023-11-17  6:19 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-11-17  6:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112579

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #5)
> Basically PR 106343 but for >>

Actually >> was mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106343#c5 .

So it is a dup.

*** This bug has been marked as a duplicate of bug 106343 ***

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-11-17  6:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-17  2:57 [Bug tree-optimization/112579] New: bb vectorizer failed to reduction sum += inv >> {1,2,3,4,5,6,7,8} liuhongt at gcc dot gnu.org
2023-11-17  5:41 ` [Bug tree-optimization/112579] " liuhongt at gcc dot gnu.org
2023-11-17  5:43 ` [Bug tree-optimization/112579] bb vectorizer failed to reduction sum += inv >> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15} liuhongt at gcc dot gnu.org
2023-11-17  6:02 ` liuhongt at gcc dot gnu.org
2023-11-17  6:12 ` liuhongt at gcc dot gnu.org
2023-11-17  6:18 ` pinskia at gcc dot gnu.org
2023-11-17  6:19 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).