public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/111820] New: GCC: 14: hangs with a simple while loop
@ 2023-10-15  2:02 141242068 at smail dot nju.edu.cn
  2023-10-15  2:07 ` [Bug tree-optimization/111820] " pinskia at gcc dot gnu.org
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: 141242068 at smail dot nju.edu.cn @ 2023-10-15  2:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

            Bug ID: 111820
           Summary: GCC: 14: hangs with a simple while loop
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: 141242068 at smail dot nju.edu.cn
  Target Milestone: ---

Compiler Explorer: https://godbolt.org/z/ezdG5GGd8

When compile below program with option `-O3 -fno-tree-vrp`, GCC consumes upto
46 seconds to finish:
```
int r;
int r_0;

void f (void)
{
  int n = 0;
  while (-- n)
    {
      r_0 += r ;
      r  += r;
      r  += r ;
      r  += r ;
      r  >= r ;
      r  += r ;
    }
}
```

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] GCC: 14: hangs with a simple while loop
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
@ 2023-10-15  2:07 ` pinskia at gcc dot gnu.org
  2023-10-15  2:09 ` [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp` pinskia at gcc dot gnu.org
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-15  2:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Looks to be the vectorizer:
#0  0x00000000012d1fe5 in wide_int_storage::operator= (x=..., this=<optimized
out>) at /home/apinski/src/upstream-gcc-git/gcc/gcc/wide-int.h:1221
#1  generic_wide_int<wide_int_storage>::operator= (this=<optimized out>) at
/home/apinski/src/upstream-gcc-git/gcc/gcc/wide-int.h:775
#2  vect_peel_nonlinear_iv_init (stmts=0x7fffffffd1c0,
init_expr=0x7ffff79dbbd0, skip_niters=<optimized out>,
step_expr=0x7ffff79bad38, induction_type=<optimized out>) at
/home/apinski/src/upstream-gcc-git/gcc/gcc/tree-vect-loop.cc:9138
#3  0x00000000012f5322 in vect_update_ivs_after_vectorizer
(loop_vinfo=0x3181a90, niters=0x7ffff79e60c0, update_e=0x7ffff79d5ba0) at
/home/apinski/src/upstream-gcc-git/gcc/gcc/tree-vect-loop-manip.cc:2028
#4  0x00000000012fec74 in vect_do_peeling
(loop_vinfo=loop_vinfo@entry=0x3181a90, niters=<optimized out>,
niters@entry=0x7ffff79bad50, nitersm1=nitersm1@entry=0x7ffff79e0b40,
niters_vector=niters_vector@entry=0x7fffffffd700,
step_vector=step_vector@entry=0x7fffffffd708,
    niters_vector_mult_vf_var=niters_vector_mult_vf_var@entry=0x7fffffffd710,
th=<optimized out>, check_profitability=<optimized out>,
niters_no_overflow=<optimized out>, advance=<optimized out>) at
/home/apinski/src/upstream-gcc-git/gcc/gcc/tree-vect-loop-manip.cc:3370
#5  0x00000000012f01a9 in vect_transform_loop
(loop_vinfo=loop_vinfo@entry=0x3181a90,
loop_vectorized_call=loop_vectorized_call@entry=0x0) at
/home/apinski/src/upstream-gcc-git/gcc/gcc/tree-vect-loop.cc:11386
#6  0x0000000001331bec in vect_transform_loops (simduid_to_vf_htab=<optimized
out>, loop=0x7ffff78054b0, loop_vectorized_call=0x0, fun=<optimized out>) at
/home/apinski/src/upstream-gcc-git/gcc/gcc/tree-vectorizer.cc:1004
#7  0x00000000013321ed in try_vectorize_loop_1 (fun=0x7ffff79d6000,
loop_dist_alias_call=0x0, loop_vectorized_call=0x0, loop=0x7ffff78054b0,
num_vectorized_loops=0x7fffffffdacc, simduid_to_vf_htab=@0x7fffffffdad0: 0x0)
at /home/apinski/src/upstream-gcc-git/gcc/gcc/tree-vectorizer.cc:1150
#8  try_vectorize_loop (simduid_to_vf_htab=@0x7fffffffdad0: 0x0,
num_vectorized_loops=0x7fffffffdacc, loop=0x7ffff78054b0, fun=0x7ffff79d6000)
at /home/apinski/src/upstream-gcc-git/gcc/gcc/tree-vectorizer.cc:1180
#9  0x0000000001332845 in (anonymous namespace)::pass_vectorize::execute
(this=<optimized out>, fun=0x7ffff79d6000) at
/home/apinski/src/upstream-gcc-git/gcc/gcc/tree-vectorizer.cc:1296

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
  2023-10-15  2:07 ` [Bug tree-optimization/111820] " pinskia at gcc dot gnu.org
@ 2023-10-15  2:09 ` pinskia at gcc dot gnu.org
  2023-10-16  7:14 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-15  2:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |needs-bisection
      Known to work|                            |11.1.0, 12.1.0, 12.3.0,
                   |                            |9.1.0
             Status|UNCONFIRMED                 |NEW
   Target Milestone|---                         |13.3
   Last reconfirmed|                            |2023-10-15
      Known to fail|                            |13.1.0, 14.0
            Summary|GCC: 14: hangs with a       |[13/14 Regression] Compiler
                   |simple while loop           |time hog in the vectorizer
                   |                            |with `-O3 -fno-tree-vrp`
     Ever confirmed|0                           |1

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
  2023-10-15  2:07 ` [Bug tree-optimization/111820] " pinskia at gcc dot gnu.org
  2023-10-15  2:09 ` [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp` pinskia at gcc dot gnu.org
@ 2023-10-16  7:14 ` rguenth at gcc dot gnu.org
  2023-10-16  7:22 ` crazylht at gmail dot com
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-16  7:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
                 CC|                            |crazylht at gmail dot com

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
        for (unsigned i = 0; i != skipn - 1; i++)
          begin = wi::mul (begin, wi::to_wide (step_expr));

(gdb) p skipn
$5 = 4294967292

niters is 4294967292 in vect_update_ivs_after_vectorizer.  Maybe the loop
should terminate when begin is zero.  But I wonder why we pass in 'niters'
and then name it 'skip_niters' ...

CCing author for fixing.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (2 preceding siblings ...)
  2023-10-16  7:14 ` rguenth at gcc dot gnu.org
@ 2023-10-16  7:22 ` crazylht at gmail dot com
  2023-10-16  7:43 ` crazylht at gmail dot com
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: crazylht at gmail dot com @ 2023-10-16  7:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
> niters is 4294967292 in vect_update_ivs_after_vectorizer.  Maybe the loop
> should terminate when begin is zero.  But I wonder why we pass in 'niters'
> and then name it 'skip_niters' ...
>

It's coming from here

 9448  niters_skip = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo);
 9449  /* If we are using the loop mask to "peel" for alignment then we need
 9450     to adjust the start value here.  */
 9451  if (niters_skip != NULL_TREE)
 9452    init_expr = vect_peel_nonlinear_iv_init (&stmts, init_expr,
niters_skip,
 9453                                             step_expr, induction_type);
 9454

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (3 preceding siblings ...)
  2023-10-16  7:22 ` crazylht at gmail dot com
@ 2023-10-16  7:43 ` crazylht at gmail dot com
  2023-10-16  8:46 ` rguenther at suse dot de
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: crazylht at gmail dot com @ 2023-10-16  7:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #3)
>         for (unsigned i = 0; i != skipn - 1; i++)
>           begin = wi::mul (begin, wi::to_wide (step_expr));
> 
> (gdb) p skipn
> $5 = 4294967292
> 
> niters is 4294967292 in vect_update_ivs_after_vectorizer.  Maybe the loop
> should terminate when begin is zero.  But I wonder why we pass in 'niters'
Here, it want to calculate begin * pow (step_expr, skipn), yes we can just skip
the loop when begin is 0.
Also optimize the loop to shift when step_expr is power of 2.
But for other cases, the loop is still needed.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (4 preceding siblings ...)
  2023-10-16  7:43 ` crazylht at gmail dot com
@ 2023-10-16  8:46 ` rguenther at suse dot de
  2023-10-17  6:56 ` crazylht at gmail dot com
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenther at suse dot de @ 2023-10-16  8:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 16 Oct 2023, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820
> 
> --- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
> (In reply to Richard Biener from comment #3)
> >         for (unsigned i = 0; i != skipn - 1; i++)
> >           begin = wi::mul (begin, wi::to_wide (step_expr));
> > 
> > (gdb) p skipn
> > $5 = 4294967292
> > 
> > niters is 4294967292 in vect_update_ivs_after_vectorizer.  Maybe the loop
> > should terminate when begin is zero.  But I wonder why we pass in 'niters'
> Here, it want to calculate begin * pow (step_expr, skipn), yes we can just skip
> the loop when begin is 0.

I mean terminate it when the multiplication overflowed to zero.

As for the MASK_ thing the skip is to be interpreted negative (we
should either not use a 'tree' here or make it have the correct type
maybe).  Can we even handle this here?  It would need to be
a division, no?

So I think we need to disable non-linear IV or masked peeling for
niter/aligment?  But I wonder how we run into this with plain -O3.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (5 preceding siblings ...)
  2023-10-16  8:46 ` rguenther at suse dot de
@ 2023-10-17  6:56 ` crazylht at gmail dot com
  2023-10-17  7:16 ` rguenther at suse dot de
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: crazylht at gmail dot com @ 2023-10-17  6:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to rguenther@suse.de from comment #6)
> On Mon, 16 Oct 2023, crazylht at gmail dot com wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820
> > 
> > --- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
> > (In reply to Richard Biener from comment #3)
> > >         for (unsigned i = 0; i != skipn - 1; i++)
> > >           begin = wi::mul (begin, wi::to_wide (step_expr));
> > > 
> > > (gdb) p skipn
> > > $5 = 4294967292
> > > 
> > > niters is 4294967292 in vect_update_ivs_after_vectorizer.  Maybe the loop
> > > should terminate when begin is zero.  But I wonder why we pass in 'niters'
> > Here, it want to calculate begin * pow (step_expr, skipn), yes we can just skip
> > the loop when begin is 0.
> 
> I mean terminate it when the multiplication overflowed to zero.
for pow (3, skipn), it will never overflowed to zero.
To solve this problem once and for all, I'm leaning towards setting a threshold
in vect_can_peel_nonlinear_iv_p for vect_step_op_mul,if step_expr is not
exact_log2() and niter > TYPE_PRECISION (step_expr) we give up on doing
vectorization.
> 
> As for the MASK_ thing the skip is to be interpreted negative (we
> should either not use a 'tree' here or make it have the correct type
> maybe).  Can we even handle this here?  It would need to be
> a division, no?
> 
> So I think we need to disable non-linear IV or masked peeling for
> niter/aligment?  But I wonder how we run into this with plain -O3.
I think we already disabled negative niters_skip in
vect_can_peel_nonlinear_iv_p.

416  /* Also doens't support peel for neg when niter is variable.
1417     ??? generate something like niter_expr & 1 ? init_expr : -init_expr? 
*/
1418  niters_skip = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo);
1419  if ((niters_skip != NULL_TREE
1420       && TREE_CODE (niters_skip) != INTEGER_CST)
1421      || (!vect_use_loop_mask_for_alignment_p (loop_vinfo)
1422          && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0))
1423    {
1424      if (dump_enabled_p ())
1425        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
1426                         "Peeling for alignement is not supported"
1427                         " for nonlinear induction when niters_skip"
1428                         " is not constant.\n");
1429      return false;
1430    }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (6 preceding siblings ...)
  2023-10-17  6:56 ` crazylht at gmail dot com
@ 2023-10-17  7:16 ` rguenther at suse dot de
  2023-10-17  9:13 ` crazylht at gmail dot com
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenther at suse dot de @ 2023-10-17  7:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 17 Oct 2023, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820
> 
> --- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
> (In reply to rguenther@suse.de from comment #6)
> > On Mon, 16 Oct 2023, crazylht at gmail dot com wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820
> > > 
> > > --- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
> > > (In reply to Richard Biener from comment #3)
> > > >         for (unsigned i = 0; i != skipn - 1; i++)
> > > >           begin = wi::mul (begin, wi::to_wide (step_expr));
> > > > 
> > > > (gdb) p skipn
> > > > $5 = 4294967292
> > > > 
> > > > niters is 4294967292 in vect_update_ivs_after_vectorizer.  Maybe the loop
> > > > should terminate when begin is zero.  But I wonder why we pass in 'niters'
> > > Here, it want to calculate begin * pow (step_expr, skipn), yes we can just skip
> > > the loop when begin is 0.
> > 
> > I mean terminate it when the multiplication overflowed to zero.
> for pow (3, skipn), it will never overflowed to zero.
> To solve this problem once and for all, I'm leaning towards setting a threshold
> in vect_can_peel_nonlinear_iv_p for vect_step_op_mul,if step_expr is not
> exact_log2() and niter > TYPE_PRECISION (step_expr) we give up on doing
> vectorization.

Hm, yeah - that's probably best.

> > 
> > As for the MASK_ thing the skip is to be interpreted negative (we
> > should either not use a 'tree' here or make it have the correct type
> > maybe).  Can we even handle this here?  It would need to be
> > a division, no?
> > 
> > So I think we need to disable non-linear IV or masked peeling for
> > niter/aligment?  But I wonder how we run into this with plain -O3.
> I think we already disabled negative niters_skip in
> vect_can_peel_nonlinear_iv_p.
> 
> 416  /* Also doens't support peel for neg when niter is variable.
> 1417     ??? generate something like niter_expr & 1 ? init_expr : -init_expr? 
> */
> 1418  niters_skip = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo);
> 1419  if ((niters_skip != NULL_TREE
> 1420       && TREE_CODE (niters_skip) != INTEGER_CST)

But we end up here with niters_skip being INTEGER_CST and ..

> 1421      || (!vect_use_loop_mask_for_alignment_p (loop_vinfo)

possibly vect_use_loop_mask_for_alignment_p.  Note
LOOP_VINFO_PEELING_FOR_ALIGNMENT < 0 simply means the amount of
peeling is unknown.

But I wonder how we run into this on x86 without enabling
loop masking ...

> 1422          && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0))
> 1423    {
> 1424      if (dump_enabled_p ())
> 1425        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> 1426                         "Peeling for alignement is not supported"
> 1427                         " for nonlinear induction when niters_skip"
> 1428                         " is not constant.\n");
> 1429      return false;
> 1430    }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (7 preceding siblings ...)
  2023-10-17  7:16 ` rguenther at suse dot de
@ 2023-10-17  9:13 ` crazylht at gmail dot com
  2023-10-17  9:47 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: crazylht at gmail dot com @ 2023-10-17  9:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---

> But we end up here with niters_skip being INTEGER_CST and ..
> 
> > 1421      || (!vect_use_loop_mask_for_alignment_p (loop_vinfo)
> 
> possibly vect_use_loop_mask_for_alignment_p.  Note
> LOOP_VINFO_PEELING_FOR_ALIGNMENT < 0 simply means the amount of
> peeling is unknown.
> 
> But I wonder how we run into this on x86 without enabling
> loop masking ...
> 
> > 1422          && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0))
> > 1423    {
> > 1424      if (dump_enabled_p ())
> > 1425        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > 1426                         "Peeling for alignement is not supported"
> > 1427                         " for nonlinear induction when niters_skip"
> > 1428                         " is not constant.\n");
> > 1429      return false;
> > 1430    }

Can you point out where it's assigned as nagative?
I saw LOOP_VINFO_MASK_SKIP_NITERS is only assigned in
vect_prepare_for_masked_peels.

when LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0
it's assigned as vf-npeel(will npeel > vf?)
else
it's assigned in get_misalign_in_elems and should be positive.

  HOST_WIDE_INT elem_size
    = int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
  tree elem_size_log = build_int_cst (type, exact_log2 (elem_size));

  /* Create:  misalign_in_bytes = addr & (target_align - 1).  */
  tree int_start_addr = fold_convert (type, start_addr);
  tree misalign_in_bytes = fold_build2 (BIT_AND_EXPR, type, int_start_addr,
                                        target_align_minus_1);

  /* Create:  misalign_in_elems = misalign_in_bytes / element_size.  */
  tree misalign_in_elems = fold_build2 (RSHIFT_EXPR, type, misalign_in_bytes,
                                        elem_size_log);

  return misalign_in_elems;

void
vect_prepare_for_masked_peels (loop_vec_info loop_vinfo)
{
  tree misalign_in_elems;
  tree type = TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo));

  gcc_assert (vect_use_loop_mask_for_alignment_p (loop_vinfo));

  /* From the information recorded in LOOP_VINFO get the number of iterations
     that need to be skipped via masking.  */
  if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0)
    {
      poly_int64 misalign = (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
                             - LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo));
      misalign_in_elems = build_int_cst (type, misalign);
    }
  else
    {
      gimple_seq seq1 = NULL, seq2 = NULL;
      misalign_in_elems = get_misalign_in_elems (&seq1, loop_vinfo);
      misalign_in_elems = fold_convert (type, misalign_in_elems);
      misalign_in_elems = force_gimple_operand (misalign_in_elems,
                                                &seq2, true, NULL_TREE);
      gimple_seq_add_seq (&seq1, seq2);
      if (seq1)
        {
          edge pe = loop_preheader_edge (LOOP_VINFO_LOOP (loop_vinfo));
          basic_block new_bb = gsi_insert_seq_on_edge_immediate (pe, seq1);
          gcc_assert (!new_bb);
        }
    }

  if (dump_enabled_p ())
    dump_printf_loc (MSG_NOTE, vect_location,
                     "misalignment for fully-masked loop: %T\n",
                     misalign_in_elems);

  LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo) = misalign_in_elems;

  vect_update_inits_of_drs (loop_vinfo, misalign_in_elems, MINUS_EXPR);
}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (8 preceding siblings ...)
  2023-10-17  9:13 ` crazylht at gmail dot com
@ 2023-10-17  9:47 ` rguenth at gcc dot gnu.org
  2023-10-23  1:16 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-17  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #9)
> > But we end up here with niters_skip being INTEGER_CST and ..
> > 
> > > 1421      || (!vect_use_loop_mask_for_alignment_p (loop_vinfo)
> > 
> > possibly vect_use_loop_mask_for_alignment_p.  Note
> > LOOP_VINFO_PEELING_FOR_ALIGNMENT < 0 simply means the amount of
> > peeling is unknown.
> > 
> > But I wonder how we run into this on x86 without enabling
> > loop masking ...
> > 
> > > 1422          && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0))
> > > 1423    {
> > > 1424      if (dump_enabled_p ())
> > > 1425        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > 1426                         "Peeling for alignement is not supported"
> > > 1427                         " for nonlinear induction when niters_skip"
> > > 1428                         " is not constant.\n");
> > > 1429      return false;
> > > 1430    }
> 
> Can you point out where it's assigned as nagative?
> I saw LOOP_VINFO_MASK_SKIP_NITERS is only assigned in
> vect_prepare_for_masked_peels.

Yes.

> when LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0
> it's assigned as vf-npeel(will npeel > vf?)

npeel should be < vf

OK, so it should be positive indeed.  But LOOP_VINFO_MASK_SKIP_NITERS
(when vect_use_loop_mask_for_alignment_p ()) means that the
first vector iteration only processes the first
vf - LOOP_VINFO_MASK_SKIP_NITERS scalar iterations, so a

 for (i = start; i < end; ++i)
   ..

loop is executed as

 for (i = start - LOOP_VINFO_MASK_SKIP_NITERS; i < end; ++i)
   if (i >= start)
     {
     }

that is, the loop mask is used to mask out the first
LOOP_VINFO_MASK_SKIP_NITERS elements.

It's a bit difficult to force peeling for alignment on x86, but
usually -fno-vect-cost-model will peel a store (it will peel the
most used ref for alignment).  With --param vect-partial-vector-usage=2
you should get AVX512 masked alignment peeling then I think.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (9 preceding siblings ...)
  2023-10-17  9:47 ` rguenth at gcc dot gnu.org
@ 2023-10-23  1:16 ` cvs-commit at gcc dot gnu.org
  2023-10-23  2:39 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-10-23  1:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:dbde384bd56f07bfbcae86f81fc74aa92e3786ad

commit r14-4834-gdbde384bd56f07bfbcae86f81fc74aa92e3786ad
Author: liuhongt <hongtao.liu@intel.com>
Date:   Wed Oct 18 10:08:24 2023 +0800

    Avoid compile time hog on vect_peel_nonlinear_iv_init for nonlinear
induction vec_step_op_mul when iteration count is too big.

    There's loop in vect_peel_nonlinear_iv_init to get init_expr *
    pow (step_expr, skip_niters). When skipn_iters is too big, compile time
    hogs. To avoid that, optimize init_expr * pow (step_expr, skip_niters) to
    init_expr << (exact_log2 (step_expr) * skip_niters) when step_expr is
    pow of 2, otherwise give up vectorization when skip_niters >=
    TYPE_PRECISION (TREE_TYPE (init_expr)).

    Also give up vectorization when niters_skip is negative which will be
    used for fully masked loop.

    gcc/ChangeLog:

            PR tree-optimization/111820
            PR tree-optimization/111833
            * tree-vect-loop-manip.cc (vect_can_peel_nonlinear_iv_p): Give
            up vectorization for nonlinear iv vect_step_op_mul when
            step_expr is not exact_log2 and niters is greater than
            TYPE_PRECISION (TREE_TYPE (step_expr)). Also don't vectorize
            for nagative niters_skip which will be used by fully masked
            loop.
            (vect_can_advance_ivs_p): Pass whole phi_info to
            vect_can_peel_nonlinear_iv_p.
            * tree-vect-loop.cc (vect_peel_nonlinear_iv_init): Optimize
            init_expr * pow (step_expr, skipn) to init_expr
            << (log2 (step_expr) * skipn) when step_expr is exact_log2.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr111820-1.c: New test.
            * gcc.target/i386/pr111820-2.c: New test.
            * gcc.target/i386/pr111820-3.c: New test.
            * gcc.target/i386/pr103144-mul-1.c: Adjust testcase.
            * gcc.target/i386/pr103144-mul-2.c: Adjust testcase.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (10 preceding siblings ...)
  2023-10-23  1:16 ` cvs-commit at gcc dot gnu.org
@ 2023-10-23  2:39 ` crazylht at gmail dot com
  2023-10-23  9:42 ` [Bug tree-optimization/111820] [13 " rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: crazylht at gmail dot com @ 2023-10-23  2:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #12 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC14, not sure if we want to backport the patch.
If so, the patch needs to be adjusted since GCC13 doesn't support auto_mpz.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (11 preceding siblings ...)
  2023-10-23  2:39 ` crazylht at gmail dot com
@ 2023-10-23  9:42 ` rguenth at gcc dot gnu.org
  2023-10-27  1:06 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-23  9:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|14.0                        |
            Summary|[13/14 Regression] Compiler |[13 Regression] Compiler
                   |time hog in the vectorizer  |time hog in the vectorizer
                   |with `-O3 -fno-tree-vrp`    |with `-O3 -fno-tree-vrp`
      Known to work|                            |14.0

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #12)
> Fixed in GCC14, not sure if we want to backport the patch.
> If so, the patch needs to be adjusted since GCC13 doesn't support auto_mpz.

Yes, we want to backport.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (12 preceding siblings ...)
  2023-10-23  9:42 ` [Bug tree-optimization/111820] [13 " rguenth at gcc dot gnu.org
@ 2023-10-27  1:06 ` cvs-commit at gcc dot gnu.org
  2023-10-27  1:18 ` crazylht at gmail dot com
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-10-27  1:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #14 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by hongtao Liu
<liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:82919cf4cb232166fed03d84a91fefd07feef6bb

commit r13-7988-g82919cf4cb232166fed03d84a91fefd07feef6bb
Author: liuhongt <hongtao.liu@intel.com>
Date:   Wed Oct 18 10:08:24 2023 +0800

    Avoid compile time hog on vect_peel_nonlinear_iv_init for nonlinear
induction vec_step_op_mul when iteration count is too big.

    There's loop in vect_peel_nonlinear_iv_init to get init_expr *
    pow (step_expr, skip_niters). When skipn_iters is too big, compile time
    hogs. To avoid that, optimize init_expr * pow (step_expr, skip_niters) to
    init_expr << (exact_log2 (step_expr) * skip_niters) when step_expr is
    pow of 2, otherwise give up vectorization when skip_niters >=
    TYPE_PRECISION (TREE_TYPE (init_expr)).

    Also give up vectorization when niters_skip is negative which will be
    used for fully masked loop.

    gcc/ChangeLog:

            PR tree-optimization/111820
            PR tree-optimization/111833
            * tree-vect-loop-manip.cc (vect_can_peel_nonlinear_iv_p): Give
            up vectorization for nonlinear iv vect_step_op_mul when
            step_expr is not exact_log2 and niters is greater than
            TYPE_PRECISION (TREE_TYPE (step_expr)). Also don't vectorize
            for nagative niters_skip which will be used by fully masked
            loop.
            (vect_can_advance_ivs_p): Pass whole phi_info to
            vect_can_peel_nonlinear_iv_p.
            * tree-vect-loop.cc (vect_peel_nonlinear_iv_init): Optimize
            init_expr * pow (step_expr, skipn) to init_expr
            << (log2 (step_expr) * skipn) when step_expr is exact_log2.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr111820-1.c: New test.
            * gcc.target/i386/pr111820-2.c: New test.
            * gcc.target/i386/pr111820-3.c: New test.
            * gcc.target/i386/pr103144-mul-1.c: Adjust testcase.
            * gcc.target/i386/pr103144-mul-2.c: Adjust testcase.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (13 preceding siblings ...)
  2023-10-27  1:06 ` cvs-commit at gcc dot gnu.org
@ 2023-10-27  1:18 ` crazylht at gmail dot com
  2023-10-27  1:20 ` pinskia at gcc dot gnu.org
  2023-10-27  3:29 ` pinskia at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: crazylht at gmail dot com @ 2023-10-27  1:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #15 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #13)
> (In reply to Hongtao.liu from comment #12)
> > Fixed in GCC14, not sure if we want to backport the patch.
> > If so, the patch needs to be adjusted since GCC13 doesn't support auto_mpz.
> 
> Yes, we want to backport.

Also fixed in GCC13.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (14 preceding siblings ...)
  2023-10-27  1:18 ` crazylht at gmail dot com
@ 2023-10-27  1:20 ` pinskia at gcc dot gnu.org
  2023-10-27  3:29 ` pinskia at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-27  1:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #16 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bug tree-optimization/111820] [13 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`
  2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
                   ` (15 preceding siblings ...)
  2023-10-27  1:20 ` pinskia at gcc dot gnu.org
@ 2023-10-27  3:29 ` pinskia at gcc dot gnu.org
  16 siblings, 0 replies; 18+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-27  3:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #17 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 111833 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-10-27  3:29 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-15  2:02 [Bug c/111820] New: GCC: 14: hangs with a simple while loop 141242068 at smail dot nju.edu.cn
2023-10-15  2:07 ` [Bug tree-optimization/111820] " pinskia at gcc dot gnu.org
2023-10-15  2:09 ` [Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp` pinskia at gcc dot gnu.org
2023-10-16  7:14 ` rguenth at gcc dot gnu.org
2023-10-16  7:22 ` crazylht at gmail dot com
2023-10-16  7:43 ` crazylht at gmail dot com
2023-10-16  8:46 ` rguenther at suse dot de
2023-10-17  6:56 ` crazylht at gmail dot com
2023-10-17  7:16 ` rguenther at suse dot de
2023-10-17  9:13 ` crazylht at gmail dot com
2023-10-17  9:47 ` rguenth at gcc dot gnu.org
2023-10-23  1:16 ` cvs-commit at gcc dot gnu.org
2023-10-23  2:39 ` crazylht at gmail dot com
2023-10-23  9:42 ` [Bug tree-optimization/111820] [13 " rguenth at gcc dot gnu.org
2023-10-27  1:06 ` cvs-commit at gcc dot gnu.org
2023-10-27  1:18 ` crazylht at gmail dot com
2023-10-27  1:20 ` pinskia at gcc dot gnu.org
2023-10-27  3:29 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).