public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition
@ 2021-06-21  8:59 rguenth at gcc dot gnu.org
  2021-06-21  9:02 ` [Bug tree-optimization/101145] " rguenth at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-21  8:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

            Bug ID: 101145
           Summary: niter analysis fails for until-wrap condition
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

unsigned foo(unsigned val, unsigned start)
{
  unsigned cnt = 0;
  for (unsigned i = start; i > val; ++i)
    cnt++;
  return cnt;
}

fails niter analysis.  At least for the following variant
the number of iterations should be about UINT_MAX - start.

unsigned foo(unsigned val, unsigned start)
{
  unsigned cnt = 0;
  if (start > val)
    {
      unsigned i = start;
      do
        {
          cnt++;
        }
      while (++i > val);
    }
  return cnt;
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
@ 2021-06-21  9:02 ` rguenth at gcc dot gnu.org
  2021-06-24 15:39 ` amker at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-21  9:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-06-21
           Keywords|                            |missed-optimization
                 CC|                            |amker at gcc dot gnu.org,
                   |                            |guojiufu at gcc dot gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
This comes up with a pending patch to split loops like

void
foo (int *a, int *b, unsigned l, unsigned n)
{
  while (++l != n)
    a[l] = b[l] + 1;
}

into

  while (++l > n)
    a[l] = b[l] + 1;
  while (++l < n)
    a[l] = b[l] + 1;

since for the second loop (the "usual" case involving no wrapping of the IV)
this results in affine IVs and thus analyzable data dependence.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
  2021-06-21  9:02 ` [Bug tree-optimization/101145] " rguenth at gcc dot gnu.org
@ 2021-06-24 15:39 ` amker at gcc dot gnu.org
  2021-06-25  1:28 ` guojiufu at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: amker at gcc dot gnu.org @ 2021-06-24 15:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #2 from bin cheng <amker at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> This comes up with a pending patch to split loops like
> 
> void
> foo (int *a, int *b, unsigned l, unsigned n)
> {
>   while (++l != n)
>     a[l] = b[l] + 1;
> }
> 
> into
> 
>   while (++l > n)
>     a[l] = b[l] + 1;
>   while (++l < n)
>     a[l] = b[l] + 1;
> 
> since for the second loop (the "usual" case involving no wrapping of the IV)
> this results in affine IVs and thus analyzable data dependence.

Special case like "i++ > constant" are handled in function
adjust_cond_for_loop_until_wrap, however, it only handles constant invariant on
the other side right now.

Will see how to cover simple cases as reported here.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
  2021-06-21  9:02 ` [Bug tree-optimization/101145] " rguenth at gcc dot gnu.org
  2021-06-24 15:39 ` amker at gcc dot gnu.org
@ 2021-06-25  1:28 ` guojiufu at gcc dot gnu.org
  2021-06-25  1:34 ` amker at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: guojiufu at gcc dot gnu.org @ 2021-06-25  1:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #3 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
Yes, while the code in adjust_cond_for_loop_until_wrap seems somehow tricky:

  /* Only support simple cases for the moment.  */
  if (TREE_CODE (iv0->base) != INTEGER_CST
      || TREE_CODE (iv1->base) != INTEGER_CST)
    return false;

This code requires both sides are constant.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-06-25  1:28 ` guojiufu at gcc dot gnu.org
@ 2021-06-25  1:34 ` amker at gcc dot gnu.org
  2021-06-25 10:13 ` guojiufu at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: amker at gcc dot gnu.org @ 2021-06-25  1:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #4 from bin cheng <amker at gcc dot gnu.org> ---
(In reply to Jiu Fu Guo from comment #3)
> Yes, while the code in adjust_cond_for_loop_until_wrap seems somehow tricky:
> 
>   /* Only support simple cases for the moment.  */
>   if (TREE_CODE (iv0->base) != INTEGER_CST
>       || TREE_CODE (iv1->base) != INTEGER_CST)
>     return false;
> 
> This code requires both sides are constant.
Actually it requires an IV with constant base.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-06-25  1:34 ` amker at gcc dot gnu.org
@ 2021-06-25 10:13 ` guojiufu at gcc dot gnu.org
  2021-06-25 12:24 ` guojiufu at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: guojiufu at gcc dot gnu.org @ 2021-06-25 10:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #5 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
(In reply to bin cheng from comment #4)
> (In reply to Jiu Fu Guo from comment #3)
> > Yes, while the code in adjust_cond_for_loop_until_wrap seems somehow tricky:
> > 
> >   /* Only support simple cases for the moment.  */
> >   if (TREE_CODE (iv0->base) != INTEGER_CST
> >       || TREE_CODE (iv1->base) != INTEGER_CST)
> >     return false;
> > 
> > This code requires both sides are constant.
> Actually it requires an IV with constant base.

I also feel that the intention of this function may only require one side
constant for IV0 CODE IV1.
As tests, for below loop, adjust_cond_for_loop_until_wrap return false:

foo (int *__restrict__ a, int *__restrict__ b, unsigned i)
{
  while (++i > 100)
    *a++ = *b++ + 1;
}

For below code, adjust_cond_for_loop_until_wrap returns true:
  i = UINT_MAX - 200;
  while (++i > 100)
    *a++ = *b++ + 1;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-06-25 10:13 ` guojiufu at gcc dot gnu.org
@ 2021-06-25 12:24 ` guojiufu at gcc dot gnu.org
  2021-06-26  2:53 ` amker at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: guojiufu at gcc dot gnu.org @ 2021-06-25 12:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #6 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
> As tests, for below loop, adjust_cond_for_loop_until_wrap return false:
> 
> foo (int *__restrict__ a, int *__restrict__ b, unsigned i)
> {
>   while (++i > 100)
>     *a++ = *b++ + 1;
> }
For the above code, niter still may be zero: e.g. "i < 100" at the start.
For the below code, niter can be determined as constant at compiling time.
> 
> For below code, adjust_cond_for_loop_until_wrap returns true:
>   i = UINT_MAX - 200;
>   while (++i > 100)
>     *a++ = *b++ + 1;

For below code, niter is also may be zero: e.g. "UINT_MAX - 100 < n" .
   i = UINT_MAX - 200
   while (++i > n)
     *a++ = *b++ + 1;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-06-25 12:24 ` guojiufu at gcc dot gnu.org
@ 2021-06-26  2:53 ` amker at gcc dot gnu.org
  2021-07-01  3:42 ` guojiufu at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: amker at gcc dot gnu.org @ 2021-06-26  2:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #7 from bin cheng <amker at gcc dot gnu.org> ---
(In reply to Jiu Fu Guo from comment #5)
> (In reply to bin cheng from comment #4)
> > (In reply to Jiu Fu Guo from comment #3)
> > > Yes, while the code in adjust_cond_for_loop_until_wrap seems somehow tricky:
> > > 
> > >   /* Only support simple cases for the moment.  */
> > >   if (TREE_CODE (iv0->base) != INTEGER_CST
> > >       || TREE_CODE (iv1->base) != INTEGER_CST)
> > >     return false;
> > > 
> > > This code requires both sides are constant.
> > Actually it requires an IV with constant base.
> 
> I also feel that the intention of this function may only require one side
> constant for IV0 CODE IV1.
> As tests, for below loop, adjust_cond_for_loop_until_wrap return false:
> 
> foo (int *__restrict__ a, int *__restrict__ b, unsigned i)
> {
>   while (++i > 100)
>     *a++ = *b++ + 1;
> }
> 
> For below code, adjust_cond_for_loop_until_wrap returns true:
>   i = UINT_MAX - 200;
>   while (++i > 100)
>     *a++ = *b++ + 1;

Oh sorry for being misleading.  When I mentioned it requires something(...), I
was describing the current behavior, not that the conditions are necessary. 
Feel free to improve such cases.  Looking into niter analysis, these
cases(trade-offs) are not rare.

Thanks

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-06-26  2:53 ` amker at gcc dot gnu.org
@ 2021-07-01  3:42 ` guojiufu at gcc dot gnu.org
  2021-08-25  8:39 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: guojiufu at gcc dot gnu.org @ 2021-07-01  3:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #8 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
Reference the code of adjust_cond_for_loop_until_wrap, add code for non-const
cases.  Code was added in adjust_cond_for_loop_until_wrap at beginning, to set
may_be_zero and no_overflow, the code was moved to number_of_iterations_lt at
last.

The patch was submitted as: 
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574110.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-07-01  3:42 ` guojiufu at gcc dot gnu.org
@ 2021-08-25  8:39 ` cvs-commit at gcc dot gnu.org
  2021-08-31 13:27 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-25  8:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jiu Fu Guo <guojiufu@gcc.gnu.org>:

https://gcc.gnu.org/g:3673dcf6d6baeb67bb70ff03d4cb3f92beed0075

commit r12-3136-g3673dcf6d6baeb67bb70ff03d4cb3f92beed0075
Author: Jiufu Guo <guojiufu@linux.ibm.com>
Date:   Wed Jul 7 13:41:01 2021 +0800

    Analyze niter for until-wrap condition [PR101145]

    For code like:
    unsigned foo(unsigned val, unsigned start)
    {
      unsigned cnt = 0;
      for (unsigned i = start; i > val; ++i)
        cnt++;
      return cnt;
    }

    The number of iterations should be about UINT_MAX - start.

    There is function adjust_cond_for_loop_until_wrap which
    handles similar work for const bases.
    Like adjust_cond_for_loop_until_wrap, this patch enhance
    function number_of_iterations_cond/number_of_iterations_lt
    to analyze number of iterations for this kind of loop.

    gcc/ChangeLog:

    2021-08-25  Jiufu Guo  <guojiufu@linux.ibm.com>

            PR tree-optimization/101145
            * tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
            New function.
            (number_of_iterations_lt): Invoke above function.
            (adjust_cond_for_loop_until_wrap):
            Merge to number_of_iterations_until_wrap.
            (number_of_iterations_cond): Update invokes for
            adjust_cond_for_loop_until_wrap and number_of_iterations_lt.

    gcc/testsuite/ChangeLog:

    2021-08-25  Jiufu Guo  <guojiufu@linux.ibm.com>

            PR tree-optimization/101145
            * gcc.dg/vect/pr101145.c: New test.
            * gcc.dg/vect/pr101145.inc: New test.
            * gcc.dg/vect/pr101145_1.c: New test.
            * gcc.dg/vect/pr101145_2.c: New test.
            * gcc.dg/vect/pr101145_3.c: New test.
            * gcc.dg/vect/pr101145inf.c: New test.
            * gcc.dg/vect/pr101145inf.inc: New test.
            * gcc.dg/vect/pr101145inf_1.c: New test.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2021-08-25  8:39 ` cvs-commit at gcc dot gnu.org
@ 2021-08-31 13:27 ` cvs-commit at gcc dot gnu.org
  2021-09-17 10:01 ` marxin at gcc dot gnu.org
  2021-12-01 19:59 ` cvs-commit at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-31 13:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:eca730231d5493647bb1e508fb1f853ffee0e95a

commit r12-3255-geca730231d5493647bb1e508fb1f853ffee0e95a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Aug 31 15:26:14 2021 +0200

    testsuite: Fix gcc.dg/vect/pr101145* tests [PR101145]

    I'm getting:
    FAIL: gcc.dg/vect/pr101145.c scan-tree-dump-times vect "vectorized 1 loops"
7
    FAIL: gcc.dg/vect/pr101145_1.c scan-tree-dump-times vect "vectorized 1
loops" 2
    FAIL: gcc.dg/vect/pr101145_2.c scan-tree-dump-times vect "vectorized 1
loops" 2
    FAIL: gcc.dg/vect/pr101145_3.c scan-tree-dump-times vect "vectorized 1
loops" 2
    FAIL: gcc.dg/vect/pr101145.c -flto -ffat-lto-objects  scan-tree-dump-times
vect "vectorized 1 loops" 7
    FAIL: gcc.dg/vect/pr101145_1.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vectorized 1 loops" 2
    FAIL: gcc.dg/vect/pr101145_2.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vectorized 1 loops" 2
    FAIL: gcc.dg/vect/pr101145_3.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vectorized 1 loops" 2
    on i686-linux (or x86_64-linux with -m32/-mno-sse).
    The problem is that those tests use dg-options, which in */vect/ testsuite
    throws away all the carefully added default options to enable vectorization
    on each target (and which e.g. vect_int etc. effective targets rely on).
    The old way would be to name those tests gcc.dg/vect/O3-pr101145*,
    but we can also use dg-additional-options (which doesn't throw the default
    options, just appends to them) which is IMO better so that we don't have to
    rename the tests.

    2021-08-31  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/101145
            * gcc.dg/vect/pr101145.c: Use dg-additional-options with just -O3
            instead of dg-options with -O3 -fdump-tree-vect-details.
            * gcc.dg/vect/pr101145_1.c: Likewise.
            * gcc.dg/vect/pr101145_2.c: Likewise.
            * gcc.dg/vect/pr101145_3.c: Likewise.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2021-08-31 13:27 ` cvs-commit at gcc dot gnu.org
@ 2021-09-17 10:01 ` marxin at gcc dot gnu.org
  2021-12-01 19:59 ` cvs-commit at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-09-17 10:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145
Bug 101145 depends on bug 102364, which changed state.

Bug 102364 Summary: [12 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r12-3136-g3673dcf6d6baeb67
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102364

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |DUPLICATE

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug tree-optimization/101145] niter analysis fails for until-wrap condition
  2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2021-09-17 10:01 ` marxin at gcc dot gnu.org
@ 2021-12-01 19:59 ` cvs-commit at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-12-01 19:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101145

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:de3e5aae6c4b540e808c822c1e878b0a3304d09c

commit r12-5699-gde3e5aae6c4b540e808c822c1e878b0a3304d09c
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Wed Dec 1 19:58:40 2021 +0000

    Final value replacement improvements for until-wrap loops.

    This middle-end patch is inspired by the Richard Beiner's until-wrap
    loop example in PR tree-optimization/101145.

    unsigned foo(unsigned val, unsigned start)
    {
      unsigned cnt = 0;
      for (unsigned i = start; i > val; ++i)
        cnt++;
      return cnt;
    }

    For this loop, the tree optimizers currently generate:

    unsigned int foo (unsigned int val, unsigned int start)
    {
      unsigned int cnt;
      unsigned int _1;
      unsigned int _5;

      <bb 2> [local count: 118111600]:
      if (start_3(D) > val_4(D))
        goto <bb 3>; [89.00%]
      else
        goto <bb 4>; [11.00%]

      <bb 3> [local count: 105119324]:
      _1 = start_3(D) + 1;
      _5 = -start_3(D);
      cnt_2 = _1 > val_4(D) ? _5 : 1;

      <bb 4> [local count: 118111600]:
      # cnt_11 = PHI <cnt_2(3), 0(2)>
      return cnt_11;
    }

    or perhaps slightly easier to read:

      if (start > val) {
        cnt = (start+1) > val ? -start : 1;
      } else cnt = 0;

    In this snippet, if we know start > val, then (start+1) > val
    unless start+1 overflows, i.e. (start+1) == 0 and start == ~0.
    We can use this (loop header) context to simplify the ternary
    expression to "(start != -1) ? -start : 1", which with a little
    help from match.pd can be folded to -start.  Hence the optimal
    final value replacement should be:

      cnt = (start > val) ? -start : 0;

    Or as now generated by this patch:

    unsigned int foo (unsigned int val, unsigned int start)
    {
      unsigned int cnt;

      <bb 2> [local count: 118111600]:
      if (start_3(D) > val_4(D))
        goto <bb 3>; [89.00%]
      else
        goto <bb 4>; [11.00%]

      <bb 3> [local count: 105119324]:
      cnt_2 = -start_3(D);

      <bb 4> [local count: 118111600]:
      # cnt_11 = PHI <cnt_2(3), 0(2)>
      return cnt_11;
    }

    We can also improve until-wrap loops that don't have a (suitable) loop
    header, as determined by simplify_using_initial_conditions.

    unsigned bar(unsigned val, unsigned start)
    {
      unsigned cnt = 0;
      unsigned i = start;
      do {
        cnt++;
        i++;
      } while (i > val);
      return cnt;
    }

    which is currently optimized to:

    unsigned int foo (unsigned int val, unsigned int start)
    {
      unsigned int cnt;
      unsigned int _9;
      unsigned int _10;

      <bb 2> [local count: 118111600]:
      _9 = start_4(D) + 1;
      _10 = -start_4(D);
      cnt_3 = val_7(D) < _9 ? _10 : 1;
      return cnt_3;
    }

    Here we have "val < (start+1) ? -start : 1", which again with the
    help of match.pd can be slightly simplified to "val <= start ? -start : 1"
    when dealing with unsigned types, because at the complicating value where
    start == ~0, we fortunately have -start == 1, hence it doesn't matter
    whether the second or third operand of the ternary operator is returned.

    To summarize, this patch (in addition to tweaking may_be_zero in
    number_of_iterations_until_wrap) adds three new constant folding
    transforms to match.pd.

    X != C1 ? -X : C2 simplifies to -X when -C1 == C2.
    which is the generalized form of the simplification above.

    X != C1 ? ~X : C2 simplifies to ~X when ~C1 == C2.
    which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case.

    and the "until-wrap final value replacement without context":

    (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when
    X is unsigned, as when X + 1 overflows, X is -1, so -X == 1.

    2021-12-01  Roger Sayle  <roger@nextmovesoftware.com>
                Richard Biener  <rguenther@suse.de>

    gcc/ChangeLog
            * tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
            Check if simplify_using_initial_conditions allows us to
            simplify the expression for may_be_zero.
            * match.pd (X != C ? -X : -C -> -X): New transform.
            (X != C ? ~X : ~C -> ~X): Likewise.
            ((X+1) > Y ? -X : 1 -> X >= Y ? -X : 1): Likewise.

    gcc/testsuite/ChangeLog
            * gcc.dg/fold-condneg-1.c: New test case.
            * gcc.dg/fold-condneg-2.c: New test case.
            * gcc.dg/fold-condnot-1.c: New test case.
            * gcc.dg/pr101145-1.c: New test case.
            * gcc.dg/pr101145-2.c: New test case.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-12-01 19:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-21  8:59 [Bug tree-optimization/101145] New: niter analysis fails for until-wrap condition rguenth at gcc dot gnu.org
2021-06-21  9:02 ` [Bug tree-optimization/101145] " rguenth at gcc dot gnu.org
2021-06-24 15:39 ` amker at gcc dot gnu.org
2021-06-25  1:28 ` guojiufu at gcc dot gnu.org
2021-06-25  1:34 ` amker at gcc dot gnu.org
2021-06-25 10:13 ` guojiufu at gcc dot gnu.org
2021-06-25 12:24 ` guojiufu at gcc dot gnu.org
2021-06-26  2:53 ` amker at gcc dot gnu.org
2021-07-01  3:42 ` guojiufu at gcc dot gnu.org
2021-08-25  8:39 ` cvs-commit at gcc dot gnu.org
2021-08-31 13:27 ` cvs-commit at gcc dot gnu.org
2021-09-17 10:01 ` marxin at gcc dot gnu.org
2021-12-01 19:59 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).