public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/66678] New: loop counter not accurately described by vrp
@ 2015-06-26 11:15 vries at gcc dot gnu.org
  2015-06-26 12:53 ` [Bug tree-optimization/66678] " vries at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: vries at gcc dot gnu.org @ 2015-06-26 11:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678

            Bug ID: 66678
           Summary: loop counter not accurately described by vrp
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: minor
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

consider testcase:
...
void
f (unsigned int n, unsigned int *__restrict__ a, unsigned int *__restrict__ b,
   unsigned int *__restrict__ c)
{
  unsigned int i;

  for (i = 0; i < n; ++i)
    c[i] = a[i] + b[i];
}
...

After vrp1 we have:
...
f (unsigned intD.9 nD.2874, unsigned intD.9 * restrict aD.2875, unsigned intD.9
* restrict bD.2876, unsigned intD.9 * restrict cD.2877)
{
  unsigned intD.9 iD.2880;
  long unsigned intD.10 _5;
  long unsigned intD.10 _6;
  unsigned intD.9 * _8;
  unsigned intD.9 * _10;
  unsigned intD.9 _11;
  unsigned intD.9 * _13;
  unsigned intD.9 _14;
  unsigned intD.9 _15;

;;   basic block 2, loop depth 0, count 0, freq 900, maybe hot
;;    prev block 0, next block 3, flags: (NEW, REACHABLE)
;;    pred:       ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
  goto <bb 4>;
;;    succ:       4 [100.0%]  (FALLTHRU,EXECUTABLE)

;;   basic block 3, loop depth 1, count 0, freq 9100, maybe hot
;;    prev block 2, next block 4, flags: (NEW, REACHABLE)
;;    pred:       4 [91.0%]  (TRUE_VALUE,EXECUTABLE)
  # RANGE [0, 18446744073709551615] NONZERO 4294967295
  _5 = (long unsigned intD.10) i_1;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _6 = _5 * 4;
  # PT = { D.2900 } (nonlocal)
  _8 = c_7(D) + _6;
  # PT = { D.2898 } (nonlocal)
  _10 = a_9(D) + _6;
  # VUSE <.MEM_2>
  _11 = MEM[(unsigned intD.9 *)_10 clique 1 base 2];
  # PT = { D.2899 } (nonlocal)
  _13 = b_12(D) + _6;
  # VUSE <.MEM_2>
  _14 = MEM[(unsigned intD.9 *)_13 clique 1 base 3];
  # RANGE [0, 4294967295]
  _15 = _11 + _14;
  # .MEM_16 = VDEF <.MEM_2>
  MEM[(unsigned intD.9 *)_8 clique 1 base 1] = _15;
  # RANGE [0, 4294967295]
  i_17 = i_1 + 1;
;;    succ:       4 [100.0%]  (FALLTHRU,DFS_BACK,EXECUTABLE)

;;   basic block 4, loop depth 1, count 0, freq 10000, maybe hot
;;    prev block 3, next block 5, flags: (NEW, REACHABLE)
;;    pred:       2 [100.0%]  (FALLTHRU,EXECUTABLE)
;;                3 [100.0%]  (FALLTHRU,DFS_BACK,EXECUTABLE)
  # i_1 = PHI <0(2), i_17(3)>
  # .MEM_2 = PHI <.MEM_3(D)(2), .MEM_16(3)>
  if (i_1 < n_4(D))
    goto <bb 3>;
  else
    goto <bb 5>;
;;    succ:       3 [91.0%]  (TRUE_VALUE,EXECUTABLE)
;;                5 [9.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 5, loop depth 0, count 0, freq 900, maybe hot
;;    prev block 4, next block 1, flags: (NEW, REACHABLE)
;;    pred:       4 [9.0%]  (FALSE_VALUE,EXECUTABLE)
  # VUSE <.MEM_2>
  return;
;;    succ:       EXIT [100.0%] 

}
...

AFAIU:
- the loop iv i_1 has range [0, 4294967294], and 
- the loop iv increment i_17 has range [1, 4294967295]

But in the dump resulting from vrp1, i_1 has no range assigned, and i_17 has
RANGE [0, 4294967295] (which is equivalent to no range assigned).


During vrp the pass inserts an assert at the start of bb 3:
...
;;   basic block 3, loop depth 1, count 0, freq 9100, maybe hot
;;    prev block 2, next block 4, flags: (NEW, REACHABLE)
;;    pred:       4 [91.0%]  (TRUE_VALUE,EXECUTABLE)
  i_18 = ASSERT_EXPR <i_1, i_1 < n_4(D)>;
  # RANGE [0, 18446744073709551615] NONZERO 4294967295
  _5 = (long unsigned intD.10) i_18;
  # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612
  _6 = _5 * 4;
  # PT = { D.2900 } (nonlocal)
  _8 = c_7(D) + _6;
  # PT = { D.2898 } (nonlocal)
  _10 = a_9(D) + _6;
  # VUSE <.MEM_2>
  _11 = MEM[(unsigned intD.9 *)_10 clique 1 base 2];
  # PT = { D.2899 } (nonlocal)
  _13 = b_12(D) + _6;
  # VUSE <.MEM_2>
  _14 = MEM[(unsigned intD.9 *)_13 clique 1 base 3];
  _15 = _11 + _14;
  # .MEM_16 = VDEF <.MEM_2>
  MEM[(unsigned intD.9 *)_8 clique 1 base 1] = _15;
  i_17 = i_18 + 1;
;;    succ:       4 [100.0%]  (FALLTHRU,DFS_BACK,EXECUTABLE)

;;   basic block 4, loop depth 1, count 0, freq 10000, maybe hot
;;    prev block 3, next block 5, flags: (NEW, REACHABLE)
;;    pred:       2 [100.0%]  (FALLTHRU,EXECUTABLE)
;;                3 [100.0%]  (FALLTHRU,DFS_BACK,EXECUTABLE)
  # i_1 = PHI <0(2), i_17(3)>
  # .MEM_2 = PHI <.MEM_3(D)(2), .MEM_16(3)>
  if (i_1 < n_4(D))
    goto <bb 3>;
  else
    goto <bb 5>;
...

When visiting the assert we get:
...
Visiting statement:
i_18 = ASSERT_EXPR <i_1, i_1 < n_4(D)>;
Intersecting
  [0, n_4(D) + 4294967295]  EQUIVALENCES: { i_1 } (1 elements)
and
  [0, 0]
to
  [0, n_4(D) + 4294967295]  EQUIVALENCES: { i_1 } (1 elements)
Found new range for i_18: [0, n_4(D) + 4294967295]
...

AFAIU, if we have no information on n_4, then range
  [0, n_4(D) + 4294967295]
is equal to 
  [0, 4294967295]

>From the assert however we can also derive a range of
  [0, 4294967294 ]
given that i_1 < n_4 and n_4 is at most 4294967295.

So, a more accurate range is
  [0, MIN (n_4(D) + 4294967295, 4294967294) ].


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/66678] loop counter not accurately described by vrp
  2015-06-26 11:15 [Bug tree-optimization/66678] New: loop counter not accurately described by vrp vries at gcc dot gnu.org
@ 2015-06-26 12:53 ` vries at gcc dot gnu.org
  2015-06-30  8:32 ` vries at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: vries at gcc dot gnu.org @ 2015-06-26 12:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678

--- Comment #1 from vries at gcc dot gnu.org ---
I.
(In reply to vries from comment #0)

> AFAIU:
> - the loop iv i_1 has range [0, 4294967294], and 

This bit is incorrect. i_1 can actually be 4294967295. 


II.

With this demonstrator patch:
...
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fdaebe4..3dd2992 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -1890,7 +1890,8 @@ extract_range_from_assert (value_range_t *vr_p, tree
expr)
                max = fold_build2 (PLUS_EXPR, TREE_TYPE (max), max,
                                   build_int_cst (TREE_TYPE (max), -1));
              else
-               max = fold_build2 (MINUS_EXPR, TREE_TYPE (max), max,
+               max = fold_build2 (MINUS_EXPR, TREE_TYPE (max),
+                                  TYPE_MAXVAL (TREE_TYPE (max)),
                                   build_int_cst (TREE_TYPE (max), 1));
              if (EXPR_P (max))
                TREE_NO_WARNING (max) = 1;
...

we get:
...
  # RANGE [1, 4294967295]
  i_17 = i_1 + 1;
...


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/66678] loop counter not accurately described by vrp
  2015-06-26 11:15 [Bug tree-optimization/66678] New: loop counter not accurately described by vrp vries at gcc dot gnu.org
  2015-06-26 12:53 ` [Bug tree-optimization/66678] " vries at gcc dot gnu.org
@ 2015-06-30  8:32 ` vries at gcc dot gnu.org
  2015-06-30  8:48 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: vries at gcc dot gnu.org @ 2015-06-30  8:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678

--- Comment #2 from vries at gcc dot gnu.org ---
Created attachment 35878
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35878&action=edit
tentative patch


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/66678] loop counter not accurately described by vrp
  2015-06-26 11:15 [Bug tree-optimization/66678] New: loop counter not accurately described by vrp vries at gcc dot gnu.org
  2015-06-26 12:53 ` [Bug tree-optimization/66678] " vries at gcc dot gnu.org
  2015-06-30  8:32 ` vries at gcc dot gnu.org
@ 2015-06-30  8:48 ` rguenth at gcc dot gnu.org
  2023-08-09 23:06 ` pinskia at gcc dot gnu.org
  2023-08-10  6:57 ` rguenther at suse dot de
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-06-30  8:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to vries from comment #2)
> Created attachment 35878 [details]
> tentative patch

That single-use case is awfully special ... just add an unrelated use
to the function and we no longer optimize again.

The real issue with VRP is that given an assert-expr we have no chance to
record multiple ranges via equivalences because equivalences are still tied
to SSA defs.  The same issue pops up for other cases.

So instead of hacking in single-use special-casing we should rather fix
that deficiency...

A possible different way to represent equivalences is to not tie value-ranges
directly to SSA names (by indexing the array with their SSA_NAME_VERSION) but
instead introduce an indirection and thus allow arbitrary new value-ranges
to be stored.

So instead of

static value_range_t **vr_value;

have

vec<value_range_t> values;
static unsigned *vr_value;

and thus get_value_range doing

   unsigned ver = SSA_NAME_VERSION (var);
   return &values[vr_value[ver]];

and make equivalences a bitmap of indexes into 'values' instead of
SSA_NAME_VERSIONs.

Then you can always record both ranges from the assert (thus keep both
the symbolic and a constant range when you now have to decide for one)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/66678] loop counter not accurately described by vrp
  2015-06-26 11:15 [Bug tree-optimization/66678] New: loop counter not accurately described by vrp vries at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2015-06-30  8:48 ` rguenth at gcc dot gnu.org
@ 2023-08-09 23:06 ` pinskia at gcc dot gnu.org
  2023-08-10  6:57 ` rguenther at suse dot de
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-09 23:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So in GCC 12+ after evrp
  # RANGE [0, 4294967294] NONZERO 4294967295
  _1 = (long unsigned intD.16) i_9;
  # RANGE [0, 17179869176] NONZERO 17179869180
  _2 = _1 * 4;
...
  # RANGE [1, 4294967295]
  i_17 = i_9 + 1;

So maybe this has been fix ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/66678] loop counter not accurately described by vrp
  2015-06-26 11:15 [Bug tree-optimization/66678] New: loop counter not accurately described by vrp vries at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-08-09 23:06 ` pinskia at gcc dot gnu.org
@ 2023-08-10  6:57 ` rguenther at suse dot de
  4 siblings, 0 replies; 6+ messages in thread
From: rguenther at suse dot de @ 2023-08-10  6:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678

--- Comment #5 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 9 Aug 2023, pinskia at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678
> 
> --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> So in GCC 12+ after evrp
>   # RANGE [0, 4294967294] NONZERO 4294967295
>   _1 = (long unsigned intD.16) i_9;
>   # RANGE [0, 17179869176] NONZERO 17179869180
>   _2 = _1 * 4;
> ...
>   # RANGE [1, 4294967295]
>   i_17 = i_9 + 1;
> 
> So maybe this has been fix ...

Note with ranger there's a regression WRT loop IVs not handled by
SCEV because it no longer iterates.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-10  6:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-26 11:15 [Bug tree-optimization/66678] New: loop counter not accurately described by vrp vries at gcc dot gnu.org
2015-06-26 12:53 ` [Bug tree-optimization/66678] " vries at gcc dot gnu.org
2015-06-30  8:32 ` vries at gcc dot gnu.org
2015-06-30  8:48 ` rguenth at gcc dot gnu.org
2023-08-09 23:06 ` pinskia at gcc dot gnu.org
2023-08-10  6:57 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).