public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE
@ 2022-07-20  5:25 linkw at gcc dot gnu.org
  2022-07-20  7:35 ` [Bug tree-optimization/106365] " linkw at gcc dot gnu.org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-20  5:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

            Bug ID: 106365
           Summary: Miss to handle ifn .LEN_STORE in FRE
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

In regression testing for the patch to add unroll factor suggestion to
vectorizer for port rs6000, one failure got exposed on Power10 (with partial
vector in length supported). 

The test case is gcc/testsuite/gcc.dg/tree-ssa/pr84512.c

The option can be: -O3 -mcpu=power10 -fno-vect-cost-model

The resulted IR in optimized:

  <bb 2> [local count: 97603129]:
  MEM <vector(4) int> [(int *)&a] = { 0, 1, 4, 9 };
  MEM <vector(4) int> [(int *)&a + 16B] = { 16, 25, 36, 49 };
  .LEN_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, 8, { 64, 0, 0, 0, 81, 0,
0, 0, 100, 0, 0, 0, 121, 0, 0, 0 }, 0);
  vect__2.10_6 = MEM <vector(4) int> [(int *)&a];
  vect__2.10_30 = MEM <vector(4) int> [(int *)&a + 16B];
  vect_res_10.11_31 = vect__2.10_6 + vect__2.10_30;
  _33 = VEC_PERM_EXPR <vect_res_10.11_31, { 0, 0, 0, 0 }, { 2, 3, 4, 5 }>;
  _34 = vect_res_10.11_31 + _33;
  _35 = VEC_PERM_EXPR <_34, { 0, 0, 0, 0 }, { 1, 2, 3, 4 }>;
  _36 = _34 + _35;
  stmp_res_10.12_37 = BIT_FIELD_REF <_36, 32, 0>;
  _13 = a[8];
  res_3 = _13 + stmp_res_10.12_37;
  _8 = a[9];
  res_23 = res_3 + _8;
  a ={v} {CLOBBER(eol)};
  return res_23;

instead of:

int foo ()
{
  <bb 2> [local count: 97603129]:
  return 285;

}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
@ 2022-07-20  7:35 ` linkw at gcc dot gnu.org
  2022-07-20  8:25 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-20  7:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org,
                   |                            |segher at gcc dot gnu.org
           Keywords|                            |missed-optimization
     Ever confirmed|0                           |1
             Target|                            |powerpc*-linux-gnu
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-07-20

--- Comment #1 from Kewen Lin <linkw at gcc dot gnu.org> ---
Now vn_reference_lookup_3 of sccvn can handle bif memcpy, it seems we can teach
it about this IFN there. Another idea seems to fold this ifn into something
early that sccvn already supports, like aggregate construction with constants?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
  2022-07-20  7:35 ` [Bug tree-optimization/106365] " linkw at gcc dot gnu.org
@ 2022-07-20  8:25 ` rguenth at gcc dot gnu.org
  2022-07-20  8:53 ` linkw at gcc dot gnu.org
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-20  8:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
What's the semantic of .LEN_STORE?  I can't find documentation for this :/ 
There's docs for the len_store optab but how 'mask' and 'bias' relate to its
operands isn't documented anywhere.  If the cited .LEN_STORE is a full store
then sure - folding to a plain MEM = value; is preferred.  Otherwise I wouldn't
split it up.  Handling of partial stores in VN is possible, the "easiest" way
is probably via vn_reference_lookup_3 and its support for partial defs
(for constant masks a store may then be composed of multiple partial defs
and "masked" parts that are required will be taken from earlier stores).

Maybe handling of all partial store IFNs can be commonized somehow.

Alias analysis in general (ref_maybe_used_by_stmt_p, call_may_clobber_ref_p,
stmt_kills_ref_p) also miss handling of them - possibly some more general
helpers can facilitate that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
  2022-07-20  7:35 ` [Bug tree-optimization/106365] " linkw at gcc dot gnu.org
  2022-07-20  8:25 ` rguenth at gcc dot gnu.org
@ 2022-07-20  8:53 ` linkw at gcc dot gnu.org
  2022-07-20  9:01 ` rguenth at gcc dot gnu.org
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-20  8:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #3 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> What's the semantic of .LEN_STORE?  I can't find documentation for this :/ 
> There's docs for the len_store optab but how 'mask' and 'bias' relate to its
> operands isn't documented anywhere.

Yeah, it seems that in general we don't document for IFNs, I guess it's because
in most cases IFN is mapped to one relevant optab.  In the doc for len_store
optab, there are some notes for "bias" (operand 3), it's either 0 or -1, and
used as part of the value to specify how many (op2 - op3) vector elements will
be stored. For now, Power10 uses 0 and s390 uses 1.

" Store (operand 2 - operand 3) vector elements from vector register operand 1
  into memory operand 0, leaving the other elements of operand 0 unchanged. 
...

  Operand 2 can be a variable or a constant amount.  Operand 3 specifies a
  constant bias: it is either a constant 0 or a constant -1.  The predicate on
  operand 3 must only accept the bias values that the target actually supports.
  GCC handles a bias of 0 more efficiently than a bias of -1."

For the statement:

  .LEN_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, 8, { 64, 0, 0, 0, 81, 0,
0, 0, 100, 0, 0, 0, 121, 0, 0, 0 }, 0);

   op0 is dest mem, op1 128B is alias align info, op2 8 is length in bytes to
be stored, op3 is src const vector, op4 is the bias.

> If the cited .LEN_STORE is a full store
> then sure - folding to a plain MEM = value; is preferred.  

The src constant vector is 16 bytes above, the length is 8 bytes, so it's not a
full store in this case.

> Otherwise I wouldn't
> split it up.  Handling of partial stores in VN is possible, the "easiest" way
> is probably via vn_reference_lookup_3 and its support for partial defs
> (for constant masks a store may then be composed of multiple partial defs
> and "masked" parts that are required will be taken from earlier stores).
> 

OK, thanks for the pointer! i'll have a look at it.

> Maybe handling of all partial store IFNs can be commonized somehow.
> 

I just had a try with SVE (partial load/store with mask) with
-msve-vector-bits=128 --param vect-partial-vector-usage=1, it also ends with
sub-optimal code:

  <bb 2> [local count: 97603129]:
  MEM <vector(4) int> [(int *)&a] = { 0, 1, 4, 9 };
  MEM <vector(4) int> [(int *)&a + 16B] = { 16, 25, 36, 49 };
  .MASK_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, { -1, -1, 0, 0 }, { 64,
81, 100, 121 });
  vect__2.10_13 = MEM <vector(4) int> [(int *)&a];
  vect__2.10_29 = MEM <vector(4) int> [(int *)&a + 16B];
  vect_res_10.11_30 = vect__2.10_13 + vect__2.10_29;
  _35 = (vector(4) int) vect_res_10.11_30;
  vect__7.16_41 = .MASK_LOAD (&MEM <int[10]> [(void *)&a + 32B], 128B, { -1,
-1, 0, 0 });
  vect_res_15.17_42 = .COND_ADD ({ -1, -1, 0, 0 }, _35, vect__7.16_41, _35);
  _44 = .REDUC_PLUS (vect_res_15.17_42); [tail call]
  a ={v} {CLOBBER(eol)};
  return _44;

> Alias analysis in general (ref_maybe_used_by_stmt_p, call_may_clobber_ref_p,
> stmt_kills_ref_p) also miss handling of them - possibly some more general
> helpers can facilitate that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-07-20  8:53 ` linkw at gcc dot gnu.org
@ 2022-07-20  9:01 ` rguenth at gcc dot gnu.org
  2022-07-20  9:07 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-20  9:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
int __attribute__((noinline,noclone))
foo (int *out)
{
  int mask[] = { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
      0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1 };
  int i;
  for (i = 0; i < 32; ++i)
    {
      if (mask[i])
        out[i] = i;
    }
  return out[7];
}

testcase for x86_64 and .MASK_STORE, could be optimized to return 1.  FRE
sees

  .MASK_STORE (out_41(D), 32B, mask__7.9_47, { 0, 1, 2, 3, 4, 5, 6, 7 });
  _10 = &mask[8] + 32;
  MEM <vector(8) int> [(int *)_10] = { 0, 1, 0, 1, 0, 1, 0, 1 };

and 'mask' having address taken makes it clobbered by .MASK_STORE.  There's
also the older issue that when mask is incoming but marked __restrict that
isn't good enough because __restrict and calls doesn't work.

The IL with .LEN_STORE might suffer similar issues at the point FRE gets
to see it.

We might be able to improve BB SLP to not code-gen

  _10 = &mask[8] + 32;
  MEM <vector(8) int> [(int *)_10] = { 0, 1, 0, 1, 0, 1, 0, 1 };

here, making 'mask' addressable again.  I have a patch for this in testing.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-07-20  9:01 ` rguenth at gcc dot gnu.org
@ 2022-07-20  9:07 ` rguenth at gcc dot gnu.org
  2022-07-20  9:11 ` linkw at gcc dot gnu.org
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-20  9:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
I will try to add handling for .MASK_STORE, hopefully that will be good enough
to massage the code for .LEN_STORE (which IIRC is "easier" since it's a
contiguous store rather than .MASK_STORE which can have multiple "pieces").

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-07-20  9:07 ` rguenth at gcc dot gnu.org
@ 2022-07-20  9:11 ` linkw at gcc dot gnu.org
  2022-07-20 10:34 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-20  9:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #6 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #5)
> I will try to add handling for .MASK_STORE, hopefully that will be good
> enough to massage the code for .LEN_STORE (which IIRC is "easier" since it's
> a contiguous store rather than .MASK_STORE which can have multiple "pieces").

Nice, thanks!  Yeah, it's contiguous. :)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-07-20  9:11 ` linkw at gcc dot gnu.org
@ 2022-07-20 10:34 ` rguenth at gcc dot gnu.org
  2022-07-20 12:07 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-20 10:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 53323
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53323&action=edit
prototype

I'm testing this - for .LEN_STORE you mainly have to compute pd.rhs_off,
pd.offset, pd.size and do a single

  return data->push_partial_def (pd, set, set, offseti, maxsizei);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-07-20 10:34 ` rguenth at gcc dot gnu.org
@ 2022-07-20 12:07 ` rguenth at gcc dot gnu.org
  2022-07-21  7:26 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-20 12:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #53323|0                           |1
        is obsolete|                            |

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 53324
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53324&action=edit
updated prototype

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-07-20 12:07 ` rguenth at gcc dot gnu.org
@ 2022-07-21  7:26 ` rguenth at gcc dot gnu.org
  2022-07-21  7:30 ` linkw at gcc dot gnu.org
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-21  7:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 53328
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53328&action=edit
patch

The attached now also handles .LEN_STORE for

int __attribute__((noinline,noclone))
foo ()
{
  int out[10];
  int i;
  for (i = 0; i < 10; ++i)
    {
      out[i] = i;
    }
  return out[9];
}

where I can see

   <bb 2> [local count: 97603129]:
-  _13 = { 4, 5, 6, 7 };
   MEM <vector(4) int> [(int *)&out] = { 0, 1, 2, 3 };
-  vectp_out.7_15 = &out + 16;
-  ivtmp_16 = 1;
-  _20 = _13 + { 4, 4, 4, 4 };
-  MEM <vector(4) int> [(int *)vectp_out.7_15] = _13;
-  vectp_out.7_22 = vectp_out.7_15 + 16;
-  ivtmp_25 = ivtmp_16 + 1;
-  _28 = { 12, 13, 14, 15 };
-  vect_33 = { 8, 0, 0, 0, 9, 0, 0, 0, 10, 0, 0, 0, 11, 0, 0, 0 };
-  .LEN_STORE (&MEM <int[10]> [(void *)&out + 32B], 128B, 8, vect_33, 0);
-  vectp_out.10_31 = &MEM <int[10]> [(void *)&out + 32B] + 16;
-  ivtmp_36 = 16;
-  _39 = MIN_EXPR <ivtmp_36, 8>;
-  _40 = 8 - _39;
-  _41 = MIN_EXPR <_40, 16>;
-  _4 = out[9];
+  MEM <vector(4) int> [(int *)&out + 16B] = { 4, 5, 6, 7 };
+  .LEN_STORE (&MEM <int[10]> [(void *)&out + 32B], 128B, 8, { 8, 0, 0, 0, 9,
0, 0, 0, 10, 0, 0, 0, 11, 0, 0, 0 }, 0);
   out ={v} {CLOBBER(eol)};
-  return _4;
+  return 9;

in the diff from cunroll to fre4.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-07-21  7:26 ` rguenth at gcc dot gnu.org
@ 2022-07-21  7:30 ` linkw at gcc dot gnu.org
  2022-07-21  7:31 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-21  7:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #10 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #7)
> Created attachment 53323 [details]
> prototype
> 
> I'm testing this - for .LEN_STORE you mainly have to compute pd.rhs_off,
> pd.offset, pd.size and do a single
> 
>   return data->push_partial_def (pd, set, set, offseti, maxsizei);

Thanks!  Added the below diff and confirm it can make most of code optimized
away and generate "return 285;".

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 0a16984d2ca..ea5b7c54f82 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -3228,6 +3228,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void
*data_,
        return (void *)-1;

       tree mask = NULL_TREE;
+      tree len = NULL_TREE;
       switch (fn)
        {
        case IFN_MASK_STORE:
@@ -3236,6 +3237,19 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void
*data_,
          if (TREE_CODE (mask) != VECTOR_CST)
            return (void *)-1;
          break;
+       case IFN_LEN_STORE:
+         {
+           /* Extract the length without bias.  */
+           tree len0 = gimple_call_arg (call, 2);
+           len0 = vn_valueize (len0);
+           if (TREE_CODE (len0) != INTEGER_CST)
+             return (void *) -1;
+           tree bias = gimple_call_arg (call, 4);
+           len = fold_build2 (MINUS_EXPR, TREE_TYPE (len0), len0, bias);
+           /* Bias is either 0 or -1, biased length should be constant.  */
+           gcc_assert (TREE_CODE (len) == INTEGER_CST);
+           break;
+         }
        default:
          return (void *)-1;
        }
@@ -3311,6 +3325,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void
*data_,
                                                       offseti, maxsizei);
                    }
                }
+             else if (len)
+               {
+                 pd.rhs_off = 0;
+                 pd.offset = offset2i;
+                 pd.size = tree_to_uhwi (len) * elsz;
+                 if (ranges_known_overlap_p (offset, maxsize, pd.offset,
+                                             pd.size))
+                   return data->push_partial_def (pd, set, set, offseti,
+                                                  maxsizei);
+               }
              else
                gcc_unreachable ();
              return NULL;

But it still keeps the .LEN_STORE there:

int foo ()
{
  int a[10];

  <bb 2> [local count: 97603129]:
  .LEN_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, 8, { 64, 0, 0, 0, 81, 0,
0, 0, 100, 0, 0, 0, 121, 0, 0, 0 }, 0);
  a ={v} {CLOBBER(eol)};
  return 285;

}

btw, the latest commit "Add alias disambiguation for vectorizer load/store
IFNs" has been applied.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-07-21  7:30 ` linkw at gcc dot gnu.org
@ 2022-07-21  7:31 ` rguenth at gcc dot gnu.org
  2022-07-21  7:37 ` linkw at gcc dot gnu.org
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-21  7:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #53324|0                           |1
        is obsolete|                            |
  Attachment #53328|0                           |1
        is obsolete|                            |

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 53329
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53329&action=edit
patch

Doh, and I got 'bias' wrong.  Now fixed - the affected byte range is [0, len +
-bias] now.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2022-07-21  7:31 ` rguenth at gcc dot gnu.org
@ 2022-07-21  7:37 ` linkw at gcc dot gnu.org
  2022-07-21  7:39 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-21  7:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #12 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #9)
> Created attachment 53328 [details]
> patch
> 

Thanks!  Sorry that I didn't see this attachment when posting the above
comment.

> +  MEM <vector(4) int> [(int *)&out + 16B] = { 4, 5, 6, 7 };
> +  .LEN_STORE (&MEM <int[10]> [(void *)&out + 32B], 128B, 8, { 8, 0, 0, 0,
> 9, 0, 0, 0, 10, 0, 0, 0, 11, 0, 0, 0 }, 0);

Similar to the case gcc.dg/tree-ssa/pr84512.c, this store would be expected to
be eliminated as well?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2022-07-21  7:37 ` linkw at gcc dot gnu.org
@ 2022-07-21  7:39 ` rguenth at gcc dot gnu.org
  2022-07-21  7:57 ` linkw at gcc dot gnu.org
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-21  7:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #10)
> But it still keeps the .LEN_STORE there:
> 
> int foo ()
> {
>   int a[10];
> 
>   <bb 2> [local count: 97603129]:
>   .LEN_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, 8, { 64, 0, 0, 0, 81,
> 0, 0, 0, 100, 0, 0, 0, 121, 0, 0, 0 }, 0);
>   a ={v} {CLOBBER(eol)};
>   return 285;
> 
> }
> 
> btw, the latest commit "Add alias disambiguation for vectorizer load/store
> IFNs" has been applied.

I think that DSE doesn't handle the store IFNs yet - maybe adding handling
to initialize_ao_ref_for_dse would be enough - but I think it cannot yet
handle a "conservative" start (for .MASK_STORES), but .LEN_STORE should
be possible to describe exact by giving an exact trimmed size to ao_ref
(the alias disambiguation changes could be also made more precise there).

Can you open a separate bugreport for the DSE issue?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2022-07-21  7:39 ` rguenth at gcc dot gnu.org
@ 2022-07-21  7:57 ` linkw at gcc dot gnu.org
  2022-07-21 11:06 ` cvs-commit at gcc dot gnu.org
  2022-07-21 11:24 ` rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-07-21  7:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #14 from Kewen Lin <linkw at gcc dot gnu.org> ---
> I think that DSE doesn't handle the store IFNs yet - maybe adding handling
> to initialize_ao_ref_for_dse would be enough - but I think it cannot yet
> handle a "conservative" start (for .MASK_STORES), but .LEN_STORE should
> be possible to describe exact by giving an exact trimmed size to ao_ref
> (the alias disambiguation changes could be also made more precise there).
> 

Got it, thanks!

> Can you open a separate bugreport for the DSE issue?

Sure, PR106378 filed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2022-07-21  7:57 ` linkw at gcc dot gnu.org
@ 2022-07-21 11:06 ` cvs-commit at gcc dot gnu.org
  2022-07-21 11:24 ` rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-21 11:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:bd9837bc3ca1344c32aef7ba9f8fa1785063132e

commit r13-1777-gbd9837bc3ca1344c32aef7ba9f8fa1785063132e
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Jul 20 12:28:26 2022 +0200

    Teach VN about masked/len stores

    The following teaches VN to handle reads from .MASK_STORE and
    .LEN_STORE.  For this push_partial_def is extended first for
    convenience so we don't have to handle the full def case in the
    caller (possibly other paths can be simplified then).  Also
    the partial definition stored value can have an offset applied
    so we don't have to build a fake RHS when we register the pieces
    of an existing store.

            PR tree-optimization/106365
            * tree-ssa-sccvn.cc (pd_data::rhs_off): New field determining
            the offset to start encoding of RHS from.
            (vn_walk_cb_data::vn_walk_cb_data): Initialize it.
            (vn_walk_cb_data::push_partial_def): Allow the first partial
            definition to be fully providing the def.  Offset RHS
            before encoding if requested.
            (vn_reference_lookup_3): Initialize def_rhs everywhere.
            Add support for .MASK_STORE and .LEN_STORE (partial) definitions.

            * gcc.target/i386/vec-maskstore-vn.c: New testcase.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug tree-optimization/106365] Miss to handle ifn .LEN_STORE in FRE
  2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2022-07-21 11:06 ` cvs-commit at gcc dot gnu.org
@ 2022-07-21 11:24 ` rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-07-21 11:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
Should be fixed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-07-21 11:24 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-20  5:25 [Bug tree-optimization/106365] New: Miss to handle ifn .LEN_STORE in FRE linkw at gcc dot gnu.org
2022-07-20  7:35 ` [Bug tree-optimization/106365] " linkw at gcc dot gnu.org
2022-07-20  8:25 ` rguenth at gcc dot gnu.org
2022-07-20  8:53 ` linkw at gcc dot gnu.org
2022-07-20  9:01 ` rguenth at gcc dot gnu.org
2022-07-20  9:07 ` rguenth at gcc dot gnu.org
2022-07-20  9:11 ` linkw at gcc dot gnu.org
2022-07-20 10:34 ` rguenth at gcc dot gnu.org
2022-07-20 12:07 ` rguenth at gcc dot gnu.org
2022-07-21  7:26 ` rguenth at gcc dot gnu.org
2022-07-21  7:30 ` linkw at gcc dot gnu.org
2022-07-21  7:31 ` rguenth at gcc dot gnu.org
2022-07-21  7:37 ` linkw at gcc dot gnu.org
2022-07-21  7:39 ` rguenth at gcc dot gnu.org
2022-07-21  7:57 ` linkw at gcc dot gnu.org
2022-07-21 11:06 ` cvs-commit at gcc dot gnu.org
2022-07-21 11:24 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).