public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads
@ 2024-03-15  4:06 tnfchris at gcc dot gnu.org
  2024-03-15  4:11 ` [Bug tree-optimization/114345] " pinskia at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-03-15  4:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345

            Bug ID: 114345
           Summary: FRE missing knowledge of semantics of IFN loads
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---

The following testcase:

---
long tdiff = 10412095;

int main() {
  struct {
    long maximum;
    int nonprimary_delay;
  } delays[] = {{}, {}, {}, {9223372036854775807, 36 * 60 * 60}};

  for (unsigned i = 0; i < sizeof(delays) / sizeof(delays[0]); ++i)
    if (tdiff <= delays[i].maximum)
      return delays[i].nonprimary_delay;

  __builtin_abort();
}
---

compiled with -O2 -fno-vect-cost-model

generates on AArch64:

  vect_cst__45 = {tdiff.0_2, tdiff.0_2};
  vect_array.11 = .LOAD_LANES (MEM <long int[4]> [(long int *)&delays]);
  vect__1.12_40 = vect_array.11[0];
  vect_array.11 ={v} {CLOBBER};
  vect_array.14 = .LOAD_LANES (MEM <long int[4]> [(long int *)&delays + 32B]);
  vect__1.15_43 = vect_array.14[0];
  vect_array.14 ={v} {CLOBBER};
  mask_patt_15.17_46 = vect__1.12_40 >= vect_cst__45;
  mask_patt_15.17_47 = vect__1.15_43 >= vect_cst__45;
  vexit_reduc_51 = mask_patt_15.17_46 | mask_patt_15.17_47;

and on x86_64:

  vect_cst__53 = {tdiff.0_2, tdiff.0_2};
  _37 = { 0, 4294967295, 4294967294, 4294967293 };
  _32 = { 4, 5, 6, 7 };
  vect__1.11_42 = MEM <vector(2) long int> [(long int *)&delays];
  vectp_delays.9_43 = &delays + 16;
  vect__1.12_44 = MEM <vector(2) long int> [(long int *)vectp_delays.9_43];
  vect_perm_even_45 = VEC_PERM_EXPR <vect__1.11_42, vect__1.12_44, { 0, 2 }>;
  vectp_delays.9_47 = &delays + 32;
  vect__1.13_48 = MEM <vector(2) long int> [(long int *)vectp_delays.9_47];
  vectp_delays.9_49 = &delays + 48;
  vect__1.14_50 = MEM <vector(2) long int> [(long int *)vectp_delays.9_49];
  vect_perm_even_51 = VEC_PERM_EXPR <vect__1.13_48, vect__1.14_50, { 0, 2 }>;
  mask_patt_17.15_54 = vect_perm_even_45 >= vect_cst__53;
  mask_patt_17.15_55 = vect_perm_even_51 >= vect_cst__53;
  vexit_reduc_59 = mask_patt_17.15_54 | mask_patt_17.15_55;

which is eventually simplified by FRE into:

  vect_cst__53 = {tdiff.0_2, tdiff.0_2};
  mask_patt_17.15_54 = vect_cst__53 <= { 0, 0 };
  mask_patt_17.15_55 = vect_cst__53 <= { 0, 9223372036854775807 };
  vexit_reduc_59 = mask_patt_17.15_54 | mask_patt_17.15_55;

and realizing that the loads aren't needed.

It looks like the reason is that FRE doesn't understand LOAD_LANES and
MASKED_LOAD_LANES or the other load IFNs.

We thus end up with a spill to the stack and a load of the constants.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/114345] FRE missing knowledge of semantics of IFN loads
  2024-03-15  4:06 [Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads tnfchris at gcc dot gnu.org
@ 2024-03-15  4:11 ` pinskia at gcc dot gnu.org
  2024-03-15  4:17 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-15  4:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2024-03-15
                 CC|                            |pinskia at gcc dot gnu.org
             Status|UNCONFIRMED                 |NEW
           Severity|normal                      |enhancement

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.  I thought there was already a bug recording this but I can't find
it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/114345] FRE missing knowledge of semantics of IFN loads
  2024-03-15  4:06 [Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads tnfchris at gcc dot gnu.org
  2024-03-15  4:11 ` [Bug tree-optimization/114345] " pinskia at gcc dot gnu.org
@ 2024-03-15  4:17 ` pinskia at gcc dot gnu.org
  2024-03-15  7:42 ` tnfchris at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-15  4:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=106365

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Oh VN does have some knowledge of MASK_STORE and LEN_STORE. Just not LOAD_LANES
.


See PR 106365 for MASK_STORE and LEN_STORE implementation. Shouldn't be hard to
add LOAD_LANES/STORE_LANES there ...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/114345] FRE missing knowledge of semantics of IFN loads
  2024-03-15  4:06 [Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads tnfchris at gcc dot gnu.org
  2024-03-15  4:11 ` [Bug tree-optimization/114345] " pinskia at gcc dot gnu.org
  2024-03-15  4:17 ` pinskia at gcc dot gnu.org
@ 2024-03-15  7:42 ` tnfchris at gcc dot gnu.org
  2024-03-15  8:30 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-03-15  7:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345

--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> Oh VN does have some knowledge of MASK_STORE and LEN_STORE. Just not
> LOAD_LANES .
> 
> 
> See PR 106365 for MASK_STORE and LEN_STORE implementation. Shouldn't be hard
> to add LOAD_LANES/STORE_LANES there ...

Ah!, thanks for the pointer.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/114345] FRE missing knowledge of semantics of IFN loads
  2024-03-15  4:06 [Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads tnfchris at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-03-15  7:42 ` tnfchris at gcc dot gnu.org
@ 2024-03-15  8:30 ` rguenth at gcc dot gnu.org
  2024-03-15  8:32 ` tnfchris at gcc dot gnu.org
  2024-03-15  8:51 ` rguenther at suse dot de
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-03-15  8:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, the shuffling in .LOAD_LANES will be a bit awkward to do, but sure.  We
basically lack "constant folding" of .LOAD_LANES and similarly of course
we can't see through .STORE_LANES of a constant when later folding a scalar
load from the same memory.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/114345] FRE missing knowledge of semantics of IFN loads
  2024-03-15  4:06 [Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads tnfchris at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-03-15  8:30 ` rguenth at gcc dot gnu.org
@ 2024-03-15  8:32 ` tnfchris at gcc dot gnu.org
  2024-03-15  8:51 ` rguenther at suse dot de
  5 siblings, 0 replies; 7+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-03-15  8:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345

--- Comment #5 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #4)
> Well, the shuffling in .LOAD_LANES will be a bit awkward to do, but sure.  We
> basically lack "constant folding" of .LOAD_LANES and similarly of course
> we can't see through .STORE_LANES of a constant when later folding a scalar
> load from the same memory.

I guess it becomes harder with the 3 and 4 lane ones, but the 2 lanes one is
just a single VEC_PERM_EXPR no?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/114345] FRE missing knowledge of semantics of IFN loads
  2024-03-15  4:06 [Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads tnfchris at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-03-15  8:32 ` tnfchris at gcc dot gnu.org
@ 2024-03-15  8:51 ` rguenther at suse dot de
  5 siblings, 0 replies; 7+ messages in thread
From: rguenther at suse dot de @ 2024-03-15  8:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 15 Mar 2024, tnfchris at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345
> 
> --- Comment #5 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #4)
> > Well, the shuffling in .LOAD_LANES will be a bit awkward to do, but sure.  We
> > basically lack "constant folding" of .LOAD_LANES and similarly of course
> > we can't see through .STORE_LANES of a constant when later folding a scalar
> > load from the same memory.
> 
> I guess it becomes harder with the 3 and 4 lane ones, but the 2 lanes one is
> just a single VEC_PERM_EXPR no?

It's all about constant folding and thus "shuffling" properly.  But if
you consider that the vector type might be punned a later "long" load
of a .STORE_LANES with "int" lanes it will get interesting to now
follow non-consecutive bits... (read: that's not implemented).  That said,
some careful set of testcases should accompany support for .LOAD_LANES
and .STORE_LANES handling in VN.

I suppose it should be possible to leverage the GIMPLE FE for this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-15  8:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-15  4:06 [Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads tnfchris at gcc dot gnu.org
2024-03-15  4:11 ` [Bug tree-optimization/114345] " pinskia at gcc dot gnu.org
2024-03-15  4:17 ` pinskia at gcc dot gnu.org
2024-03-15  7:42 ` tnfchris at gcc dot gnu.org
2024-03-15  8:30 ` rguenth at gcc dot gnu.org
2024-03-15  8:32 ` tnfchris at gcc dot gnu.org
2024-03-15  8:51 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).