public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] tree-optimization/107852 - missed optimization with PHIs
@ 2022-11-29 13:30 Richard Biener
  2022-12-05 21:39 ` Jan-Benedict Glaw
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Biener @ 2022-11-29 13:30 UTC (permalink / raw)
  To: gcc-patches

The following deals with the situation where we have

<bb 2> [local count: 1073741824]:
_5 = bytes.D.25336._M_impl.D.24643._M_start;
_6 = bytes.D.25336._M_impl.D.24643._M_finish;
pretmp_66 = bytes.D.25336._M_impl.D.24643._M_end_of_storage;
if (_5 != _6)
  goto <bb 3>; [70.00%]
else
  goto <bb 4>; [30.00%]

...

<bb 6> [local count: 329045359]:
_89 = operator new (4);
_43 = bytes.D.25336._M_impl.D.24643._M_start;
_Num_44 = _137 - _43;
if (_Num_44 != 0)

but fail to see that _137 is equal to _5 and thus eventually _Num_44
is zero if not operator new would possibly clobber the global
bytes variable.

The following resolves this in value-numbering by using the
predicated values for _5 == _6 recorded for the dominating
condition.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

	PR tree-optimization/107852
	* tree-ssa-sccvn.cc (visit_phi): Use equivalences recorded
	as predicated values to elide more redundant PHIs.

	* gcc.dg/tree-ssa/ssa-fre-101.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-101.c | 47 +++++++++++++++++++
 gcc/tree-ssa-sccvn.cc                       | 51 ++++++++++++++++++++-
 2 files changed, 97 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-101.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-101.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-101.c
new file mode 100644
index 00000000000..c67f211dcf6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-101.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-fre1-details" } */
+
+int test1 (int i, int j)
+{
+  int k;
+  if (i != j)
+    k = i;
+  else
+    k = j;
+  return k;
+}
+
+int test2 (int i, int j)
+{
+  int k;
+  if (i != j)
+    k = j;
+  else
+    k = i;
+  return k;
+}
+
+int test3 (int i, int j)
+{
+  int k;
+  if (i == j)
+    k = j;
+  else
+    k = i;
+  return k;
+}
+
+int test4 (int i, int j)
+{
+  int k;
+  if (i == j)
+    k = i;
+  else
+    k = j;
+  return k;
+}
+
+/* We'd expect 4 hits but since we only keep one forwarder the
+   VN predication machinery cannot record something for the entry
+   block since it doesn't work on edges but on their source.  */
+/* { dg-final { scan-tree-dump-times "equal on edge" 2 "fre1" } } */
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 1f9c6c53b52..6895ae84d13 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -5814,7 +5814,8 @@ visit_phi (gimple *phi, bool *inserted, bool backedges_varying_p)
 
   /* See if all non-TOP arguments have the same value.  TOP is
      equivalent to everything, so we can ignore it.  */
-  FOR_EACH_EDGE (e, ei, gimple_bb (phi)->preds)
+  basic_block bb = gimple_bb (phi);
+  FOR_EACH_EDGE (e, ei, bb->preds)
     if (e->flags & EDGE_EXECUTABLE)
       {
 	tree def = PHI_ARG_DEF_FROM_EDGE (phi, e);
@@ -5859,6 +5860,54 @@ visit_phi (gimple *phi, bool *inserted, bool backedges_varying_p)
 			 && known_eq (soff, doff))
 		  continue;
 	      }
+	    /* There's also the possibility to use equivalences.  */
+	    if (!FLOAT_TYPE_P (TREE_TYPE (def)))
+	      {
+		vn_nary_op_t vnresult;
+		tree ops[2];
+		ops[0] = def;
+		ops[1] = sameval;
+		tree val = vn_nary_op_lookup_pieces (2, EQ_EXPR,
+						     boolean_type_node,
+						     ops, &vnresult);
+		if (! val && vnresult && vnresult->predicated_values)
+		  {
+		    val = vn_nary_op_get_predicated_value (vnresult, e->src);
+		    if (val && integer_truep (val))
+		      {
+			if (dump_file && (dump_flags & TDF_DETAILS))
+			  {
+			    fprintf (dump_file, "Predication says ");
+			    print_generic_expr (dump_file, def, TDF_NONE);
+			    fprintf (dump_file, " and ");
+			    print_generic_expr (dump_file, sameval, TDF_NONE);
+			    fprintf (dump_file, " are equal on edge %d -> %d\n",
+				     e->src->index, e->dest->index);
+			  }
+			continue;
+		      }
+		    /* If on all previous edges the value was equal to def
+		       we can change sameval to def.  */
+		    if (EDGE_COUNT (bb->preds) == 2
+			&& (val = vn_nary_op_get_predicated_value
+				    (vnresult, EDGE_PRED (bb, 0)->src))
+			&& integer_truep (val))
+		      {
+			if (dump_file && (dump_flags & TDF_DETAILS))
+			  {
+			    fprintf (dump_file, "Predication says ");
+			    print_generic_expr (dump_file, def, TDF_NONE);
+			    fprintf (dump_file, " and ");
+			    print_generic_expr (dump_file, sameval, TDF_NONE);
+			    fprintf (dump_file, " are equal on edge %d -> %d\n",
+				     EDGE_PRED (bb, 0)->src->index,
+				     EDGE_PRED (bb, 0)->dest->index);
+			  }
+			sameval = def;
+			continue;
+		      }
+		  }
+	      }
 	    sameval = NULL_TREE;
 	    break;
 	  }
-- 
2.35.3

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] tree-optimization/107852 - missed optimization with PHIs
  2022-11-29 13:30 [PATCH] tree-optimization/107852 - missed optimization with PHIs Richard Biener
@ 2022-12-05 21:39 ` Jan-Benedict Glaw
  2022-12-06  7:11   ` Richard Biener
  0 siblings, 1 reply; 3+ messages in thread
From: Jan-Benedict Glaw @ 2022-12-05 21:39 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3812 bytes --]

On Tue, 2022-11-29 14:30:22 +0100, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
> 
> 	PR tree-optimization/107852
> 	* tree-ssa-sccvn.cc (visit_phi): Use equivalences recorded
> 	as predicated values to elide more redundant PHIs.
> 
> 	* gcc.dg/tree-ssa/ssa-fre-101.c: New testcase.

This seems to trigger an issue when building the Linux powerpc kernel
for the skiroot_defconfig:

[mk all 2022-12-05 19:50:10]   powerpc64-linux-gcc -Wp,-MMD,drivers/dma-buf/.dma-fence-array.o.d -nostdinc -I./arch/powerpc/include -I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi -I./arch/powerpc/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -I ./arch/powerpc -DHAVE_AS_ATHIGH=1 -fmacro-prefix-map=./= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu11 -mlittle-endian -m64 -msoft-float -pipe -mtraceback=no -mabi=elfv2 -mcmodel=medium -mno-pointers-to-nested-functions -mcpu=power8 -mtune=power10 -mno-prefixed -mno-pcrel -mno-altivec -mno-vsx -mno-mma -fno-asynchronous-unwind-tables -mno-string -Wa,-maltivec -Wa,-mpower4 -Wa,-many -mno-strict-align -mlittle-endian -mstack-protector-guard=tls -mstack-protector-guard-reg=r13 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-address-of-packed-member -Os -fno-allow-store-data-races -Wframe-larger-than=2048 -fstack-protector-strong -Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable -Wno-dangling-pointer -fomit-frame-pointer -ftrivial-auto-var-init=zero -fno-stack-clash-protection -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wcast-function-type -Wno-stringop-truncation -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -Wno-alloc-size-larger-than -Wimplicit-fallthrough=5 -fno-strict-overflow -fno-stack-check -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned -mstack-protector-guard-offset=2800    -DKBUILD_MODFILE='"drivers/dma-buf/dma-fence-array"' -DKBUILD_BASENAME='"dma_fence_array"' -DKBUILD_MODNAME='"dma_fence_array"' -D__KBUILD_MODNAME=kmod_dma_fence_array -c -o drivers/dma-buf/dma-fence-array.o drivers/dma-buf/dma-fence-array.c  
[mk all 2022-12-05 19:50:10] drivers/dma-buf/dma-fence-array.c: In function 'dma_fence_array_create':
[mk all 2022-12-05 19:50:10] drivers/dma-buf/dma-fence-array.c:154:25: error: control flow in the middle of basic block 12
[mk all 2022-12-05 19:50:10]   154 | struct dma_fence_array *dma_fence_array_create(int num_fences,
[mk all 2022-12-05 19:50:10]       |                         ^~~~~~~~~~~~~~~~~~~~~~
[mk all 2022-12-05 19:50:10] during GIMPLE pass: ivopts
[mk all 2022-12-05 19:50:10] drivers/dma-buf/dma-fence-array.c:154:25: internal compiler error: verify_flow_info failed
[mk all 2022-12-05 19:50:10] 0x19ea876 internal_error(char const*, ...)
[mk all 2022-12-05 19:50:10]    ???:0
[mk all 2022-12-05 19:50:10] 0x94b00e verify_flow_info()
[mk all 2022-12-05 19:50:10]    ???:0
[mk all 2022-12-05 19:50:10] Please submit a full bug report, with preprocessed source (by using -freport-bug).
[mk all 2022-12-05 19:50:10] Please include the complete backtrace with any bug report.
[mk all 2022-12-05 19:50:10] See <https://gcc.gnu.org/bugs/> for instructions.

Maybe you've got an idea, otherwise I'll try to reproduce it manually.
(That's all automated building.)

Thanks,
  Jan-Benedict
-- 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] tree-optimization/107852 - missed optimization with PHIs
  2022-12-05 21:39 ` Jan-Benedict Glaw
@ 2022-12-06  7:11   ` Richard Biener
  0 siblings, 0 replies; 3+ messages in thread
From: Richard Biener @ 2022-12-06  7:11 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4160 bytes --]

On Mon, 5 Dec 2022, Jan-Benedict Glaw wrote:

> On Tue, 2022-11-29 14:30:22 +0100, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
> > 
> > 	PR tree-optimization/107852
> > 	* tree-ssa-sccvn.cc (visit_phi): Use equivalences recorded
> > 	as predicated values to elide more redundant PHIs.
> > 
> > 	* gcc.dg/tree-ssa/ssa-fre-101.c: New testcase.
> 
> This seems to trigger an issue when building the Linux powerpc kernel
> for the skiroot_defconfig:
> 
> [mk all 2022-12-05 19:50:10]   powerpc64-linux-gcc -Wp,-MMD,drivers/dma-buf/.dma-fence-array.o.d -nostdinc -I./arch/powerpc/include -I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi -I./arch/powerpc/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -I ./arch/powerpc -DHAVE_AS_ATHIGH=1 -fmacro-prefix-map=./= -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Wno-format-security -std=gnu11 -mlittle-endian -m64 -msoft-float -pipe -mtraceback=no -mabi=elfv2 -mcmodel=medium -mno-pointers-to-nested-functions -mcpu=power8 -mtune=power10 -mno-prefixed -mno-pcrel -mno-altivec -mno-vsx -mno-mma -fno-asynchronous-unwind-tables -mno-string -Wa,-maltivec -Wa,-mpower4 -Wa,-many -mno
 -strict-align -mlittle-endian -mstack-protector-guard=tls -mstack-protector-guard-reg=r13 -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-address-of-packed-member -Os -fno-allow-store-data-races -Wframe-larger-than=2048 -fstack-protector-strong -Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable -Wno-dangling-pointer -fomit-frame-pointer -ftrivial-auto-var-init=zero -fno-stack-clash-protection -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wcast-function-type -Wno-stringop-truncation -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -Wno-alloc-size-larger-than -Wimplicit-fallthrough=5 -fno-strict-overflow -fno-stack-check -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wno-packed-not-aligned -mstack-protector-guard-offset=2800    -DKBUILD_MODFILE='"drivers/dma-buf/dma-fence-array"' -DKBUILD_BASENAME='"dma_fence_array"' -DKBUILD_MODNAME='"dma_fence_arra
 y"' -D__KBUILD_MODNAME=kmod_dma_fence_array -c -o drivers/dma-buf/dma-fence-array.o drivers/dma-buf/dma-fence-array.c  
> [mk all 2022-12-05 19:50:10] drivers/dma-buf/dma-fence-array.c: In function 'dma_fence_array_create':
> [mk all 2022-12-05 19:50:10] drivers/dma-buf/dma-fence-array.c:154:25: error: control flow in the middle of basic block 12
> [mk all 2022-12-05 19:50:10]   154 | struct dma_fence_array *dma_fence_array_create(int num_fences,
> [mk all 2022-12-05 19:50:10]       |                         ^~~~~~~~~~~~~~~~~~~~~~
> [mk all 2022-12-05 19:50:10] during GIMPLE pass: ivopts
> [mk all 2022-12-05 19:50:10] drivers/dma-buf/dma-fence-array.c:154:25: internal compiler error: verify_flow_info failed
> [mk all 2022-12-05 19:50:10] 0x19ea876 internal_error(char const*, ...)
> [mk all 2022-12-05 19:50:10]    ???:0
> [mk all 2022-12-05 19:50:10] 0x94b00e verify_flow_info()
> [mk all 2022-12-05 19:50:10]    ???:0
> [mk all 2022-12-05 19:50:10] Please submit a full bug report, with preprocessed source (by using -freport-bug).
> [mk all 2022-12-05 19:50:10] Please include the complete backtrace with any bug report.
> [mk all 2022-12-05 19:50:10] See <https://gcc.gnu.org/bugs/> for instructions.
> 
> Maybe you've got an idea, otherwise I'll try to reproduce it manually.
> (That's all automated building.)

I'll note the above ICE is quite a few passes later during IVOPTs so
the change triggered a latent issue.  Wild guessing makes me think
it's some asm goto being mis-handled.

Can you please open a bugreport and provide preprocessed source so
one can reproduce this with a cc1 cross?

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-06  7:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-29 13:30 [PATCH] tree-optimization/107852 - missed optimization with PHIs Richard Biener
2022-12-05 21:39 ` Jan-Benedict Glaw
2022-12-06  7:11   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).