public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] ifcvt: Fix regression in aarch64/fcsel_1.c
@ 2023-02-03  9:15 Richard Sandiford
  2023-02-13  6:46 ` Jeff Law
  2023-02-13 10:41 ` [Ping] " Richard Sandiford
  0 siblings, 2 replies; 4+ messages in thread
From: Richard Sandiford @ 2023-02-03  9:15 UTC (permalink / raw)
  To: gcc-patches

aarch64/fcsel_1.c contains:

double
f_2 (double a, double b, double c, double d)
{
  if (a > b)
    return c;
  else
    return d;
}

which started failing in the GCC 12 timeframe.  When it passed,
the RTL had the form:

[A]
  (set (reg ret) (reg c))
  (set (pc) (if_then_else (gt ...) (label_ref ret) (pc)))
    edge to ret, fallthru to else
else:
  (set (reg ret) (reg d))
    fallthru to ret
ret:
  ...exit...

i.e. a branch around.  Now the RTL has form:

[B]
  (set (reg ret) (reg d))
  (set (pc) (if_then_else (gt ...) (label_ref then) (pc)))
    edge to then, fallthru to ret
ret:
  ...exit...

then:
  (set (reg ret) (reg c))
    edge to ret

i.e. a branch out.

Both are valid, of course, and there's no easy way to predict
which we'll get.  But ifcvt canonicalises its representation on:

  if (cond) goto fallthru else goto non-fallthru

That is, it canoncalises on the branch-around case for half-diamonds.
It therefore wants to invert the comparison in [B] to get:

  if (...) goto ret else goto then

But that isn't possible for strict FP gt, so the optimisation fails.

Canonicalising on the branch-around case seems like the wrong choice for
half diamonds.  The natural way of expressing a conditional branch is
for the label_ref to be the "then" destination and pc to be the "else"
destination.  And the natural choice of condition seems to be the one
under which extra stuff *is* done, rather than the one under which extra
stuff *isn't* done.  But that decision goes back at least 20 years and
it doesn't seem like a good idea to change it in stage 4.

This patch instead allows the internal structure to store the
condition in inverted form.  For simplicity it handles only
conditional moves, which is the one case that is needed
to fix the known regression.  (There are probably unknown
regressions too, but still.)

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
	* ifcvt.h (noce_if_info::cond_inverted): New field.
	* ifcvt.cc (cond_move_convert_if_block): Swap the then and else
	values when cond_inverted is true.
	(noce_find_if_block): Allow the condition to be inverted when
	handling conditional moves.
---
 gcc/ifcvt.cc | 31 +++++++++++++++++++------------
 gcc/ifcvt.h  |  8 ++++++++
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index 008796838f7..63ef42b3c34 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -4253,6 +4253,9 @@ cond_move_convert_if_block (struct noce_if_info *if_infop,
 	    e = dest;
 	}
 
+      if (if_infop->cond_inverted)
+	std::swap (t, e);
+
       target = noce_emit_cmove (if_infop, dest, code, cond_arg0, cond_arg1,
 				t, e);
       if (!target)
@@ -4405,7 +4408,6 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
   basic_block then_bb, else_bb, join_bb;
   bool then_else_reversed = false;
   rtx_insn *jump;
-  rtx cond;
   rtx_insn *cond_earliest;
   struct noce_if_info if_info;
   bool speed_p = optimize_bb_for_speed_p (test_bb);
@@ -4481,25 +4483,28 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
   if (! onlyjump_p (jump))
     return FALSE;
 
-  /* If this is not a standard conditional jump, we can't parse it.  */
-  cond = noce_get_condition (jump, &cond_earliest, then_else_reversed);
-  if (!cond)
-    return FALSE;
-
-  /* We must be comparing objects whose modes imply the size.  */
-  if (GET_MODE (XEXP (cond, 0)) == BLKmode)
-    return FALSE;
-
   /* Initialize an IF_INFO struct to pass around.  */
   memset (&if_info, 0, sizeof if_info);
   if_info.test_bb = test_bb;
   if_info.then_bb = then_bb;
   if_info.else_bb = else_bb;
   if_info.join_bb = join_bb;
-  if_info.cond = cond;
+  if_info.cond = noce_get_condition (jump, &cond_earliest,
+				     then_else_reversed);;
   rtx_insn *rev_cond_earliest;
   if_info.rev_cond = noce_get_condition (jump, &rev_cond_earliest,
 					 !then_else_reversed);
+  if (!if_info.cond && !if_info.rev_cond)
+    return FALSE;
+  if (!if_info.cond)
+    {
+      std::swap (if_info.cond, if_info.rev_cond);
+      std::swap (cond_earliest, rev_cond_earliest);
+      if_info.cond_inverted = true;
+    }
+  /* We must be comparing objects whose modes imply the size.  */
+  if (GET_MODE (XEXP (if_info.cond, 0)) == BLKmode)
+    return FALSE;
   gcc_assert (if_info.rev_cond == NULL_RTX
 	      || rev_cond_earliest == cond_earliest);
   if_info.cond_earliest = cond_earliest;
@@ -4518,7 +4523,9 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
 
   /* Do the real work.  */
 
-  if (noce_process_if_block (&if_info))
+  /* ??? noce_process_if_block has not yet been updated to handle
+     inverted conditions.  */
+  if (!if_info.cond_inverted && noce_process_if_block (&if_info))
     return TRUE;
 
   if (HAVE_conditional_move
diff --git a/gcc/ifcvt.h b/gcc/ifcvt.h
index c6ea244aae1..be1385aabe4 100644
--- a/gcc/ifcvt.h
+++ b/gcc/ifcvt.h
@@ -86,6 +86,14 @@ struct noce_if_info
      form as well.  */
   bool then_else_reversed;
 
+  /* True if THEN_BB is conditional on !COND rather than COND.
+     This is used if:
+
+     - JUMP branches to THEN_BB on COND
+     - JUMP falls through to JOIN_BB on !COND
+     - COND cannot be reversed.  */
+  bool cond_inverted;
+
   /* True if the contents of then_bb and else_bb are a
      simple single set instruction.  */
   bool then_simple;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ifcvt: Fix regression in aarch64/fcsel_1.c
  2023-02-03  9:15 [PATCH] ifcvt: Fix regression in aarch64/fcsel_1.c Richard Sandiford
@ 2023-02-13  6:46 ` Jeff Law
  2023-02-13 10:41 ` [Ping] " Richard Sandiford
  1 sibling, 0 replies; 4+ messages in thread
From: Jeff Law @ 2023-02-13  6:46 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 2/3/23 02:15, Richard Sandiford via Gcc-patches wrote:
> aarch64/fcsel_1.c contains:
> 
> double
> f_2 (double a, double b, double c, double d)
> {
>    if (a > b)
>      return c;
>    else
>      return d;
> }
> 
> which started failing in the GCC 12 timeframe.  When it passed,
> the RTL had the form:
> 
> [A]
>    (set (reg ret) (reg c))
>    (set (pc) (if_then_else (gt ...) (label_ref ret) (pc)))
>      edge to ret, fallthru to else
> else:
>    (set (reg ret) (reg d))
>      fallthru to ret
> ret:
>    ...exit...
> 
> i.e. a branch around.  Now the RTL has form:
> 
> [B]
>    (set (reg ret) (reg d))
>    (set (pc) (if_then_else (gt ...) (label_ref then) (pc)))
>      edge to then, fallthru to ret
> ret:
>    ...exit...
> 
> then:
>    (set (reg ret) (reg c))
>      edge to ret
> 
> i.e. a branch out.
> 
> Both are valid, of course, and there's no easy way to predict
> which we'll get.  But ifcvt canonicalises its representation on:
> 
>    if (cond) goto fallthru else goto non-fallthru
> 
> That is, it canoncalises on the branch-around case for half-diamonds.
> It therefore wants to invert the comparison in [B] to get:
> 
>    if (...) goto ret else goto then
> 
> But that isn't possible for strict FP gt, so the optimisation fails.
> 
> Canonicalising on the branch-around case seems like the wrong choice for
> half diamonds.  The natural way of expressing a conditional branch is
> for the label_ref to be the "then" destination and pc to be the "else"
> destination.  And the natural choice of condition seems to be the one
> under which extra stuff *is* done, rather than the one under which extra
> stuff *isn't* done.  But that decision goes back at least 20 years and
> it doesn't seem like a good idea to change it in stage 4.
As I was parsing things up to the last sentence my first thought was no 
isn't the right time to fix this :-)


> 
> This patch instead allows the internal structure to store the
> condition in inverted form.  For simplicity it handles only
> conditional moves, which is the one case that is needed
> to fix the known regression.  (There are probably unknown
> regressions too, but still.)
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> gcc/
> 	* ifcvt.h (noce_if_info::cond_inverted): New field.
> 	* ifcvt.cc (cond_move_convert_if_block): Swap the then and else
> 	values when cond_inverted is true.
> 	(noce_find_if_block): Allow the condition to be inverted when
> 	handling conditional moves.
> ---
>   gcc/ifcvt.cc | 31 +++++++++++++++++++------------
>   gcc/ifcvt.h  |  8 ++++++++
>   2 files changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> index 008796838f7..63ef42b3c34 100644
> --- a/gcc/ifcvt.cc
> +++ b/gcc/ifcvt.cc
> @@ -4253,6 +4253,9 @@ cond_move_convert_if_block (struct noce_if_info *if_infop,
>   	    e = dest;
>   	}
>   
> +      if (if_infop->cond_inverted)
> +	std::swap (t, e);
> +
>         target = noce_emit_cmove (if_infop, dest, code, cond_arg0, cond_arg1,
>   				t, e);
>         if (!target)
> @@ -4405,7 +4408,6 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
>     basic_block then_bb, else_bb, join_bb;
>     bool then_else_reversed = false;
>     rtx_insn *jump;
> -  rtx cond;
>     rtx_insn *cond_earliest;
>     struct noce_if_info if_info;
>     bool speed_p = optimize_bb_for_speed_p (test_bb);
> @@ -4481,25 +4483,28 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
>     if (! onlyjump_p (jump))
>       return FALSE;
>   
> -  /* If this is not a standard conditional jump, we can't parse it.  */
> -  cond = noce_get_condition (jump, &cond_earliest, then_else_reversed);
> -  if (!cond)
> -    return FALSE;
> -
> -  /* We must be comparing objects whose modes imply the size.  */
> -  if (GET_MODE (XEXP (cond, 0)) == BLKmode)
> -    return FALSE;
> -
>     /* Initialize an IF_INFO struct to pass around.  */
>     memset (&if_info, 0, sizeof if_info);
>     if_info.test_bb = test_bb;
>     if_info.then_bb = then_bb;
>     if_info.else_bb = else_bb;
>     if_info.join_bb = join_bb;
> -  if_info.cond = cond;
> +  if_info.cond = noce_get_condition (jump, &cond_earliest,
> +				     then_else_reversed);;
Extraneous ';'.

OK with that nit fixed.

jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Ping] ifcvt: Fix regression in aarch64/fcsel_1.c
  2023-02-03  9:15 [PATCH] ifcvt: Fix regression in aarch64/fcsel_1.c Richard Sandiford
  2023-02-13  6:46 ` Jeff Law
@ 2023-02-13 10:41 ` Richard Sandiford
  2023-02-13 10:46   ` Richard Sandiford
  1 sibling, 1 reply; 4+ messages in thread
From: Richard Sandiford @ 2023-02-13 10:41 UTC (permalink / raw)
  To: gcc-patches

Ping for the patch below

----

aarch64/fcsel_1.c contains:

double
f_2 (double a, double b, double c, double d)
{
  if (a > b)
    return c;
  else
    return d;
}

which started failing in the GCC 12 timeframe.  When it passed,
the RTL had the form:

[A]
  (set (reg ret) (reg c))
  (set (pc) (if_then_else (gt ...) (label_ref ret) (pc)))
    edge to ret, fallthru to else
else:
  (set (reg ret) (reg d))
    fallthru to ret
ret:
  ...exit...

i.e. a branch around.  Now the RTL has form:

[B]
  (set (reg ret) (reg d))
  (set (pc) (if_then_else (gt ...) (label_ref then) (pc)))
    edge to then, fallthru to ret
ret:
  ...exit...

then:
  (set (reg ret) (reg c))
    edge to ret

i.e. a branch out.

Both are valid, of course, and there's no easy way to predict
which we'll get.  But ifcvt canonicalises its representation on:

  if (cond) goto fallthru else goto non-fallthru

That is, it canoncalises on the branch-around case for half-diamonds.
It therefore wants to invert the comparison in [B] to get:

  if (...) goto ret else goto then

But that isn't possible for strict FP gt, so the optimisation fails.

Canonicalising on the branch-around case seems like the wrong choice for
half diamonds.  The natural way of expressing a conditional branch is
for the label_ref to be the "then" destination and pc to be the "else"
destination.  And the natural choice of condition seems to be the one
under which extra stuff *is* done, rather than the one under which extra
stuff *isn't* done.  But that decision goes back at least 20 years and
it doesn't seem like a good idea to change it in stage 4.

This patch instead allows the internal structure to store the
condition in inverted form.  For simplicity it handles only
conditional moves, which is the one case that is needed
to fix the known regression.  (There are probably unknown
regressions too, but still.)

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
	* ifcvt.h (noce_if_info::cond_inverted): New field.
	* ifcvt.cc (cond_move_convert_if_block): Swap the then and else
	values when cond_inverted is true.
	(noce_find_if_block): Allow the condition to be inverted when
	handling conditional moves.
---
 gcc/ifcvt.cc | 31 +++++++++++++++++++------------
 gcc/ifcvt.h  |  8 ++++++++
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index 008796838f7..63ef42b3c34 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -4253,6 +4253,9 @@ cond_move_convert_if_block (struct noce_if_info *if_infop,
 	    e = dest;
 	}
 
+      if (if_infop->cond_inverted)
+	std::swap (t, e);
+
       target = noce_emit_cmove (if_infop, dest, code, cond_arg0, cond_arg1,
 				t, e);
       if (!target)
@@ -4405,7 +4408,6 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
   basic_block then_bb, else_bb, join_bb;
   bool then_else_reversed = false;
   rtx_insn *jump;
-  rtx cond;
   rtx_insn *cond_earliest;
   struct noce_if_info if_info;
   bool speed_p = optimize_bb_for_speed_p (test_bb);
@@ -4481,25 +4483,28 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
   if (! onlyjump_p (jump))
     return FALSE;
 
-  /* If this is not a standard conditional jump, we can't parse it.  */
-  cond = noce_get_condition (jump, &cond_earliest, then_else_reversed);
-  if (!cond)
-    return FALSE;
-
-  /* We must be comparing objects whose modes imply the size.  */
-  if (GET_MODE (XEXP (cond, 0)) == BLKmode)
-    return FALSE;
-
   /* Initialize an IF_INFO struct to pass around.  */
   memset (&if_info, 0, sizeof if_info);
   if_info.test_bb = test_bb;
   if_info.then_bb = then_bb;
   if_info.else_bb = else_bb;
   if_info.join_bb = join_bb;
-  if_info.cond = cond;
+  if_info.cond = noce_get_condition (jump, &cond_earliest,
+				     then_else_reversed);;
   rtx_insn *rev_cond_earliest;
   if_info.rev_cond = noce_get_condition (jump, &rev_cond_earliest,
 					 !then_else_reversed);
+  if (!if_info.cond && !if_info.rev_cond)
+    return FALSE;
+  if (!if_info.cond)
+    {
+      std::swap (if_info.cond, if_info.rev_cond);
+      std::swap (cond_earliest, rev_cond_earliest);
+      if_info.cond_inverted = true;
+    }
+  /* We must be comparing objects whose modes imply the size.  */
+  if (GET_MODE (XEXP (if_info.cond, 0)) == BLKmode)
+    return FALSE;
   gcc_assert (if_info.rev_cond == NULL_RTX
 	      || rev_cond_earliest == cond_earliest);
   if_info.cond_earliest = cond_earliest;
@@ -4518,7 +4523,9 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
 
   /* Do the real work.  */
 
-  if (noce_process_if_block (&if_info))
+  /* ??? noce_process_if_block has not yet been updated to handle
+     inverted conditions.  */
+  if (!if_info.cond_inverted && noce_process_if_block (&if_info))
     return TRUE;
 
   if (HAVE_conditional_move
diff --git a/gcc/ifcvt.h b/gcc/ifcvt.h
index c6ea244aae1..be1385aabe4 100644
--- a/gcc/ifcvt.h
+++ b/gcc/ifcvt.h
@@ -86,6 +86,14 @@ struct noce_if_info
      form as well.  */
   bool then_else_reversed;
 
+  /* True if THEN_BB is conditional on !COND rather than COND.
+     This is used if:
+
+     - JUMP branches to THEN_BB on COND
+     - JUMP falls through to JOIN_BB on !COND
+     - COND cannot be reversed.  */
+  bool cond_inverted;
+
   /* True if the contents of then_bb and else_bb are a
      simple single set instruction.  */
   bool then_simple;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Ping] ifcvt: Fix regression in aarch64/fcsel_1.c
  2023-02-13 10:41 ` [Ping] " Richard Sandiford
@ 2023-02-13 10:46   ` Richard Sandiford
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Sandiford @ 2023-02-13 10:46 UTC (permalink / raw)
  To: gcc-patches

Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Ping for the patch below

Ugh, somehow missed Jeff's OK over the weekend.  Sorry for the noise!

Richard

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-02-13 10:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-03  9:15 [PATCH] ifcvt: Fix regression in aarch64/fcsel_1.c Richard Sandiford
2023-02-13  6:46 ` Jeff Law
2023-02-13 10:41 ` [Ping] " Richard Sandiford
2023-02-13 10:46   ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).