public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion
@ 2022-03-21 21:27 Qing Zhao
  2022-03-31 13:10 ` Richard Sandiford
  2022-03-31 20:03 ` Uros Bizjak
  0 siblings, 2 replies; 5+ messages in thread
From: Qing Zhao @ 2022-03-21 21:27 UTC (permalink / raw)
  To: richard Sandiford; +Cc: gcc-patches Paul A Clarke via, Xi Ruoyao

Hi, 

Per our discussion on: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592002.html

I come up with the following patch to:

1. Update the documentation for TARGET_ZERO_CALL_USED_REGS hook;
2. Add an assertion in function.cc to make sure the actually zeroed_regs is a subset of all call used regs;
   (The reason I didn’t add a new parameter to TARGET_ZERO_CALL_USED_REGS is, I think adding the 
    assertion in the common place function.cc is simpler to be implemented).
3. This new assertion identified a bug in i386 implementation. Fix this bug in i386.

This patch is bootstrapped on both x86 and aarch64, no regression.

Okay for commit?

thanks.

Qing

===================================
From 2e5bc1b25a707c6a17afbf03da2a8bec5b03454d Mon Sep 17 00:00:00 2001
From: Qing Zhao <qing.zhao@oracle.com>
Date: Fri, 18 Mar 2022 20:49:56 +0000
Subject: [PATCH] Add an assertion: the zeroed_hardregs set is a subset of all
 call used regs.

We should make sure that the hard register set that is actually cleared by
the target hook zero_call_used_regs should be a subset of all call used
registers.

At the same time, update documentation for the target hook
TARGET_ZERO_CALL_USED_REGS.

This new assertion identified a bug in the i386 implemenation, which
incorrectly set the zeroed_hardregs for stack registers. Fixed this bug
in i386 implementation.

gcc/ChangeLog:

2022-03-21  Qing Zhao  <qing.zhao@oracle.com>

	* config/i386/i386.cc (zero_all_st_registers): Return the value of
	num_of_st.
	(ix86_zero_call_used_regs): Update zeroed_hardregs set according to
	the return value of zero_all_st_registers.
	* doc/tm.texi: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
	* function.cc (gen_call_used_regs_seq): Add an assertion.
	* target.def: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
---
 gcc/config/i386/i386.cc | 27 ++++++++++++++++++---------
 gcc/doc/tm.texi         |  7 +++++++
 gcc/function.cc         | 22 ++++++++++++++++++----
 gcc/target.def          |  7 +++++++
 4 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 5a561966eb44..d84047a4bc1b 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -3753,16 +3753,17 @@ zero_all_vector_registers (HARD_REG_SET need_zeroed_hardregs)
    needs to be cleared, the whole stack should be cleared.  However,
    x87 stack registers that hold the return value should be excluded.
    x87 returns in the top (two for complex values) register, so
-   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.  */
+   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.
+   return the value of num_of_st.  */
 
 
-static bool
+static int
 zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
 {
 
   /* If the FPU is disabled, no need to zero all st registers.  */
   if (! (TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387))
-    return false;
+    return 0;
 
   unsigned int num_of_st = 0;
   for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
@@ -3774,7 +3775,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
       }
 
   if (num_of_st == 0)
-    return false;
+    return 0;
 
   bool return_with_x87 = false;
   return_with_x87 = (crtl->return_rtx
@@ -3802,7 +3803,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
       insn = emit_insn (gen_rtx_SET (st_reg, st_reg));
       add_reg_note (insn, REG_DEAD, st_reg);
     }
-  return true;
+  return num_of_st;
 }
 
 
@@ -3851,7 +3852,7 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
 {
   HARD_REG_SET zeroed_hardregs;
   bool all_sse_zeroed = false;
-  bool all_st_zeroed = false;
+  int all_st_zeroed_num = 0;
   bool all_mm_zeroed = false;
 
   CLEAR_HARD_REG_SET (zeroed_hardregs);
@@ -3881,9 +3882,17 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
   if (!exit_with_mmx_mode)
     /* x87 exit mode, we should zero all st registers together.  */
     {
-      all_st_zeroed = zero_all_st_registers (need_zeroed_hardregs);
-      if (all_st_zeroed)
-	SET_HARD_REG_BIT (zeroed_hardregs, FIRST_STACK_REG);
+      all_st_zeroed_num = zero_all_st_registers (need_zeroed_hardregs);
+
+      if (all_st_zeroed_num > 0)
+	for (unsigned int regno = FIRST_STACK_REG; regno <= LAST_STACK_REG; regno++)
+	  /* x87 stack registers that hold the return value should be excluded.
+	     x87 returns in the top (two for complex values) register.  */
+	  if (all_st_zeroed_num == 8
+	      || !((all_st_zeroed_num >= 6 && regno == REGNO (crtl->return_rtx))
+		   || (all_st_zeroed_num == 6
+		       && (regno == (REGNO (crtl->return_rtx) + 1)))))
+	    SET_HARD_REG_BIT (zeroed_hardregs, regno);
     }
   else
     /* MMX exit mode, check whether we can zero all mm registers.  */
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 2f92d37da8c0..c5006afc00d2 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12330,6 +12330,13 @@ This target hook emits instructions to zero the subset of @var{selected_regs}
 that could conceivably contain values that are useful to an attacker.
 Return the set of registers that were actually cleared.
 
+For most targets, the returned set of registers is a subset of
+@var{selected_regs}, however, for some of the targets (for example MIPS),
+clearing some registers that are in the @var{selected_regs} requires
+clearing other call used registers that are not in the @var{selected_regs},
+under such situation, the returned set of registers must be a subset of all
+call used registers.
+
 The default implementation uses normal move instructions to zero
 all the registers in @var{selected_regs}.  Define this hook if the
 target has more efficient ways of zeroing certain registers,
diff --git a/gcc/function.cc b/gcc/function.cc
index d5ed51a6a663..ad0096a43eff 100644
--- a/gcc/function.cc
+++ b/gcc/function.cc
@@ -5892,7 +5892,9 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
   df_simulate_one_insn_backwards (bb, ret, live_out);
 
   HARD_REG_SET selected_hardregs;
+  HARD_REG_SET all_call_used_regs;
   CLEAR_HARD_REG_SET (selected_hardregs);
+  CLEAR_HARD_REG_SET (all_call_used_regs);
   for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
     {
       if (!crtl->abi->clobbers_full_reg_p (regno))
@@ -5901,6 +5903,13 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
 	continue;
       if (REGNO_REG_SET_P (live_out, regno))
 	continue;
+#ifdef LEAF_REG_REMAP
+      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
+	continue;
+#endif
+      /* This is a call used register that is dead at return.  */
+      SET_HARD_REG_BIT (all_call_used_regs, regno);
+
       if (only_gpr
 	  && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], regno))
 	continue;
@@ -5908,10 +5917,6 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
 	continue;
       if (only_arg && !FUNCTION_ARG_REGNO_P (regno))
 	continue;
-#ifdef LEAF_REG_REMAP
-      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
-	continue;
-#endif
 
       /* Now this is a register that we might want to zero.  */
       SET_HARD_REG_BIT (selected_hardregs, regno);
@@ -5925,6 +5930,15 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
   HARD_REG_SET zeroed_hardregs;
   start_sequence ();
   zeroed_hardregs = targetm.calls.zero_call_used_regs (selected_hardregs);
+
+  /* For most targets, the returned set of registers is a subset of
+     selected_hardregs, however, for some of the targets (for example MIPS),
+     clearing some registers that are in selected_hardregs requires clearing
+     other call used registers that are not in the selected_hardregs, under
+     such situation, the returned set of registers must be a subset of
+     all call used registers.  */
+  gcc_assert (hard_reg_set_subset_p (zeroed_hardregs, all_call_used_regs));
+
   rtx_insn *seq = get_insns ();
   end_sequence ();
   if (seq)
diff --git a/gcc/target.def b/gcc/target.def
index 72c2e1ef756c..d85adf36a391 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5122,6 +5122,13 @@ DEFHOOK
 that could conceivably contain values that are useful to an attacker.\n\
 Return the set of registers that were actually cleared.\n\
 \n\
+For most targets, the returned set of registers is a subset of\n\
+@var{selected_regs}, however, for some of the targets (for example MIPS),\n\
+clearing some registers that are in the @var{selected_regs} requires\n\
+clearing other call used registers that are not in the @var{selected_regs},\n\
+under such situation, the returned set of registers must be a subset of all\n\
+call used registers.\n\
+\n\
 The default implementation uses normal move instructions to zero\n\
 all the registers in @var{selected_regs}.  Define this hook if the\n\
 target has more efficient ways of zeroing certain registers,\n\
-- 
2.27.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion
  2022-03-21 21:27 [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion Qing Zhao
@ 2022-03-31 13:10 ` Richard Sandiford
  2022-03-31 13:53   ` Qing Zhao
  2022-04-01 18:45   ` Qing Zhao
  2022-03-31 20:03 ` Uros Bizjak
  1 sibling, 2 replies; 5+ messages in thread
From: Richard Sandiford @ 2022-03-31 13:10 UTC (permalink / raw)
  To: Qing Zhao; +Cc: gcc-patches Paul A Clarke via, Xi Ruoyao

Qing Zhao <qing.zhao@oracle.com> writes:
> Hi, 
>
> Per our discussion on: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592002.html
>
> I come up with the following patch to:
>
> 1. Update the documentation for TARGET_ZERO_CALL_USED_REGS hook;
> 2. Add an assertion in function.cc to make sure the actually zeroed_regs is a subset of all call used regs;
>    (The reason I didn’t add a new parameter to TARGET_ZERO_CALL_USED_REGS is, I think adding the 
>     assertion in the common place function.cc is simpler to be implemented).

Yeah, that's fair.  I guess in theory, passing the parameter would allow
targets to choose between two versions of zeroing the register, one with
a temporary and one without.  But that's a purely hypothetical situation
and we could always add a parameter later if that turns out to be useful.

Perhaps more realistically, there might be other uses of the hook in
future that want to zero registers for different reasons, with their
own rules about which registers can be zeroed.  In other words, the
hook is providing a general facility that happens to be useful for
-fzero-call-used-regs.  But again, we can deal with that if it ever
happens.

So I agree this is the right call, especially for stage 4.

> 3. This new assertion identified a bug in i386 implementation. Fix this bug in i386.
>
> This patch is bootstrapped on both x86 and aarch64, no regression.
>
> Okay for commit?

OK for the non-x86 bits.

Thanks,
Richard

> thanks.
>
> Qing
>
> ===================================
> From 2e5bc1b25a707c6a17afbf03da2a8bec5b03454d Mon Sep 17 00:00:00 2001
> From: Qing Zhao <qing.zhao@oracle.com>
> Date: Fri, 18 Mar 2022 20:49:56 +0000
> Subject: [PATCH] Add an assertion: the zeroed_hardregs set is a subset of all
>  call used regs.
>
> We should make sure that the hard register set that is actually cleared by
> the target hook zero_call_used_regs should be a subset of all call used
> registers.
>
> At the same time, update documentation for the target hook
> TARGET_ZERO_CALL_USED_REGS.
>
> This new assertion identified a bug in the i386 implemenation, which
> incorrectly set the zeroed_hardregs for stack registers. Fixed this bug
> in i386 implementation.
>
> gcc/ChangeLog:
>
> 2022-03-21  Qing Zhao  <qing.zhao@oracle.com>
>
> 	* config/i386/i386.cc (zero_all_st_registers): Return the value of
> 	num_of_st.
> 	(ix86_zero_call_used_regs): Update zeroed_hardregs set according to
> 	the return value of zero_all_st_registers.
> 	* doc/tm.texi: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
> 	* function.cc (gen_call_used_regs_seq): Add an assertion.
> 	* target.def: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
> ---
>  gcc/config/i386/i386.cc | 27 ++++++++++++++++++---------
>  gcc/doc/tm.texi         |  7 +++++++
>  gcc/function.cc         | 22 ++++++++++++++++++----
>  gcc/target.def          |  7 +++++++
>  4 files changed, 50 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 5a561966eb44..d84047a4bc1b 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -3753,16 +3753,17 @@ zero_all_vector_registers (HARD_REG_SET need_zeroed_hardregs)
>     needs to be cleared, the whole stack should be cleared.  However,
>     x87 stack registers that hold the return value should be excluded.
>     x87 returns in the top (two for complex values) register, so
> -   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.  */
> +   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.
> +   return the value of num_of_st.  */
>  
>  
> -static bool
> +static int
>  zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>  {
>  
>    /* If the FPU is disabled, no need to zero all st registers.  */
>    if (! (TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387))
> -    return false;
> +    return 0;
>  
>    unsigned int num_of_st = 0;
>    for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
> @@ -3774,7 +3775,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>        }
>  
>    if (num_of_st == 0)
> -    return false;
> +    return 0;
>  
>    bool return_with_x87 = false;
>    return_with_x87 = (crtl->return_rtx
> @@ -3802,7 +3803,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>        insn = emit_insn (gen_rtx_SET (st_reg, st_reg));
>        add_reg_note (insn, REG_DEAD, st_reg);
>      }
> -  return true;
> +  return num_of_st;
>  }
>  
>  
> @@ -3851,7 +3852,7 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>  {
>    HARD_REG_SET zeroed_hardregs;
>    bool all_sse_zeroed = false;
> -  bool all_st_zeroed = false;
> +  int all_st_zeroed_num = 0;
>    bool all_mm_zeroed = false;
>  
>    CLEAR_HARD_REG_SET (zeroed_hardregs);
> @@ -3881,9 +3882,17 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>    if (!exit_with_mmx_mode)
>      /* x87 exit mode, we should zero all st registers together.  */
>      {
> -      all_st_zeroed = zero_all_st_registers (need_zeroed_hardregs);
> -      if (all_st_zeroed)
> -	SET_HARD_REG_BIT (zeroed_hardregs, FIRST_STACK_REG);
> +      all_st_zeroed_num = zero_all_st_registers (need_zeroed_hardregs);
> +
> +      if (all_st_zeroed_num > 0)
> +	for (unsigned int regno = FIRST_STACK_REG; regno <= LAST_STACK_REG; regno++)
> +	  /* x87 stack registers that hold the return value should be excluded.
> +	     x87 returns in the top (two for complex values) register.  */
> +	  if (all_st_zeroed_num == 8
> +	      || !((all_st_zeroed_num >= 6 && regno == REGNO (crtl->return_rtx))
> +		   || (all_st_zeroed_num == 6
> +		       && (regno == (REGNO (crtl->return_rtx) + 1)))))
> +	    SET_HARD_REG_BIT (zeroed_hardregs, regno);
>      }
>    else
>      /* MMX exit mode, check whether we can zero all mm registers.  */
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 2f92d37da8c0..c5006afc00d2 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -12330,6 +12330,13 @@ This target hook emits instructions to zero the subset of @var{selected_regs}
>  that could conceivably contain values that are useful to an attacker.
>  Return the set of registers that were actually cleared.
>  
> +For most targets, the returned set of registers is a subset of
> +@var{selected_regs}, however, for some of the targets (for example MIPS),
> +clearing some registers that are in the @var{selected_regs} requires
> +clearing other call used registers that are not in the @var{selected_regs},
> +under such situation, the returned set of registers must be a subset of all
> +call used registers.
> +
>  The default implementation uses normal move instructions to zero
>  all the registers in @var{selected_regs}.  Define this hook if the
>  target has more efficient ways of zeroing certain registers,
> diff --git a/gcc/function.cc b/gcc/function.cc
> index d5ed51a6a663..ad0096a43eff 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -5892,7 +5892,9 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>    df_simulate_one_insn_backwards (bb, ret, live_out);
>  
>    HARD_REG_SET selected_hardregs;
> +  HARD_REG_SET all_call_used_regs;
>    CLEAR_HARD_REG_SET (selected_hardregs);
> +  CLEAR_HARD_REG_SET (all_call_used_regs);
>    for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>      {
>        if (!crtl->abi->clobbers_full_reg_p (regno))
> @@ -5901,6 +5903,13 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>  	continue;
>        if (REGNO_REG_SET_P (live_out, regno))
>  	continue;
> +#ifdef LEAF_REG_REMAP
> +      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
> +	continue;
> +#endif
> +      /* This is a call used register that is dead at return.  */
> +      SET_HARD_REG_BIT (all_call_used_regs, regno);
> +
>        if (only_gpr
>  	  && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], regno))
>  	continue;
> @@ -5908,10 +5917,6 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>  	continue;
>        if (only_arg && !FUNCTION_ARG_REGNO_P (regno))
>  	continue;
> -#ifdef LEAF_REG_REMAP
> -      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
> -	continue;
> -#endif
>  
>        /* Now this is a register that we might want to zero.  */
>        SET_HARD_REG_BIT (selected_hardregs, regno);
> @@ -5925,6 +5930,15 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>    HARD_REG_SET zeroed_hardregs;
>    start_sequence ();
>    zeroed_hardregs = targetm.calls.zero_call_used_regs (selected_hardregs);
> +
> +  /* For most targets, the returned set of registers is a subset of
> +     selected_hardregs, however, for some of the targets (for example MIPS),
> +     clearing some registers that are in selected_hardregs requires clearing
> +     other call used registers that are not in the selected_hardregs, under
> +     such situation, the returned set of registers must be a subset of
> +     all call used registers.  */
> +  gcc_assert (hard_reg_set_subset_p (zeroed_hardregs, all_call_used_regs));
> +
>    rtx_insn *seq = get_insns ();
>    end_sequence ();
>    if (seq)
> diff --git a/gcc/target.def b/gcc/target.def
> index 72c2e1ef756c..d85adf36a391 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -5122,6 +5122,13 @@ DEFHOOK
>  that could conceivably contain values that are useful to an attacker.\n\
>  Return the set of registers that were actually cleared.\n\
>  \n\
> +For most targets, the returned set of registers is a subset of\n\
> +@var{selected_regs}, however, for some of the targets (for example MIPS),\n\
> +clearing some registers that are in the @var{selected_regs} requires\n\
> +clearing other call used registers that are not in the @var{selected_regs},\n\
> +under such situation, the returned set of registers must be a subset of all\n\
> +call used registers.\n\
> +\n\
>  The default implementation uses normal move instructions to zero\n\
>  all the registers in @var{selected_regs}.  Define this hook if the\n\
>  target has more efficient ways of zeroing certain registers,\n\

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion
  2022-03-31 13:10 ` Richard Sandiford
@ 2022-03-31 13:53   ` Qing Zhao
  2022-04-01 18:45   ` Qing Zhao
  1 sibling, 0 replies; 5+ messages in thread
From: Qing Zhao @ 2022-03-31 13:53 UTC (permalink / raw)
  To: Richard Sandiford, ubizjak; +Cc: gcc-patches Paul A Clarke via, Xi Ruoyao

Hi, Richard,

Thanks a lot for your reviewing on the non-x86 part. 

Uros, could you please review the x86 part of the change?

thanks.

Qing

> On Mar 31, 2022, at 8:10 AM, Richard Sandiford <richard.sandiford@arm.com> wrote:
> 
> Qing Zhao <qing.zhao@oracle.com> writes:
>> Hi, 
>> 
>> Per our discussion on: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592002.html
>> 
>> I come up with the following patch to:
>> 
>> 1. Update the documentation for TARGET_ZERO_CALL_USED_REGS hook;
>> 2. Add an assertion in function.cc to make sure the actually zeroed_regs is a subset of all call used regs;
>>   (The reason I didn’t add a new parameter to TARGET_ZERO_CALL_USED_REGS is, I think adding the 
>>    assertion in the common place function.cc is simpler to be implemented).
> 
> Yeah, that's fair.  I guess in theory, passing the parameter would allow
> targets to choose between two versions of zeroing the register, one with
> a temporary and one without.  But that's a purely hypothetical situation
> and we could always add a parameter later if that turns out to be useful.
> 
> Perhaps more realistically, there might be other uses of the hook in
> future that want to zero registers for different reasons, with their
> own rules about which registers can be zeroed.  In other words, the
> hook is providing a general facility that happens to be useful for
> -fzero-call-used-regs.  But again, we can deal with that if it ever
> happens.
> 
> So I agree this is the right call, especially for stage 4.
> 
>> 3. This new assertion identified a bug in i386 implementation. Fix this bug in i386.
>> 
>> This patch is bootstrapped on both x86 and aarch64, no regression.
>> 
>> Okay for commit?
> 
> OK for the non-x86 bits.
> 
> Thanks,
> Richard
> 
>> thanks.
>> 
>> Qing
>> 
>> ===================================
>> From 2e5bc1b25a707c6a17afbf03da2a8bec5b03454d Mon Sep 17 00:00:00 2001
>> From: Qing Zhao <qing.zhao@oracle.com>
>> Date: Fri, 18 Mar 2022 20:49:56 +0000
>> Subject: [PATCH] Add an assertion: the zeroed_hardregs set is a subset of all
>> call used regs.
>> 
>> We should make sure that the hard register set that is actually cleared by
>> the target hook zero_call_used_regs should be a subset of all call used
>> registers.
>> 
>> At the same time, update documentation for the target hook
>> TARGET_ZERO_CALL_USED_REGS.
>> 
>> This new assertion identified a bug in the i386 implemenation, which
>> incorrectly set the zeroed_hardregs for stack registers. Fixed this bug
>> in i386 implementation.
>> 
>> gcc/ChangeLog:
>> 
>> 2022-03-21  Qing Zhao  <qing.zhao@oracle.com>
>> 
>> 	* config/i386/i386.cc (zero_all_st_registers): Return the value of
>> 	num_of_st.
>> 	(ix86_zero_call_used_regs): Update zeroed_hardregs set according to
>> 	the return value of zero_all_st_registers.
>> 	* doc/tm.texi: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
>> 	* function.cc (gen_call_used_regs_seq): Add an assertion.
>> 	* target.def: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
>> ---
>> gcc/config/i386/i386.cc | 27 ++++++++++++++++++---------
>> gcc/doc/tm.texi         |  7 +++++++
>> gcc/function.cc         | 22 ++++++++++++++++++----
>> gcc/target.def          |  7 +++++++
>> 4 files changed, 50 insertions(+), 13 deletions(-)
>> 
>> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
>> index 5a561966eb44..d84047a4bc1b 100644
>> --- a/gcc/config/i386/i386.cc
>> +++ b/gcc/config/i386/i386.cc
>> @@ -3753,16 +3753,17 @@ zero_all_vector_registers (HARD_REG_SET need_zeroed_hardregs)
>>    needs to be cleared, the whole stack should be cleared.  However,
>>    x87 stack registers that hold the return value should be excluded.
>>    x87 returns in the top (two for complex values) register, so
>> -   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.  */
>> +   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.
>> +   return the value of num_of_st.  */
>> 
>> 
>> -static bool
>> +static int
>> zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>> {
>> 
>>   /* If the FPU is disabled, no need to zero all st registers.  */
>>   if (! (TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387))
>> -    return false;
>> +    return 0;
>> 
>>   unsigned int num_of_st = 0;
>>   for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>> @@ -3774,7 +3775,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>>       }
>> 
>>   if (num_of_st == 0)
>> -    return false;
>> +    return 0;
>> 
>>   bool return_with_x87 = false;
>>   return_with_x87 = (crtl->return_rtx
>> @@ -3802,7 +3803,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>>       insn = emit_insn (gen_rtx_SET (st_reg, st_reg));
>>       add_reg_note (insn, REG_DEAD, st_reg);
>>     }
>> -  return true;
>> +  return num_of_st;
>> }
>> 
>> 
>> @@ -3851,7 +3852,7 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>> {
>>   HARD_REG_SET zeroed_hardregs;
>>   bool all_sse_zeroed = false;
>> -  bool all_st_zeroed = false;
>> +  int all_st_zeroed_num = 0;
>>   bool all_mm_zeroed = false;
>> 
>>   CLEAR_HARD_REG_SET (zeroed_hardregs);
>> @@ -3881,9 +3882,17 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>>   if (!exit_with_mmx_mode)
>>     /* x87 exit mode, we should zero all st registers together.  */
>>     {
>> -      all_st_zeroed = zero_all_st_registers (need_zeroed_hardregs);
>> -      if (all_st_zeroed)
>> -	SET_HARD_REG_BIT (zeroed_hardregs, FIRST_STACK_REG);
>> +      all_st_zeroed_num = zero_all_st_registers (need_zeroed_hardregs);
>> +
>> +      if (all_st_zeroed_num > 0)
>> +	for (unsigned int regno = FIRST_STACK_REG; regno <= LAST_STACK_REG; regno++)
>> +	  /* x87 stack registers that hold the return value should be excluded.
>> +	     x87 returns in the top (two for complex values) register.  */
>> +	  if (all_st_zeroed_num == 8
>> +	      || !((all_st_zeroed_num >= 6 && regno == REGNO (crtl->return_rtx))
>> +		   || (all_st_zeroed_num == 6
>> +		       && (regno == (REGNO (crtl->return_rtx) + 1)))))
>> +	    SET_HARD_REG_BIT (zeroed_hardregs, regno);
>>     }
>>   else
>>     /* MMX exit mode, check whether we can zero all mm registers.  */
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index 2f92d37da8c0..c5006afc00d2 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> @@ -12330,6 +12330,13 @@ This target hook emits instructions to zero the subset of @var{selected_regs}
>> that could conceivably contain values that are useful to an attacker.
>> Return the set of registers that were actually cleared.
>> 
>> +For most targets, the returned set of registers is a subset of
>> +@var{selected_regs}, however, for some of the targets (for example MIPS),
>> +clearing some registers that are in the @var{selected_regs} requires
>> +clearing other call used registers that are not in the @var{selected_regs},
>> +under such situation, the returned set of registers must be a subset of all
>> +call used registers.
>> +
>> The default implementation uses normal move instructions to zero
>> all the registers in @var{selected_regs}.  Define this hook if the
>> target has more efficient ways of zeroing certain registers,
>> diff --git a/gcc/function.cc b/gcc/function.cc
>> index d5ed51a6a663..ad0096a43eff 100644
>> --- a/gcc/function.cc
>> +++ b/gcc/function.cc
>> @@ -5892,7 +5892,9 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>>   df_simulate_one_insn_backwards (bb, ret, live_out);
>> 
>>   HARD_REG_SET selected_hardregs;
>> +  HARD_REG_SET all_call_used_regs;
>>   CLEAR_HARD_REG_SET (selected_hardregs);
>> +  CLEAR_HARD_REG_SET (all_call_used_regs);
>>   for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>>     {
>>       if (!crtl->abi->clobbers_full_reg_p (regno))
>> @@ -5901,6 +5903,13 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>> 	continue;
>>       if (REGNO_REG_SET_P (live_out, regno))
>> 	continue;
>> +#ifdef LEAF_REG_REMAP
>> +      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
>> +	continue;
>> +#endif
>> +      /* This is a call used register that is dead at return.  */
>> +      SET_HARD_REG_BIT (all_call_used_regs, regno);
>> +
>>       if (only_gpr
>> 	  && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], regno))
>> 	continue;
>> @@ -5908,10 +5917,6 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>> 	continue;
>>       if (only_arg && !FUNCTION_ARG_REGNO_P (regno))
>> 	continue;
>> -#ifdef LEAF_REG_REMAP
>> -      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
>> -	continue;
>> -#endif
>> 
>>       /* Now this is a register that we might want to zero.  */
>>       SET_HARD_REG_BIT (selected_hardregs, regno);
>> @@ -5925,6 +5930,15 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>>   HARD_REG_SET zeroed_hardregs;
>>   start_sequence ();
>>   zeroed_hardregs = targetm.calls.zero_call_used_regs (selected_hardregs);
>> +
>> +  /* For most targets, the returned set of registers is a subset of
>> +     selected_hardregs, however, for some of the targets (for example MIPS),
>> +     clearing some registers that are in selected_hardregs requires clearing
>> +     other call used registers that are not in the selected_hardregs, under
>> +     such situation, the returned set of registers must be a subset of
>> +     all call used registers.  */
>> +  gcc_assert (hard_reg_set_subset_p (zeroed_hardregs, all_call_used_regs));
>> +
>>   rtx_insn *seq = get_insns ();
>>   end_sequence ();
>>   if (seq)
>> diff --git a/gcc/target.def b/gcc/target.def
>> index 72c2e1ef756c..d85adf36a391 100644
>> --- a/gcc/target.def
>> +++ b/gcc/target.def
>> @@ -5122,6 +5122,13 @@ DEFHOOK
>> that could conceivably contain values that are useful to an attacker.\n\
>> Return the set of registers that were actually cleared.\n\
>> \n\
>> +For most targets, the returned set of registers is a subset of\n\
>> +@var{selected_regs}, however, for some of the targets (for example MIPS),\n\
>> +clearing some registers that are in the @var{selected_regs} requires\n\
>> +clearing other call used registers that are not in the @var{selected_regs},\n\
>> +under such situation, the returned set of registers must be a subset of all\n\
>> +call used registers.\n\
>> +\n\
>> The default implementation uses normal move instructions to zero\n\
>> all the registers in @var{selected_regs}.  Define this hook if the\n\
>> target has more efficient ways of zeroing certain registers,\n\


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion
  2022-03-21 21:27 [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion Qing Zhao
  2022-03-31 13:10 ` Richard Sandiford
@ 2022-03-31 20:03 ` Uros Bizjak
  1 sibling, 0 replies; 5+ messages in thread
From: Uros Bizjak @ 2022-03-31 20:03 UTC (permalink / raw)
  To: Qing Zhao; +Cc: richard Sandiford, gcc-patches Paul A Clarke via

On Mon, Mar 21, 2022 at 10:28 PM Qing Zhao via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>
> Per our discussion on: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592002.html
>
> I come up with the following patch to:
>
> 1. Update the documentation for TARGET_ZERO_CALL_USED_REGS hook;
> 2. Add an assertion in function.cc to make sure the actually zeroed_regs is a subset of all call used regs;
>    (The reason I didn’t add a new parameter to TARGET_ZERO_CALL_USED_REGS is, I think adding the
>     assertion in the common place function.cc is simpler to be implemented).
> 3. This new assertion identified a bug in i386 implementation. Fix this bug in i386.
>
> This patch is bootstrapped on both x86 and aarch64, no regression.
>
> Okay for commit?
>
> thanks.
>
> Qing
>
> ===================================
> From 2e5bc1b25a707c6a17afbf03da2a8bec5b03454d Mon Sep 17 00:00:00 2001
> From: Qing Zhao <qing.zhao@oracle.com>
> Date: Fri, 18 Mar 2022 20:49:56 +0000
> Subject: [PATCH] Add an assertion: the zeroed_hardregs set is a subset of all
>  call used regs.
>
> We should make sure that the hard register set that is actually cleared by
> the target hook zero_call_used_regs should be a subset of all call used
> registers.
>
> At the same time, update documentation for the target hook
> TARGET_ZERO_CALL_USED_REGS.
>
> This new assertion identified a bug in the i386 implemenation, which
> incorrectly set the zeroed_hardregs for stack registers. Fixed this bug
> in i386 implementation.
>
> gcc/ChangeLog:
>
> 2022-03-21  Qing Zhao  <qing.zhao@oracle.com>
>
>         * config/i386/i386.cc (zero_all_st_registers): Return the value of
>         num_of_st.
>         (ix86_zero_call_used_regs): Update zeroed_hardregs set according to
>         the return value of zero_all_st_registers.
>         * doc/tm.texi: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
>         * function.cc (gen_call_used_regs_seq): Add an assertion.
>         * target.def: Update the documentation of TARGET_ZERO_CALL_USED_REGS.

OK for the i386 part.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.cc | 27 ++++++++++++++++++---------
>  gcc/doc/tm.texi         |  7 +++++++
>  gcc/function.cc         | 22 ++++++++++++++++++----
>  gcc/target.def          |  7 +++++++
>  4 files changed, 50 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 5a561966eb44..d84047a4bc1b 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -3753,16 +3753,17 @@ zero_all_vector_registers (HARD_REG_SET need_zeroed_hardregs)
>     needs to be cleared, the whole stack should be cleared.  However,
>     x87 stack registers that hold the return value should be excluded.
>     x87 returns in the top (two for complex values) register, so
> -   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.  */
> +   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.
> +   return the value of num_of_st.  */
>
>
> -static bool
> +static int
>  zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>  {
>
>    /* If the FPU is disabled, no need to zero all st registers.  */
>    if (! (TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387))
> -    return false;
> +    return 0;
>
>    unsigned int num_of_st = 0;
>    for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
> @@ -3774,7 +3775,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>        }
>
>    if (num_of_st == 0)
> -    return false;
> +    return 0;
>
>    bool return_with_x87 = false;
>    return_with_x87 = (crtl->return_rtx
> @@ -3802,7 +3803,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>        insn = emit_insn (gen_rtx_SET (st_reg, st_reg));
>        add_reg_note (insn, REG_DEAD, st_reg);
>      }
> -  return true;
> +  return num_of_st;
>  }
>
>
> @@ -3851,7 +3852,7 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>  {
>    HARD_REG_SET zeroed_hardregs;
>    bool all_sse_zeroed = false;
> -  bool all_st_zeroed = false;
> +  int all_st_zeroed_num = 0;
>    bool all_mm_zeroed = false;
>
>    CLEAR_HARD_REG_SET (zeroed_hardregs);
> @@ -3881,9 +3882,17 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>    if (!exit_with_mmx_mode)
>      /* x87 exit mode, we should zero all st registers together.  */
>      {
> -      all_st_zeroed = zero_all_st_registers (need_zeroed_hardregs);
> -      if (all_st_zeroed)
> -       SET_HARD_REG_BIT (zeroed_hardregs, FIRST_STACK_REG);
> +      all_st_zeroed_num = zero_all_st_registers (need_zeroed_hardregs);
> +
> +      if (all_st_zeroed_num > 0)
> +       for (unsigned int regno = FIRST_STACK_REG; regno <= LAST_STACK_REG; regno++)
> +         /* x87 stack registers that hold the return value should be excluded.
> +            x87 returns in the top (two for complex values) register.  */
> +         if (all_st_zeroed_num == 8
> +             || !((all_st_zeroed_num >= 6 && regno == REGNO (crtl->return_rtx))
> +                  || (all_st_zeroed_num == 6
> +                      && (regno == (REGNO (crtl->return_rtx) + 1)))))
> +           SET_HARD_REG_BIT (zeroed_hardregs, regno);
>      }
>    else
>      /* MMX exit mode, check whether we can zero all mm registers.  */
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 2f92d37da8c0..c5006afc00d2 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -12330,6 +12330,13 @@ This target hook emits instructions to zero the subset of @var{selected_regs}
>  that could conceivably contain values that are useful to an attacker.
>  Return the set of registers that were actually cleared.
>
> +For most targets, the returned set of registers is a subset of
> +@var{selected_regs}, however, for some of the targets (for example MIPS),
> +clearing some registers that are in the @var{selected_regs} requires
> +clearing other call used registers that are not in the @var{selected_regs},
> +under such situation, the returned set of registers must be a subset of all
> +call used registers.
> +
>  The default implementation uses normal move instructions to zero
>  all the registers in @var{selected_regs}.  Define this hook if the
>  target has more efficient ways of zeroing certain registers,
> diff --git a/gcc/function.cc b/gcc/function.cc
> index d5ed51a6a663..ad0096a43eff 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -5892,7 +5892,9 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>    df_simulate_one_insn_backwards (bb, ret, live_out);
>
>    HARD_REG_SET selected_hardregs;
> +  HARD_REG_SET all_call_used_regs;
>    CLEAR_HARD_REG_SET (selected_hardregs);
> +  CLEAR_HARD_REG_SET (all_call_used_regs);
>    for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>      {
>        if (!crtl->abi->clobbers_full_reg_p (regno))
> @@ -5901,6 +5903,13 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>         continue;
>        if (REGNO_REG_SET_P (live_out, regno))
>         continue;
> +#ifdef LEAF_REG_REMAP
> +      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
> +       continue;
> +#endif
> +      /* This is a call used register that is dead at return.  */
> +      SET_HARD_REG_BIT (all_call_used_regs, regno);
> +
>        if (only_gpr
>           && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], regno))
>         continue;
> @@ -5908,10 +5917,6 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>         continue;
>        if (only_arg && !FUNCTION_ARG_REGNO_P (regno))
>         continue;
> -#ifdef LEAF_REG_REMAP
> -      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
> -       continue;
> -#endif
>
>        /* Now this is a register that we might want to zero.  */
>        SET_HARD_REG_BIT (selected_hardregs, regno);
> @@ -5925,6 +5930,15 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>    HARD_REG_SET zeroed_hardregs;
>    start_sequence ();
>    zeroed_hardregs = targetm.calls.zero_call_used_regs (selected_hardregs);
> +
> +  /* For most targets, the returned set of registers is a subset of
> +     selected_hardregs, however, for some of the targets (for example MIPS),
> +     clearing some registers that are in selected_hardregs requires clearing
> +     other call used registers that are not in the selected_hardregs, under
> +     such situation, the returned set of registers must be a subset of
> +     all call used registers.  */
> +  gcc_assert (hard_reg_set_subset_p (zeroed_hardregs, all_call_used_regs));
> +
>    rtx_insn *seq = get_insns ();
>    end_sequence ();
>    if (seq)
> diff --git a/gcc/target.def b/gcc/target.def
> index 72c2e1ef756c..d85adf36a391 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -5122,6 +5122,13 @@ DEFHOOK
>  that could conceivably contain values that are useful to an attacker.\n\
>  Return the set of registers that were actually cleared.\n\
>  \n\
> +For most targets, the returned set of registers is a subset of\n\
> +@var{selected_regs}, however, for some of the targets (for example MIPS),\n\
> +clearing some registers that are in the @var{selected_regs} requires\n\
> +clearing other call used registers that are not in the @var{selected_regs},\n\
> +under such situation, the returned set of registers must be a subset of all\n\
> +call used registers.\n\
> +\n\
>  The default implementation uses normal move instructions to zero\n\
>  all the registers in @var{selected_regs}.  Define this hook if the\n\
>  target has more efficient ways of zeroing certain registers,\n\
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion
  2022-03-31 13:10 ` Richard Sandiford
  2022-03-31 13:53   ` Qing Zhao
@ 2022-04-01 18:45   ` Qing Zhao
  1 sibling, 0 replies; 5+ messages in thread
From: Qing Zhao @ 2022-04-01 18:45 UTC (permalink / raw)
  To: Richard Sandiford, ubizjak; +Cc: gcc-patches Paul A Clarke via, Xi Ruoyao

FYI. 

I have committed the change to upstream as:

31933f4f788b6cd64cbb7ee42076997f6d0fe212

Qing
> On Mar 31, 2022, at 8:10 AM, Richard Sandiford <richard.sandiford@arm.com> wrote:
> 
> Qing Zhao <qing.zhao@oracle.com> writes:
>> Hi, 
>> 
>> Per our discussion on: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592002.html
>> 
>> I come up with the following patch to:
>> 
>> 1. Update the documentation for TARGET_ZERO_CALL_USED_REGS hook;
>> 2. Add an assertion in function.cc to make sure the actually zeroed_regs is a subset of all call used regs;
>>   (The reason I didn’t add a new parameter to TARGET_ZERO_CALL_USED_REGS is, I think adding the 
>>    assertion in the common place function.cc is simpler to be implemented).
> 
> Yeah, that's fair.  I guess in theory, passing the parameter would allow
> targets to choose between two versions of zeroing the register, one with
> a temporary and one without.  But that's a purely hypothetical situation
> and we could always add a parameter later if that turns out to be useful.
> 
> Perhaps more realistically, there might be other uses of the hook in
> future that want to zero registers for different reasons, with their
> own rules about which registers can be zeroed.  In other words, the
> hook is providing a general facility that happens to be useful for
> -fzero-call-used-regs.  But again, we can deal with that if it ever
> happens.
> 
> So I agree this is the right call, especially for stage 4.
> 
>> 3. This new assertion identified a bug in i386 implementation. Fix this bug in i386.
>> 
>> This patch is bootstrapped on both x86 and aarch64, no regression.
>> 
>> Okay for commit?
> 
> OK for the non-x86 bits.
> 
> Thanks,
> Richard
> 
>> thanks.
>> 
>> Qing
>> 
>> ===================================
>> From 2e5bc1b25a707c6a17afbf03da2a8bec5b03454d Mon Sep 17 00:00:00 2001
>> From: Qing Zhao <qing.zhao@oracle.com>
>> Date: Fri, 18 Mar 2022 20:49:56 +0000
>> Subject: [PATCH] Add an assertion: the zeroed_hardregs set is a subset of all
>> call used regs.
>> 
>> We should make sure that the hard register set that is actually cleared by
>> the target hook zero_call_used_regs should be a subset of all call used
>> registers.
>> 
>> At the same time, update documentation for the target hook
>> TARGET_ZERO_CALL_USED_REGS.
>> 
>> This new assertion identified a bug in the i386 implemenation, which
>> incorrectly set the zeroed_hardregs for stack registers. Fixed this bug
>> in i386 implementation.
>> 
>> gcc/ChangeLog:
>> 
>> 2022-03-21  Qing Zhao  <qing.zhao@oracle.com>
>> 
>> 	* config/i386/i386.cc (zero_all_st_registers): Return the value of
>> 	num_of_st.
>> 	(ix86_zero_call_used_regs): Update zeroed_hardregs set according to
>> 	the return value of zero_all_st_registers.
>> 	* doc/tm.texi: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
>> 	* function.cc (gen_call_used_regs_seq): Add an assertion.
>> 	* target.def: Update the documentation of TARGET_ZERO_CALL_USED_REGS.
>> ---
>> gcc/config/i386/i386.cc | 27 ++++++++++++++++++---------
>> gcc/doc/tm.texi         |  7 +++++++
>> gcc/function.cc         | 22 ++++++++++++++++++----
>> gcc/target.def          |  7 +++++++
>> 4 files changed, 50 insertions(+), 13 deletions(-)
>> 
>> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
>> index 5a561966eb44..d84047a4bc1b 100644
>> --- a/gcc/config/i386/i386.cc
>> +++ b/gcc/config/i386/i386.cc
>> @@ -3753,16 +3753,17 @@ zero_all_vector_registers (HARD_REG_SET need_zeroed_hardregs)
>>    needs to be cleared, the whole stack should be cleared.  However,
>>    x87 stack registers that hold the return value should be excluded.
>>    x87 returns in the top (two for complex values) register, so
>> -   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.  */
>> +   num_of_st should be 7/6 when x87 returns, otherwise it will be 8.
>> +   return the value of num_of_st.  */
>> 
>> 
>> -static bool
>> +static int
>> zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>> {
>> 
>>   /* If the FPU is disabled, no need to zero all st registers.  */
>>   if (! (TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387))
>> -    return false;
>> +    return 0;
>> 
>>   unsigned int num_of_st = 0;
>>   for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>> @@ -3774,7 +3775,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>>       }
>> 
>>   if (num_of_st == 0)
>> -    return false;
>> +    return 0;
>> 
>>   bool return_with_x87 = false;
>>   return_with_x87 = (crtl->return_rtx
>> @@ -3802,7 +3803,7 @@ zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs)
>>       insn = emit_insn (gen_rtx_SET (st_reg, st_reg));
>>       add_reg_note (insn, REG_DEAD, st_reg);
>>     }
>> -  return true;
>> +  return num_of_st;
>> }
>> 
>> 
>> @@ -3851,7 +3852,7 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>> {
>>   HARD_REG_SET zeroed_hardregs;
>>   bool all_sse_zeroed = false;
>> -  bool all_st_zeroed = false;
>> +  int all_st_zeroed_num = 0;
>>   bool all_mm_zeroed = false;
>> 
>>   CLEAR_HARD_REG_SET (zeroed_hardregs);
>> @@ -3881,9 +3882,17 @@ ix86_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>>   if (!exit_with_mmx_mode)
>>     /* x87 exit mode, we should zero all st registers together.  */
>>     {
>> -      all_st_zeroed = zero_all_st_registers (need_zeroed_hardregs);
>> -      if (all_st_zeroed)
>> -	SET_HARD_REG_BIT (zeroed_hardregs, FIRST_STACK_REG);
>> +      all_st_zeroed_num = zero_all_st_registers (need_zeroed_hardregs);
>> +
>> +      if (all_st_zeroed_num > 0)
>> +	for (unsigned int regno = FIRST_STACK_REG; regno <= LAST_STACK_REG; regno++)
>> +	  /* x87 stack registers that hold the return value should be excluded.
>> +	     x87 returns in the top (two for complex values) register.  */
>> +	  if (all_st_zeroed_num == 8
>> +	      || !((all_st_zeroed_num >= 6 && regno == REGNO (crtl->return_rtx))
>> +		   || (all_st_zeroed_num == 6
>> +		       && (regno == (REGNO (crtl->return_rtx) + 1)))))
>> +	    SET_HARD_REG_BIT (zeroed_hardregs, regno);
>>     }
>>   else
>>     /* MMX exit mode, check whether we can zero all mm registers.  */
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index 2f92d37da8c0..c5006afc00d2 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> @@ -12330,6 +12330,13 @@ This target hook emits instructions to zero the subset of @var{selected_regs}
>> that could conceivably contain values that are useful to an attacker.
>> Return the set of registers that were actually cleared.
>> 
>> +For most targets, the returned set of registers is a subset of
>> +@var{selected_regs}, however, for some of the targets (for example MIPS),
>> +clearing some registers that are in the @var{selected_regs} requires
>> +clearing other call used registers that are not in the @var{selected_regs},
>> +under such situation, the returned set of registers must be a subset of all
>> +call used registers.
>> +
>> The default implementation uses normal move instructions to zero
>> all the registers in @var{selected_regs}.  Define this hook if the
>> target has more efficient ways of zeroing certain registers,
>> diff --git a/gcc/function.cc b/gcc/function.cc
>> index d5ed51a6a663..ad0096a43eff 100644
>> --- a/gcc/function.cc
>> +++ b/gcc/function.cc
>> @@ -5892,7 +5892,9 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>>   df_simulate_one_insn_backwards (bb, ret, live_out);
>> 
>>   HARD_REG_SET selected_hardregs;
>> +  HARD_REG_SET all_call_used_regs;
>>   CLEAR_HARD_REG_SET (selected_hardregs);
>> +  CLEAR_HARD_REG_SET (all_call_used_regs);
>>   for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>>     {
>>       if (!crtl->abi->clobbers_full_reg_p (regno))
>> @@ -5901,6 +5903,13 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>> 	continue;
>>       if (REGNO_REG_SET_P (live_out, regno))
>> 	continue;
>> +#ifdef LEAF_REG_REMAP
>> +      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
>> +	continue;
>> +#endif
>> +      /* This is a call used register that is dead at return.  */
>> +      SET_HARD_REG_BIT (all_call_used_regs, regno);
>> +
>>       if (only_gpr
>> 	  && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], regno))
>> 	continue;
>> @@ -5908,10 +5917,6 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>> 	continue;
>>       if (only_arg && !FUNCTION_ARG_REGNO_P (regno))
>> 	continue;
>> -#ifdef LEAF_REG_REMAP
>> -      if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0)
>> -	continue;
>> -#endif
>> 
>>       /* Now this is a register that we might want to zero.  */
>>       SET_HARD_REG_BIT (selected_hardregs, regno);
>> @@ -5925,6 +5930,15 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int zero_regs_type)
>>   HARD_REG_SET zeroed_hardregs;
>>   start_sequence ();
>>   zeroed_hardregs = targetm.calls.zero_call_used_regs (selected_hardregs);
>> +
>> +  /* For most targets, the returned set of registers is a subset of
>> +     selected_hardregs, however, for some of the targets (for example MIPS),
>> +     clearing some registers that are in selected_hardregs requires clearing
>> +     other call used registers that are not in the selected_hardregs, under
>> +     such situation, the returned set of registers must be a subset of
>> +     all call used registers.  */
>> +  gcc_assert (hard_reg_set_subset_p (zeroed_hardregs, all_call_used_regs));
>> +
>>   rtx_insn *seq = get_insns ();
>>   end_sequence ();
>>   if (seq)
>> diff --git a/gcc/target.def b/gcc/target.def
>> index 72c2e1ef756c..d85adf36a391 100644
>> --- a/gcc/target.def
>> +++ b/gcc/target.def
>> @@ -5122,6 +5122,13 @@ DEFHOOK
>> that could conceivably contain values that are useful to an attacker.\n\
>> Return the set of registers that were actually cleared.\n\
>> \n\
>> +For most targets, the returned set of registers is a subset of\n\
>> +@var{selected_regs}, however, for some of the targets (for example MIPS),\n\
>> +clearing some registers that are in the @var{selected_regs} requires\n\
>> +clearing other call used registers that are not in the @var{selected_regs},\n\
>> +under such situation, the returned set of registers must be a subset of all\n\
>> +call used registers.\n\
>> +\n\
>> The default implementation uses normal move instructions to zero\n\
>> all the registers in @var{selected_regs}.  Define this hook if the\n\
>> target has more efficient ways of zeroing certain registers,\n\


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-01 18:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-21 21:27 [patch]update the documentation for TARGET_ZERO_CALL_USED_REGS hook and add an assertion Qing Zhao
2022-03-31 13:10 ` Richard Sandiford
2022-03-31 13:53   ` Qing Zhao
2022-04-01 18:45   ` Qing Zhao
2022-03-31 20:03 ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).