public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jakub Jelinek <jakub@redhat.com>
To: "H.J. Lu" <hjl.tools@gmail.com>, Uros Bizjak <ubizjak@gmail.com>
Cc: gcc-patches@gcc.gnu.org, rep.dot.nop@gmail.com
Subject: Re: [PATCH v4] x86-64: Find a scratch register for large model profiling
Date: Fri, 2 Feb 2024 17:10:05 +0100	[thread overview]
Message-ID: <Zb0T3VJdRuT9ZReR@tucnak> (raw)
In-Reply-To: <20240202154200.306107-1-hjl.tools@gmail.com>

On Fri, Feb 02, 2024 at 07:42:00AM -0800, H.J. Lu wrote:
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -22749,6 +22749,39 @@ current_fentry_section (const char **name)
>    return true;
>  }
>  
> +/* Return an caller-saved register, which isn't live, at entry for
> +   profile.  */
> +
> +static int
> +x86_64_select_profile_regnum (bool r11_ok ATTRIBUTE_UNUSED)
> +{
> +  /* Use %r10 if it isn't used by DRAP.  */
> +  if (!crtl->drap_reg || REGNO (crtl->drap_reg) != R10_REG)

I'd really like to see flag_fentry != 0 || here, if the profiler is
emitted before the prologue (so before initializing the drap register),
%r10 is a fine choice.

> +    return R10_REG;
> +
> +  bitmap reg_live = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun));

I meant at pro_and_epilogue time, but perhaps doing it here
can discover arguments of the function which are used somewhere in
the body too.

> +  int i;
> +  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> +    if (GENERAL_REGNO_P (i)
> +	&& i != R10_REG
> +#ifdef NO_PROFILE_COUNTERS
> +	&& (r11_ok || i != R11_REG)
> +#else
> +	&& i != R11_REG
> +#endif
> +	&& (!REX2_INT_REGNO_P (i) || TARGET_APX_EGPR)
> +	&& !fixed_regs[i]
> +	&& call_used_regs[i]

I wonder if this shouldn't be && (call_used_regs[i] || X)
where X would cover registers known to be saved in the prologue
which aren't live from the prologue to the body (stuff like hard frame
pointer if used).
Because if the prologue say saves %r12 or %rbx to stack but doesn't
yet set it to something, why couldn't the profiler use it?
I'd expect cfun->machine contains something what has been saved there.

> +	&& !REGNO_REG_SET_P (reg_live, i))
> +      return i;
> +
> +  sorry ("no register available for profiling %<-mcmodel=large%s%>",
> +	 ix86_cmodel == CM_LARGE_PIC ? " -fPIC" : "");
> +
> +  return INVALID_REGNUM;
> +}
> +
>  /* Output assembler code to FILE to increment profiler label # LABELNO
>     for profiling a function entry.  */
>  void
> @@ -22783,42 +22816,60 @@ x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
>  	fprintf (file, "\tleaq\t%sP%d(%%rip), %%r11\n", LPREFIX, labelno);
>  #endif
>  
> +      int scratch;
> +      const char *reg_prefix;
> +      const char *reg;
> +
>        if (!TARGET_PECOFF)
>  	{
>  	  switch (ix86_cmodel)
>  	    {
>  	    case CM_LARGE:
> -	      /* NB: R10 is caller-saved.  Although it can be used as a
> -		 static chain register, it is preserved when calling
> -		 mcount for nested functions.  */
> +	      scratch = x86_64_select_profile_regnum (true);
> +	      reg = hi_reg_name[scratch];
> +	      reg_prefix = LEGACY_INT_REGNO_P (scratch) ? "r" : "";
>  	      if (ASSEMBLER_DIALECT == ASM_INTEL)
> -		fprintf (file, "1:\tmovabs\tr10, OFFSET FLAT:%s\n"
> -			       "\tcall\tr10\n", mcount_name);
> +		fprintf (file,
> +			 "1:\tmovabs\t%s%s, OFFSET FLAT:%s\n"
> +			 "\tcall\t%s%s\n",
> +			 reg_prefix, reg, mcount_name, reg_prefix, reg);
>  	      else
> -		fprintf (file, "1:\tmovabsq\t$%s, %%r10\n\tcall\t*%%r10\n",
> -			 mcount_name);
> +		fprintf (file,
> +			 "1:\tmovabsq\t$%s, %%%s%s\n\tcall\t*%%%s%s\n",
> +			 mcount_name, reg_prefix, reg, reg_prefix, reg);
>  	      break;
>  	    case CM_LARGE_PIC:
>  #ifdef NO_PROFILE_COUNTERS
> +	      scratch = x86_64_select_profile_regnum (false);
> +	      reg = hi_reg_name[scratch];
> +	      reg_prefix = LEGACY_INT_REGNO_P (scratch) ? "r" : "";
>  	      if (ASSEMBLER_DIALECT == ASM_INTEL)
>  		{
>  		  fprintf (file, "1:movabs\tr11, "
>  				 "OFFSET FLAT:_GLOBAL_OFFSET_TABLE_-1b\n");
> -		  fprintf (file, "\tlea\tr10, 1b[rip]\n");
> -		  fprintf (file, "\tadd\tr10, r11\n");
> +		  fprintf (file, "\tlea\t%s%s, 1b[rip]\n",
> +			   reg_prefix, reg);
> +		  fprintf (file, "\tadd\t%s%s, r11\n",
> +			   reg_prefix, reg);
>  		  fprintf (file, "\tmovabs\tr11, OFFSET FLAT:%s@PLTOFF\n",
>  			   mcount_name);
> -		  fprintf (file, "\tadd\tr10, r11\n");
> -		  fprintf (file, "\tcall\tr10\n");
> +		  fprintf (file, "\tadd\t%s%s, r11\n",
> +			   reg_prefix, reg);
> +		  fprintf (file, "\tcall\t%s%s\n",
> +			   reg_prefix, reg);
>  		  break;
>  		}
>  	      fprintf (file,
>  		       "1:\tmovabsq\t$_GLOBAL_OFFSET_TABLE_-1b, %%r11\n");
> -	      fprintf (file, "\tleaq\t1b(%%rip), %%r10\n");
> -	      fprintf (file, "\taddq\t%%r11, %%r10\n");
> +	      fprintf (file, "\tleaq\t1b(%%rip), %%%s%s\n",
> +		       reg_prefix, reg);
> +	      fprintf (file, "\taddq\t%%r11, %%%s%s\n",
> +		       reg_prefix, reg);
>  	      fprintf (file, "\tmovabsq\t$%s@PLTOFF, %%r11\n", mcount_name);
> -	      fprintf (file, "\taddq\t%%r11, %%r10\n");
> -	      fprintf (file, "\tcall\t*%%r10\n");
> +	      fprintf (file, "\taddq\t%%r11, %%%s%s\n",
> +		       reg_prefix, reg);
> +	      fprintf (file, "\tcall\t*%%%s%s\n",
> +		       reg_prefix, reg);
>  #else
>  	      sorry ("profiling %<-mcmodel=large%> with PIC is not supported");
>  #endif
> diff --git a/gcc/testsuite/gcc.target/i386/pr113689-1.c b/gcc/testsuite/gcc.target/i386/pr113689-1.c
> new file mode 100644
> index 00000000000..c32445e0fc4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr113689-1.c
> @@ -0,0 +1,49 @@
> +/* { dg-do run { target { lp64 && fpic } } } */
> +/* { dg-options "-O2 -fno-pic -fprofile -mcmodel=large" } */
> +
> +#include <stdarg.h>
> +
> +__attribute__((noipa,noclone,noinline))

Just noipa instead of noipa,noclone,noinline, the former implies
noclone,noinline too.  In all tests.

And, eventually you want Uros to review this too.

	Jakub


  reply	other threads:[~2024-02-02 16:10 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02 15:42 H.J. Lu
2024-02-02 16:10 ` Jakub Jelinek [this message]
2024-02-02 19:19   ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zb0T3VJdRuT9ZReR@tucnak \
    --to=jakub@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=rep.dot.nop@gmail.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).