public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Ivan Sorokin <vanyacpp@gmail.com>
To: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] [PR91400] build only one __cpu_model variable
Date: Mon, 3 May 2021 12:08:39 +0300	[thread overview]
Message-ID: <1163bb22-4682-1a51-6268-3fdd48485853@gmail.com> (raw)
In-Reply-To: <20210503083948.449-1-vanyacpp@gmail.com>

As GCC 11 is released and the development of GCC 12 begins I would like
to send this patch again.

It is basically unchanged since the previous submission in February. The 
only change I made is fixing formatting by adding a space after gcc_assert.

Please note, that the original author of the code no longer actively 
contribute to GCC, so he is unlikely to review this patch. It would be 
great if someone else takes a look at it.

On 03.05.2021 11:39, Ivan Sorokin wrote:
> Prior to this commit GCC -O2 generated quite bad code for this
> function:
> 
> bool f()
> {
>      return __builtin_cpu_supports("popcnt")
>          && __builtin_cpu_supports("ssse3");
> }
> 
> f:
>          movl    __cpu_model+12(%rip), %eax
>          xorl    %r8d, %r8d
>          testb   $4, %al
>          je      .L1
>          shrl    $6, %eax
>          movl    %eax, %r8d
>          andl    $1, %r8d
> .L1:
>          movl    %r8d, %eax
>          ret
> 
> The problem was caused by the fact that internally every invocation
> of __builtin_cpu_supports built a new variable __cpu_model and a new
> type __processor_model. Because of this GIMPLE level optimizers
> weren't able to CSE the loads of __cpu_model and optimize
> bit-operations properly.
> 
> This commit fixes the problem by caching created __cpu_model
> variable and __processor_model type. Now the GCC -O2 generates:
> 
> f:
>          movl    __cpu_model+12(%rip), %eax
>          andl    $68, %eax
>          cmpl    $68, %eax
>          sete    %al
>          ret
> 
> gcc/ChangeLog:
> 
> 	PR target/91400
> 	* config/i386/i386-builtins.c (fold_builtin_cpu): Extract
> 	building of __cpu_model and __processor_model into new
> 	function.
> 	* config/i386/i386-builtins.c (init_cpu_model_var): New.
> 	Cache creation of __cpu_model and __processor_model.
> 
> gcc/testsuite/Changelog:
> 
> 	PR target/91400
> 	* gcc.target/i386/pr91400.c: New.
> ---
>   gcc/config/i386/i386-builtins.c         | 27 ++++++++++++++++++-------
>   gcc/testsuite/gcc.target/i386/pr91400.c | 11 ++++++++++
>   2 files changed, 31 insertions(+), 7 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/i386/pr91400.c
> 
> diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
> index b66911082ab..b7d9dd18b03 100644
> --- a/gcc/config/i386/i386-builtins.c
> +++ b/gcc/config/i386/i386-builtins.c
> @@ -2103,6 +2103,25 @@ make_var_decl (tree type, const char *name)
>     return new_decl;
>   }
>   
> +static GTY(()) tree __cpu_model_var;
> +static GTY(()) tree __processor_model_type;
> +
> +static void
> +init_cpu_model_var()
> +{
> +  if (__cpu_model_var != NULL_TREE)
> +    {
> +      gcc_assert (__processor_model_type != NULL_TREE);
> +      return;
> +    }
> +
> +  __processor_model_type = build_processor_model_struct ();
> +  __cpu_model_var = make_var_decl (__processor_model_type,
> +				   "__cpu_model");
> +
> +  varpool_node::add (__cpu_model_var);
> +}
> +
>   /* FNDECL is a __builtin_cpu_is or a __builtin_cpu_supports call that is folded
>      into an integer defined in libgcc/config/i386/cpuinfo.c */
>   
> @@ -2114,13 +2133,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
>       = (enum ix86_builtins) DECL_MD_FUNCTION_CODE (fndecl);
>     tree param_string_cst = NULL;
>   
> -  tree __processor_model_type = build_processor_model_struct ();
> -  tree __cpu_model_var = make_var_decl (__processor_model_type,
> -					"__cpu_model");
> -
> -
> -  varpool_node::add (__cpu_model_var);
> -
> +  init_cpu_model_var ();
>     gcc_assert ((args != NULL) && (*args != NULL));
>   
>     param_string_cst = *args;
> diff --git a/gcc/testsuite/gcc.target/i386/pr91400.c b/gcc/testsuite/gcc.target/i386/pr91400.c
> new file mode 100644
> index 00000000000..e8b7d9285f9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr91400.c
> @@ -0,0 +1,11 @@
> +/* PR target/91400 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-times "andl" 1 } } */
> +/* { dg-final { scan-assembler-times "68" 2 } } */
> +/* { dg-final { scan-assembler-not "je" } } */
> +
> +_Bool f()
> +{
> +    return __builtin_cpu_supports("popcnt") && __builtin_cpu_supports("ssse3");
> +}
> 


  reply	other threads:[~2021-05-03  9:08 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-03  8:39 Ivan Sorokin
2021-05-03  9:08 ` Ivan Sorokin [this message]
2021-05-05  7:36 ` Richard Biener
2021-05-05 14:50   ` [PATCH v2] x86: Build only one __cpu_model/__cpu_features2 variables H.J. Lu
2021-05-05 17:59     ` Uros Bizjak
  -- strict thread matches above, loose matches on Subject: below --
2021-02-01 23:00 [PATCH] [PR91400] build only one __cpu_model variable Ivan Sorokin
2021-02-01 23:16 ` Ivan Sorokin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1163bb22-4682-1a51-6268-3fdd48485853@gmail.com \
    --to=vanyacpp@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).