public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Optimize macro: make it more predictable
@ 2020-10-23 11:47 Martin Liška
  2020-11-03 13:27 ` Richard Biener
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Martin Liška @ 2020-10-23 11:47 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, Jakub Jelinek, Michael Matz

Hey.

This is a follow-up of the discussion that happened in thread about no_stack_protector
attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html

The current optimize attribute works in the following way:
- 1) we take current global_options as base
- 2) maybe_default_options is called for the currently selected optimization level, which
      means all rules in default_options_table are executed
- 3) attribute values are applied (via decode_options)

So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
     /* -O1 and -Og optimizations.  */
     { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },

My patch handled and the current optimize attribute really behaves that same as appending attribute value
to the command line. So far so good. We should also reflect that in documentation entry which is quite
vague right now:

"""
The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
"""

and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that

-O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
with -ftree-vectorize -O1 ?

I'm also planning to take a look at the target macro/attribute, I expect similar problems:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469

Thoughts?
Thanks,
Martin

gcc/c-family/ChangeLog:

	* c-common.c (parse_optimize_options): Decoded attribute options
	with the ones that were already set on the command line.

gcc/ChangeLog:

	* toplev.c (toplev::main): Save decoded Optimization options.
	* toplev.h (save_opt_decoded_options): New.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable -ffast-math.
	* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
---
  gcc/c-family/c-common.c                           | 15 ++++++++++++++-
  .../gcc.target/i386/avx512er-vrsqrt28ps-3.c       |  2 +-
  .../gcc.target/i386/avx512er-vrsqrt28ps-5.c       |  2 +-
  gcc/toplev.c                                      |  8 ++++++++
  gcc/toplev.h                                      |  1 +
  5 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index e16ca3894bc..d4342e93d0a 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5727,10 +5727,23 @@ parse_optimize_options (tree args, bool attr_p)
        j++;
      }
    decoded_options_count = j;
+
+  /* Merge the decoded options with save_decoded_options.  */
+  unsigned save_opt_count = save_opt_decoded_options.length ();
+  unsigned merged_decoded_options_count = save_opt_count + decoded_options_count;
+  cl_decoded_option *merged_decoded_options
+    = XNEWVEC (cl_decoded_option, merged_decoded_options_count);
+
+  for (unsigned i = 0; i < save_opt_count; ++i)
+    merged_decoded_options[i] = save_opt_decoded_options[i];
+  for (unsigned i = 0; i < decoded_options_count; ++i)
+    merged_decoded_options[save_opt_count + i] = decoded_options[i];
+
    /* And apply them.  */
    decode_options (&global_options, &global_options_set,
-		  decoded_options, decoded_options_count,
+		  merged_decoded_options, merged_decoded_options_count,
  		  input_location, global_dc, NULL);
+  free (decoded_options);
  
    targetm.override_options_after_change();
  
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
index 1ba8172d6e3..40aefb50844 100644
--- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
@@ -8,7 +8,7 @@
  #define MAX 1000
  #define EPS 0.00001
  
-__attribute__ ((noinline, optimize (1)))
+__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
  void static
  compute_rsqrt_ref (float *a, float *r)
  {
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
index e067a81e562..498f4d50aa5 100644
--- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
@@ -8,7 +8,7 @@
  #define MAX 1000
  #define EPS 0.00001
  
-__attribute__ ((noinline, optimize (1)))
+__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
  void static
  compute_sqrt_ref (float *a, float *r)
  {
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 8c1e1e1f44f..27e4a1bc485 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -123,6 +123,9 @@ static bool no_backend;
  struct cl_decoded_option *save_decoded_options;
  unsigned int save_decoded_options_count;
  
+/* Vector of saved Optimization decoded command line options.  */
+auto_vec<cl_decoded_option> save_opt_decoded_options;
+
  /* Used to enable -fvar-tracking, -fweb and -frename-registers according
     to optimize in process_options ().  */
  #define AUTODETECT_VALUE 2
@@ -2430,6 +2433,11 @@ toplev::main (int argc, char **argv)
  						&save_decoded_options,
  						&save_decoded_options_count);
  
+  /* Save Optimization decoded options.  */
+  for (unsigned i = 0; i < save_decoded_options_count; ++i)
+    if (cl_options[save_decoded_options[i].opt_index].flags & CL_OPTIMIZATION)
+      save_opt_decoded_options.safe_push (save_decoded_options[i]);
+
    /* Perform language-specific options initialization.  */
    lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
  
diff --git a/gcc/toplev.h b/gcc/toplev.h
index d6c316962b0..627312f85f5 100644
--- a/gcc/toplev.h
+++ b/gcc/toplev.h
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
  /* Decoded options, and number of such options.  */
  extern struct cl_decoded_option *save_decoded_options;
  extern unsigned int save_decoded_options_count;
+extern auto_vec<cl_decoded_option> save_opt_decoded_options;
  
  class timer;
  
-- 
2.28.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-10-23 11:47 [PATCH] Optimize macro: make it more predictable Martin Liška
@ 2020-11-03 13:27 ` Richard Biener
  2020-11-03 13:34   ` Jakub Jelinek
  2020-11-09 10:27   ` Martin Liška
  2020-11-06 17:34 ` Jeff Law
  2021-07-01 13:13 ` Martin Liška
  2 siblings, 2 replies; 30+ messages in thread
From: Richard Biener @ 2020-11-03 13:27 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, Jakub Jelinek, Michael Matz

On Fri, Oct 23, 2020 at 1:47 PM Martin Liška <mliska@suse.cz> wrote:
>
> Hey.
>
> This is a follow-up of the discussion that happened in thread about no_stack_protector
> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>
> The current optimize attribute works in the following way:
> - 1) we take current global_options as base
> - 2) maybe_default_options is called for the currently selected optimization level, which
>       means all rules in default_options_table are executed
> - 3) attribute values are applied (via decode_options)
>
> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>      /* -O1 and -Og optimizations.  */
>      { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>
> My patch handled and the current optimize attribute really behaves that same as appending attribute value
> to the command line. So far so good. We should also reflect that in documentation entry which is quite
> vague right now:
>
> """
> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
> """
>
> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>
> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
> with -ftree-vectorize -O1 ?

Hmm.  I guess the only two reasonable options are to append to the active set
and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
end up with -O1.

Maybe we can have

@item optimize (@var{level}, @dots{})

reset everything to plain -On and

@item optimize (@var{string}, @dots{})

append?  So optimize("O1") will end up with -O2 -ftree-vectorize -O1 and
optimize(1) with -O1?  How do we handle

void __attribute__((optimize(1),optimize("ftree-vectorize")))

thus two optimize attributes?

> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>
> Thoughts?
> Thanks,
> Martin
>
> gcc/c-family/ChangeLog:
>
>         * c-common.c (parse_optimize_options): Decoded attribute options
>         with the ones that were already set on the command line.
>
> gcc/ChangeLog:
>
>         * toplev.c (toplev::main): Save decoded Optimization options.
>         * toplev.h (save_opt_decoded_options): New.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable -ffast-math.
>         * gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
> ---
>   gcc/c-family/c-common.c                           | 15 ++++++++++++++-
>   .../gcc.target/i386/avx512er-vrsqrt28ps-3.c       |  2 +-
>   .../gcc.target/i386/avx512er-vrsqrt28ps-5.c       |  2 +-
>   gcc/toplev.c                                      |  8 ++++++++
>   gcc/toplev.h                                      |  1 +
>   5 files changed, 25 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index e16ca3894bc..d4342e93d0a 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -5727,10 +5727,23 @@ parse_optimize_options (tree args, bool attr_p)
>         j++;
>       }
>     decoded_options_count = j;
> +
> +  /* Merge the decoded options with save_decoded_options.  */
> +  unsigned save_opt_count = save_opt_decoded_options.length ();
> +  unsigned merged_decoded_options_count = save_opt_count + decoded_options_count;
> +  cl_decoded_option *merged_decoded_options
> +    = XNEWVEC (cl_decoded_option, merged_decoded_options_count);
> +
> +  for (unsigned i = 0; i < save_opt_count; ++i)
> +    merged_decoded_options[i] = save_opt_decoded_options[i];
> +  for (unsigned i = 0; i < decoded_options_count; ++i)
> +    merged_decoded_options[save_opt_count + i] = decoded_options[i];
> +
>     /* And apply them.  */
>     decode_options (&global_options, &global_options_set,
> -                 decoded_options, decoded_options_count,
> +                 merged_decoded_options, merged_decoded_options_count,
>                   input_location, global_dc, NULL);
> +  free (decoded_options);
>
>     targetm.override_options_after_change();
>
> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
> index 1ba8172d6e3..40aefb50844 100644
> --- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
> @@ -8,7 +8,7 @@
>   #define MAX 1000
>   #define EPS 0.00001
>
> -__attribute__ ((noinline, optimize (1)))
> +__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
>   void static
>   compute_rsqrt_ref (float *a, float *r)
>   {
> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
> index e067a81e562..498f4d50aa5 100644
> --- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
> @@ -8,7 +8,7 @@
>   #define MAX 1000
>   #define EPS 0.00001
>
> -__attribute__ ((noinline, optimize (1)))
> +__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
>   void static
>   compute_sqrt_ref (float *a, float *r)
>   {
> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index 8c1e1e1f44f..27e4a1bc485 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -123,6 +123,9 @@ static bool no_backend;
>   struct cl_decoded_option *save_decoded_options;
>   unsigned int save_decoded_options_count;
>
> +/* Vector of saved Optimization decoded command line options.  */
> +auto_vec<cl_decoded_option> save_opt_decoded_options;
> +
>   /* Used to enable -fvar-tracking, -fweb and -frename-registers according
>      to optimize in process_options ().  */
>   #define AUTODETECT_VALUE 2
> @@ -2430,6 +2433,11 @@ toplev::main (int argc, char **argv)
>                                                 &save_decoded_options,
>                                                 &save_decoded_options_count);
>
> +  /* Save Optimization decoded options.  */
> +  for (unsigned i = 0; i < save_decoded_options_count; ++i)
> +    if (cl_options[save_decoded_options[i].opt_index].flags & CL_OPTIMIZATION)
> +      save_opt_decoded_options.safe_push (save_decoded_options[i]);
> +
>     /* Perform language-specific options initialization.  */
>     lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
>
> diff --git a/gcc/toplev.h b/gcc/toplev.h
> index d6c316962b0..627312f85f5 100644
> --- a/gcc/toplev.h
> +++ b/gcc/toplev.h
> @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
>   /* Decoded options, and number of such options.  */
>   extern struct cl_decoded_option *save_decoded_options;
>   extern unsigned int save_decoded_options_count;
> +extern auto_vec<cl_decoded_option> save_opt_decoded_options;
>
>   class timer;
>
> --
> 2.28.0
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-11-03 13:27 ` Richard Biener
@ 2020-11-03 13:34   ` Jakub Jelinek
  2020-11-03 13:40     ` Richard Biener
  2020-11-09 10:35     ` Martin Liška
  2020-11-09 10:27   ` Martin Liška
  1 sibling, 2 replies; 30+ messages in thread
From: Jakub Jelinek @ 2020-11-03 13:34 UTC (permalink / raw)
  To: Richard Biener; +Cc: Martin Liška, GCC Patches, Michael Matz

On Tue, Nov 03, 2020 at 02:27:52PM +0100, Richard Biener wrote:
> On Fri, Oct 23, 2020 at 1:47 PM Martin Liška <mliska@suse.cz> wrote:
> > This is a follow-up of the discussion that happened in thread about no_stack_protector
> > attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
> >
> > The current optimize attribute works in the following way:
> > - 1) we take current global_options as base
> > - 2) maybe_default_options is called for the currently selected optimization level, which
> >       means all rules in default_options_table are executed
> > - 3) attribute values are applied (via decode_options)
> >
> > So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
> > ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
> >      /* -O1 and -Og optimizations.  */
> >      { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
> >
> > My patch handled and the current optimize attribute really behaves that same as appending attribute value
> > to the command line. So far so good. We should also reflect that in documentation entry which is quite
> > vague right now:
> >
> > """
> > The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
> > """
> >
> > and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
> >
> > -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
> > with -ftree-vectorize -O1 ?
> 
> Hmm.  I guess the only two reasonable options are to append to the active set
> and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
> end up with -O1.

I'd say we always want to append, but only take into account explicit
options.
So basically get the effect of
take the command line, append to that options from the optimize/target
pragmas in effect and append to that options from optimize/target
attributes and only from that figure out the implicit options.

	Jakub


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-11-03 13:34   ` Jakub Jelinek
@ 2020-11-03 13:40     ` Richard Biener
  2020-11-09 10:35     ` Martin Liška
  1 sibling, 0 replies; 30+ messages in thread
From: Richard Biener @ 2020-11-03 13:40 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Martin Liška, GCC Patches, Michael Matz

On Tue, Nov 3, 2020 at 2:35 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Tue, Nov 03, 2020 at 02:27:52PM +0100, Richard Biener wrote:
> > On Fri, Oct 23, 2020 at 1:47 PM Martin Liška <mliska@suse.cz> wrote:
> > > This is a follow-up of the discussion that happened in thread about no_stack_protector
> > > attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
> > >
> > > The current optimize attribute works in the following way:
> > > - 1) we take current global_options as base
> > > - 2) maybe_default_options is called for the currently selected optimization level, which
> > >       means all rules in default_options_table are executed
> > > - 3) attribute values are applied (via decode_options)
> > >
> > > So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
> > > ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
> > >      /* -O1 and -Og optimizations.  */
> > >      { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
> > >
> > > My patch handled and the current optimize attribute really behaves that same as appending attribute value
> > > to the command line. So far so good. We should also reflect that in documentation entry which is quite
> > > vague right now:
> > >
> > > """
> > > The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
> > > """
> > >
> > > and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
> > >
> > > -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
> > > with -ftree-vectorize -O1 ?
> >
> > Hmm.  I guess the only two reasonable options are to append to the active set
> > and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
> > end up with -O1.
>
> I'd say we always want to append, but only take into account explicit
> options.
> So basically get the effect of
> take the command line, append to that options from the optimize/target
> pragmas in effect and append to that options from optimize/target
> attributes and only from that figure out the implicit options.

OK, so minus target options that is what martins patch does, right?

Richard.

>         Jakub
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-10-23 11:47 [PATCH] Optimize macro: make it more predictable Martin Liška
  2020-11-03 13:27 ` Richard Biener
@ 2020-11-06 17:34 ` Jeff Law
  2020-11-09 10:36   ` Martin Liška
  2021-07-01 13:13 ` Martin Liška
  2 siblings, 1 reply; 30+ messages in thread
From: Jeff Law @ 2020-11-06 17:34 UTC (permalink / raw)
  To: Martin Liška, gcc-patches; +Cc: Jakub Jelinek, Michael Matz


On 10/23/20 5:47 AM, Martin Liška wrote:
> Hey.
>
> This is a follow-up of the discussion that happened in thread about
> no_stack_protector
> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>
> The current optimize attribute works in the following way:
> - 1) we take current global_options as base
> - 2) maybe_default_options is called for the currently selected
> optimization level, which
>      means all rules in default_options_table are executed
> - 3) attribute values are applied (via decode_options)
>
> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer
> and __attribute__((optimize("-fno-stack-protector")))
> ends basically with -O2 -fno-stack-protector because
> -fno-omit-frame-pointer is default:
>     /* -O1 and -Og optimizations.  */
>     { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>
> My patch handled and the current optimize attribute really behaves
> that same as appending attribute value
> to the command line. So far so good. We should also reflect that in
> documentation entry which is quite
> vague right now:
>
> """
> The optimize attribute is used to specify that a function is to be
> compiled with different optimization options than specified on the
> command line.
> """
>
> and we may want to handle -Ox in the attribute in a special way. I
> guess many macro/pragma users expect that
>
> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with
> -O1 and not
> with -ftree-vectorize -O1 ?
>
> I'm also planning to take a look at the target macro/attribute, I
> expect similar problems:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>
> Thoughts?
> Thanks,
> Martin
>
> gcc/c-family/ChangeLog:
>
>     * c-common.c (parse_optimize_options): Decoded attribute options
>     with the ones that were already set on the command line.
>
> gcc/ChangeLog:
>
>     * toplev.c (toplev::main): Save decoded Optimization options.
>     * toplev.h (save_opt_decoded_options): New.
>
> gcc/testsuite/ChangeLog:
>
>     * gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable -ffast-math.
>     * gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
So you XNEWVEC and store the result into "merge_decoded_options".  But
you free "decoded_options".  Was that intentional?

This seems to bring a bit more predictability, but I suspect there's
more to do here.


jeff



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-11-03 13:27 ` Richard Biener
  2020-11-03 13:34   ` Jakub Jelinek
@ 2020-11-09 10:27   ` Martin Liška
  1 sibling, 0 replies; 30+ messages in thread
From: Martin Liška @ 2020-11-09 10:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jakub Jelinek, Michael Matz

On 11/3/20 2:27 PM, Richard Biener wrote:
> On Fri, Oct 23, 2020 at 1:47 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> Hey.
>>
>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>
>> The current optimize attribute works in the following way:
>> - 1) we take current global_options as base
>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>        means all rules in default_options_table are executed
>> - 3) attribute values are applied (via decode_options)
>>
>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>       /* -O1 and -Og optimizations.  */
>>       { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>
>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>> vague right now:
>>
>> """
>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>> """
>>
>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>
>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>> with -ftree-vectorize -O1 ?
> 
> Hmm.  I guess the only two reasonable options are to append to the active set
> and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
> end up with -O1.
> 
> Maybe we can have
> 
> @item optimize (@var{level}, @dots{})
> 
> reset everything to plain -On and
> 
> @item optimize (@var{string}, @dots{})
> 
> append?  So optimize("O1") will end up with -O2 -ftree-vectorize -O1 and
> optimize(1) with -O1?  How do we handle

Hello.

To be honest I don't like it much, it can be quite confusing for uses of the GCC.

> 
> void __attribute__((optimize(1),optimize("ftree-vectorize")))
> 
> thus two optimize attributes?

Yes, parse_optimize_options is called twice for the situation.
I'm going to reply to Jakub's email ...

> 
>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>>
>> Thoughts?
>> Thanks,
>> Martin
>>
>> gcc/c-family/ChangeLog:
>>
>>          * c-common.c (parse_optimize_options): Decoded attribute options
>>          with the ones that were already set on the command line.
>>
>> gcc/ChangeLog:
>>
>>          * toplev.c (toplev::main): Save decoded Optimization options.
>>          * toplev.h (save_opt_decoded_options): New.
>>
>> gcc/testsuite/ChangeLog:
>>
>>          * gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable -ffast-math.
>>          * gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
>> ---
>>    gcc/c-family/c-common.c                           | 15 ++++++++++++++-
>>    .../gcc.target/i386/avx512er-vrsqrt28ps-3.c       |  2 +-
>>    .../gcc.target/i386/avx512er-vrsqrt28ps-5.c       |  2 +-
>>    gcc/toplev.c                                      |  8 ++++++++
>>    gcc/toplev.h                                      |  1 +
>>    5 files changed, 25 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
>> index e16ca3894bc..d4342e93d0a 100644
>> --- a/gcc/c-family/c-common.c
>> +++ b/gcc/c-family/c-common.c
>> @@ -5727,10 +5727,23 @@ parse_optimize_options (tree args, bool attr_p)
>>          j++;
>>        }
>>      decoded_options_count = j;
>> +
>> +  /* Merge the decoded options with save_decoded_options.  */
>> +  unsigned save_opt_count = save_opt_decoded_options.length ();
>> +  unsigned merged_decoded_options_count = save_opt_count + decoded_options_count;
>> +  cl_decoded_option *merged_decoded_options
>> +    = XNEWVEC (cl_decoded_option, merged_decoded_options_count);
>> +
>> +  for (unsigned i = 0; i < save_opt_count; ++i)
>> +    merged_decoded_options[i] = save_opt_decoded_options[i];
>> +  for (unsigned i = 0; i < decoded_options_count; ++i)
>> +    merged_decoded_options[save_opt_count + i] = decoded_options[i];
>> +
>>      /* And apply them.  */
>>      decode_options (&global_options, &global_options_set,
>> -                 decoded_options, decoded_options_count,
>> +                 merged_decoded_options, merged_decoded_options_count,
>>                    input_location, global_dc, NULL);
>> +  free (decoded_options);
>>
>>      targetm.override_options_after_change();
>>
>> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
>> index 1ba8172d6e3..40aefb50844 100644
>> --- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
>> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
>> @@ -8,7 +8,7 @@
>>    #define MAX 1000
>>    #define EPS 0.00001
>>
>> -__attribute__ ((noinline, optimize (1)))
>> +__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
>>    void static
>>    compute_rsqrt_ref (float *a, float *r)
>>    {
>> diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
>> index e067a81e562..498f4d50aa5 100644
>> --- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
>> +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
>> @@ -8,7 +8,7 @@
>>    #define MAX 1000
>>    #define EPS 0.00001
>>
>> -__attribute__ ((noinline, optimize (1)))
>> +__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
>>    void static
>>    compute_sqrt_ref (float *a, float *r)
>>    {
>> diff --git a/gcc/toplev.c b/gcc/toplev.c
>> index 8c1e1e1f44f..27e4a1bc485 100644
>> --- a/gcc/toplev.c
>> +++ b/gcc/toplev.c
>> @@ -123,6 +123,9 @@ static bool no_backend;
>>    struct cl_decoded_option *save_decoded_options;
>>    unsigned int save_decoded_options_count;
>>
>> +/* Vector of saved Optimization decoded command line options.  */
>> +auto_vec<cl_decoded_option> save_opt_decoded_options;
>> +
>>    /* Used to enable -fvar-tracking, -fweb and -frename-registers according
>>       to optimize in process_options ().  */
>>    #define AUTODETECT_VALUE 2
>> @@ -2430,6 +2433,11 @@ toplev::main (int argc, char **argv)
>>                                                  &save_decoded_options,
>>                                                  &save_decoded_options_count);
>>
>> +  /* Save Optimization decoded options.  */
>> +  for (unsigned i = 0; i < save_decoded_options_count; ++i)
>> +    if (cl_options[save_decoded_options[i].opt_index].flags & CL_OPTIMIZATION)
>> +      save_opt_decoded_options.safe_push (save_decoded_options[i]);
>> +
>>      /* Perform language-specific options initialization.  */
>>      lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
>>
>> diff --git a/gcc/toplev.h b/gcc/toplev.h
>> index d6c316962b0..627312f85f5 100644
>> --- a/gcc/toplev.h
>> +++ b/gcc/toplev.h
>> @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
>>    /* Decoded options, and number of such options.  */
>>    extern struct cl_decoded_option *save_decoded_options;
>>    extern unsigned int save_decoded_options_count;
>> +extern auto_vec<cl_decoded_option> save_opt_decoded_options;
>>
>>    class timer;
>>
>> --
>> 2.28.0
>>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-11-03 13:34   ` Jakub Jelinek
  2020-11-03 13:40     ` Richard Biener
@ 2020-11-09 10:35     ` Martin Liška
  2020-11-26 13:56       ` Martin Liška
  1 sibling, 1 reply; 30+ messages in thread
From: Martin Liška @ 2020-11-09 10:35 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: GCC Patches, Michael Matz

On 11/3/20 2:34 PM, Jakub Jelinek wrote:
> On Tue, Nov 03, 2020 at 02:27:52PM +0100, Richard Biener wrote:
>> On Fri, Oct 23, 2020 at 1:47 PM Martin Liška <mliska@suse.cz> wrote:
>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>
>>> The current optimize attribute works in the following way:
>>> - 1) we take current global_options as base
>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>        means all rules in default_options_table are executed
>>> - 3) attribute values are applied (via decode_options)
>>>
>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>       /* -O1 and -Og optimizations.  */
>>>       { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>
>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>> vague right now:
>>>
>>> """
>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>> """
>>>
>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>
>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>> with -ftree-vectorize -O1 ?
>>
>> Hmm.  I guess the only two reasonable options are to append to the active set
>> and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
>> end up with -O1.
> 
> I'd say we always want to append, but only take into account explicit
> options.

Yes, I also prefer to always append and basically drop the "reset" functionality.

> So basically get the effect of
> take the command line, append to that options from the optimize/target
> pragmas in effect and append to that options from optimize/target
> attributes and only from that figure out the implicit options.

Few notes here:
- target and optimize attributes are separate so parsing happens independently; however
   they use global_options and global_options_set as a starting point
- you can have a series of wrapped optimize/pragma macros and again information is shared
in global_options/global_options_set
- target and optimize options interact, but in a controlled way with SET_OPTION_IF_UNSET

That said, I hope the biggest offender is right now the handling of -Olevel.

@Jakub: Do you see a situation with my patch where it breaks?

Thanks,
Martin

> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-11-06 17:34 ` Jeff Law
@ 2020-11-09 10:36   ` Martin Liška
  0 siblings, 0 replies; 30+ messages in thread
From: Martin Liška @ 2020-11-09 10:36 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: Jakub Jelinek, Michael Matz

On 11/6/20 6:34 PM, Jeff Law wrote:
> So you XNEWVEC and store the result into "merge_decoded_options".  But
> you free "decoded_options".  Was that intentional?

Hello.

Good point here.

> 
> This seems to bring a bit more predictability, but I suspect there's
> more to do here.

Yes, both should be freed. One can see the following leak on master:

==12237== 176 bytes in 1 blocks are definitely lost in loss record 669 of 786
==12237==    at 0x483BD7B: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==12237==    by 0x1AB10CD: xrealloc (xmalloc.c:179)
==12237==    by 0x1A1AE59: prune_options (opts-common.c:1139)
==12237==    by 0x1A1AE59: decode_cmdline_options_to_array(unsigned int, char const**, unsigned int, cl_decoded_option**, unsigned int*) (opts-common.c:1027)
==12237==    by 0xDCD456: decode_cmdline_options_to_array_default_mask(unsigned int, char const**, cl_decoded_option**, unsigned int*) (opts-global.c:273)
==12237==    by 0x921377: parse_optimize_options(tree_node*, bool) (c-common.c:5709)
==12237==    by 0x9768DB: handle_optimize_attribute(tree_node**, tree_node*, tree_node*, int, bool*) (c-attribs.c:4962)
==12237==    by 0x84596A: decl_attributes(tree_node**, tree_node*, int, tree_node*) (attribs.c:723)
==12237==    by 0x856F88: c_decl_attributes(tree_node**, tree_node*, int) (c-decl.c:5043)
==12237==    by 0x8661E5: start_function(c_declspecs*, c_declarator*, tree_node*) (c-decl.c:9408)
==12237==    by 0x8D644A: c_parser_declaration_or_fndef(c_parser*, bool, bool, bool, bool, bool, tree_node**, vec<c_token, va_heap, vl_ptr>, bool, tree_node*, oacc_routine_data*, bool*) (c-parser.c:2444)
==12237==    by 0x8DF343: c_parser_external_declaration(c_parser*) (c-parser.c:1777)
==12237==    by 0x8DFE41: c_parser_translation_unit (c-parser.c:1650)
==12237==    by 0x8DFE41: c_parse_file() (c-parser.c:21876)

Martin

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-11-09 10:35     ` Martin Liška
@ 2020-11-26 13:56       ` Martin Liška
  2020-12-07 11:03         ` Martin Liška
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2020-11-26 13:56 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: Michael Matz, GCC Patches

PING^1

On 11/9/20 11:35 AM, Martin Liška wrote:
> On 11/3/20 2:34 PM, Jakub Jelinek wrote:
>> On Tue, Nov 03, 2020 at 02:27:52PM +0100, Richard Biener wrote:
>>> On Fri, Oct 23, 2020 at 1:47 PM Martin Liška <mliska@suse.cz> wrote:
>>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>>
>>>> The current optimize attribute works in the following way:
>>>> - 1) we take current global_options as base
>>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>>        means all rules in default_options_table are executed
>>>> - 3) attribute values are applied (via decode_options)
>>>>
>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>>       /* -O1 and -Og optimizations.  */
>>>>       { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>>
>>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>>> vague right now:
>>>>
>>>> """
>>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>>> """
>>>>
>>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>>
>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>>> with -ftree-vectorize -O1 ?
>>>
>>> Hmm.  I guess the only two reasonable options are to append to the active set
>>> and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
>>> end up with -O1.
>>
>> I'd say we always want to append, but only take into account explicit
>> options.
> 
> Yes, I also prefer to always append and basically drop the "reset" functionality.
> 
>> So basically get the effect of
>> take the command line, append to that options from the optimize/target
>> pragmas in effect and append to that options from optimize/target
>> attributes and only from that figure out the implicit options.
> 
> Few notes here:
> - target and optimize attributes are separate so parsing happens independently; however
>    they use global_options and global_options_set as a starting point
> - you can have a series of wrapped optimize/pragma macros and again information is shared
> in global_options/global_options_set
> - target and optimize options interact, but in a controlled way with SET_OPTION_IF_UNSET
> 
> That said, I hope the biggest offender is right now the handling of -Olevel.
> 
> @Jakub: Do you see a situation with my patch where it breaks?
> 
> Thanks,
> Martin
> 
>>
>>     Jakub
>>
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-11-26 13:56       ` Martin Liška
@ 2020-12-07 11:03         ` Martin Liška
  2021-01-11 13:10           ` Martin Liška
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2020-12-07 11:03 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: Michael Matz, GCC Patches

PING^2

On 11/26/20 2:56 PM, Martin Liška wrote:
> PING^1
> 
> On 11/9/20 11:35 AM, Martin Liška wrote:
>> On 11/3/20 2:34 PM, Jakub Jelinek wrote:
>>> On Tue, Nov 03, 2020 at 02:27:52PM +0100, Richard Biener wrote:
>>>> On Fri, Oct 23, 2020 at 1:47 PM Martin Liška <mliska@suse.cz> wrote:
>>>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>>>
>>>>> The current optimize attribute works in the following way:
>>>>> - 1) we take current global_options as base
>>>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>>>        means all rules in default_options_table are executed
>>>>> - 3) attribute values are applied (via decode_options)
>>>>>
>>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>>>       /* -O1 and -Og optimizations.  */
>>>>>       { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>>>
>>>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>>>> vague right now:
>>>>>
>>>>> """
>>>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>>>> """
>>>>>
>>>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>>>
>>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>>>> with -ftree-vectorize -O1 ?
>>>>
>>>> Hmm.  I guess the only two reasonable options are to append to the active set
>>>> and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
>>>> end up with -O1.
>>>
>>> I'd say we always want to append, but only take into account explicit
>>> options.
>>
>> Yes, I also prefer to always append and basically drop the "reset" functionality.
>>
>>> So basically get the effect of
>>> take the command line, append to that options from the optimize/target
>>> pragmas in effect and append to that options from optimize/target
>>> attributes and only from that figure out the implicit options.
>>
>> Few notes here:
>> - target and optimize attributes are separate so parsing happens independently; however
>>    they use global_options and global_options_set as a starting point
>> - you can have a series of wrapped optimize/pragma macros and again information is shared
>> in global_options/global_options_set
>> - target and optimize options interact, but in a controlled way with SET_OPTION_IF_UNSET
>>
>> That said, I hope the biggest offender is right now the handling of -Olevel.
>>
>> @Jakub: Do you see a situation with my patch where it breaks?
>>
>> Thanks,
>> Martin
>>
>>>
>>>     Jakub
>>>
>>
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-12-07 11:03         ` Martin Liška
@ 2021-01-11 13:10           ` Martin Liška
  0 siblings, 0 replies; 30+ messages in thread
From: Martin Liška @ 2021-01-11 13:10 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: Michael Matz, GCC Patches

I'm suggesting postponing this to GCC 12 as I'm planning a bigger
target/optimize attribute (pragma) overhaul.

Martin

On 12/7/20 12:03 PM, Martin Liška wrote:
> PING^2
> 
> On 11/26/20 2:56 PM, Martin Liška wrote:
>> PING^1
>>
>> On 11/9/20 11:35 AM, Martin Liška wrote:
>>> On 11/3/20 2:34 PM, Jakub Jelinek wrote:
>>>> On Tue, Nov 03, 2020 at 02:27:52PM +0100, Richard Biener wrote:
>>>>> On Fri, Oct 23, 2020 at 1:47 PM Martin Liška <mliska@suse.cz> wrote:
>>>>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>>>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>>>>
>>>>>> The current optimize attribute works in the following way:
>>>>>> - 1) we take current global_options as base
>>>>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>>>>        means all rules in default_options_table are executed
>>>>>> - 3) attribute values are applied (via decode_options)
>>>>>>
>>>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>>>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>>>>       /* -O1 and -Og optimizations.  */
>>>>>>       { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>>>>
>>>>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>>>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>>>>> vague right now:
>>>>>>
>>>>>> """
>>>>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>>>>> """
>>>>>>
>>>>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>>>>
>>>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>>>>> with -ftree-vectorize -O1 ?
>>>>>
>>>>> Hmm.  I guess the only two reasonable options are to append to the active set
>>>>> and thus end up with -ftree-vectorize -O1 or to start from an empty set and thus
>>>>> end up with -O1.
>>>>
>>>> I'd say we always want to append, but only take into account explicit
>>>> options.
>>>
>>> Yes, I also prefer to always append and basically drop the "reset" functionality.
>>>
>>>> So basically get the effect of
>>>> take the command line, append to that options from the optimize/target
>>>> pragmas in effect and append to that options from optimize/target
>>>> attributes and only from that figure out the implicit options.
>>>
>>> Few notes here:
>>> - target and optimize attributes are separate so parsing happens independently; however
>>>    they use global_options and global_options_set as a starting point
>>> - you can have a series of wrapped optimize/pragma macros and again information is shared
>>> in global_options/global_options_set
>>> - target and optimize options interact, but in a controlled way with SET_OPTION_IF_UNSET
>>>
>>> That said, I hope the biggest offender is right now the handling of -Olevel.
>>>
>>> @Jakub: Do you see a situation with my patch where it breaks?
>>>
>>> Thanks,
>>> Martin
>>>
>>>>
>>>>     Jakub
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2020-10-23 11:47 [PATCH] Optimize macro: make it more predictable Martin Liška
  2020-11-03 13:27 ` Richard Biener
  2020-11-06 17:34 ` Jeff Law
@ 2021-07-01 13:13 ` Martin Liška
  2021-08-10 15:52   ` Martin Liška
  2021-08-24 12:13   ` Richard Biener
  2 siblings, 2 replies; 30+ messages in thread
From: Martin Liška @ 2021-07-01 13:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Michael Matz

[-- Attachment #1: Type: text/plain, Size: 4722 bytes --]

On 10/23/20 1:47 PM, Martin Liška wrote:
> Hey.

Hello.

I deferred the patch for GCC 12. Since the time, I messed up with options
I feel more familiar with the option handling. So ...

> 
> This is a follow-up of the discussion that happened in thread about no_stack_protector
> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
> 
> The current optimize attribute works in the following way:
> - 1) we take current global_options as base
> - 2) maybe_default_options is called for the currently selected optimization level, which
>       means all rules in default_options_table are executed
> - 3) attribute values are applied (via decode_options)
> 
> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>      /* -O1 and -Og optimizations.  */
>      { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
> 
> My patch handled and the current optimize attribute really behaves that same as appending attribute value
> to the command line. So far so good. We should also reflect that in documentation entry which is quite
> vague right now:

^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.

> 
> """
> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
> """

I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
as something not for a production use:

"The optimize attribute should be used for debugging purposes only. It is not suitable in production code."

Are we sure about the statement? I know that e.g. glibc uses that.

> 
> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
> 
> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
> with -ftree-vectorize -O1 ?

The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:

$ cat -n gcc/config/i386/i386-options.c
...
   1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
   1246                  {
   1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
   1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
   1249                       and all bits in x_ix86_isa_flags2.  */
   1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
   1251                                               | OPTION_MASK_ABI_64
   1252                                               | OPTION_MASK_ABI_X32
   1253                                               | OPTION_MASK_CODE16);
   1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
   1255                                                        | OPTION_MASK_ABI_64
   1256                                                        | OPTION_MASK_ABI_X32
   1257                                                        | OPTION_MASK_CODE16);
   1258                    opts->x_ix86_isa_flags2 = 0;
   1259                    opts->x_ix86_isa_flags2_explicit = 0;
   1260                  }

That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:

$ cat hreset.c
#pragma GCC target "arch=geode"
#include <immintrin.h>
void foo(unsigned int eax)
{
   _hreset (eax);
}

$ clang hreset.c -mhreset  -c -O2 -m32
$ gcc hreset.c -mhreset  -c -O2 -m32
In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
                  from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
                  from hreset.c:2:
hreset.c: In function ‘foo’:
/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
    39 | _hreset (unsigned int __EAX)
       | ^~~~~~~
hreset.c:5:3: note: called from here
     5 |   _hreset (eax);
       |   ^~~~~~~~~~~~~

Anyway, I think the current target attribute handling should be preserved.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

> 
> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
> 
> Thoughts?
> Thanks,
> Martin


[-- Attachment #2: 0001-Append-target-optimize-attr-to-the-current-cmdline.patch --]
[-- Type: text/x-patch, Size: 7129 bytes --]

From eaa1892cbe32c6fe73de7708aa17be2d3917bceb Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Wed, 2 Jun 2021 08:44:37 +0200
Subject: [PATCH] Append target/optimize attr to the current cmdline.

gcc/c-family/ChangeLog:

	* c-common.c (parse_optimize_options): Combine optimize
	options with what was provided on the command line.

gcc/ChangeLog:

	* toplev.c (toplev::main): Save decoded optimization options.
	* toplev.h (save_opt_decoded_options): New.
	* doc/extend.texi: Be more clear about optimize and target
	attributes.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable fast math.
	* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
	* gcc.target/i386/attr-optimize.c: New test.
---
 gcc/c-family/c-common.c                       | 17 +++++++++++--
 gcc/doc/extend.texi                           |  8 +++++--
 gcc/testsuite/gcc.target/i386/attr-optimize.c | 24 +++++++++++++++++++
 .../gcc.target/i386/avx512er-vrsqrt28ps-3.c   |  2 +-
 .../gcc.target/i386/avx512er-vrsqrt28ps-5.c   |  2 +-
 gcc/toplev.c                                  |  8 +++++++
 gcc/toplev.h                                  |  1 +
 7 files changed, 56 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/attr-optimize.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 681fcc972f4..f4e56653585 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5901,9 +5901,22 @@ parse_optimize_options (tree args, bool attr_p)
       j++;
     }
   decoded_options_count = j;
-  /* And apply them.  */
+
+  /* Merge the decoded options with save_decoded_options.  */
+  unsigned save_opt_count = save_opt_decoded_options.length ();
+  unsigned merged_decoded_options_count
+    = save_opt_count + decoded_options_count;
+  cl_decoded_option *merged_decoded_options
+    = XNEWVEC (cl_decoded_option, merged_decoded_options_count);
+
+  for (unsigned i = 0; i < save_opt_count; ++i)
+    merged_decoded_options[i] = save_opt_decoded_options[i];
+  for (unsigned i = 0; i < decoded_options_count; ++i)
+    merged_decoded_options[save_opt_count + i] = decoded_options[i];
+
+   /* And apply them.  */
   decode_options (&global_options, &global_options_set,
-		  decoded_options, decoded_options_count,
+		  merged_decoded_options, merged_decoded_options_count,
 		  input_location, global_dc, NULL);
   free (decoded_options);
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 8fc66d626d8..3b501a8be3a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3593,7 +3593,10 @@ take function pointer arguments.
 @cindex @code{optimize} function attribute
 The @code{optimize} attribute is used to specify that a function is to
 be compiled with different optimization options than specified on the
-command line.  Valid arguments are constant non-negative integers and
+command line.  The optimize attribute arguments of a function behave
+as if they were added to the command line options.
+
+Valid arguments are constant non-negative integers and
 strings.  Each numeric argument specifies an optimization @var{level}.
 Each @var{string} argument consists of one or more comma-separated
 substrings.  Each substring that begins with the letter @code{O} refers
@@ -3797,7 +3800,8 @@ This attribute prevents stack protection code for the function.
 Multiple target back ends implement the @code{target} attribute
 to specify that a function is to
 be compiled with different target options than specified on the
-command line.  One or more strings can be provided as arguments.
+command line.  The original target command line options are ignored.
+One or more strings can be provided as arguments.
 Each string consists of one or more comma-separated suffixes to
 the @code{-m} prefix jointly forming the name of a machine-dependent
 option.  @xref{Submodel Options,,Machine-Dependent Options}.
diff --git a/gcc/testsuite/gcc.target/i386/attr-optimize.c b/gcc/testsuite/gcc.target/i386/attr-optimize.c
new file mode 100644
index 00000000000..f5db028f1fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/attr-optimize.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O1 -ftree-slp-vectorize -march=znver1 -fdump-tree-optimized" } */
+
+/* Use -O2, but -ftree-slp-vectorize option should be preserved and used.  */
+#pragma GCC optimize "-O2"
+
+typedef struct {
+  long n[4];
+} secp256k1_fe;
+
+void *a;
+int c;
+static void
+fn1(secp256k1_fe *p1, int p2)
+{
+  p1->n[0] = p1->n[1] = p2;
+}
+void
+fn2()
+{
+  fn1(a, !c);
+}
+
+/* { dg-final { scan-tree-dump { MEM <vector\(2\) long int> \[[^]]*\] = } "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
index 1ba8172d6e3..40aefb50844 100644
--- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
@@ -8,7 +8,7 @@
 #define MAX 1000
 #define EPS 0.00001
 
-__attribute__ ((noinline, optimize (1)))
+__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
 void static
 compute_rsqrt_ref (float *a, float *r)
 {
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
index e067a81e562..498f4d50aa5 100644
--- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
@@ -8,7 +8,7 @@
 #define MAX 1000
 #define EPS 0.00001
 
-__attribute__ ((noinline, optimize (1)))
+__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
 void static
 compute_sqrt_ref (float *a, float *r)
 {
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 43f1f7d345e..3e26ad97ff0 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -121,6 +121,9 @@ static bool no_backend;
 struct cl_decoded_option *save_decoded_options;
 unsigned int save_decoded_options_count;
 
+/* Vector of saved Optimization decoded command line options.  */
+auto_vec<cl_decoded_option> save_opt_decoded_options;
+
 /* Used to enable -fvar-tracking, -fweb and -frename-registers according
    to optimize in process_options ().  */
 #define AUTODETECT_VALUE 2
@@ -2335,6 +2338,11 @@ toplev::main (int argc, char **argv)
 						&save_decoded_options,
 						&save_decoded_options_count);
 
+  /* Save Optimization decoded options.  */
+  for (unsigned i = 0; i < save_decoded_options_count; ++i)
+    if (cl_options[save_decoded_options[i].opt_index].flags & CL_OPTIMIZATION)
+      save_opt_decoded_options.safe_push (save_decoded_options[i]);
+
   /* Perform language-specific options initialization.  */
   lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
 
diff --git a/gcc/toplev.h b/gcc/toplev.h
index 175944c85b7..9a1d2e6986f 100644
--- a/gcc/toplev.h
+++ b/gcc/toplev.h
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Decoded options, and number of such options.  */
 extern struct cl_decoded_option *save_decoded_options;
 extern unsigned int save_decoded_options_count;
+extern auto_vec<cl_decoded_option> save_opt_decoded_options;
 
 class timer;
 
-- 
2.32.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-07-01 13:13 ` Martin Liška
@ 2021-08-10 15:52   ` Martin Liška
  2021-08-24 11:06     ` Martin Liška
  2021-08-24 12:13   ` Richard Biener
  1 sibling, 1 reply; 30+ messages in thread
From: Martin Liška @ 2021-08-10 15:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Michael Matz

PING^1

On 7/1/21 3:13 PM, Martin Liška wrote:
> On 10/23/20 1:47 PM, Martin Liška wrote:
>> Hey.
> 
> Hello.
> 
> I deferred the patch for GCC 12. Since the time, I messed up with options
> I feel more familiar with the option handling. So ...
> 
>>
>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>
>> The current optimize attribute works in the following way:
>> - 1) we take current global_options as base
>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>       means all rules in default_options_table are executed
>> - 3) attribute values are applied (via decode_options)
>>
>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>      /* -O1 and -Og optimizations.  */
>>      { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>
>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>> vague right now:
> 
> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.
> 
>>
>> """
>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>> """
> 
> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
> as something not for a production use:
> 
> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
> 
> Are we sure about the statement? I know that e.g. glibc uses that.
> 
>>
>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>
>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>> with -ftree-vectorize -O1 ?
> 
> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
> 
> $ cat -n gcc/config/i386/i386-options.c
> ...
>    1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
>    1246                  {
>    1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
>    1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
>    1249                       and all bits in x_ix86_isa_flags2.  */
>    1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
>    1251                                               | OPTION_MASK_ABI_64
>    1252                                               | OPTION_MASK_ABI_X32
>    1253                                               | OPTION_MASK_CODE16);
>    1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
>    1255                                                        | OPTION_MASK_ABI_64
>    1256                                                        | OPTION_MASK_ABI_X32
>    1257                                                        | OPTION_MASK_CODE16);
>    1258                    opts->x_ix86_isa_flags2 = 0;
>    1259                    opts->x_ix86_isa_flags2_explicit = 0;
>    1260                  }
> 
> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
> 
> $ cat hreset.c
> #pragma GCC target "arch=geode"
> #include <immintrin.h>
> void foo(unsigned int eax)
> {
>    _hreset (eax);
> }
> 
> $ clang hreset.c -mhreset  -c -O2 -m32
> $ gcc hreset.c -mhreset  -c -O2 -m32
> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
>                   from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
>                   from hreset.c:2:
> hreset.c: In function ‘foo’:
> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
>     39 | _hreset (unsigned int __EAX)
>        | ^~~~~~~
> hreset.c:5:3: note: called from here
>      5 |   _hreset (eax);
>        |   ^~~~~~~~~~~~~
> 
> Anyway, I think the current target attribute handling should be preserved.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
>>
>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>>
>> Thoughts?
>> Thanks,
>> Martin
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-08-10 15:52   ` Martin Liška
@ 2021-08-24 11:06     ` Martin Liška
  0 siblings, 0 replies; 30+ messages in thread
From: Martin Liška @ 2021-08-24 11:06 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Michael Matz

PING^2

On 8/10/21 17:52, Martin Liška wrote:
> PING^1
> 
> On 7/1/21 3:13 PM, Martin Liška wrote:
>> On 10/23/20 1:47 PM, Martin Liška wrote:
>>> Hey.
>>
>> Hello.
>>
>> I deferred the patch for GCC 12. Since the time, I messed up with options
>> I feel more familiar with the option handling. So ...
>>
>>>
>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>
>>> The current optimize attribute works in the following way:
>>> - 1) we take current global_options as base
>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>       means all rules in default_options_table are executed
>>> - 3) attribute values are applied (via decode_options)
>>>
>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>      /* -O1 and -Og optimizations.  */
>>>      { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>
>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>> vague right now:
>>
>> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.
>>
>>>
>>> """
>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>> """
>>
>> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
>> as something not for a production use:
>>
>> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
>>
>> Are we sure about the statement? I know that e.g. glibc uses that.
>>
>>>
>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>
>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>> with -ftree-vectorize -O1 ?
>>
>> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
>>
>> $ cat -n gcc/config/i386/i386-options.c
>> ...
>>    1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
>>    1246                  {
>>    1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
>>    1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
>>    1249                       and all bits in x_ix86_isa_flags2.  */
>>    1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
>>    1251                                               | OPTION_MASK_ABI_64
>>    1252                                               | OPTION_MASK_ABI_X32
>>    1253                                               | OPTION_MASK_CODE16);
>>    1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
>>    1255                                                        | OPTION_MASK_ABI_64
>>    1256                                                        | OPTION_MASK_ABI_X32
>>    1257                                                        | OPTION_MASK_CODE16);
>>    1258                    opts->x_ix86_isa_flags2 = 0;
>>    1259                    opts->x_ix86_isa_flags2_explicit = 0;
>>    1260                  }
>>
>> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
>> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
>>
>> $ cat hreset.c
>> #pragma GCC target "arch=geode"
>> #include <immintrin.h>
>> void foo(unsigned int eax)
>> {
>>    _hreset (eax);
>> }
>>
>> $ clang hreset.c -mhreset  -c -O2 -m32
>> $ gcc hreset.c -mhreset  -c -O2 -m32
>> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
>>                   from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
>>                   from hreset.c:2:
>> hreset.c: In function ‘foo’:
>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
>>     39 | _hreset (unsigned int __EAX)
>>        | ^~~~~~~
>> hreset.c:5:3: note: called from here
>>      5 |   _hreset (eax);
>>        |   ^~~~~~~~~~~~~
>>
>> Anyway, I think the current target attribute handling should be preserved.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
>>>
>>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>>>
>>> Thoughts?
>>> Thanks,
>>> Martin
>>
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-07-01 13:13 ` Martin Liška
  2021-08-10 15:52   ` Martin Liška
@ 2021-08-24 12:13   ` Richard Biener
  2021-08-24 13:04     ` Martin Liška
  1 sibling, 1 reply; 30+ messages in thread
From: Richard Biener @ 2021-08-24 12:13 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, Jakub Jelinek, Michael Matz

On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mliska@suse.cz> wrote:
>
> On 10/23/20 1:47 PM, Martin Liška wrote:
> > Hey.
>
> Hello.
>
> I deferred the patch for GCC 12. Since the time, I messed up with options
> I feel more familiar with the option handling. So ...
>
> >
> > This is a follow-up of the discussion that happened in thread about no_stack_protector
> > attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
> >
> > The current optimize attribute works in the following way:
> > - 1) we take current global_options as base
> > - 2) maybe_default_options is called for the currently selected optimization level, which
> >       means all rules in default_options_table are executed
> > - 3) attribute values are applied (via decode_options)
> >
> > So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
> > ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
> >      /* -O1 and -Og optimizations.  */
> >      { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
> >
> > My patch handled and the current optimize attribute really behaves that same as appending attribute value
> > to the command line. So far so good. We should also reflect that in documentation entry which is quite
> > vague right now:
>
> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.

There is also handle_common_deferred_options that's not called so any
option processed there should
probably be excempt from being set/unset in the optimize attribute?

> >
> > """
> > The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
> > """
>
> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
> as something not for a production use:
>
> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
>
> Are we sure about the statement? I know that e.g. glibc uses that.

Well, given we're changing behavior now that warning looks valid ;)
I'll also note that

"The optimize attribute arguments of a function behave
as if they were added to the command line options."

is still likely untrue, the global state init is complicated ;)


> >
> > and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
> >
> > -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
> > with -ftree-vectorize -O1 ?

As implemented your patch seems to turn it into -ftree-vectorize -O1.
IIRC multiple optimize attributes apply
ontop of each other, and it makes sense to me that optimize (2),
optimize ("tree-vectorize") behaves the same
as optimize (2, "tree-vectorize").  I'm not sure this is still the
case after your patch?  Also consider

#pragma GCC optimize ("tree-vectorize")
void foo () { ...}

#pragma GCC optimize ("tree-loop-distribution")
void bar () {... }

I'd expect bar to have both vectorization and loop distribution
enabled? (note I didn't use push/pop here)

> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
>
> $ cat -n gcc/config/i386/i386-options.c
> ...
>    1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
>    1246                  {
>    1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
>    1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
>    1249                       and all bits in x_ix86_isa_flags2.  */
>    1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
>    1251                                               | OPTION_MASK_ABI_64
>    1252                                               | OPTION_MASK_ABI_X32
>    1253                                               | OPTION_MASK_CODE16);
>    1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
>    1255                                                        | OPTION_MASK_ABI_64
>    1256                                                        | OPTION_MASK_ABI_X32
>    1257                                                        | OPTION_MASK_CODE16);
>    1258                    opts->x_ix86_isa_flags2 = 0;
>    1259                    opts->x_ix86_isa_flags2_explicit = 0;
>    1260                  }
>
> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
>
> $ cat hreset.c
> #pragma GCC target "arch=geode"
> #include <immintrin.h>
> void foo(unsigned int eax)
> {
>    _hreset (eax);
> }
>
> $ clang hreset.c -mhreset  -c -O2 -m32
> $ gcc hreset.c -mhreset  -c -O2 -m32
> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
>                   from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
>                   from hreset.c:2:
> hreset.c: In function ‘foo’:
> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
>     39 | _hreset (unsigned int __EAX)
>        | ^~~~~~~
> hreset.c:5:3: note: called from here
>      5 |   _hreset (eax);
>        |   ^~~~~~~~~~~~~
>
> Anyway, I think the current target attribute handling should be preserved.

I think this and the -O1 argument above suggests that there should be
a way to distinguish
two modes - add to the active set of options and starting from scratch.

Maybe it's over-designing things but do we want to preserve the
existing behavior
and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
a way to amend
the state?

OTOH as we're missing global_options re-init even with your patch we won't get
the defaults correct (aka what toplev::main does with init_options_struct and
the corresponding langhook).  Likewise if lang_hooks.init_options performs any
defaulting a later flag overrides and we override that with optimize() that
doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags.
So -O2 -fcx-fortran-rules on the command-line and optimize
("no-cx-fortran-rules")
to cancel the -fcx-fortran-rules switch wouldn't work?

Thanks,
Richard.

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> >
> > I'm also planning to take a look at the target macro/attribute, I expect similar problems:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
> >
> > Thoughts?
> > Thanks,
> > Martin
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-08-24 12:13   ` Richard Biener
@ 2021-08-24 13:04     ` Martin Liška
  2021-08-26 11:04       ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2021-08-24 13:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jakub Jelinek, Michael Matz

On 8/24/21 14:13, Richard Biener wrote:
> On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 10/23/20 1:47 PM, Martin Liška wrote:
>>> Hey.
>>
>> Hello.
>>
>> I deferred the patch for GCC 12. Since the time, I messed up with options
>> I feel more familiar with the option handling. So ...
>>
>>>
>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>
>>> The current optimize attribute works in the following way:
>>> - 1) we take current global_options as base
>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>        means all rules in default_options_table are executed
>>> - 3) attribute values are applied (via decode_options)
>>>
>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>       /* -O1 and -Og optimizations.  */
>>>       { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>
>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>> vague right now:
>>
>> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.

Hey.

> There is also handle_common_deferred_options that's not called so any
> option processed there should
> probably be excempt from being set/unset in the optimize attribute?

Looking at the handled options, they have all Defer type and not Optimization.
Thus we should be fine.

> 
>>>
>>> """
>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>> """
>>
>> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
>> as something not for a production use:
>>
>> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
>>
>> Are we sure about the statement? I know that e.g. glibc uses that.
> 
> Well, given we're changing behavior now that warning looks valid ;)

Yeah! True.

> I'll also note that
> 
> "The optimize attribute arguments of a function behave
> as if they were added to the command line options."
> 
> is still likely untrue, the global state init is complicated ;)

Sure, but the situation should be much closer to it :) Do you have a better wording?

> 
> 
>>>
>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>
>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>> with -ftree-vectorize -O1 ?

This is my older suggestion and it will likely make it even much complicated. So ...

> 
> As implemented your patch seems to turn it into -ftree-vectorize -O1.

Yes.

> IIRC multiple optimize attributes apply
> ontop of each other, and it makes sense to me that optimize (2),
> optimize ("tree-vectorize") behaves the same
> as optimize (2, "tree-vectorize").  I'm not sure this is still the
> case after your patch?  Also consider
> 
> #pragma GCC optimize ("tree-vectorize")
> void foo () { ...}
> 
> #pragma GCC optimize ("tree-loop-distribution")
> void bar () {... }
> 
> I'd expect bar to have both vectorization and loop distribution
> enabled? (note I didn't use push/pop here)

Yes, yes and yes. I'm going to verify it.

> 
>> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
>>
>> $ cat -n gcc/config/i386/i386-options.c
>> ...
>>     1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
>>     1246                  {
>>     1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
>>     1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
>>     1249                       and all bits in x_ix86_isa_flags2.  */
>>     1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
>>     1251                                               | OPTION_MASK_ABI_64
>>     1252                                               | OPTION_MASK_ABI_X32
>>     1253                                               | OPTION_MASK_CODE16);
>>     1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
>>     1255                                                        | OPTION_MASK_ABI_64
>>     1256                                                        | OPTION_MASK_ABI_X32
>>     1257                                                        | OPTION_MASK_CODE16);
>>     1258                    opts->x_ix86_isa_flags2 = 0;
>>     1259                    opts->x_ix86_isa_flags2_explicit = 0;
>>     1260                  }
>>
>> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
>> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
>>
>> $ cat hreset.c
>> #pragma GCC target "arch=geode"
>> #include <immintrin.h>
>> void foo(unsigned int eax)
>> {
>>     _hreset (eax);
>> }
>>
>> $ clang hreset.c -mhreset  -c -O2 -m32
>> $ gcc hreset.c -mhreset  -c -O2 -m32
>> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
>>                    from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
>>                    from hreset.c:2:
>> hreset.c: In function ‘foo’:
>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
>>      39 | _hreset (unsigned int __EAX)
>>         | ^~~~~~~
>> hreset.c:5:3: note: called from here
>>       5 |   _hreset (eax);
>>         |   ^~~~~~~~~~~~~
>>
>> Anyway, I think the current target attribute handling should be preserved.
> 
> I think this and the -O1 argument above suggests that there should be
> a way to distinguish
> two modes - add to the active set of options and starting from scratch.

Doing that would make it even crazier :)

> 
> Maybe it's over-designing things but do we want to preserve the
> existing behavior
> and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
> a way to amend
> the state?

I prefer doing only the append mode (when one can still use -fno-foo for an explicit
drop of a flag).

> 
> OTOH as we're missing global_options re-init even with your patch we won't get
> the defaults correct (aka what toplev::main does with init_options_struct and
> the corresponding langhook).  Likewise if lang_hooks.init_options performs any
> defaulting a later flag overrides and we override that with optimize() that
> doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags.
> So -O2 -fcx-fortran-rules on the command-line and optimize
> ("no-cx-fortran-rules")
> to cancel the -fcx-fortran-rules switch wouldn't work?

In most cases it works. What's problematic about -fcx-fortran-rules is that it sets

   /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
   if (flag_cx_limited_range)
     flag_complex_method = 0;

   /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
   if (flag_cx_fortran_rules)
     flag_complex_method = 1;

in process_options (called only for cmdline options) and not in

/* After all options at LOC have been read into OPTS and OPTS_SET,
    finalize settings of those options and diagnose incompatible
    combinations.  */
void
finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
		location_t loc)

which is a place which is called once options are decoded (both from cmdline and when
combined with a attribute or pragma):

#1  0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 <global_options>, opts_set=opts_set@entry=0x26afdc0 <global_options_set>, loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303

#2  0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, opts_set=0x26afdc0 <global_options_set>, decoded_options=<optimized out>, decoded_options_count=decoded_options_count@entry=4, loc=258754, dc=0x26b2b00 <global_diagnostic_context>,

     target_option_override_hook=0x0) at /home/marxin/Programming/gcc/gcc/opts-global.c:324

#3  0x0000000000921144 in parse_optimize_options (args=args@entry=<tree_list 0x7ffff76e1910>, attr_p=attr_p@entry=false) at /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921

#4  0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993

#5  0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, context=pragma_external, if_p=0x0) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573


Martin

> 
> Thanks,
> Richard.
> 
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
>>>
>>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>>>
>>> Thoughts?
>>> Thanks,
>>> Martin
>>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-08-24 13:04     ` Martin Liška
@ 2021-08-26 11:04       ` Richard Biener
  2021-08-26 12:39         ` Martin Liška
  2021-09-06 11:37         ` [PATCH] flag_complex_method: support optimize attribute Martin Liška
  0 siblings, 2 replies; 30+ messages in thread
From: Richard Biener @ 2021-08-26 11:04 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, Jakub Jelinek, Michael Matz

On Tue, Aug 24, 2021 at 3:04 PM Martin Liška <mliska@suse.cz> wrote:
>
> On 8/24/21 14:13, Richard Biener wrote:
> > On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mliska@suse.cz> wrote:
> >>
> >> On 10/23/20 1:47 PM, Martin Liška wrote:
> >>> Hey.
> >>
> >> Hello.
> >>
> >> I deferred the patch for GCC 12. Since the time, I messed up with options
> >> I feel more familiar with the option handling. So ...
> >>
> >>>
> >>> This is a follow-up of the discussion that happened in thread about no_stack_protector
> >>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
> >>>
> >>> The current optimize attribute works in the following way:
> >>> - 1) we take current global_options as base
> >>> - 2) maybe_default_options is called for the currently selected optimization level, which
> >>>        means all rules in default_options_table are executed
> >>> - 3) attribute values are applied (via decode_options)
> >>>
> >>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
> >>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
> >>>       /* -O1 and -Og optimizations.  */
> >>>       { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
> >>>
> >>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
> >>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
> >>> vague right now:
> >>
> >> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.
>
> Hey.
>
> > There is also handle_common_deferred_options that's not called so any
> > option processed there should
> > probably be excempt from being set/unset in the optimize attribute?
>
> Looking at the handled options, they have all Defer type and not Optimization.
> Thus we should be fine.
>
> >
> >>>
> >>> """
> >>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
> >>> """
> >>
> >> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
> >> as something not for a production use:
> >>
> >> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
> >>
> >> Are we sure about the statement? I know that e.g. glibc uses that.
> >
> > Well, given we're changing behavior now that warning looks valid ;)
>
> Yeah! True.
>
> > I'll also note that
> >
> > "The optimize attribute arguments of a function behave
> > as if they were added to the command line options."
> >
> > is still likely untrue, the global state init is complicated ;)
>
> Sure, but the situation should be much closer to it :) Do you have a better wording?

Maybe "The intent is that the optimize attribute behaves as if the
arguments were
appended to the command line."

But as said originally below I'm not sure that this behavior is what people
expect.  If I say optimize("fno-tree-vectorize") then I do expect that to retain
other command-line arguments.  If I say optimize(1) I'm not sure I would expect
-ftree-vectorize on the command-line to prevail ;)

Is google code search still a thing?  Can one search all of github
somehow?  I really
wonder how 'optimize' is used at the moment.  There are quite some optimize
attributes in the target part of the testsuite for example.  And
testsuite/gcc.dg/vect/bb-slp-41.c
suggests that optimize("-fno-tree-fre") preserves at least the
optimization level?

There's always the possibility to preserve the current behavior for 'optimize'
and add a new 'add_optimize' attribute that does the other thing.

> >
> >
> >>>
> >>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
> >>>
> >>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
> >>> with -ftree-vectorize -O1 ?
>
> This is my older suggestion and it will likely make it even much complicated. So ...

In theory it's just dropping the command-line but yes, the question is
what happens
to the non-optimization part of the command-line.  We obviously shouldn't drop
-std=gnu14 and it's side-effects.  So yes, even documenting exactly what this
would do is difficult ;)

> >
> > As implemented your patch seems to turn it into -ftree-vectorize -O1.
>
> Yes.
>
> > IIRC multiple optimize attributes apply
> > ontop of each other, and it makes sense to me that optimize (2),
> > optimize ("tree-vectorize") behaves the same
> > as optimize (2, "tree-vectorize").  I'm not sure this is still the
> > case after your patch?  Also consider
> >
> > #pragma GCC optimize ("tree-vectorize")
> > void foo () { ...}
> >
> > #pragma GCC optimize ("tree-loop-distribution")
> > void bar () {... }
> >
> > I'd expect bar to have both vectorization and loop distribution
> > enabled? (note I didn't use push/pop here)
>
> Yes, yes and yes. I'm going to verify it.
>
> >
> >> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
> >>
> >> $ cat -n gcc/config/i386/i386-options.c
> >> ...
> >>     1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
> >>     1246                  {
> >>     1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
> >>     1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
> >>     1249                       and all bits in x_ix86_isa_flags2.  */
> >>     1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
> >>     1251                                               | OPTION_MASK_ABI_64
> >>     1252                                               | OPTION_MASK_ABI_X32
> >>     1253                                               | OPTION_MASK_CODE16);
> >>     1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
> >>     1255                                                        | OPTION_MASK_ABI_64
> >>     1256                                                        | OPTION_MASK_ABI_X32
> >>     1257                                                        | OPTION_MASK_CODE16);
> >>     1258                    opts->x_ix86_isa_flags2 = 0;
> >>     1259                    opts->x_ix86_isa_flags2_explicit = 0;
> >>     1260                  }
> >>
> >> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
> >> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
> >>
> >> $ cat hreset.c
> >> #pragma GCC target "arch=geode"
> >> #include <immintrin.h>
> >> void foo(unsigned int eax)
> >> {
> >>     _hreset (eax);
> >> }
> >>
> >> $ clang hreset.c -mhreset  -c -O2 -m32
> >> $ gcc hreset.c -mhreset  -c -O2 -m32
> >> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
> >>                    from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
> >>                    from hreset.c:2:
> >> hreset.c: In function ‘foo’:
> >> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
> >>      39 | _hreset (unsigned int __EAX)
> >>         | ^~~~~~~
> >> hreset.c:5:3: note: called from here
> >>       5 |   _hreset (eax);
> >>         |   ^~~~~~~~~~~~~
> >>
> >> Anyway, I think the current target attribute handling should be preserved.
> >
> > I think this and the -O1 argument above suggests that there should be
> > a way to distinguish
> > two modes - add to the active set of options and starting from scratch.
>
> Doing that would make it even crazier :)
>
> >
> > Maybe it's over-designing things but do we want to preserve the
> > existing behavior
> > and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
> > a way to amend
> > the state?
>
> I prefer doing only the append mode (when one can still use -fno-foo for an explicit
> drop of a flag).
>
> >
> > OTOH as we're missing global_options re-init even with your patch we won't get
> > the defaults correct (aka what toplev::main does with init_options_struct and
> > the corresponding langhook).  Likewise if lang_hooks.init_options performs any
> > defaulting a later flag overrides and we override that with optimize() that
> > doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags.
> > So -O2 -fcx-fortran-rules on the command-line and optimize
> > ("no-cx-fortran-rules")
> > to cancel the -fcx-fortran-rules switch wouldn't work?
>
> In most cases it works. What's problematic about -fcx-fortran-rules is that it sets
>
>    /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
>    if (flag_cx_limited_range)
>      flag_complex_method = 0;
>
>    /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
>    if (flag_cx_fortran_rules)
>      flag_complex_method = 1;
>
> in process_options (called only for cmdline options) and not in
>
> /* After all options at LOC have been read into OPTS and OPTS_SET,
>     finalize settings of those options and diagnose incompatible
>     combinations.  */
> void
> finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>                 location_t loc)
>
> which is a place which is called once options are decoded (both from cmdline and when
> combined with a attribute or pragma):

Yes, and that flag_complex_method is initialized via the langhook
mentioned, for example
c-family/c-opts.c has

/* Initialize options structure OPTS.  */
void
c_common_init_options_struct (struct gcc_options *opts)
{
  opts->x_flag_exceptions = c_dialect_cxx ();
  opts->x_warn_pointer_arith = c_dialect_cxx ();
  opts->x_warn_write_strings = c_dialect_cxx ();
  opts->x_flag_warn_unused_result = true;

  /* By default, C99-like requirements for complex multiply and divide.  */
  opts->x_flag_complex_method = 2;
}

so an attempt to "cancel" a command-line option that adjusted any of
the above will not
work because we're not re-initializing global_options appropriately.
But maybe we can
just do that?  That is, call

  /* Initialize global options structures; this must be repeated for
     each structure used for parsing options.  */
  init_options_struct (&global_options, &global_options_set);
  lang_hooks.init_options_struct (&global_options);

and

  /* Perform language-specific options initialization.  */
  lang_hooks.init_options (save_decoded_options_count, save_decoded_options);

as done by toplev.c?  Or if we do not want to do that store that state away
to an 'initialized_options/initialized_options_set' set of vars we can
copy from?

> #1  0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 <global_options>, opts_set=opts_set@entry=0x26afdc0 <global_options_set>, loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303
>
> #2  0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, opts_set=0x26afdc0 <global_options_set>, decoded_options=<optimized out>, decoded_options_count=decoded_options_count@entry=4, loc=258754, dc=0x26b2b00 <global_diagnostic_context>,
>
>      target_option_override_hook=0x0) at /home/marxin/Programming/gcc/gcc/opts-global.c:324
>
> #3  0x0000000000921144 in parse_optimize_options (args=args@entry=<tree_list 0x7ffff76e1910>, attr_p=attr_p@entry=false) at /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921
>
> #4  0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993
>
> #5  0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, context=pragma_external, if_p=0x0) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573
>
>
> Martin
>
> >
> > Thanks,
> > Richard.
> >
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >> Thanks,
> >> Martin
> >>
> >>>
> >>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
> >>>
> >>> Thoughts?
> >>> Thanks,
> >>> Martin
> >>
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-08-26 11:04       ` Richard Biener
@ 2021-08-26 12:39         ` Martin Liška
  2021-08-26 13:20           ` Richard Biener
  2021-08-27  8:35           ` Martin Liška
  2021-09-06 11:37         ` [PATCH] flag_complex_method: support optimize attribute Martin Liška
  1 sibling, 2 replies; 30+ messages in thread
From: Martin Liška @ 2021-08-26 12:39 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jakub Jelinek, Michael Matz

On 8/26/21 13:04, Richard Biener wrote:
> On Tue, Aug 24, 2021 at 3:04 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 8/24/21 14:13, Richard Biener wrote:
>>> On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mliska@suse.cz> wrote:
>>>>
>>>> On 10/23/20 1:47 PM, Martin Liška wrote:
>>>>> Hey.
>>>>
>>>> Hello.
>>>>
>>>> I deferred the patch for GCC 12. Since the time, I messed up with options
>>>> I feel more familiar with the option handling. So ...
>>>>
>>>>>
>>>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>>>
>>>>> The current optimize attribute works in the following way:
>>>>> - 1) we take current global_options as base
>>>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>>>         means all rules in default_options_table are executed
>>>>> - 3) attribute values are applied (via decode_options)
>>>>>
>>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>>>        /* -O1 and -Og optimizations.  */
>>>>>        { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>>>
>>>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>>>> vague right now:
>>>>
>>>> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.
>>
>> Hey.
>>
>>> There is also handle_common_deferred_options that's not called so any
>>> option processed there should
>>> probably be excempt from being set/unset in the optimize attribute?
>>
>> Looking at the handled options, they have all Defer type and not Optimization.
>> Thus we should be fine.
>>
>>>
>>>>>
>>>>> """
>>>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>>>> """
>>>>
>>>> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
>>>> as something not for a production use:
>>>>
>>>> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
>>>>
>>>> Are we sure about the statement? I know that e.g. glibc uses that.
>>>
>>> Well, given we're changing behavior now that warning looks valid ;)
>>
>> Yeah! True.
>>
>>> I'll also note that
>>>
>>> "The optimize attribute arguments of a function behave
>>> as if they were added to the command line options."
>>>
>>> is still likely untrue, the global state init is complicated ;)
>>
>> Sure, but the situation should be much closer to it :) Do you have a better wording?
> 
> Maybe "The intent is that the optimize attribute behaves as if the
> arguments were
> appended to the command line."
> 
> But as said originally below I'm not sure that this behavior is what people
> expect.  If I say optimize("fno-tree-vectorize") then I do expect that to retain
> other command-line arguments.

Yep, that's clear!

> If I say optimize(1) I'm not sure I would expect
> -ftree-vectorize on the command-line to prevail ;)

That's a very good question: theoretically yes, I can imagine various scenarios how can -Ox be used
in an attribute:

1) -O0 for a function that e.g. violate strict aliasing, or when one wants to debug a fn in an optimized binary
2) -Ofast for a function which is performance critical

On the other hand, my original motivation was a kernel compiler with:
-O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))

It's not intuitive that you end up with -O2 -fno-stack-protector (and -fomit-frame-pointer).

> 
> Is google code search still a thing?  Can one search all of github
> somehow?  I really
> wonder how 'optimize' is used at the moment.  There are quite some optimize
> attributes in the target part of the testsuite for example.  And
> testsuite/gcc.dg/vect/bb-slp-41.c

I can investigate that.

> suggests that optimize("-fno-tree-fre") preserves at least the
> optimization level?
> 
> There's always the possibility to preserve the current behavior for 'optimize'

Yes, but we should somehow document the current weird behavior when it comes to -Ox options.

> and add a new 'add_optimize' attribute that does the other thing.

Considering that ...

> 
>>>
>>>
>>>>>
>>>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>>>
>>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>>>> with -ftree-vectorize -O1 ?
>>
>> This is my older suggestion and it will likely make it even much complicated. So ...
> 
> In theory it's just dropping the command-line but yes, the question is
> what happens
> to the non-optimization part of the command-line.  We obviously shouldn't drop
> -std=gnu14 and it's side-effects.  So yes, even documenting exactly what this
> would do is difficult ;)
> 
>>>
>>> As implemented your patch seems to turn it into -ftree-vectorize -O1.
>>
>> Yes.
>>
>>> IIRC multiple optimize attributes apply
>>> ontop of each other, and it makes sense to me that optimize (2),
>>> optimize ("tree-vectorize") behaves the same
>>> as optimize (2, "tree-vectorize").  I'm not sure this is still the
>>> case after your patch?  Also consider
>>>
>>> #pragma GCC optimize ("tree-vectorize")
>>> void foo () { ...}
>>>
>>> #pragma GCC optimize ("tree-loop-distribution")
>>> void bar () {... }
>>>
>>> I'd expect bar to have both vectorization and loop distribution
>>> enabled? (note I didn't use push/pop here)
>>
>> Yes, yes and yes. I'm going to verify it.
>>
>>>
>>>> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
>>>>
>>>> $ cat -n gcc/config/i386/i386-options.c
>>>> ...
>>>>      1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
>>>>      1246                  {
>>>>      1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
>>>>      1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
>>>>      1249                       and all bits in x_ix86_isa_flags2.  */
>>>>      1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
>>>>      1251                                               | OPTION_MASK_ABI_64
>>>>      1252                                               | OPTION_MASK_ABI_X32
>>>>      1253                                               | OPTION_MASK_CODE16);
>>>>      1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
>>>>      1255                                                        | OPTION_MASK_ABI_64
>>>>      1256                                                        | OPTION_MASK_ABI_X32
>>>>      1257                                                        | OPTION_MASK_CODE16);
>>>>      1258                    opts->x_ix86_isa_flags2 = 0;
>>>>      1259                    opts->x_ix86_isa_flags2_explicit = 0;
>>>>      1260                  }
>>>>
>>>> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
>>>> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
>>>>
>>>> $ cat hreset.c
>>>> #pragma GCC target "arch=geode"
>>>> #include <immintrin.h>
>>>> void foo(unsigned int eax)
>>>> {
>>>>      _hreset (eax);
>>>> }
>>>>
>>>> $ clang hreset.c -mhreset  -c -O2 -m32
>>>> $ gcc hreset.c -mhreset  -c -O2 -m32
>>>> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
>>>>                     from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
>>>>                     from hreset.c:2:
>>>> hreset.c: In function ‘foo’:
>>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
>>>>       39 | _hreset (unsigned int __EAX)
>>>>          | ^~~~~~~
>>>> hreset.c:5:3: note: called from here
>>>>        5 |   _hreset (eax);
>>>>          |   ^~~~~~~~~~~~~
>>>>
>>>> Anyway, I think the current target attribute handling should be preserved.
>>>
>>> I think this and the -O1 argument above suggests that there should be
>>> a way to distinguish
>>> two modes - add to the active set of options and starting from scratch.
>>
>> Doing that would make it even crazier :)
>>
>>>
>>> Maybe it's over-designing things but do we want to preserve the
>>> existing behavior
>>> and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
>>> a way to amend
>>> the state?
>>
>> I prefer doing only the append mode (when one can still use -fno-foo for an explicit
>> drop of a flag).
>>
>>>
>>> OTOH as we're missing global_options re-init even with your patch we won't get
>>> the defaults correct (aka what toplev::main does with init_options_struct and
>>> the corresponding langhook).  Likewise if lang_hooks.init_options performs any
>>> defaulting a later flag overrides and we override that with optimize() that
>>> doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags.
>>> So -O2 -fcx-fortran-rules on the command-line and optimize
>>> ("no-cx-fortran-rules")
>>> to cancel the -fcx-fortran-rules switch wouldn't work?
>>
>> In most cases it works. What's problematic about -fcx-fortran-rules is that it sets
>>
>>     /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
>>     if (flag_cx_limited_range)
>>       flag_complex_method = 0;
>>
>>     /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
>>     if (flag_cx_fortran_rules)
>>       flag_complex_method = 1;
>>
>> in process_options (called only for cmdline options) and not in
>>
>> /* After all options at LOC have been read into OPTS and OPTS_SET,
>>      finalize settings of those options and diagnose incompatible
>>      combinations.  */
>> void
>> finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>>                  location_t loc)
>>
>> which is a place which is called once options are decoded (both from cmdline and when
>> combined with a attribute or pragma):
> 
> Yes, and that flag_complex_method is initialized via the langhook
> mentioned, for example
> c-family/c-opts.c has
> 
> /* Initialize options structure OPTS.  */
> void
> c_common_init_options_struct (struct gcc_options *opts)
> {
>    opts->x_flag_exceptions = c_dialect_cxx ();
>    opts->x_warn_pointer_arith = c_dialect_cxx ();
>    opts->x_warn_write_strings = c_dialect_cxx ();
>    opts->x_flag_warn_unused_result = true;
> 
>    /* By default, C99-like requirements for complex multiply and divide.  */
>    opts->x_flag_complex_method = 2;
> }
> 
> so an attempt to "cancel" a command-line option that adjusted any of
> the above will not
> work because we're not re-initializing global_options appropriately.
> But maybe we can
> just do that?  That is, call
> 
>    /* Initialize global options structures; this must be repeated for
>       each structure used for parsing options.  */
>    init_options_struct (&global_options, &global_options_set);
>    lang_hooks.init_options_struct (&global_options);
> 
> and
> 
>    /* Perform language-specific options initialization.  */
>    lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
> 
> as done by toplev.c?  Or if we do not want to do that store that state away
> to an 'initialized_options/initialized_options_set' set of vars we can
> copy from?

I think this one is a bit orthogonal to the suggested changes, right? We can hopefully implement it incrementally.

Martin

> 
>> #1  0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 <global_options>, opts_set=opts_set@entry=0x26afdc0 <global_options_set>, loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303
>>
>> #2  0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, opts_set=0x26afdc0 <global_options_set>, decoded_options=<optimized out>, decoded_options_count=decoded_options_count@entry=4, loc=258754, dc=0x26b2b00 <global_diagnostic_context>,
>>
>>       target_option_override_hook=0x0) at /home/marxin/Programming/gcc/gcc/opts-global.c:324
>>
>> #3  0x0000000000921144 in parse_optimize_options (args=args@entry=<tree_list 0x7ffff76e1910>, attr_p=attr_p@entry=false) at /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921
>>
>> #4  0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993
>>
>> #5  0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, context=pragma_external, if_p=0x0) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573
>>
>>
>> Martin
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>
>>>> Ready to be installed?
>>>> Thanks,
>>>> Martin
>>>>
>>>>>
>>>>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>>>>>
>>>>> Thoughts?
>>>>> Thanks,
>>>>> Martin
>>>>
>>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-08-26 12:39         ` Martin Liška
@ 2021-08-26 13:20           ` Richard Biener
  2021-08-27  8:35           ` Martin Liška
  1 sibling, 0 replies; 30+ messages in thread
From: Richard Biener @ 2021-08-26 13:20 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, Jakub Jelinek, Michael Matz

On Thu, Aug 26, 2021 at 2:39 PM Martin Liška <mliska@suse.cz> wrote:
>
> On 8/26/21 13:04, Richard Biener wrote:
> > On Tue, Aug 24, 2021 at 3:04 PM Martin Liška <mliska@suse.cz> wrote:
> >>
> >> On 8/24/21 14:13, Richard Biener wrote:
> >>> On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mliska@suse.cz> wrote:
> >>>>
> >>>> On 10/23/20 1:47 PM, Martin Liška wrote:
> >>>>> Hey.
> >>>>
> >>>> Hello.
> >>>>
> >>>> I deferred the patch for GCC 12. Since the time, I messed up with options
> >>>> I feel more familiar with the option handling. So ...
> >>>>
> >>>>>
> >>>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
> >>>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
> >>>>>
> >>>>> The current optimize attribute works in the following way:
> >>>>> - 1) we take current global_options as base
> >>>>> - 2) maybe_default_options is called for the currently selected optimization level, which
> >>>>>         means all rules in default_options_table are executed
> >>>>> - 3) attribute values are applied (via decode_options)
> >>>>>
> >>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
> >>>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
> >>>>>        /* -O1 and -Og optimizations.  */
> >>>>>        { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
> >>>>>
> >>>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
> >>>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
> >>>>> vague right now:
> >>>>
> >>>> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.
> >>
> >> Hey.
> >>
> >>> There is also handle_common_deferred_options that's not called so any
> >>> option processed there should
> >>> probably be excempt from being set/unset in the optimize attribute?
> >>
> >> Looking at the handled options, they have all Defer type and not Optimization.
> >> Thus we should be fine.
> >>
> >>>
> >>>>>
> >>>>> """
> >>>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
> >>>>> """
> >>>>
> >>>> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
> >>>> as something not for a production use:
> >>>>
> >>>> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
> >>>>
> >>>> Are we sure about the statement? I know that e.g. glibc uses that.
> >>>
> >>> Well, given we're changing behavior now that warning looks valid ;)
> >>
> >> Yeah! True.
> >>
> >>> I'll also note that
> >>>
> >>> "The optimize attribute arguments of a function behave
> >>> as if they were added to the command line options."
> >>>
> >>> is still likely untrue, the global state init is complicated ;)
> >>
> >> Sure, but the situation should be much closer to it :) Do you have a better wording?
> >
> > Maybe "The intent is that the optimize attribute behaves as if the
> > arguments were
> > appended to the command line."
> >
> > But as said originally below I'm not sure that this behavior is what people
> > expect.  If I say optimize("fno-tree-vectorize") then I do expect that to retain
> > other command-line arguments.
>
> Yep, that's clear!
>
> > If I say optimize(1) I'm not sure I would expect
> > -ftree-vectorize on the command-line to prevail ;)
>
> That's a very good question: theoretically yes, I can imagine various scenarios how can -Ox be used
> in an attribute:
>
> 1) -O0 for a function that e.g. violate strict aliasing, or when one wants to debug a fn in an optimized binary
> 2) -Ofast for a function which is performance critical
>
> On the other hand, my original motivation was a kernel compiler with:
> -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>
> It's not intuitive that you end up with -O2 -fno-stack-protector (and -fomit-frame-pointer).
>
> >
> > Is google code search still a thing?  Can one search all of github
> > somehow?  I really
> > wonder how 'optimize' is used at the moment.  There are quite some optimize
> > attributes in the target part of the testsuite for example.  And
> > testsuite/gcc.dg/vect/bb-slp-41.c
>
> I can investigate that.
>
> > suggests that optimize("-fno-tree-fre") preserves at least the
> > optimization level?
> >
> > There's always the possibility to preserve the current behavior for 'optimize'
>
> Yes, but we should somehow document the current weird behavior when it comes to -Ox options.
>
> > and add a new 'add_optimize' attribute that does the other thing.
>
> Considering that ...
>
> >
> >>>
> >>>
> >>>>>
> >>>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
> >>>>>
> >>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
> >>>>> with -ftree-vectorize -O1 ?
> >>
> >> This is my older suggestion and it will likely make it even much complicated. So ...
> >
> > In theory it's just dropping the command-line but yes, the question is
> > what happens
> > to the non-optimization part of the command-line.  We obviously shouldn't drop
> > -std=gnu14 and it's side-effects.  So yes, even documenting exactly what this
> > would do is difficult ;)
> >
> >>>
> >>> As implemented your patch seems to turn it into -ftree-vectorize -O1.
> >>
> >> Yes.
> >>
> >>> IIRC multiple optimize attributes apply
> >>> ontop of each other, and it makes sense to me that optimize (2),
> >>> optimize ("tree-vectorize") behaves the same
> >>> as optimize (2, "tree-vectorize").  I'm not sure this is still the
> >>> case after your patch?  Also consider
> >>>
> >>> #pragma GCC optimize ("tree-vectorize")
> >>> void foo () { ...}
> >>>
> >>> #pragma GCC optimize ("tree-loop-distribution")
> >>> void bar () {... }
> >>>
> >>> I'd expect bar to have both vectorization and loop distribution
> >>> enabled? (note I didn't use push/pop here)
> >>
> >> Yes, yes and yes. I'm going to verify it.
> >>
> >>>
> >>>> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
> >>>>
> >>>> $ cat -n gcc/config/i386/i386-options.c
> >>>> ...
> >>>>      1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
> >>>>      1246                  {
> >>>>      1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
> >>>>      1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
> >>>>      1249                       and all bits in x_ix86_isa_flags2.  */
> >>>>      1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
> >>>>      1251                                               | OPTION_MASK_ABI_64
> >>>>      1252                                               | OPTION_MASK_ABI_X32
> >>>>      1253                                               | OPTION_MASK_CODE16);
> >>>>      1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
> >>>>      1255                                                        | OPTION_MASK_ABI_64
> >>>>      1256                                                        | OPTION_MASK_ABI_X32
> >>>>      1257                                                        | OPTION_MASK_CODE16);
> >>>>      1258                    opts->x_ix86_isa_flags2 = 0;
> >>>>      1259                    opts->x_ix86_isa_flags2_explicit = 0;
> >>>>      1260                  }
> >>>>
> >>>> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
> >>>> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
> >>>>
> >>>> $ cat hreset.c
> >>>> #pragma GCC target "arch=geode"
> >>>> #include <immintrin.h>
> >>>> void foo(unsigned int eax)
> >>>> {
> >>>>      _hreset (eax);
> >>>> }
> >>>>
> >>>> $ clang hreset.c -mhreset  -c -O2 -m32
> >>>> $ gcc hreset.c -mhreset  -c -O2 -m32
> >>>> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
> >>>>                     from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
> >>>>                     from hreset.c:2:
> >>>> hreset.c: In function ‘foo’:
> >>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
> >>>>       39 | _hreset (unsigned int __EAX)
> >>>>          | ^~~~~~~
> >>>> hreset.c:5:3: note: called from here
> >>>>        5 |   _hreset (eax);
> >>>>          |   ^~~~~~~~~~~~~
> >>>>
> >>>> Anyway, I think the current target attribute handling should be preserved.
> >>>
> >>> I think this and the -O1 argument above suggests that there should be
> >>> a way to distinguish
> >>> two modes - add to the active set of options and starting from scratch.
> >>
> >> Doing that would make it even crazier :)
> >>
> >>>
> >>> Maybe it's over-designing things but do we want to preserve the
> >>> existing behavior
> >>> and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
> >>> a way to amend
> >>> the state?
> >>
> >> I prefer doing only the append mode (when one can still use -fno-foo for an explicit
> >> drop of a flag).
> >>
> >>>
> >>> OTOH as we're missing global_options re-init even with your patch we won't get
> >>> the defaults correct (aka what toplev::main does with init_options_struct and
> >>> the corresponding langhook).  Likewise if lang_hooks.init_options performs any
> >>> defaulting a later flag overrides and we override that with optimize() that
> >>> doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags.
> >>> So -O2 -fcx-fortran-rules on the command-line and optimize
> >>> ("no-cx-fortran-rules")
> >>> to cancel the -fcx-fortran-rules switch wouldn't work?
> >>
> >> In most cases it works. What's problematic about -fcx-fortran-rules is that it sets
> >>
> >>     /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
> >>     if (flag_cx_limited_range)
> >>       flag_complex_method = 0;
> >>
> >>     /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
> >>     if (flag_cx_fortran_rules)
> >>       flag_complex_method = 1;
> >>
> >> in process_options (called only for cmdline options) and not in
> >>
> >> /* After all options at LOC have been read into OPTS and OPTS_SET,
> >>      finalize settings of those options and diagnose incompatible
> >>      combinations.  */
> >> void
> >> finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
> >>                  location_t loc)
> >>
> >> which is a place which is called once options are decoded (both from cmdline and when
> >> combined with a attribute or pragma):
> >
> > Yes, and that flag_complex_method is initialized via the langhook
> > mentioned, for example
> > c-family/c-opts.c has
> >
> > /* Initialize options structure OPTS.  */
> > void
> > c_common_init_options_struct (struct gcc_options *opts)
> > {
> >    opts->x_flag_exceptions = c_dialect_cxx ();
> >    opts->x_warn_pointer_arith = c_dialect_cxx ();
> >    opts->x_warn_write_strings = c_dialect_cxx ();
> >    opts->x_flag_warn_unused_result = true;
> >
> >    /* By default, C99-like requirements for complex multiply and divide.  */
> >    opts->x_flag_complex_method = 2;
> > }
> >
> > so an attempt to "cancel" a command-line option that adjusted any of
> > the above will not
> > work because we're not re-initializing global_options appropriately.
> > But maybe we can
> > just do that?  That is, call
> >
> >    /* Initialize global options structures; this must be repeated for
> >       each structure used for parsing options.  */
> >    init_options_struct (&global_options, &global_options_set);
> >    lang_hooks.init_options_struct (&global_options);
> >
> > and
> >
> >    /* Perform language-specific options initialization.  */
> >    lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
> >
> > as done by toplev.c?  Or if we do not want to do that store that state away
> > to an 'initialized_options/initialized_options_set' set of vars we can
> > copy from?
>
> I think this one is a bit orthogonal to the suggested changes, right? We can hopefully implement it incrementally.

It's I think required to get close to "behave as if appended to the
command-line".

Richard.

> Martin
>
> >
> >> #1  0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 <global_options>, opts_set=opts_set@entry=0x26afdc0 <global_options_set>, loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303
> >>
> >> #2  0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, opts_set=0x26afdc0 <global_options_set>, decoded_options=<optimized out>, decoded_options_count=decoded_options_count@entry=4, loc=258754, dc=0x26b2b00 <global_diagnostic_context>,
> >>
> >>       target_option_override_hook=0x0) at /home/marxin/Programming/gcc/gcc/opts-global.c:324
> >>
> >> #3  0x0000000000921144 in parse_optimize_options (args=args@entry=<tree_list 0x7ffff76e1910>, attr_p=attr_p@entry=false) at /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921
> >>
> >> #4  0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993
> >>
> >> #5  0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, context=pragma_external, if_p=0x0) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573
> >>
> >>
> >> Martin
> >>
> >>>
> >>> Thanks,
> >>> Richard.
> >>>
> >>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>>>
> >>>> Ready to be installed?
> >>>> Thanks,
> >>>> Martin
> >>>>
> >>>>>
> >>>>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
> >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
> >>>>>
> >>>>> Thoughts?
> >>>>> Thanks,
> >>>>> Martin
> >>>>
> >>
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-08-26 12:39         ` Martin Liška
  2021-08-26 13:20           ` Richard Biener
@ 2021-08-27  8:35           ` Martin Liška
  2021-08-27  9:05             ` Richard Biener
  1 sibling, 1 reply; 30+ messages in thread
From: Martin Liška @ 2021-08-27  8:35 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, Michael Matz, GCC Patches

On 8/26/21 14:39, Martin Liška wrote:
> I can investigate that.

So there are statistics about #pragma GCC optimize and optimize attribute in openSUSE:Factory.
The numbers at the line beginning represent number of functions affected by the pragma or attribute.
Are we smarter in what we want?

Processed 14026 packages, attr/pragma used in 111


--- Mesa-drivers.log ---

1x: "O1"

--- Mesa.log ---

1x: "O1"

--- PrusaSlicer.log ---

44x: "-fno-unsafe-math-optimizations"

--- SVT-AV1.log ---

2x: "unroll-loops"

--- abseil-cpp.log ---

4x: "no-optimize-sibling-calls"

--- analitza.log ---

1x: "-fno-unsafe-math-optimizations"

--- argon2.log ---

2x: "O0"

--- arpack-ng:serial.log ---

2x: "-fno-unsafe-math-optimizations"

--- avogadrolibs.log ---

106x: "-fno-unsafe-math-optimizations"

--- caddy.log ---

32x: "no-tree-vectorize"

--- calligra.log ---

1x: "-fno-unsafe-math-optimizations"

--- ceph-test.log ---

2x: "no-tree-vectorize"

--- ceph.log ---

2x: "no-tree-vectorize"

--- ceres-solver.log ---

130x: "-fno-unsafe-math-optimizations"

--- cmake:mini.log ---

1x: "no-tree-vectorize"

--- cpustat.log ---

15x: "-O3"

--- cross-aarch64-gcc11-bootstrap.log ---

32x: "no-omit-frame-pointer"

--- cross-amdgcn-gcc11.log ---

12x: "O2"

10x: "-fno-tree-loop-distribute-patterns"

1x: "-fno-tree-loop-distribute-patterns""-fno-tree-loop-distribute-patterns"

--- cross-nvptx-gcc11.log ---

6x: "-fno-tree-loop-distribute-patterns"

2x: "O2"

--- csound.log ---

6x: "no-finite-math-only"

1x: "-fno-unsafe-math-optimizations"

--- darktable.log ---

65213x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "tree-loop-vectorize", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "no-math-errno"

4981x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math"

1021x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "fp-contract=fast", "tree-vectorize"

1018x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unsw
 itch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "tree-vectorize"

1011x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "tree-vectorize", "no-math-errno"

315x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math"

71x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "no-math-errno"

54x: "finite-math-only"

13x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswit
 ch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math"

5x: "fast-math", "fp-contract=fast"

--- eigen3:docs.log ---

378x: "-fno-unsafe-math-optimizations"

--- eventstat.log ---

2x: "-O3"

--- frr.log ---

3x: "3"

--- fwts.log ---

4x: "-O0"

--- fwupd.log ---

1x: "0"

--- gcc11-testresults.log ---

4151x: "no-isolate-erroneous-paths-dereference"

12x: "O2"

8x: "-fno-optimize-sibling-calls"

3x: ATTR="no-isolate-erroneous-paths-dereference""no-isolate-erroneous-paths-dereference"

2x: ATTR="no-isolate-erroneous-paths-dereference"

1x: "no-isolate-erroneous-paths-dereference"ATTR=

--- gcc11.log ---

5572x: "no-isolate-erroneous-paths-dereference"

16x: "O2"

8x: "-fno-optimize-sibling-calls"

--- ghc-cryptonite.log ---

3x: "O0"

--- glibc.log ---

308x: "-fno-stack-protector"

1x: "-fno-stack-protector"../sysdeps/x86_64/multiarch/memcpy.c:29:1: note: XXX:optimize attribute:

1x: ATTR="-fno-stack-protector"

1x: ATTR="-fno-stack-protector""-fno-stack-protector"

--- glibc:testsuite.log ---

318x: "-fno-stack-protector"

--- glibc:utils.log ---

216x: "-fno-stack-protector"

1x: "-fno-stack-protector"ATTR=

--- icinga2.log ---

5x: "O0"

--- java-16-openjdk.log ---

24167x: "O0"

--- keepalived.log ---

1x: "O0"

--- kstars.log ---

187x: "-fno-unsafe-math-optimizations"

--- lammps.log ---

6x: "no-var-tracking-assignments"

--- lastpass-cli.log ---

1x: "unroll-loops"

--- libgcrypt.log ---

28x: "O0"

--- libkeccak.log ---

378x: "-O0"

--- libopenshot-audio.log ---

7x: "no-associative-math"

--- libqt5-qtbase.log ---

1x: "omit-frame-pointer"

1x: "no-tree-vectorize"

--- libraw.log ---

12x: "no-aggressive-loop-optimizations"

--- libretro-beetle-psx-hw.log ---

10217x: "unroll-loops"

56x: "unroll-loops", "unroll-loops"

--- libretro-beetle-psx.log ---

11428x: "unroll-loops"

56x: "unroll-loops", "unroll-loops"

--- libretro-beetle-saturn.log ---

132x: "no-crossjumping,no-gcse"

39x: "Os"

1x: "no-unroll-loops,no-peel-loops,no-crossjumping"

1x: "O2,no-unroll-loops,no-peel-loops,no-crossjumping"

--- libsemigroups.log ---

3x: "-fno-unsafe-math-optimizations"

--- libunwind.log ---

4x: "-O1"

--- libxtrx.log ---

5x: "O3"

--- libxtrxdsp.log ---

2x: "unroll-loops"

--- links.log ---

464942x: "-ftree-vectorize", "-ffast-math"

--- mariadb.log ---

2x: "-O0"

--- memtest86+.log ---

1x: "O0"

--- mercurial.log ---

19x: "no-tree-vectorize"

--- monitoring-plugins.log ---

11x: "wrapv"

--- mono-core.log ---

3x: "O0"

--- movit.log ---

41x: "-fno-unsafe-math-optimizations"

--- mvapich2:gnu-hpc-testsuite.log ---

1x: "O0"

--- mvapich2:gnu-hpc.log ---

1x: "O0"

--- mvapich2:standard.log ---

1x: "O0"

--- mvapich2:testsuite.log ---

1x: "O0"

--- mysql-connector-cpp.log ---

36x: "no-tree-vectorize"

--- newlib:epiphany.log ---

3x: "-fno-tree-loop-distribute-patterns"

--- newlib:riscv64.log ---

5x: "-fno-tree-loop-distribute-patterns"

--- o2scl.log ---

92x: "-fno-unsafe-math-optimizations"

--- ocaml.log ---

8x: "tree-vectorize"

--- openbabel.log ---

14x: "-fno-unsafe-math-optimizations"

--- openucx.log ---

220x: "O0"

--- pdns.log ---

6x: "no-associative-math"

--- perl-Cpanel-JSON-XS.log ---

1x: "O0"

--- perl-Sereal-Decoder.log ---

2x: "no-tree-vectorize"

--- perl-Sereal-Encoder.log ---

2x: "no-tree-vectorize"

--- php7.log ---

67x: "Os"

--- php7:apache2.log ---

67x: "Os"

--- php7:embed.log ---

67x: "Os"

--- php7:fastcgi.log ---

67x: "Os"

--- php7:fpm.log ---

67x: "Os"

--- php8.log ---

75x: "Os"

--- php8:apache2.log ---

75x: "Os"

--- php8:embed.log ---

75x: "Os"

--- php8:fastcgi.log ---

75x: "Os"

--- php8:fpm.log ---

74x: "Os"

--- picom.log ---

27x: "-fno-fast-math"

--- python-Bottleneck.log ---

25x: "O3"

--- python-grpcio.log ---

12x: "no-optimize-sibling-calls"

--- python-numpy.log ---

9715x: "O3"

48x: "unroll-loops"

--- python-numpy:gnu-hpc.log ---

9709x: "O3"

48x: "unroll-loops"

--- python-pyroomacoustics.log ---

2x: "-fno-unsafe-math-optimizations"

--- python-scipy:gnu-hpc.log ---

6179x: "tree-vectorize", "unsafe-math-optimizations", "unroll-loops"

1006x: "unroll-loops"

1x: "unroll-loops"gcc: scipy/special/cephes/rgamma.c

--- python-scipy:standard.log ---

6179x: "tree-vectorize", "unsafe-math-optimizations", "unroll-loops"

1005x: "unroll-loops"

1x: "unroll-loops"gcc: scipy/special/cephes/tandg.c

1x: ATTR="unroll-loops"

--- python-thewalrus.log ---

2x: "-fno-unsafe-math-optimizations"

--- python-torch:standard.log ---

230x: "-fno-unsafe-math-optimizations"

--- python-zstandard.log ---

12x: "no-tree-vectorize"

--- python3-espressomd.log ---

2x: "no-associative-math"

--- qt6-base.log ---

1x: "omit-frame-pointer"

1x: "no-tree-vectorize"

--- rspamd.log ---

22x: "unroll-loops"

2x: "no-tree-vectorize"

--- sha3sum.log ---

120x: "-O0"

--- shim.log ---

4x: "0"

--- snd.log ---

223x: "tree-vectorize"

--- step.log ---

49x: "-fno-unsafe-math-optimizations"

--- stress-ng.log ---

1070x: "-O3"

7x: "-O0"

2x: "-O1"

--- subversion.log ---

184x: "-O0"

--- sympol.log ---

2x: "-fno-unsafe-math-optimizations"

--- tensorflow2:lite.log ---

80x: "-fno-unsafe-math-optimizations"

--- texlive.log ---

1x: "fast-math"

--- votca-csg.log ---

93x: "-fno-unsafe-math-optimizations"

--- votca-tools.log ---

60x: "-fno-unsafe-math-optimizations"

--- votca-xtp.log ---

215x: "-fno-unsafe-math-optimizations"

--- xf86-video-intel.log ---

625x: "Ofast"

15x: "Ofast", "Ofast"

--- xmrig.log ---

1x: "O0"

--- zstd.log ---

6x: "no-tree-vectorize"



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-08-27  8:35           ` Martin Liška
@ 2021-08-27  9:05             ` Richard Biener
  2021-09-13 13:52               ` Martin Liška
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Biener @ 2021-08-27  9:05 UTC (permalink / raw)
  To: Martin Liška; +Cc: Jakub Jelinek, Michael Matz, GCC Patches

On Fri, Aug 27, 2021 at 10:35 AM Martin Liška <mliska@suse.cz> wrote:
>
> On 8/26/21 14:39, Martin Liška wrote:
> > I can investigate that.
>
> So there are statistics about #pragma GCC optimize and optimize attribute in openSUSE:Factory.
> The numbers at the line beginning represent number of functions affected by the pragma or attribute.
> Are we smarter in what we want?

So with ignoring darktable which seems completely insane the cases
will likely continue
to work as intended if we change from the current scheme to appending
as proposed.

Richard.

> Processed 14026 packages, attr/pragma used in 111
>
>
> --- Mesa-drivers.log ---
>
> 1x: "O1"
>
> --- Mesa.log ---
>
> 1x: "O1"
>
> --- PrusaSlicer.log ---
>
> 44x: "-fno-unsafe-math-optimizations"
>
> --- SVT-AV1.log ---
>
> 2x: "unroll-loops"
>
> --- abseil-cpp.log ---
>
> 4x: "no-optimize-sibling-calls"
>
> --- analitza.log ---
>
> 1x: "-fno-unsafe-math-optimizations"
>
> --- argon2.log ---
>
> 2x: "O0"
>
> --- arpack-ng:serial.log ---
>
> 2x: "-fno-unsafe-math-optimizations"
>
> --- avogadrolibs.log ---
>
> 106x: "-fno-unsafe-math-optimizations"
>
> --- caddy.log ---
>
> 32x: "no-tree-vectorize"
>
> --- calligra.log ---
>
> 1x: "-fno-unsafe-math-optimizations"
>
> --- ceph-test.log ---
>
> 2x: "no-tree-vectorize"
>
> --- ceph.log ---
>
> 2x: "no-tree-vectorize"
>
> --- ceres-solver.log ---
>
> 130x: "-fno-unsafe-math-optimizations"
>
> --- cmake:mini.log ---
>
> 1x: "no-tree-vectorize"
>
> --- cpustat.log ---
>
> 15x: "-O3"
>
> --- cross-aarch64-gcc11-bootstrap.log ---
>
> 32x: "no-omit-frame-pointer"
>
> --- cross-amdgcn-gcc11.log ---
>
> 12x: "O2"
>
> 10x: "-fno-tree-loop-distribute-patterns"
>
> 1x: "-fno-tree-loop-distribute-patterns""-fno-tree-loop-distribute-patterns"
>
> --- cross-nvptx-gcc11.log ---
>
> 6x: "-fno-tree-loop-distribute-patterns"
>
> 2x: "O2"
>
> --- csound.log ---
>
> 6x: "no-finite-math-only"
>
> 1x: "-fno-unsafe-math-optimizations"
>
> --- darktable.log ---
>
> 65213x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "tree-loop-vectorize", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "no-math-errno"
>
> 4981x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math"
>
> 1021x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "fp-contract=fast", "tree-vectorize"
>
> 1018x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unsw
>  itch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "tree-vectorize"
>
> 1011x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "tree-vectorize", "no-math-errno"
>
> 315x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math"
>
> 71x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "no-math-errno"
>
> 54x: "finite-math-only"
>
> 13x: "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswitch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math", "unroll-loops", "tree-loop-if-convert", "tree-loop-distribution", "no-strict-aliasing", "loop-interchange", "loop-nest-optimize", "tree-loop-im", "unswit
>  ch-loops", "tree-loop-ivcanon", "ira-loop-pressure", "split-ivs-in-unroller", "variable-expansion-in-unroller", "split-loops", "ivopts", "predictive-commoning", "tree-loop-linear", "loop-block", "loop-strip-mine", "finite-math-only", "fp-contract=fast", "fast-math"
>
> 5x: "fast-math", "fp-contract=fast"
>
> --- eigen3:docs.log ---
>
> 378x: "-fno-unsafe-math-optimizations"
>
> --- eventstat.log ---
>
> 2x: "-O3"
>
> --- frr.log ---
>
> 3x: "3"
>
> --- fwts.log ---
>
> 4x: "-O0"
>
> --- fwupd.log ---
>
> 1x: "0"
>
> --- gcc11-testresults.log ---
>
> 4151x: "no-isolate-erroneous-paths-dereference"
>
> 12x: "O2"
>
> 8x: "-fno-optimize-sibling-calls"
>
> 3x: ATTR="no-isolate-erroneous-paths-dereference""no-isolate-erroneous-paths-dereference"
>
> 2x: ATTR="no-isolate-erroneous-paths-dereference"
>
> 1x: "no-isolate-erroneous-paths-dereference"ATTR=
>
> --- gcc11.log ---
>
> 5572x: "no-isolate-erroneous-paths-dereference"
>
> 16x: "O2"
>
> 8x: "-fno-optimize-sibling-calls"
>
> --- ghc-cryptonite.log ---
>
> 3x: "O0"
>
> --- glibc.log ---
>
> 308x: "-fno-stack-protector"
>
> 1x: "-fno-stack-protector"../sysdeps/x86_64/multiarch/memcpy.c:29:1: note: XXX:optimize attribute:
>
> 1x: ATTR="-fno-stack-protector"
>
> 1x: ATTR="-fno-stack-protector""-fno-stack-protector"
>
> --- glibc:testsuite.log ---
>
> 318x: "-fno-stack-protector"
>
> --- glibc:utils.log ---
>
> 216x: "-fno-stack-protector"
>
> 1x: "-fno-stack-protector"ATTR=
>
> --- icinga2.log ---
>
> 5x: "O0"
>
> --- java-16-openjdk.log ---
>
> 24167x: "O0"
>
> --- keepalived.log ---
>
> 1x: "O0"
>
> --- kstars.log ---
>
> 187x: "-fno-unsafe-math-optimizations"
>
> --- lammps.log ---
>
> 6x: "no-var-tracking-assignments"
>
> --- lastpass-cli.log ---
>
> 1x: "unroll-loops"
>
> --- libgcrypt.log ---
>
> 28x: "O0"
>
> --- libkeccak.log ---
>
> 378x: "-O0"
>
> --- libopenshot-audio.log ---
>
> 7x: "no-associative-math"
>
> --- libqt5-qtbase.log ---
>
> 1x: "omit-frame-pointer"
>
> 1x: "no-tree-vectorize"
>
> --- libraw.log ---
>
> 12x: "no-aggressive-loop-optimizations"
>
> --- libretro-beetle-psx-hw.log ---
>
> 10217x: "unroll-loops"
>
> 56x: "unroll-loops", "unroll-loops"
>
> --- libretro-beetle-psx.log ---
>
> 11428x: "unroll-loops"
>
> 56x: "unroll-loops", "unroll-loops"
>
> --- libretro-beetle-saturn.log ---
>
> 132x: "no-crossjumping,no-gcse"
>
> 39x: "Os"
>
> 1x: "no-unroll-loops,no-peel-loops,no-crossjumping"
>
> 1x: "O2,no-unroll-loops,no-peel-loops,no-crossjumping"
>
> --- libsemigroups.log ---
>
> 3x: "-fno-unsafe-math-optimizations"
>
> --- libunwind.log ---
>
> 4x: "-O1"
>
> --- libxtrx.log ---
>
> 5x: "O3"
>
> --- libxtrxdsp.log ---
>
> 2x: "unroll-loops"
>
> --- links.log ---
>
> 464942x: "-ftree-vectorize", "-ffast-math"
>
> --- mariadb.log ---
>
> 2x: "-O0"
>
> --- memtest86+.log ---
>
> 1x: "O0"
>
> --- mercurial.log ---
>
> 19x: "no-tree-vectorize"
>
> --- monitoring-plugins.log ---
>
> 11x: "wrapv"
>
> --- mono-core.log ---
>
> 3x: "O0"
>
> --- movit.log ---
>
> 41x: "-fno-unsafe-math-optimizations"
>
> --- mvapich2:gnu-hpc-testsuite.log ---
>
> 1x: "O0"
>
> --- mvapich2:gnu-hpc.log ---
>
> 1x: "O0"
>
> --- mvapich2:standard.log ---
>
> 1x: "O0"
>
> --- mvapich2:testsuite.log ---
>
> 1x: "O0"
>
> --- mysql-connector-cpp.log ---
>
> 36x: "no-tree-vectorize"
>
> --- newlib:epiphany.log ---
>
> 3x: "-fno-tree-loop-distribute-patterns"
>
> --- newlib:riscv64.log ---
>
> 5x: "-fno-tree-loop-distribute-patterns"
>
> --- o2scl.log ---
>
> 92x: "-fno-unsafe-math-optimizations"
>
> --- ocaml.log ---
>
> 8x: "tree-vectorize"
>
> --- openbabel.log ---
>
> 14x: "-fno-unsafe-math-optimizations"
>
> --- openucx.log ---
>
> 220x: "O0"
>
> --- pdns.log ---
>
> 6x: "no-associative-math"
>
> --- perl-Cpanel-JSON-XS.log ---
>
> 1x: "O0"
>
> --- perl-Sereal-Decoder.log ---
>
> 2x: "no-tree-vectorize"
>
> --- perl-Sereal-Encoder.log ---
>
> 2x: "no-tree-vectorize"
>
> --- php7.log ---
>
> 67x: "Os"
>
> --- php7:apache2.log ---
>
> 67x: "Os"
>
> --- php7:embed.log ---
>
> 67x: "Os"
>
> --- php7:fastcgi.log ---
>
> 67x: "Os"
>
> --- php7:fpm.log ---
>
> 67x: "Os"
>
> --- php8.log ---
>
> 75x: "Os"
>
> --- php8:apache2.log ---
>
> 75x: "Os"
>
> --- php8:embed.log ---
>
> 75x: "Os"
>
> --- php8:fastcgi.log ---
>
> 75x: "Os"
>
> --- php8:fpm.log ---
>
> 74x: "Os"
>
> --- picom.log ---
>
> 27x: "-fno-fast-math"
>
> --- python-Bottleneck.log ---
>
> 25x: "O3"
>
> --- python-grpcio.log ---
>
> 12x: "no-optimize-sibling-calls"
>
> --- python-numpy.log ---
>
> 9715x: "O3"
>
> 48x: "unroll-loops"
>
> --- python-numpy:gnu-hpc.log ---
>
> 9709x: "O3"
>
> 48x: "unroll-loops"
>
> --- python-pyroomacoustics.log ---
>
> 2x: "-fno-unsafe-math-optimizations"
>
> --- python-scipy:gnu-hpc.log ---
>
> 6179x: "tree-vectorize", "unsafe-math-optimizations", "unroll-loops"
>
> 1006x: "unroll-loops"
>
> 1x: "unroll-loops"gcc: scipy/special/cephes/rgamma.c
>
> --- python-scipy:standard.log ---
>
> 6179x: "tree-vectorize", "unsafe-math-optimizations", "unroll-loops"
>
> 1005x: "unroll-loops"
>
> 1x: "unroll-loops"gcc: scipy/special/cephes/tandg.c
>
> 1x: ATTR="unroll-loops"
>
> --- python-thewalrus.log ---
>
> 2x: "-fno-unsafe-math-optimizations"
>
> --- python-torch:standard.log ---
>
> 230x: "-fno-unsafe-math-optimizations"
>
> --- python-zstandard.log ---
>
> 12x: "no-tree-vectorize"
>
> --- python3-espressomd.log ---
>
> 2x: "no-associative-math"
>
> --- qt6-base.log ---
>
> 1x: "omit-frame-pointer"
>
> 1x: "no-tree-vectorize"
>
> --- rspamd.log ---
>
> 22x: "unroll-loops"
>
> 2x: "no-tree-vectorize"
>
> --- sha3sum.log ---
>
> 120x: "-O0"
>
> --- shim.log ---
>
> 4x: "0"
>
> --- snd.log ---
>
> 223x: "tree-vectorize"
>
> --- step.log ---
>
> 49x: "-fno-unsafe-math-optimizations"
>
> --- stress-ng.log ---
>
> 1070x: "-O3"
>
> 7x: "-O0"
>
> 2x: "-O1"
>
> --- subversion.log ---
>
> 184x: "-O0"
>
> --- sympol.log ---
>
> 2x: "-fno-unsafe-math-optimizations"
>
> --- tensorflow2:lite.log ---
>
> 80x: "-fno-unsafe-math-optimizations"
>
> --- texlive.log ---
>
> 1x: "fast-math"
>
> --- votca-csg.log ---
>
> 93x: "-fno-unsafe-math-optimizations"
>
> --- votca-tools.log ---
>
> 60x: "-fno-unsafe-math-optimizations"
>
> --- votca-xtp.log ---
>
> 215x: "-fno-unsafe-math-optimizations"
>
> --- xf86-video-intel.log ---
>
> 625x: "Ofast"
>
> 15x: "Ofast", "Ofast"
>
> --- xmrig.log ---
>
> 1x: "O0"
>
> --- zstd.log ---
>
> 6x: "no-tree-vectorize"
>
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] flag_complex_method: support optimize attribute
  2021-08-26 11:04       ` Richard Biener
  2021-08-26 12:39         ` Martin Liška
@ 2021-09-06 11:37         ` Martin Liška
  2021-09-06 11:46           ` Jakub Jelinek
  1 sibling, 1 reply; 30+ messages in thread
From: Martin Liška @ 2021-09-06 11:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jakub Jelinek, Michael Matz

[-- Attachment #1: Type: text/plain, Size: 1666 bytes --]

On 8/26/21 13:04, Richard Biener wrote:
> Yes, and that flag_complex_method is initialized via the langhook
> mentioned, for example
> c-family/c-opts.c has
> 
> /* Initialize options structure OPTS.  */
> void
> c_common_init_options_struct (struct gcc_options *opts)
> {
>    opts->x_flag_exceptions = c_dialect_cxx ();
>    opts->x_warn_pointer_arith = c_dialect_cxx ();
>    opts->x_warn_write_strings = c_dialect_cxx ();
>    opts->x_flag_warn_unused_result = true;
> 
>    /* By default, C99-like requirements for complex multiply and divide.  */
>    opts->x_flag_complex_method = 2;
> }
> 
> so an attempt to "cancel" a command-line option that adjusted any of
> the above will not
> work because we're not re-initializing global_options appropriately.
> But maybe we can
> just do that?  That is, call
> 
>    /* Initialize global options structures; this must be repeated for
>       each structure used for parsing options.  */
>    init_options_struct (&global_options, &global_options_set);
>    lang_hooks.init_options_struct (&global_options);
> 
> and
> 
>    /* Perform language-specific options initialization.  */
>    lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
> 
> as done by toplev.c?  Or if we do not want to do that store that state away
> to an 'initialized_options/initialized_options_set' set of vars we can
> copy from?

Hello.

I think the proper fix is handling the flag_complex_method-related options in finish_options.
Doing that, we can see 'optimize' pragma works with the current master.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

[-- Attachment #2: 0001-flag_complex_method-support-optimize-attribute.patch --]
[-- Type: text/x-patch, Size: 2655 bytes --]

From 203451d5ea5352e75da27dcf8259cc8dd7880512 Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Fri, 3 Sep 2021 10:53:00 +0200
Subject: [PATCH] flag_complex_method: support optimize attribute

gcc/ChangeLog:

	* opts.c (finish_options): Move flag_complex_method setting
	from ...
	* toplev.c (process_options): ... here.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/compile/attr-complex-method.c: New test.
---
 gcc/opts.c                                             |  8 ++++++++
 .../gcc.c-torture/compile/attr-complex-method.c        | 10 ++++++++++
 gcc/toplev.c                                           |  8 --------
 3 files changed, 18 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c

diff --git a/gcc/opts.c b/gcc/opts.c
index e0501551ef5..2ac1a932f38 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1323,6 +1323,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
       = (opts->x_flag_unroll_loops
          || opts->x_flag_peel_loops
          || opts->x_optimize >= 3);
+
+  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
+  if (opts->x_flag_cx_limited_range)
+    flag_complex_method = 0;
+
+  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
+  if (opts->x_flag_cx_fortran_rules)
+    flag_complex_method = 1;
 }
 
 #define LEFT_COLUMN	27
diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c
new file mode 100644
index 00000000000..f08b72e273f
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c
@@ -0,0 +1,10 @@
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+#pragma GCC optimize "-fcx-limited-range"
+
+void do_div (_Complex double *a, _Complex double *b)
+{
+  *a = *b / (4.0 - 5.0fi);
+}
+
+/* { dg-final { scan-tree-dump-not "__divdc3" "optimized" } } */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 14d1335e79e..e1688aae724 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1695,14 +1695,6 @@ process_options (void)
       flag_stack_check = NO_STACK_CHECK;
     }
 
-  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
-  if (flag_cx_limited_range)
-    flag_complex_method = 0;
-
-  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
-  if (flag_cx_fortran_rules)
-    flag_complex_method = 1;
-
   /* Targets must be able to place spill slots at lower addresses.  If the
      target already uses a soft frame pointer, the transition is trivial.  */
   if (!FRAME_GROWS_DOWNWARD && flag_stack_protect)
-- 
2.33.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] flag_complex_method: support optimize attribute
  2021-09-06 11:37         ` [PATCH] flag_complex_method: support optimize attribute Martin Liška
@ 2021-09-06 11:46           ` Jakub Jelinek
  2021-09-06 12:16             ` Richard Biener
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Jelinek @ 2021-09-06 11:46 UTC (permalink / raw)
  To: Martin Liška; +Cc: Richard Biener, GCC Patches, Michael Matz

On Mon, Sep 06, 2021 at 01:37:46PM +0200, Martin Liška wrote:
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -1323,6 +1323,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>        = (opts->x_flag_unroll_loops
>           || opts->x_flag_peel_loops
>           || opts->x_optimize >= 3);
> +
> +  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
> +  if (opts->x_flag_cx_limited_range)
> +    flag_complex_method = 0;
> +
> +  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
> +  if (opts->x_flag_cx_fortran_rules)
> +    flag_complex_method = 1;

That should then be opts->x_flag_complex_method instead of flag_complex_method.

Ok with that change.

Note, I think we want to do much more in finish_options and less in
process_options, anything that is about Optimization options rather than
just the global ones.  Though one needs to be careful with the cases where
the code diagnoses something.

	Jakub


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] flag_complex_method: support optimize attribute
  2021-09-06 11:46           ` Jakub Jelinek
@ 2021-09-06 12:16             ` Richard Biener
  2021-09-06 12:24               ` Jakub Jelinek
  2021-09-07  9:42               ` Martin Liška
  0 siblings, 2 replies; 30+ messages in thread
From: Richard Biener @ 2021-09-06 12:16 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Martin Liška, GCC Patches, Michael Matz

On Mon, Sep 6, 2021 at 1:46 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Sep 06, 2021 at 01:37:46PM +0200, Martin Liška wrote:
> > --- a/gcc/opts.c
> > +++ b/gcc/opts.c
> > @@ -1323,6 +1323,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
> >        = (opts->x_flag_unroll_loops
> >           || opts->x_flag_peel_loops
> >           || opts->x_optimize >= 3);
> > +
> > +  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
> > +  if (opts->x_flag_cx_limited_range)
> > +    flag_complex_method = 0;
> > +
> > +  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
> > +  if (opts->x_flag_cx_fortran_rules)
> > +    flag_complex_method = 1;
>
> That should then be opts->x_flag_complex_method instead of flag_complex_method.
>
> Ok with that change.

But the C/C++ langhooks also set flag_complex_method so I fail to see how
this helps?  As said I was referring to -fcx-limited-range on the command-line
and -fno-cx-limited-range in the optimize node to undo this which should
get you the langhook setting of flag_complex_method = 2.

> Note, I think we want to do much more in finish_options and less in
> process_options, anything that is about Optimization options rather than
> just the global ones.  Though one needs to be careful with the cases where
> the code diagnoses something.
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] flag_complex_method: support optimize attribute
  2021-09-06 12:16             ` Richard Biener
@ 2021-09-06 12:24               ` Jakub Jelinek
  2021-09-07  9:42               ` Martin Liška
  1 sibling, 0 replies; 30+ messages in thread
From: Jakub Jelinek @ 2021-09-06 12:24 UTC (permalink / raw)
  To: Richard Biener; +Cc: Martin Liška, GCC Patches, Michael Matz

On Mon, Sep 06, 2021 at 02:16:58PM +0200, Richard Biener wrote:
> On Mon, Sep 6, 2021 at 1:46 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Mon, Sep 06, 2021 at 01:37:46PM +0200, Martin Liška wrote:
> > > --- a/gcc/opts.c
> > > +++ b/gcc/opts.c
> > > @@ -1323,6 +1323,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
> > >        = (opts->x_flag_unroll_loops
> > >           || opts->x_flag_peel_loops
> > >           || opts->x_optimize >= 3);
> > > +
> > > +  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
> > > +  if (opts->x_flag_cx_limited_range)
> > > +    flag_complex_method = 0;
> > > +
> > > +  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
> > > +  if (opts->x_flag_cx_fortran_rules)
> > > +    flag_complex_method = 1;
> >
> > That should then be opts->x_flag_complex_method instead of flag_complex_method.
> >
> > Ok with that change.
> 
> But the C/C++ langhooks also set flag_complex_method so I fail to see how
> this helps?  As said I was referring to -fcx-limited-range on the command-line
> and -fno-cx-limited-range in the optimize node to undo this which should
> get you the langhook setting of flag_complex_method = 2.

flag_complex_method is global_options.x_flag_complex_method.
Setting it in a function that has opts and opts_set pointers passed to it
is just unclean, yes, opts will usually be &global_options, but doesn't have
to be always, and everything else in finish_options uses opts->x_* instead
of *.

	Jakub


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] flag_complex_method: support optimize attribute
  2021-09-06 12:16             ` Richard Biener
  2021-09-06 12:24               ` Jakub Jelinek
@ 2021-09-07  9:42               ` Martin Liška
  2021-09-13 13:32                 ` Martin Liška
  2021-09-19 14:45                 ` Jeff Law
  1 sibling, 2 replies; 30+ messages in thread
From: Martin Liška @ 2021-09-07  9:42 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek; +Cc: GCC Patches, Michael Matz

[-- Attachment #1: Type: text/plain, Size: 1701 bytes --]

On 9/6/21 14:16, Richard Biener wrote:
> On Mon, Sep 6, 2021 at 1:46 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>
>> On Mon, Sep 06, 2021 at 01:37:46PM +0200, Martin Liška wrote:
>>> --- a/gcc/opts.c
>>> +++ b/gcc/opts.c
>>> @@ -1323,6 +1323,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>>>         = (opts->x_flag_unroll_loops
>>>            || opts->x_flag_peel_loops
>>>            || opts->x_optimize >= 3);
>>> +
>>> +  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
>>> +  if (opts->x_flag_cx_limited_range)
>>> +    flag_complex_method = 0;
>>> +
>>> +  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
>>> +  if (opts->x_flag_cx_fortran_rules)
>>> +    flag_complex_method = 1;
>>
>> That should then be opts->x_flag_complex_method instead of flag_complex_method.
>>
>> Ok with that change.
> 
> But the C/C++ langhooks also set flag_complex_method so I fail to see how
> this helps?  As said I was referring to -fcx-limited-range on the command-line
> and -fno-cx-limited-range in the optimize node to undo this which should
> get you the langhook setting of flag_complex_method = 2.

You are right, it's even more complicated as -fno-cx-limited-range is target specific.
Option handling has been introducing surprises every time ...

The following tested patch should handle it.

Ready to be installed?
Thanks,
Martin

> 
>> Note, I think we want to do much more in finish_options and less in
>> process_options, anything that is about Optimization options rather than
>> just the global ones.  Though one needs to be careful with the cases where
>> the code diagnoses something.
>>
>>          Jakub
>>

[-- Attachment #2: 0001-flag_complex_method-support-optimize-attribute.patch --]
[-- Type: text/x-patch, Size: 6286 bytes --]

From e88ae14be7c5609a969897b5d09f40709fea8a34 Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Fri, 3 Sep 2021 10:53:00 +0200
Subject: [PATCH] flag_complex_method: support optimize attribute

gcc/c-family/ChangeLog:

	* c-opts.c (c_common_init_options_struct): Set also
	  x_flag_default_complex_method.

gcc/ChangeLog:

	* common.opt: Add new variable flag_default_complex_method.
	* opts.c (finish_options): Handle flags related to
	  x_flag_complex_method.
	* toplev.c (process_options): Remove option handling related
	to flag_complex_method.

gcc/go/ChangeLog:

	* go-lang.c (go_langhook_init_options_struct): Set also
	  x_flag_default_complex_method.

gcc/lto/ChangeLog:

	* lto-lang.c (lto_init_options_struct): Set also
	  x_flag_default_complex_method.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/compile/attr-complex-method-2.c: New test.
	* gcc.c-torture/compile/attr-complex-method.c: New test.
---
 gcc/c-family/c-opts.c                                |  1 +
 gcc/common.opt                                       |  3 +++
 gcc/go/go-lang.c                                     |  1 +
 gcc/lto/lto-lang.c                                   |  1 +
 gcc/opts.c                                           | 12 ++++++++++++
 .../gcc.c-torture/compile/attr-complex-method-2.c    | 10 ++++++++++
 .../gcc.c-torture/compile/attr-complex-method.c      | 10 ++++++++++
 gcc/toplev.c                                         |  8 --------
 8 files changed, 38 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index fdde082158b..3eaab5e1530 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -222,6 +222,7 @@ c_common_init_options_struct (struct gcc_options *opts)
 
   /* By default, C99-like requirements for complex multiply and divide.  */
   opts->x_flag_complex_method = 2;
+  opts->x_flag_default_complex_method = opts->x_flag_complex_method;
 }
 
 /* Common initialization before calling option handlers.  */
diff --git a/gcc/common.opt b/gcc/common.opt
index 7d69ab5ef7c..6bfe0b74023 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -59,6 +59,9 @@ enum incremental_link flag_incremental_link = INCREMENTAL_LINK_NONE
 Variable
 int flag_complex_method = 1
 
+Variable
+int flag_default_complex_method = 1
+
 ; Language specific warning pass for unused results.
 Variable
 bool flag_warn_unused_result = false
diff --git a/gcc/go/go-lang.c b/gcc/go/go-lang.c
index a01db8dbdcd..c3ae6f012bb 100644
--- a/gcc/go/go-lang.c
+++ b/gcc/go/go-lang.c
@@ -174,6 +174,7 @@ go_langhook_init_options_struct (struct gcc_options *opts)
   /* Default to avoiding range issues for complex multiply and
      divide.  */
   opts->x_flag_complex_method = 2;
+  opts->x_flag_default_complex_method = opts->x_flag_complex_method;
 
   /* The builtin math functions should not set errno.  */
   opts->x_flag_errno_math = 0;
diff --git a/gcc/lto/lto-lang.c b/gcc/lto/lto-lang.c
index 92f499643b5..a014e5884e0 100644
--- a/gcc/lto/lto-lang.c
+++ b/gcc/lto/lto-lang.c
@@ -813,6 +813,7 @@ lto_init_options_struct (struct gcc_options *opts)
      safe choice.  This will pessimize Fortran code with LTO unless
      people specify a complex method manually or use -ffast-math.  */
   opts->x_flag_complex_method = 2;
+  opts->x_flag_default_complex_method = opts->x_flag_complex_method;
 }
 
 /* Handle command-line option SCODE.  If the option takes an argument, it is
diff --git a/gcc/opts.c b/gcc/opts.c
index e0501551ef5..8cf868ac88d 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1323,6 +1323,18 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
       = (opts->x_flag_unroll_loops
          || opts->x_flag_peel_loops
          || opts->x_optimize >= 3);
+
+  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
+  if (opts->x_flag_cx_limited_range)
+    opts->x_flag_complex_method = 0;
+  else if (opts_set->x_flag_cx_limited_range)
+    opts->x_flag_complex_method = opts->x_flag_default_complex_method;
+
+  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
+  if (opts->x_flag_cx_fortran_rules)
+    opts->x_flag_complex_method = 1;
+  else if (opts_set->x_flag_cx_fortran_rules)
+    opts->x_flag_complex_method = opts->x_flag_default_complex_method;
 }
 
 #define LEFT_COLUMN	27
diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c
new file mode 100644
index 00000000000..a3dc9c1ba91
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c
@@ -0,0 +1,10 @@
+/* { dg-additional-options "-fcx-limited-range -fdump-tree-optimized" } */
+
+#pragma GCC optimize "-fno-cx-limited-range"
+
+void do_div (_Complex double *a, _Complex double *b)
+{
+  *a = *b / (4.0 - 5.0fi);
+}
+
+/* { dg-final { scan-tree-dump "__divdc3" "optimized" } } */
diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c
new file mode 100644
index 00000000000..f08b72e273f
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c
@@ -0,0 +1,10 @@
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+#pragma GCC optimize "-fcx-limited-range"
+
+void do_div (_Complex double *a, _Complex double *b)
+{
+  *a = *b / (4.0 - 5.0fi);
+}
+
+/* { dg-final { scan-tree-dump-not "__divdc3" "optimized" } } */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 14d1335e79e..e1688aae724 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1695,14 +1695,6 @@ process_options (void)
       flag_stack_check = NO_STACK_CHECK;
     }
 
-  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
-  if (flag_cx_limited_range)
-    flag_complex_method = 0;
-
-  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
-  if (flag_cx_fortran_rules)
-    flag_complex_method = 1;
-
   /* Targets must be able to place spill slots at lower addresses.  If the
      target already uses a soft frame pointer, the transition is trivial.  */
   if (!FRAME_GROWS_DOWNWARD && flag_stack_protect)
-- 
2.33.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] flag_complex_method: support optimize attribute
  2021-09-07  9:42               ` Martin Liška
@ 2021-09-13 13:32                 ` Martin Liška
  2021-09-19 14:45                 ` Jeff Law
  1 sibling, 0 replies; 30+ messages in thread
From: Martin Liška @ 2021-09-13 13:32 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek; +Cc: Michael Matz, GCC Patches

PING^1

On 9/7/21 11:42, Martin Liška wrote:
> On 9/6/21 14:16, Richard Biener wrote:
>> On Mon, Sep 6, 2021 at 1:46 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>>
>>> On Mon, Sep 06, 2021 at 01:37:46PM +0200, Martin Liška wrote:
>>>> --- a/gcc/opts.c
>>>> +++ b/gcc/opts.c
>>>> @@ -1323,6 +1323,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>>>>         = (opts->x_flag_unroll_loops
>>>>            || opts->x_flag_peel_loops
>>>>            || opts->x_optimize >= 3);
>>>> +
>>>> +  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
>>>> +  if (opts->x_flag_cx_limited_range)
>>>> +    flag_complex_method = 0;
>>>> +
>>>> +  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
>>>> +  if (opts->x_flag_cx_fortran_rules)
>>>> +    flag_complex_method = 1;
>>>
>>> That should then be opts->x_flag_complex_method instead of flag_complex_method.
>>>
>>> Ok with that change.
>>
>> But the C/C++ langhooks also set flag_complex_method so I fail to see how
>> this helps?  As said I was referring to -fcx-limited-range on the command-line
>> and -fno-cx-limited-range in the optimize node to undo this which should
>> get you the langhook setting of flag_complex_method = 2.
> 
> You are right, it's even more complicated as -fno-cx-limited-range is target specific.
> Option handling has been introducing surprises every time ...
> 
> The following tested patch should handle it.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
>>
>>> Note, I think we want to do much more in finish_options and less in
>>> process_options, anything that is about Optimization options rather than
>>> just the global ones.  Though one needs to be careful with the cases where
>>> the code diagnoses something.
>>>
>>>          Jakub
>>>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-08-27  9:05             ` Richard Biener
@ 2021-09-13 13:52               ` Martin Liška
  2021-09-19  5:46                 ` Jeff Law
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Liška @ 2021-09-13 13:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, Michael Matz, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 493 bytes --]

On 8/27/21 11:05, Richard Biener wrote:
> So with ignoring darktable which seems completely insane the cases
> will likely continue
> to work as intended if we change from the current scheme to appending
> as proposed.

All right, I'm addressing the flag_complex_method in a separate sub-thread.

There's slightly updated version of the patch where I modifed the documentation bits.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

[-- Attachment #2: 0001-Append-target-optimize-attr-to-the-current-cmdline.patch --]
[-- Type: text/x-patch, Size: 7121 bytes --]

From e13e3ec56acfb62543bc1912f1310d00eefba5c3 Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Wed, 2 Jun 2021 08:44:37 +0200
Subject: [PATCH] Append target/optimize attr to the current cmdline.

gcc/c-family/ChangeLog:

	* c-common.c (parse_optimize_options): Combine optimize
	options with what was provided on the command line.

gcc/ChangeLog:

	* toplev.c (toplev::main): Save decoded optimization options.
	* toplev.h (save_opt_decoded_options): New.
	* doc/extend.texi: Be more clear about optimize and target
	attributes.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable fast math.
	* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
	* gcc.target/i386/attr-optimize.c: New test.
---
 gcc/c-family/c-common.c                       | 17 +++++++++++--
 gcc/doc/extend.texi                           |  8 +++++--
 gcc/testsuite/gcc.target/i386/attr-optimize.c | 24 +++++++++++++++++++
 .../gcc.target/i386/avx512er-vrsqrt28ps-3.c   |  2 +-
 .../gcc.target/i386/avx512er-vrsqrt28ps-5.c   |  2 +-
 gcc/toplev.c                                  |  8 +++++++
 gcc/toplev.h                                  |  1 +
 7 files changed, 56 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/attr-optimize.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 017e41537ac..09038e3175f 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5904,9 +5904,22 @@ parse_optimize_options (tree args, bool attr_p)
       j++;
     }
   decoded_options_count = j;
-  /* And apply them.  */
+
+  /* Merge the decoded options with save_decoded_options.  */
+  unsigned save_opt_count = save_opt_decoded_options.length ();
+  unsigned merged_decoded_options_count
+    = save_opt_count + decoded_options_count;
+  cl_decoded_option *merged_decoded_options
+    = XNEWVEC (cl_decoded_option, merged_decoded_options_count);
+
+  for (unsigned i = 0; i < save_opt_count; ++i)
+    merged_decoded_options[i] = save_opt_decoded_options[i];
+  for (unsigned i = 0; i < decoded_options_count; ++i)
+    merged_decoded_options[save_opt_count + i] = decoded_options[i];
+
+   /* And apply them.  */
   decode_options (&global_options, &global_options_set,
-		  decoded_options, decoded_options_count,
+		  merged_decoded_options, merged_decoded_options_count,
 		  input_location, global_dc, NULL);
   free (decoded_options);
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7fb22ed8063..1cb7e33ca29 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3639,7 +3639,10 @@ take function pointer arguments.
 @cindex @code{optimize} function attribute
 The @code{optimize} attribute is used to specify that a function is to
 be compiled with different optimization options than specified on the
-command line.  Valid arguments are constant non-negative integers and
+command line.  The optimize attribute arguments of a function behave
+behave as if appended to the command-line.
+
+Valid arguments are constant non-negative integers and
 strings.  Each numeric argument specifies an optimization @var{level}.
 Each @var{string} argument consists of one or more comma-separated
 substrings.  Each substring that begins with the letter @code{O} refers
@@ -3843,7 +3846,8 @@ This attribute prevents stack protection code for the function.
 Multiple target back ends implement the @code{target} attribute
 to specify that a function is to
 be compiled with different target options than specified on the
-command line.  One or more strings can be provided as arguments.
+command line.  The original target command-line options are ignored.
+One or more strings can be provided as arguments.
 Each string consists of one or more comma-separated suffixes to
 the @code{-m} prefix jointly forming the name of a machine-dependent
 option.  @xref{Submodel Options,,Machine-Dependent Options}.
diff --git a/gcc/testsuite/gcc.target/i386/attr-optimize.c b/gcc/testsuite/gcc.target/i386/attr-optimize.c
new file mode 100644
index 00000000000..f5db028f1fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/attr-optimize.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O1 -ftree-slp-vectorize -march=znver1 -fdump-tree-optimized" } */
+
+/* Use -O2, but -ftree-slp-vectorize option should be preserved and used.  */
+#pragma GCC optimize "-O2"
+
+typedef struct {
+  long n[4];
+} secp256k1_fe;
+
+void *a;
+int c;
+static void
+fn1(secp256k1_fe *p1, int p2)
+{
+  p1->n[0] = p1->n[1] = p2;
+}
+void
+fn2()
+{
+  fn1(a, !c);
+}
+
+/* { dg-final { scan-tree-dump { MEM <vector\(2\) long int> \[[^]]*\] = } "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
index 1ba8172d6e3..40aefb50844 100644
--- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c
@@ -8,7 +8,7 @@
 #define MAX 1000
 #define EPS 0.00001
 
-__attribute__ ((noinline, optimize (1)))
+__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
 void static
 compute_rsqrt_ref (float *a, float *r)
 {
diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
index e067a81e562..498f4d50aa5 100644
--- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
+++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c
@@ -8,7 +8,7 @@
 #define MAX 1000
 #define EPS 0.00001
 
-__attribute__ ((noinline, optimize (1)))
+__attribute__ ((noinline, optimize (1, "-fno-fast-math")))
 void static
 compute_sqrt_ref (float *a, float *r)
 {
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 14d1335e79e..538ffdbd31a 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -121,6 +121,9 @@ static bool no_backend;
 struct cl_decoded_option *save_decoded_options;
 unsigned int save_decoded_options_count;
 
+/* Vector of saved Optimization decoded command line options.  */
+auto_vec<cl_decoded_option> save_opt_decoded_options;
+
 /* Used to enable -fvar-tracking, -fweb and -frename-registers according
    to optimize in process_options ().  */
 #define AUTODETECT_VALUE 2
@@ -2342,6 +2345,11 @@ toplev::main (int argc, char **argv)
 						&save_decoded_options,
 						&save_decoded_options_count);
 
+  /* Save Optimization decoded options.  */
+  for (unsigned i = 0; i < save_decoded_options_count; ++i)
+    if (cl_options[save_decoded_options[i].opt_index].flags & CL_OPTIMIZATION)
+      save_opt_decoded_options.safe_push (save_decoded_options[i]);
+
   /* Perform language-specific options initialization.  */
   lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
 
diff --git a/gcc/toplev.h b/gcc/toplev.h
index f543554b15f..c44c5ff926a 100644
--- a/gcc/toplev.h
+++ b/gcc/toplev.h
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Decoded options, and number of such options.  */
 extern struct cl_decoded_option *save_decoded_options;
 extern unsigned int save_decoded_options_count;
+extern auto_vec<cl_decoded_option> save_opt_decoded_options;
 
 class timer;
 
-- 
2.33.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Optimize macro: make it more predictable
  2021-09-13 13:52               ` Martin Liška
@ 2021-09-19  5:46                 ` Jeff Law
  0 siblings, 0 replies; 30+ messages in thread
From: Jeff Law @ 2021-09-19  5:46 UTC (permalink / raw)
  To: Martin Liška, Richard Biener
  Cc: Jakub Jelinek, Michael Matz, GCC Patches



On 9/13/2021 7:52 AM, Martin Liška wrote:
> On 8/27/21 11:05, Richard Biener wrote:
>> So with ignoring darktable which seems completely insane the cases
>> will likely continue
>> to work as intended if we change from the current scheme to appending
>> as proposed.
>
> All right, I'm addressing the flag_complex_method in a separate 
> sub-thread.
>
> There's slightly updated version of the patch where I modifed the 
> documentation bits.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> 0001-Append-target-optimize-attr-to-the-current-cmdline.patch
>
>  From e13e3ec56acfb62543bc1912f1310d00eefba5c3 Mon Sep 17 00:00:00 2001
> From: Martin Liska <mliska@suse.cz>
> Date: Wed, 2 Jun 2021 08:44:37 +0200
> Subject: [PATCH] Append target/optimize attr to the current cmdline.
>
> gcc/c-family/ChangeLog:
>
> 	* c-common.c (parse_optimize_options): Combine optimize
> 	options with what was provided on the command line.
>
> gcc/ChangeLog:
>
> 	* toplev.c (toplev::main): Save decoded optimization options.
> 	* toplev.h (save_opt_decoded_options): New.
> 	* doc/extend.texi: Be more clear about optimize and target
> 	attributes.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable fast math.
> 	* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
> 	* gcc.target/i386/attr-optimize.c: New test.
OK
jeff


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] flag_complex_method: support optimize attribute
  2021-09-07  9:42               ` Martin Liška
  2021-09-13 13:32                 ` Martin Liška
@ 2021-09-19 14:45                 ` Jeff Law
  1 sibling, 0 replies; 30+ messages in thread
From: Jeff Law @ 2021-09-19 14:45 UTC (permalink / raw)
  To: Martin Liška, Richard Biener, Jakub Jelinek
  Cc: Michael Matz, GCC Patches



On 9/7/2021 3:42 AM, Martin Liška wrote:
> On 9/6/21 14:16, Richard Biener wrote:
>> On Mon, Sep 6, 2021 at 1:46 PM Jakub Jelinek <jakub@redhat.com> wrote:
>>>
>>> On Mon, Sep 06, 2021 at 01:37:46PM +0200, Martin Liška wrote:
>>>> --- a/gcc/opts.c
>>>> +++ b/gcc/opts.c
>>>> @@ -1323,6 +1323,14 @@ finish_options (struct gcc_options *opts, 
>>>> struct gcc_options *opts_set,
>>>>         = (opts->x_flag_unroll_loops
>>>>            || opts->x_flag_peel_loops
>>>>            || opts->x_optimize >= 3);
>>>> +
>>>> +  /* With -fcx-limited-range, we do cheap and quick complex 
>>>> arithmetic.  */
>>>> +  if (opts->x_flag_cx_limited_range)
>>>> +    flag_complex_method = 0;
>>>> +
>>>> +  /* With -fcx-fortran-rules, we do something in-between cheap and 
>>>> C99.  */
>>>> +  if (opts->x_flag_cx_fortran_rules)
>>>> +    flag_complex_method = 1;
>>>
>>> That should then be opts->x_flag_complex_method instead of 
>>> flag_complex_method.
>>>
>>> Ok with that change.
>>
>> But the C/C++ langhooks also set flag_complex_method so I fail to see 
>> how
>> this helps?  As said I was referring to -fcx-limited-range on the 
>> command-line
>> and -fno-cx-limited-range in the optimize node to undo this which should
>> get you the langhook setting of flag_complex_method = 2.
>
> You are right, it's even more complicated as -fno-cx-limited-range is 
> target specific.
> Option handling has been introducing surprises every time ...
>
> The following tested patch should handle it.
>
> Ready to be installed?
> Thanks,
> Martin
>
>>
>>> Note, I think we want to do much more in finish_options and less in
>>> process_options, anything that is about Optimization options rather 
>>> than
>>> just the global ones.  Though one needs to be careful with the cases 
>>> where
>>> the code diagnoses something.
>>>
>>>          Jakub
>>>
>
> 0001-flag_complex_method-support-optimize-attribute.patch
>
>  From e88ae14be7c5609a969897b5d09f40709fea8a34 Mon Sep 17 00:00:00 2001
> From: Martin Liska <mliska@suse.cz>
> Date: Fri, 3 Sep 2021 10:53:00 +0200
> Subject: [PATCH] flag_complex_method: support optimize attribute
>
> gcc/c-family/ChangeLog:
>
> 	* c-opts.c (c_common_init_options_struct): Set also
> 	  x_flag_default_complex_method.
>
> gcc/ChangeLog:
>
> 	* common.opt: Add new variable flag_default_complex_method.
> 	* opts.c (finish_options): Handle flags related to
> 	  x_flag_complex_method.
> 	* toplev.c (process_options): Remove option handling related
> 	to flag_complex_method.
>
> gcc/go/ChangeLog:
>
> 	* go-lang.c (go_langhook_init_options_struct): Set also
> 	  x_flag_default_complex_method.
>
> gcc/lto/ChangeLog:
>
> 	* lto-lang.c (lto_init_options_struct): Set also
> 	  x_flag_default_complex_method.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.c-torture/compile/attr-complex-method-2.c: New test.
> 	* gcc.c-torture/compile/attr-complex-method.c: New test.
OK
jeff


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2021-09-19 14:45 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-23 11:47 [PATCH] Optimize macro: make it more predictable Martin Liška
2020-11-03 13:27 ` Richard Biener
2020-11-03 13:34   ` Jakub Jelinek
2020-11-03 13:40     ` Richard Biener
2020-11-09 10:35     ` Martin Liška
2020-11-26 13:56       ` Martin Liška
2020-12-07 11:03         ` Martin Liška
2021-01-11 13:10           ` Martin Liška
2020-11-09 10:27   ` Martin Liška
2020-11-06 17:34 ` Jeff Law
2020-11-09 10:36   ` Martin Liška
2021-07-01 13:13 ` Martin Liška
2021-08-10 15:52   ` Martin Liška
2021-08-24 11:06     ` Martin Liška
2021-08-24 12:13   ` Richard Biener
2021-08-24 13:04     ` Martin Liška
2021-08-26 11:04       ` Richard Biener
2021-08-26 12:39         ` Martin Liška
2021-08-26 13:20           ` Richard Biener
2021-08-27  8:35           ` Martin Liška
2021-08-27  9:05             ` Richard Biener
2021-09-13 13:52               ` Martin Liška
2021-09-19  5:46                 ` Jeff Law
2021-09-06 11:37         ` [PATCH] flag_complex_method: support optimize attribute Martin Liška
2021-09-06 11:46           ` Jakub Jelinek
2021-09-06 12:16             ` Richard Biener
2021-09-06 12:24               ` Jakub Jelinek
2021-09-07  9:42               ` Martin Liška
2021-09-13 13:32                 ` Martin Liška
2021-09-19 14:45                 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).