public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Martin Liška" <mliska@suse.cz>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Jakub Jelinek <jakub@redhat.com>, Michael Matz <matz@suse.de>
Subject: Re: [PATCH] Optimize macro: make it more predictable
Date: Thu, 26 Aug 2021 14:39:57 +0200	[thread overview]
Message-ID: <e3890c97-3c3c-0692-43b7-e115b46cf91e@suse.cz> (raw)
In-Reply-To: <CAFiYyc0Kej7BhwoMKuPzPuh9oVtzZPWR=vFK8T5cMgdOmLkD+A@mail.gmail.com>

On 8/26/21 13:04, Richard Biener wrote:
> On Tue, Aug 24, 2021 at 3:04 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 8/24/21 14:13, Richard Biener wrote:
>>> On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mliska@suse.cz> wrote:
>>>>
>>>> On 10/23/20 1:47 PM, Martin Liška wrote:
>>>>> Hey.
>>>>
>>>> Hello.
>>>>
>>>> I deferred the patch for GCC 12. Since the time, I messed up with options
>>>> I feel more familiar with the option handling. So ...
>>>>
>>>>>
>>>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>>>
>>>>> The current optimize attribute works in the following way:
>>>>> - 1) we take current global_options as base
>>>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>>>         means all rules in default_options_table are executed
>>>>> - 3) attribute values are applied (via decode_options)
>>>>>
>>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>>>        /* -O1 and -Og optimizations.  */
>>>>>        { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>>>
>>>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>>>> vague right now:
>>>>
>>>> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.
>>
>> Hey.
>>
>>> There is also handle_common_deferred_options that's not called so any
>>> option processed there should
>>> probably be excempt from being set/unset in the optimize attribute?
>>
>> Looking at the handled options, they have all Defer type and not Optimization.
>> Thus we should be fine.
>>
>>>
>>>>>
>>>>> """
>>>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>>>> """
>>>>
>>>> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
>>>> as something not for a production use:
>>>>
>>>> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
>>>>
>>>> Are we sure about the statement? I know that e.g. glibc uses that.
>>>
>>> Well, given we're changing behavior now that warning looks valid ;)
>>
>> Yeah! True.
>>
>>> I'll also note that
>>>
>>> "The optimize attribute arguments of a function behave
>>> as if they were added to the command line options."
>>>
>>> is still likely untrue, the global state init is complicated ;)
>>
>> Sure, but the situation should be much closer to it :) Do you have a better wording?
> 
> Maybe "The intent is that the optimize attribute behaves as if the
> arguments were
> appended to the command line."
> 
> But as said originally below I'm not sure that this behavior is what people
> expect.  If I say optimize("fno-tree-vectorize") then I do expect that to retain
> other command-line arguments.

Yep, that's clear!

> If I say optimize(1) I'm not sure I would expect
> -ftree-vectorize on the command-line to prevail ;)

That's a very good question: theoretically yes, I can imagine various scenarios how can -Ox be used
in an attribute:

1) -O0 for a function that e.g. violate strict aliasing, or when one wants to debug a fn in an optimized binary
2) -Ofast for a function which is performance critical

On the other hand, my original motivation was a kernel compiler with:
-O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))

It's not intuitive that you end up with -O2 -fno-stack-protector (and -fomit-frame-pointer).

> 
> Is google code search still a thing?  Can one search all of github
> somehow?  I really
> wonder how 'optimize' is used at the moment.  There are quite some optimize
> attributes in the target part of the testsuite for example.  And
> testsuite/gcc.dg/vect/bb-slp-41.c

I can investigate that.

> suggests that optimize("-fno-tree-fre") preserves at least the
> optimization level?
> 
> There's always the possibility to preserve the current behavior for 'optimize'

Yes, but we should somehow document the current weird behavior when it comes to -Ox options.

> and add a new 'add_optimize' attribute that does the other thing.

Considering that ...

> 
>>>
>>>
>>>>>
>>>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>>>
>>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>>>> with -ftree-vectorize -O1 ?
>>
>> This is my older suggestion and it will likely make it even much complicated. So ...
> 
> In theory it's just dropping the command-line but yes, the question is
> what happens
> to the non-optimization part of the command-line.  We obviously shouldn't drop
> -std=gnu14 and it's side-effects.  So yes, even documenting exactly what this
> would do is difficult ;)
> 
>>>
>>> As implemented your patch seems to turn it into -ftree-vectorize -O1.
>>
>> Yes.
>>
>>> IIRC multiple optimize attributes apply
>>> ontop of each other, and it makes sense to me that optimize (2),
>>> optimize ("tree-vectorize") behaves the same
>>> as optimize (2, "tree-vectorize").  I'm not sure this is still the
>>> case after your patch?  Also consider
>>>
>>> #pragma GCC optimize ("tree-vectorize")
>>> void foo () { ...}
>>>
>>> #pragma GCC optimize ("tree-loop-distribution")
>>> void bar () {... }
>>>
>>> I'd expect bar to have both vectorization and loop distribution
>>> enabled? (note I didn't use push/pop here)
>>
>> Yes, yes and yes. I'm going to verify it.
>>
>>>
>>>> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
>>>>
>>>> $ cat -n gcc/config/i386/i386-options.c
>>>> ...
>>>>      1245                if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
>>>>      1246                  {
>>>>      1247                    /* If arch= is set,  clear all bits in x_ix86_isa_flags,
>>>>      1248                       except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
>>>>      1249                       and all bits in x_ix86_isa_flags2.  */
>>>>      1250                    opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
>>>>      1251                                               | OPTION_MASK_ABI_64
>>>>      1252                                               | OPTION_MASK_ABI_X32
>>>>      1253                                               | OPTION_MASK_CODE16);
>>>>      1254                    opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
>>>>      1255                                                        | OPTION_MASK_ABI_64
>>>>      1256                                                        | OPTION_MASK_ABI_X32
>>>>      1257                                                        | OPTION_MASK_CODE16);
>>>>      1258                    opts->x_ix86_isa_flags2 = 0;
>>>>      1259                    opts->x_ix86_isa_flags2_explicit = 0;
>>>>      1260                  }
>>>>
>>>> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
>>>> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
>>>>
>>>> $ cat hreset.c
>>>> #pragma GCC target "arch=geode"
>>>> #include <immintrin.h>
>>>> void foo(unsigned int eax)
>>>> {
>>>>      _hreset (eax);
>>>> }
>>>>
>>>> $ clang hreset.c -mhreset  -c -O2 -m32
>>>> $ gcc hreset.c -mhreset  -c -O2 -m32
>>>> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
>>>>                     from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
>>>>                     from hreset.c:2:
>>>> hreset.c: In function ‘foo’:
>>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
>>>>       39 | _hreset (unsigned int __EAX)
>>>>          | ^~~~~~~
>>>> hreset.c:5:3: note: called from here
>>>>        5 |   _hreset (eax);
>>>>          |   ^~~~~~~~~~~~~
>>>>
>>>> Anyway, I think the current target attribute handling should be preserved.
>>>
>>> I think this and the -O1 argument above suggests that there should be
>>> a way to distinguish
>>> two modes - add to the active set of options and starting from scratch.
>>
>> Doing that would make it even crazier :)
>>
>>>
>>> Maybe it's over-designing things but do we want to preserve the
>>> existing behavior
>>> and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
>>> a way to amend
>>> the state?
>>
>> I prefer doing only the append mode (when one can still use -fno-foo for an explicit
>> drop of a flag).
>>
>>>
>>> OTOH as we're missing global_options re-init even with your patch we won't get
>>> the defaults correct (aka what toplev::main does with init_options_struct and
>>> the corresponding langhook).  Likewise if lang_hooks.init_options performs any
>>> defaulting a later flag overrides and we override that with optimize() that
>>> doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags.
>>> So -O2 -fcx-fortran-rules on the command-line and optimize
>>> ("no-cx-fortran-rules")
>>> to cancel the -fcx-fortran-rules switch wouldn't work?
>>
>> In most cases it works. What's problematic about -fcx-fortran-rules is that it sets
>>
>>     /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
>>     if (flag_cx_limited_range)
>>       flag_complex_method = 0;
>>
>>     /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
>>     if (flag_cx_fortran_rules)
>>       flag_complex_method = 1;
>>
>> in process_options (called only for cmdline options) and not in
>>
>> /* After all options at LOC have been read into OPTS and OPTS_SET,
>>      finalize settings of those options and diagnose incompatible
>>      combinations.  */
>> void
>> finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>>                  location_t loc)
>>
>> which is a place which is called once options are decoded (both from cmdline and when
>> combined with a attribute or pragma):
> 
> Yes, and that flag_complex_method is initialized via the langhook
> mentioned, for example
> c-family/c-opts.c has
> 
> /* Initialize options structure OPTS.  */
> void
> c_common_init_options_struct (struct gcc_options *opts)
> {
>    opts->x_flag_exceptions = c_dialect_cxx ();
>    opts->x_warn_pointer_arith = c_dialect_cxx ();
>    opts->x_warn_write_strings = c_dialect_cxx ();
>    opts->x_flag_warn_unused_result = true;
> 
>    /* By default, C99-like requirements for complex multiply and divide.  */
>    opts->x_flag_complex_method = 2;
> }
> 
> so an attempt to "cancel" a command-line option that adjusted any of
> the above will not
> work because we're not re-initializing global_options appropriately.
> But maybe we can
> just do that?  That is, call
> 
>    /* Initialize global options structures; this must be repeated for
>       each structure used for parsing options.  */
>    init_options_struct (&global_options, &global_options_set);
>    lang_hooks.init_options_struct (&global_options);
> 
> and
> 
>    /* Perform language-specific options initialization.  */
>    lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
> 
> as done by toplev.c?  Or if we do not want to do that store that state away
> to an 'initialized_options/initialized_options_set' set of vars we can
> copy from?

I think this one is a bit orthogonal to the suggested changes, right? We can hopefully implement it incrementally.

Martin

> 
>> #1  0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 <global_options>, opts_set=opts_set@entry=0x26afdc0 <global_options_set>, loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303
>>
>> #2  0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, opts_set=0x26afdc0 <global_options_set>, decoded_options=<optimized out>, decoded_options_count=decoded_options_count@entry=4, loc=258754, dc=0x26b2b00 <global_diagnostic_context>,
>>
>>       target_option_override_hook=0x0) at /home/marxin/Programming/gcc/gcc/opts-global.c:324
>>
>> #3  0x0000000000921144 in parse_optimize_options (args=args@entry=<tree_list 0x7ffff76e1910>, attr_p=attr_p@entry=false) at /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921
>>
>> #4  0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993
>>
>> #5  0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, context=pragma_external, if_p=0x0) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573
>>
>>
>> Martin
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>
>>>> Ready to be installed?
>>>> Thanks,
>>>> Martin
>>>>
>>>>>
>>>>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>>>>>
>>>>> Thoughts?
>>>>> Thanks,
>>>>> Martin
>>>>
>>


  reply	other threads:[~2021-08-26 12:39 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-23 11:47 Martin Liška
2020-11-03 13:27 ` Richard Biener
2020-11-03 13:34   ` Jakub Jelinek
2020-11-03 13:40     ` Richard Biener
2020-11-09 10:35     ` Martin Liška
2020-11-26 13:56       ` Martin Liška
2020-12-07 11:03         ` Martin Liška
2021-01-11 13:10           ` Martin Liška
2020-11-09 10:27   ` Martin Liška
2020-11-06 17:34 ` Jeff Law
2020-11-09 10:36   ` Martin Liška
2021-07-01 13:13 ` Martin Liška
2021-08-10 15:52   ` Martin Liška
2021-08-24 11:06     ` Martin Liška
2021-08-24 12:13   ` Richard Biener
2021-08-24 13:04     ` Martin Liška
2021-08-26 11:04       ` Richard Biener
2021-08-26 12:39         ` Martin Liška [this message]
2021-08-26 13:20           ` Richard Biener
2021-08-27  8:35           ` Martin Liška
2021-08-27  9:05             ` Richard Biener
2021-09-13 13:52               ` Martin Liška
2021-09-19  5:46                 ` Jeff Law
2021-09-06 11:37         ` [PATCH] flag_complex_method: support optimize attribute Martin Liška
2021-09-06 11:46           ` Jakub Jelinek
2021-09-06 12:16             ` Richard Biener
2021-09-06 12:24               ` Jakub Jelinek
2021-09-07  9:42               ` Martin Liška
2021-09-13 13:32                 ` Martin Liška
2021-09-19 14:45                 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3890c97-3c3c-0692-43b7-e115b46cf91e@suse.cz \
    --to=mliska@suse.cz \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=matz@suse.de \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).