From: "Martin Liška" <mliska@suse.cz>
To: Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Jakub Jelinek <jakub@redhat.com>, Michael Matz <matz@suse.de>
Subject: Re: [PATCH] Optimize macro: make it more predictable
Date: Thu, 26 Aug 2021 14:39:57 +0200 [thread overview]
Message-ID: <e3890c97-3c3c-0692-43b7-e115b46cf91e@suse.cz> (raw)
In-Reply-To: <CAFiYyc0Kej7BhwoMKuPzPuh9oVtzZPWR=vFK8T5cMgdOmLkD+A@mail.gmail.com>
On 8/26/21 13:04, Richard Biener wrote:
> On Tue, Aug 24, 2021 at 3:04 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 8/24/21 14:13, Richard Biener wrote:
>>> On Thu, Jul 1, 2021 at 3:13 PM Martin Liška <mliska@suse.cz> wrote:
>>>>
>>>> On 10/23/20 1:47 PM, Martin Liška wrote:
>>>>> Hey.
>>>>
>>>> Hello.
>>>>
>>>> I deferred the patch for GCC 12. Since the time, I messed up with options
>>>> I feel more familiar with the option handling. So ...
>>>>
>>>>>
>>>>> This is a follow-up of the discussion that happened in thread about no_stack_protector
>>>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>>>>>
>>>>> The current optimize attribute works in the following way:
>>>>> - 1) we take current global_options as base
>>>>> - 2) maybe_default_options is called for the currently selected optimization level, which
>>>>> means all rules in default_options_table are executed
>>>>> - 3) attribute values are applied (via decode_options)
>>>>>
>>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
>>>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default:
>>>>> /* -O1 and -Og optimizations. */
>>>>> { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>>>>>
>>>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value
>>>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite
>>>>> vague right now:
>>>>
>>>> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that.
>>
>> Hey.
>>
>>> There is also handle_common_deferred_options that's not called so any
>>> option processed there should
>>> probably be excempt from being set/unset in the optimize attribute?
>>
>> Looking at the handled options, they have all Defer type and not Optimization.
>> Thus we should be fine.
>>
>>>
>>>>>
>>>>> """
>>>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line.
>>>>> """
>>>>
>>>> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute
>>>> as something not for a production use:
>>>>
>>>> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code."
>>>>
>>>> Are we sure about the statement? I know that e.g. glibc uses that.
>>>
>>> Well, given we're changing behavior now that warning looks valid ;)
>>
>> Yeah! True.
>>
>>> I'll also note that
>>>
>>> "The optimize attribute arguments of a function behave
>>> as if they were added to the command line options."
>>>
>>> is still likely untrue, the global state init is complicated ;)
>>
>> Sure, but the situation should be much closer to it :) Do you have a better wording?
>
> Maybe "The intent is that the optimize attribute behaves as if the
> arguments were
> appended to the command line."
>
> But as said originally below I'm not sure that this behavior is what people
> expect. If I say optimize("fno-tree-vectorize") then I do expect that to retain
> other command-line arguments.
Yep, that's clear!
> If I say optimize(1) I'm not sure I would expect
> -ftree-vectorize on the command-line to prevail ;)
That's a very good question: theoretically yes, I can imagine various scenarios how can -Ox be used
in an attribute:
1) -O0 for a function that e.g. violate strict aliasing, or when one wants to debug a fn in an optimized binary
2) -Ofast for a function which is performance critical
On the other hand, my original motivation was a kernel compiler with:
-O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector")))
It's not intuitive that you end up with -O2 -fno-stack-protector (and -fomit-frame-pointer).
>
> Is google code search still a thing? Can one search all of github
> somehow? I really
> wonder how 'optimize' is used at the moment. There are quite some optimize
> attributes in the target part of the testsuite for example. And
> testsuite/gcc.dg/vect/bb-slp-41.c
I can investigate that.
> suggests that optimize("-fno-tree-fre") preserves at least the
> optimization level?
>
> There's always the possibility to preserve the current behavior for 'optimize'
Yes, but we should somehow document the current weird behavior when it comes to -Ox options.
> and add a new 'add_optimize' attribute that does the other thing.
Considering that ...
>
>>>
>>>
>>>>>
>>>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that
>>>>>
>>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not
>>>>> with -ftree-vectorize -O1 ?
>>
>> This is my older suggestion and it will likely make it even much complicated. So ...
>
> In theory it's just dropping the command-line but yes, the question is
> what happens
> to the non-optimization part of the command-line. We obviously shouldn't drop
> -std=gnu14 and it's side-effects. So yes, even documenting exactly what this
> would do is difficult ;)
>
>>>
>>> As implemented your patch seems to turn it into -ftree-vectorize -O1.
>>
>> Yes.
>>
>>> IIRC multiple optimize attributes apply
>>> ontop of each other, and it makes sense to me that optimize (2),
>>> optimize ("tree-vectorize") behaves the same
>>> as optimize (2, "tree-vectorize"). I'm not sure this is still the
>>> case after your patch? Also consider
>>>
>>> #pragma GCC optimize ("tree-vectorize")
>>> void foo () { ...}
>>>
>>> #pragma GCC optimize ("tree-loop-distribution")
>>> void bar () {... }
>>>
>>> I'd expect bar to have both vectorization and loop distribution
>>> enabled? (note I didn't use push/pop here)
>>
>> Yes, yes and yes. I'm going to verify it.
>>
>>>
>>>> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags:
>>>>
>>>> $ cat -n gcc/config/i386/i386-options.c
>>>> ...
>>>> 1245 if (opt == IX86_FUNCTION_SPECIFIC_ARCH)
>>>> 1246 {
>>>> 1247 /* If arch= is set, clear all bits in x_ix86_isa_flags,
>>>> 1248 except for ISA_64BIT, ABI_64, ABI_X32, and CODE16
>>>> 1249 and all bits in x_ix86_isa_flags2. */
>>>> 1250 opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT
>>>> 1251 | OPTION_MASK_ABI_64
>>>> 1252 | OPTION_MASK_ABI_X32
>>>> 1253 | OPTION_MASK_CODE16);
>>>> 1254 opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT
>>>> 1255 | OPTION_MASK_ABI_64
>>>> 1256 | OPTION_MASK_ABI_X32
>>>> 1257 | OPTION_MASK_CODE16);
>>>> 1258 opts->x_ix86_isa_flags2 = 0;
>>>> 1259 opts->x_ix86_isa_flags2_explicit = 0;
>>>> 1260 }
>>>>
>>>> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs
>>>> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently:
>>>>
>>>> $ cat hreset.c
>>>> #pragma GCC target "arch=geode"
>>>> #include <immintrin.h>
>>>> void foo(unsigned int eax)
>>>> {
>>>> _hreset (eax);
>>>> }
>>>>
>>>> $ clang hreset.c -mhreset -c -O2 -m32
>>>> $ gcc hreset.c -mhreset -c -O2 -m32
>>>> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97,
>>>> from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27,
>>>> from hreset.c:2:
>>>> hreset.c: In function ‘foo’:
>>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch
>>>> 39 | _hreset (unsigned int __EAX)
>>>> | ^~~~~~~
>>>> hreset.c:5:3: note: called from here
>>>> 5 | _hreset (eax);
>>>> | ^~~~~~~~~~~~~
>>>>
>>>> Anyway, I think the current target attribute handling should be preserved.
>>>
>>> I think this and the -O1 argument above suggests that there should be
>>> a way to distinguish
>>> two modes - add to the active set of options and starting from scratch.
>>
>> Doing that would make it even crazier :)
>>
>>>
>>> Maybe it's over-designing things but do we want to preserve the
>>> existing behavior
>>> and instead add optimize ("+ftree-vectorize") and target ("+avx2") as
>>> a way to amend
>>> the state?
>>
>> I prefer doing only the append mode (when one can still use -fno-foo for an explicit
>> drop of a flag).
>>
>>>
>>> OTOH as we're missing global_options re-init even with your patch we won't get
>>> the defaults correct (aka what toplev::main does with init_options_struct and
>>> the corresponding langhook). Likewise if lang_hooks.init_options performs any
>>> defaulting a later flag overrides and we override that with optimize() that
>>> doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags.
>>> So -O2 -fcx-fortran-rules on the command-line and optimize
>>> ("no-cx-fortran-rules")
>>> to cancel the -fcx-fortran-rules switch wouldn't work?
>>
>> In most cases it works. What's problematic about -fcx-fortran-rules is that it sets
>>
>> /* With -fcx-limited-range, we do cheap and quick complex arithmetic. */
>> if (flag_cx_limited_range)
>> flag_complex_method = 0;
>>
>> /* With -fcx-fortran-rules, we do something in-between cheap and C99. */
>> if (flag_cx_fortran_rules)
>> flag_complex_method = 1;
>>
>> in process_options (called only for cmdline options) and not in
>>
>> /* After all options at LOC have been read into OPTS and OPTS_SET,
>> finalize settings of those options and diagnose incompatible
>> combinations. */
>> void
>> finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>> location_t loc)
>>
>> which is a place which is called once options are decoded (both from cmdline and when
>> combined with a attribute or pragma):
>
> Yes, and that flag_complex_method is initialized via the langhook
> mentioned, for example
> c-family/c-opts.c has
>
> /* Initialize options structure OPTS. */
> void
> c_common_init_options_struct (struct gcc_options *opts)
> {
> opts->x_flag_exceptions = c_dialect_cxx ();
> opts->x_warn_pointer_arith = c_dialect_cxx ();
> opts->x_warn_write_strings = c_dialect_cxx ();
> opts->x_flag_warn_unused_result = true;
>
> /* By default, C99-like requirements for complex multiply and divide. */
> opts->x_flag_complex_method = 2;
> }
>
> so an attempt to "cancel" a command-line option that adjusted any of
> the above will not
> work because we're not re-initializing global_options appropriately.
> But maybe we can
> just do that? That is, call
>
> /* Initialize global options structures; this must be repeated for
> each structure used for parsing options. */
> init_options_struct (&global_options, &global_options_set);
> lang_hooks.init_options_struct (&global_options);
>
> and
>
> /* Perform language-specific options initialization. */
> lang_hooks.init_options (save_decoded_options_count, save_decoded_options);
>
> as done by toplev.c? Or if we do not want to do that store that state away
> to an 'initialized_options/initialized_options_set' set of vars we can
> copy from?
I think this one is a bit orthogonal to the suggested changes, right? We can hopefully implement it incrementally.
Martin
>
>> #1 0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 <global_options>, opts_set=opts_set@entry=0x26afdc0 <global_options_set>, loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303
>>
>> #2 0x0000000000dd9e3b in decode_options (opts=0x26b13e0 <global_options>, opts_set=0x26afdc0 <global_options_set>, decoded_options=<optimized out>, decoded_options_count=decoded_options_count@entry=4, loc=258754, dc=0x26b2b00 <global_diagnostic_context>,
>>
>> target_option_override_hook=0x0) at /home/marxin/Programming/gcc/gcc/opts-global.c:324
>>
>> #3 0x0000000000921144 in parse_optimize_options (args=args@entry=<tree_list 0x7ffff76e1910>, attr_p=attr_p@entry=false) at /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921
>>
>> #4 0x0000000000972aab in handle_pragma_optimize (dummy=<optimized out>) at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993
>>
>> #5 0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, context=pragma_external, if_p=0x0) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573
>>
>>
>> Martin
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>
>>>> Ready to be installed?
>>>> Thanks,
>>>> Martin
>>>>
>>>>>
>>>>> I'm also planning to take a look at the target macro/attribute, I expect similar problems:
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>>>>>
>>>>> Thoughts?
>>>>> Thanks,
>>>>> Martin
>>>>
>>
next prev parent reply other threads:[~2021-08-26 12:39 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-23 11:47 Martin Liška
2020-11-03 13:27 ` Richard Biener
2020-11-03 13:34 ` Jakub Jelinek
2020-11-03 13:40 ` Richard Biener
2020-11-09 10:35 ` Martin Liška
2020-11-26 13:56 ` Martin Liška
2020-12-07 11:03 ` Martin Liška
2021-01-11 13:10 ` Martin Liška
2020-11-09 10:27 ` Martin Liška
2020-11-06 17:34 ` Jeff Law
2020-11-09 10:36 ` Martin Liška
2021-07-01 13:13 ` Martin Liška
2021-08-10 15:52 ` Martin Liška
2021-08-24 11:06 ` Martin Liška
2021-08-24 12:13 ` Richard Biener
2021-08-24 13:04 ` Martin Liška
2021-08-26 11:04 ` Richard Biener
2021-08-26 12:39 ` Martin Liška [this message]
2021-08-26 13:20 ` Richard Biener
2021-08-27 8:35 ` Martin Liška
2021-08-27 9:05 ` Richard Biener
2021-09-13 13:52 ` Martin Liška
2021-09-19 5:46 ` Jeff Law
2021-09-06 11:37 ` [PATCH] flag_complex_method: support optimize attribute Martin Liška
2021-09-06 11:46 ` Jakub Jelinek
2021-09-06 12:16 ` Richard Biener
2021-09-06 12:24 ` Jakub Jelinek
2021-09-07 9:42 ` Martin Liška
2021-09-13 13:32 ` Martin Liška
2021-09-19 14:45 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e3890c97-3c3c-0692-43b7-e115b46cf91e@suse.cz \
--to=mliska@suse.cz \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=matz@suse.de \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).