From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id AF45A3858402 for ; Thu, 26 Aug 2021 12:39:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AF45A3858402 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A8AD3201A9; Thu, 26 Aug 2021 12:39:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1629981597; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ElBzGsGEkPpUJHuygu72bvD5C6N4rVN6Nvh28gZevJE=; b=xQNoDz+fzF9TG3M9lOzUOj+qaFtZ9cvYrg84rmXSeCrg/FLAWnRioZ5KZsf29piS46e/Hg kbMGYzUU0oq32Tur7ickHskHXAdIfPeSM9W/x7HrrmxlwHCVnVZooNIdxsG3sZPMxsQd6W dX03TMhD8r6g2QQQADiEdOuUhlaSgrg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1629981597; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ElBzGsGEkPpUJHuygu72bvD5C6N4rVN6Nvh28gZevJE=; b=K8PWgqtSR+Cv1LtXAkumAkLpUiQMZ7kL3F2hVlFJZjvaXm6o/xMcyHMGmeDwr80wY5JpZR hwdMyNkurhRYhGBw== Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap1.suse-dmz.suse.de (Postfix) with ESMTPS id 7FA58132FD; Thu, 26 Aug 2021 12:39:57 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap1.suse-dmz.suse.de with ESMTPSA id 2JguHp2LJ2E7SAAAGKfGzw (envelope-from ); Thu, 26 Aug 2021 12:39:57 +0000 Message-ID: Date: Thu, 26 Aug 2021 14:39:57 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.0.1 Subject: Re: [PATCH] Optimize macro: make it more predictable Content-Language: en-US To: Richard Biener Cc: GCC Patches , Jakub Jelinek , Michael Matz References: <82e71ebf-7b2e-67e7-1f08-ea525deee4cb@suse.cz> <1dfa7226-3056-d215-4626-01126d428891@suse.cz> <88fde73d-d36a-2010-5837-30f2943d9dad@suse.cz> From: =?UTF-8?Q?Martin_Li=c5=a1ka?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Aug 2021 12:40:09 -0000 On 8/26/21 13:04, Richard Biener wrote: > On Tue, Aug 24, 2021 at 3:04 PM Martin Liška wrote: >> >> On 8/24/21 14:13, Richard Biener wrote: >>> On Thu, Jul 1, 2021 at 3:13 PM Martin Liška wrote: >>>> >>>> On 10/23/20 1:47 PM, Martin Liška wrote: >>>>> Hey. >>>> >>>> Hello. >>>> >>>> I deferred the patch for GCC 12. Since the time, I messed up with options >>>> I feel more familiar with the option handling. So ... >>>> >>>>> >>>>> This is a follow-up of the discussion that happened in thread about no_stack_protector >>>>> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html >>>>> >>>>> The current optimize attribute works in the following way: >>>>> - 1) we take current global_options as base >>>>> - 2) maybe_default_options is called for the currently selected optimization level, which >>>>> means all rules in default_options_table are executed >>>>> - 3) attribute values are applied (via decode_options) >>>>> >>>>> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector"))) >>>>> ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default: >>>>> /* -O1 and -Og optimizations. */ >>>>> { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 }, >>>>> >>>>> My patch handled and the current optimize attribute really behaves that same as appending attribute value >>>>> to the command line. So far so good. We should also reflect that in documentation entry which is quite >>>>> vague right now: >>>> >>>> ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that. >> >> Hey. >> >>> There is also handle_common_deferred_options that's not called so any >>> option processed there should >>> probably be excempt from being set/unset in the optimize attribute? >> >> Looking at the handled options, they have all Defer type and not Optimization. >> Thus we should be fine. >> >>> >>>>> >>>>> """ >>>>> The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line. >>>>> """ >>>> >>>> I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute >>>> as something not for a production use: >>>> >>>> "The optimize attribute should be used for debugging purposes only. It is not suitable in production code." >>>> >>>> Are we sure about the statement? I know that e.g. glibc uses that. >>> >>> Well, given we're changing behavior now that warning looks valid ;) >> >> Yeah! True. >> >>> I'll also note that >>> >>> "The optimize attribute arguments of a function behave >>> as if they were added to the command line options." >>> >>> is still likely untrue, the global state init is complicated ;) >> >> Sure, but the situation should be much closer to it :) Do you have a better wording? > > Maybe "The intent is that the optimize attribute behaves as if the > arguments were > appended to the command line." > > But as said originally below I'm not sure that this behavior is what people > expect. If I say optimize("fno-tree-vectorize") then I do expect that to retain > other command-line arguments. Yep, that's clear! > If I say optimize(1) I'm not sure I would expect > -ftree-vectorize on the command-line to prevail ;) That's a very good question: theoretically yes, I can imagine various scenarios how can -Ox be used in an attribute: 1) -O0 for a function that e.g. violate strict aliasing, or when one wants to debug a fn in an optimized binary 2) -Ofast for a function which is performance critical On the other hand, my original motivation was a kernel compiler with: -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector"))) It's not intuitive that you end up with -O2 -fno-stack-protector (and -fomit-frame-pointer). > > Is google code search still a thing? Can one search all of github > somehow? I really > wonder how 'optimize' is used at the moment. There are quite some optimize > attributes in the target part of the testsuite for example. And > testsuite/gcc.dg/vect/bb-slp-41.c I can investigate that. > suggests that optimize("-fno-tree-fre") preserves at least the > optimization level? > > There's always the possibility to preserve the current behavior for 'optimize' Yes, but we should somehow document the current weird behavior when it comes to -Ox options. > and add a new 'add_optimize' attribute that does the other thing. Considering that ... > >>> >>> >>>>> >>>>> and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that >>>>> >>>>> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not >>>>> with -ftree-vectorize -O1 ? >> >> This is my older suggestion and it will likely make it even much complicated. So ... > > In theory it's just dropping the command-line but yes, the question is > what happens > to the non-optimization part of the command-line. We obviously shouldn't drop > -std=gnu14 and it's side-effects. So yes, even documenting exactly what this > would do is difficult ;) > >>> >>> As implemented your patch seems to turn it into -ftree-vectorize -O1. >> >> Yes. >> >>> IIRC multiple optimize attributes apply >>> ontop of each other, and it makes sense to me that optimize (2), >>> optimize ("tree-vectorize") behaves the same >>> as optimize (2, "tree-vectorize"). I'm not sure this is still the >>> case after your patch? Also consider >>> >>> #pragma GCC optimize ("tree-vectorize") >>> void foo () { ...} >>> >>> #pragma GCC optimize ("tree-loop-distribution") >>> void bar () {... } >>> >>> I'd expect bar to have both vectorization and loop distribution >>> enabled? (note I didn't use push/pop here) >> >> Yes, yes and yes. I'm going to verify it. >> >>> >>>> The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags: >>>> >>>> $ cat -n gcc/config/i386/i386-options.c >>>> ... >>>> 1245 if (opt == IX86_FUNCTION_SPECIFIC_ARCH) >>>> 1246 { >>>> 1247 /* If arch= is set, clear all bits in x_ix86_isa_flags, >>>> 1248 except for ISA_64BIT, ABI_64, ABI_X32, and CODE16 >>>> 1249 and all bits in x_ix86_isa_flags2. */ >>>> 1250 opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT >>>> 1251 | OPTION_MASK_ABI_64 >>>> 1252 | OPTION_MASK_ABI_X32 >>>> 1253 | OPTION_MASK_CODE16); >>>> 1254 opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT >>>> 1255 | OPTION_MASK_ABI_64 >>>> 1256 | OPTION_MASK_ABI_X32 >>>> 1257 | OPTION_MASK_CODE16); >>>> 1258 opts->x_ix86_isa_flags2 = 0; >>>> 1259 opts->x_ix86_isa_flags2_explicit = 0; >>>> 1260 } >>>> >>>> That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs >>>> to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently: >>>> >>>> $ cat hreset.c >>>> #pragma GCC target "arch=geode" >>>> #include >>>> void foo(unsigned int eax) >>>> { >>>> _hreset (eax); >>>> } >>>> >>>> $ clang hreset.c -mhreset -c -O2 -m32 >>>> $ gcc hreset.c -mhreset -c -O2 -m32 >>>> In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97, >>>> from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27, >>>> from hreset.c:2: >>>> hreset.c: In function ‘foo’: >>>> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch >>>> 39 | _hreset (unsigned int __EAX) >>>> | ^~~~~~~ >>>> hreset.c:5:3: note: called from here >>>> 5 | _hreset (eax); >>>> | ^~~~~~~~~~~~~ >>>> >>>> Anyway, I think the current target attribute handling should be preserved. >>> >>> I think this and the -O1 argument above suggests that there should be >>> a way to distinguish >>> two modes - add to the active set of options and starting from scratch. >> >> Doing that would make it even crazier :) >> >>> >>> Maybe it's over-designing things but do we want to preserve the >>> existing behavior >>> and instead add optimize ("+ftree-vectorize") and target ("+avx2") as >>> a way to amend >>> the state? >> >> I prefer doing only the append mode (when one can still use -fno-foo for an explicit >> drop of a flag). >> >>> >>> OTOH as we're missing global_options re-init even with your patch we won't get >>> the defaults correct (aka what toplev::main does with init_options_struct and >>> the corresponding langhook). Likewise if lang_hooks.init_options performs any >>> defaulting a later flag overrides and we override that with optimize() that >>> doesn't work - I'm thinking of things like flag_complex_method and -fcx-* flags. >>> So -O2 -fcx-fortran-rules on the command-line and optimize >>> ("no-cx-fortran-rules") >>> to cancel the -fcx-fortran-rules switch wouldn't work? >> >> In most cases it works. What's problematic about -fcx-fortran-rules is that it sets >> >> /* With -fcx-limited-range, we do cheap and quick complex arithmetic. */ >> if (flag_cx_limited_range) >> flag_complex_method = 0; >> >> /* With -fcx-fortran-rules, we do something in-between cheap and C99. */ >> if (flag_cx_fortran_rules) >> flag_complex_method = 1; >> >> in process_options (called only for cmdline options) and not in >> >> /* After all options at LOC have been read into OPTS and OPTS_SET, >> finalize settings of those options and diagnose incompatible >> combinations. */ >> void >> finish_options (struct gcc_options *opts, struct gcc_options *opts_set, >> location_t loc) >> >> which is a place which is called once options are decoded (both from cmdline and when >> combined with a attribute or pragma): > > Yes, and that flag_complex_method is initialized via the langhook > mentioned, for example > c-family/c-opts.c has > > /* Initialize options structure OPTS. */ > void > c_common_init_options_struct (struct gcc_options *opts) > { > opts->x_flag_exceptions = c_dialect_cxx (); > opts->x_warn_pointer_arith = c_dialect_cxx (); > opts->x_warn_write_strings = c_dialect_cxx (); > opts->x_flag_warn_unused_result = true; > > /* By default, C99-like requirements for complex multiply and divide. */ > opts->x_flag_complex_method = 2; > } > > so an attempt to "cancel" a command-line option that adjusted any of > the above will not > work because we're not re-initializing global_options appropriately. > But maybe we can > just do that? That is, call > > /* Initialize global options structures; this must be repeated for > each structure used for parsing options. */ > init_options_struct (&global_options, &global_options_set); > lang_hooks.init_options_struct (&global_options); > > and > > /* Perform language-specific options initialization. */ > lang_hooks.init_options (save_decoded_options_count, save_decoded_options); > > as done by toplev.c? Or if we do not want to do that store that state away > to an 'initialized_options/initialized_options_set' set of vars we can > copy from? I think this one is a bit orthogonal to the suggested changes, right? We can hopefully implement it incrementally. Martin > >> #1 0x0000000001b69da3 in finish_options (opts=opts@entry=0x26b13e0 , opts_set=opts_set@entry=0x26afdc0 , loc=loc@entry=258754) at /home/marxin/Programming/gcc/gcc/opts.c:1303 >> >> #2 0x0000000000dd9e3b in decode_options (opts=0x26b13e0 , opts_set=0x26afdc0 , decoded_options=, decoded_options_count=decoded_options_count@entry=4, loc=258754, dc=0x26b2b00 , >> >> target_option_override_hook=0x0) at /home/marxin/Programming/gcc/gcc/opts-global.c:324 >> >> #3 0x0000000000921144 in parse_optimize_options (args=args@entry=, attr_p=attr_p@entry=false) at /home/marxin/Programming/gcc/gcc/c-family/c-common.c:5921 >> >> #4 0x0000000000972aab in handle_pragma_optimize (dummy=) at /home/marxin/Programming/gcc/gcc/c-family/c-pragma.c:993 >> >> #5 0x00000000008e3118 in c_parser_pragma (parser=0x7ffff7fbeab0, context=pragma_external, if_p=0x0) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:12573 >> >> >> Martin >> >>> >>> Thanks, >>> Richard. >>> >>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >>>> >>>> Ready to be installed? >>>> Thanks, >>>> Martin >>>> >>>>> >>>>> I'm also planning to take a look at the target macro/attribute, I expect similar problems: >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469 >>>>> >>>>> Thoughts? >>>>> Thanks, >>>>> Martin >>>> >>