From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id C0B043886C4F for ; Thu, 1 Jul 2021 13:13:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C0B043886C4F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz Received: from imap.suse.de (imap-alt.suse-dmz.suse.de [192.168.254.47]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9DD53204E2; Thu, 1 Jul 2021 13:13:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1625145201; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=/2sgfzBXq4T/i3Kq8R1o77XtJMCaI3qxniXvKn2OTwY=; b=WgfSsxhe5zxwPMYlrs4QCqrEkqKT+Cf0wku9E4Nlq/6BHL2E329DCkyquDmZzQ8Q7XLbTc IYIVTIZgbDjMgFQoM1u8yY8R9kUZYDrQyeDypU0J6bqVo+iyW82F/SO3NoYs2PsQPG9eQE k+dg1uwbG9zhn2/pEiNPIgbfu8YtYDg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1625145201; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=/2sgfzBXq4T/i3Kq8R1o77XtJMCaI3qxniXvKn2OTwY=; b=O6TY+eyY2FLzXIGRkxYuh4guHMCoIG2Gp6tZ/s/skP3SEE1inT3f3K8o9jF/lPsEALc0cU kV5O/IsCDaC6uXDg== Received: from imap3-int (imap-alt.suse-dmz.suse.de [192.168.254.47]) by imap.suse.de (Postfix) with ESMTP id 7810A11CC0; Thu, 1 Jul 2021 13:13:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1625145201; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=/2sgfzBXq4T/i3Kq8R1o77XtJMCaI3qxniXvKn2OTwY=; b=WgfSsxhe5zxwPMYlrs4QCqrEkqKT+Cf0wku9E4Nlq/6BHL2E329DCkyquDmZzQ8Q7XLbTc IYIVTIZgbDjMgFQoM1u8yY8R9kUZYDrQyeDypU0J6bqVo+iyW82F/SO3NoYs2PsQPG9eQE k+dg1uwbG9zhn2/pEiNPIgbfu8YtYDg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1625145201; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=/2sgfzBXq4T/i3Kq8R1o77XtJMCaI3qxniXvKn2OTwY=; b=O6TY+eyY2FLzXIGRkxYuh4guHMCoIG2Gp6tZ/s/skP3SEE1inT3f3K8o9jF/lPsEALc0cU kV5O/IsCDaC6uXDg== Received: from director2.suse.de ([192.168.254.72]) by imap3-int with ESMTPSA id UqRTHHG/3WBqPgAALh3uQQ (envelope-from ); Thu, 01 Jul 2021 13:13:21 +0000 Subject: Re: [PATCH] Optimize macro: make it more predictable From: =?UTF-8?Q?Martin_Li=c5=a1ka?= To: gcc-patches@gcc.gnu.org Cc: Jakub Jelinek , Michael Matz References: <82e71ebf-7b2e-67e7-1f08-ea525deee4cb@suse.cz> Message-ID: <1dfa7226-3056-d215-4626-01126d428891@suse.cz> Date: Thu, 1 Jul 2021 15:13:21 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <82e71ebf-7b2e-67e7-1f08-ea525deee4cb@suse.cz> Content-Type: multipart/mixed; boundary="------------0F5B2607330F8EFA5062BC53" Content-Language: en-US X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jul 2021 13:13:25 -0000 This is a multi-part message in MIME format. --------------0F5B2607330F8EFA5062BC53 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit On 10/23/20 1:47 PM, Martin Liška wrote: > Hey. Hello. I deferred the patch for GCC 12. Since the time, I messed up with options I feel more familiar with the option handling. So ... > > This is a follow-up of the discussion that happened in thread about no_stack_protector > attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html > > The current optimize attribute works in the following way: > - 1) we take current global_options as base > - 2) maybe_default_options is called for the currently selected optimization level, which >      means all rules in default_options_table are executed > - 3) attribute values are applied (via decode_options) > > So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer and __attribute__((optimize("-fno-stack-protector"))) > ends basically with -O2 -fno-stack-protector because -fno-omit-frame-pointer is default: >     /* -O1 and -Og optimizations.  */ >     { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 }, > > My patch handled and the current optimize attribute really behaves that same as appending attribute value > to the command line. So far so good. We should also reflect that in documentation entry which is quite > vague right now: ^^^ all these are still valid arguments, plus I'm adding a new test-case that tests that. > > """ > The optimize attribute is used to specify that a function is to be compiled with different optimization options than specified on the command line. > """ I addressed that with documentation changes, should be more clear to users. Moreover, I noticed that we declare 'optimize' attribute as something not for a production use: "The optimize attribute should be used for debugging purposes only. It is not suitable in production code." Are we sure about the statement? I know that e.g. glibc uses that. > > and we may want to handle -Ox in the attribute in a special way. I guess many macro/pragma users expect that > > -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with -O1 and not > with -ftree-vectorize -O1 ? The situation with 'target' attribute is different. When parsing the attribute, we intentionally drop all existing target flags: $ cat -n gcc/config/i386/i386-options.c ... 1245 if (opt == IX86_FUNCTION_SPECIFIC_ARCH) 1246 { 1247 /* If arch= is set, clear all bits in x_ix86_isa_flags, 1248 except for ISA_64BIT, ABI_64, ABI_X32, and CODE16 1249 and all bits in x_ix86_isa_flags2. */ 1250 opts->x_ix86_isa_flags &= (OPTION_MASK_ISA_64BIT 1251 | OPTION_MASK_ABI_64 1252 | OPTION_MASK_ABI_X32 1253 | OPTION_MASK_CODE16); 1254 opts->x_ix86_isa_flags_explicit &= (OPTION_MASK_ISA_64BIT 1255 | OPTION_MASK_ABI_64 1256 | OPTION_MASK_ABI_X32 1257 | OPTION_MASK_CODE16); 1258 opts->x_ix86_isa_flags2 = 0; 1259 opts->x_ix86_isa_flags2_explicit = 0; 1260 } That seems logical because target attribute is used for e.g. ifunc multi-versioning and one needs to be sure all existing ISA flags are dropped. However, I noticed clang behaves differently: $ cat hreset.c #pragma GCC target "arch=geode" #include void foo(unsigned int eax) { _hreset (eax); } $ clang hreset.c -mhreset -c -O2 -m32 $ gcc hreset.c -mhreset -c -O2 -m32 In file included from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/x86gprintrin.h:97, from /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/immintrin.h:27, from hreset.c:2: hreset.c: In function ‘foo’: /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/12.0.0/include/hresetintrin.h:39:1: error: inlining failed in call to ‘always_inline’ ‘_hreset’: target specific option mismatch 39 | _hreset (unsigned int __EAX) | ^~~~~~~ hreset.c:5:3: note: called from here 5 | _hreset (eax); | ^~~~~~~~~~~~~ Anyway, I think the current target attribute handling should be preserved. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin > > I'm also planning to take a look at the target macro/attribute, I expect similar problems: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469 > > Thoughts? > Thanks, > Martin --------------0F5B2607330F8EFA5062BC53 Content-Type: text/x-patch; charset=UTF-8; name="0001-Append-target-optimize-attr-to-the-current-cmdline.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-Append-target-optimize-attr-to-the-current-cmdline.patc"; filename*1="h" >From eaa1892cbe32c6fe73de7708aa17be2d3917bceb Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Wed, 2 Jun 2021 08:44:37 +0200 Subject: [PATCH] Append target/optimize attr to the current cmdline. gcc/c-family/ChangeLog: * c-common.c (parse_optimize_options): Combine optimize options with what was provided on the command line. gcc/ChangeLog: * toplev.c (toplev::main): Save decoded optimization options. * toplev.h (save_opt_decoded_options): New. * doc/extend.texi: Be more clear about optimize and target attributes. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable fast math. * gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise. * gcc.target/i386/attr-optimize.c: New test. --- gcc/c-family/c-common.c | 17 +++++++++++-- gcc/doc/extend.texi | 8 +++++-- gcc/testsuite/gcc.target/i386/attr-optimize.c | 24 +++++++++++++++++++ .../gcc.target/i386/avx512er-vrsqrt28ps-3.c | 2 +- .../gcc.target/i386/avx512er-vrsqrt28ps-5.c | 2 +- gcc/toplev.c | 8 +++++++ gcc/toplev.h | 1 + 7 files changed, 56 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/attr-optimize.c diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 681fcc972f4..f4e56653585 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -5901,9 +5901,22 @@ parse_optimize_options (tree args, bool attr_p) j++; } decoded_options_count = j; - /* And apply them. */ + + /* Merge the decoded options with save_decoded_options. */ + unsigned save_opt_count = save_opt_decoded_options.length (); + unsigned merged_decoded_options_count + = save_opt_count + decoded_options_count; + cl_decoded_option *merged_decoded_options + = XNEWVEC (cl_decoded_option, merged_decoded_options_count); + + for (unsigned i = 0; i < save_opt_count; ++i) + merged_decoded_options[i] = save_opt_decoded_options[i]; + for (unsigned i = 0; i < decoded_options_count; ++i) + merged_decoded_options[save_opt_count + i] = decoded_options[i]; + + /* And apply them. */ decode_options (&global_options, &global_options_set, - decoded_options, decoded_options_count, + merged_decoded_options, merged_decoded_options_count, input_location, global_dc, NULL); free (decoded_options); diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 8fc66d626d8..3b501a8be3a 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -3593,7 +3593,10 @@ take function pointer arguments. @cindex @code{optimize} function attribute The @code{optimize} attribute is used to specify that a function is to be compiled with different optimization options than specified on the -command line. Valid arguments are constant non-negative integers and +command line. The optimize attribute arguments of a function behave +as if they were added to the command line options. + +Valid arguments are constant non-negative integers and strings. Each numeric argument specifies an optimization @var{level}. Each @var{string} argument consists of one or more comma-separated substrings. Each substring that begins with the letter @code{O} refers @@ -3797,7 +3800,8 @@ This attribute prevents stack protection code for the function. Multiple target back ends implement the @code{target} attribute to specify that a function is to be compiled with different target options than specified on the -command line. One or more strings can be provided as arguments. +command line. The original target command line options are ignored. +One or more strings can be provided as arguments. Each string consists of one or more comma-separated suffixes to the @code{-m} prefix jointly forming the name of a machine-dependent option. @xref{Submodel Options,,Machine-Dependent Options}. diff --git a/gcc/testsuite/gcc.target/i386/attr-optimize.c b/gcc/testsuite/gcc.target/i386/attr-optimize.c new file mode 100644 index 00000000000..f5db028f1fd --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/attr-optimize.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O1 -ftree-slp-vectorize -march=znver1 -fdump-tree-optimized" } */ + +/* Use -O2, but -ftree-slp-vectorize option should be preserved and used. */ +#pragma GCC optimize "-O2" + +typedef struct { + long n[4]; +} secp256k1_fe; + +void *a; +int c; +static void +fn1(secp256k1_fe *p1, int p2) +{ + p1->n[0] = p1->n[1] = p2; +} +void +fn2() +{ + fn1(a, !c); +} + +/* { dg-final { scan-tree-dump { MEM \[[^]]*\] = } "optimized" } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c index 1ba8172d6e3..40aefb50844 100644 --- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-3.c @@ -8,7 +8,7 @@ #define MAX 1000 #define EPS 0.00001 -__attribute__ ((noinline, optimize (1))) +__attribute__ ((noinline, optimize (1, "-fno-fast-math"))) void static compute_rsqrt_ref (float *a, float *r) { diff --git a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c index e067a81e562..498f4d50aa5 100644 --- a/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c +++ b/gcc/testsuite/gcc.target/i386/avx512er-vrsqrt28ps-5.c @@ -8,7 +8,7 @@ #define MAX 1000 #define EPS 0.00001 -__attribute__ ((noinline, optimize (1))) +__attribute__ ((noinline, optimize (1, "-fno-fast-math"))) void static compute_sqrt_ref (float *a, float *r) { diff --git a/gcc/toplev.c b/gcc/toplev.c index 43f1f7d345e..3e26ad97ff0 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -121,6 +121,9 @@ static bool no_backend; struct cl_decoded_option *save_decoded_options; unsigned int save_decoded_options_count; +/* Vector of saved Optimization decoded command line options. */ +auto_vec save_opt_decoded_options; + /* Used to enable -fvar-tracking, -fweb and -frename-registers according to optimize in process_options (). */ #define AUTODETECT_VALUE 2 @@ -2335,6 +2338,11 @@ toplev::main (int argc, char **argv) &save_decoded_options, &save_decoded_options_count); + /* Save Optimization decoded options. */ + for (unsigned i = 0; i < save_decoded_options_count; ++i) + if (cl_options[save_decoded_options[i].opt_index].flags & CL_OPTIMIZATION) + save_opt_decoded_options.safe_push (save_decoded_options[i]); + /* Perform language-specific options initialization. */ lang_hooks.init_options (save_decoded_options_count, save_decoded_options); diff --git a/gcc/toplev.h b/gcc/toplev.h index 175944c85b7..9a1d2e6986f 100644 --- a/gcc/toplev.h +++ b/gcc/toplev.h @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3. If not see /* Decoded options, and number of such options. */ extern struct cl_decoded_option *save_decoded_options; extern unsigned int save_decoded_options_count; +extern auto_vec save_opt_decoded_options; class timer; -- 2.32.0 --------------0F5B2607330F8EFA5062BC53--