From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 34F663893655 for ; Thu, 22 Apr 2021 12:16:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 34F663893655 Received: by mail-ej1-x62d.google.com with SMTP id r9so68451661ejj.3 for ; Thu, 22 Apr 2021 05:16:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RV5IdLvilET1EADcrGHj9lBNZRAaaw9uEZdE8IirQIE=; b=OLLI7uBsXoC7fqcpifZIXQesLiDjLmgzKHuWa1pWwpAsR5P0bN1BBdnE3SBVd1+V5p I9E5Zb3VVJ0GvFBCbJrUlJmPwlH/RFQa4rUTHynK8GNYra7wdOOItv7rMPHHqrdHvOGm VhdEEblv2o9QI/xG0hD/W2xrv4csLhduEFCWvYcV94dWSTy5hDRd79l9qP8u/FelNXAP VOTVjxWZaoYOdipfQ9RR7LhCaQjRMgctc1TDOuJAiMRtpQi0pN5ZsLBg38PcQhc8Pm1G qjJBEDSH9TzpphKUgyAaNHwRlC6Tsql6oHkTqVZjubLL/p/HclJsl/zLkFKP7wG0uLUd 53Zg== X-Gm-Message-State: AOAM533LHfjH7qqJfNfHFi08PpGAAKYhaTPsonRcJ5GVDPER9T/J5iZB UD9HztvLkVoEpCNv7gbRrqGUqhfK40OoxpvXrgo= X-Google-Smtp-Source: ABdhPJwsilIQR3kcamwi/96+odhFhWeLp/RyrUF9mMUJ32IPLRcB+YNsw12piTcrixP0UAa5KVPfYsisIWG+Tyujbl0= X-Received: by 2002:a17:906:a449:: with SMTP id cb9mr3071002ejb.118.1619093818162; Thu, 22 Apr 2021 05:16:58 -0700 (PDT) MIME-Version: 1.0 References: <20210414223918.230495-1-hjl.tools@gmail.com> <20210414223918.230495-3-hjl.tools@gmail.com> <20210422090201.GO1179226@tucnak> In-Reply-To: From: Richard Biener Date: Thu, 22 Apr 2021 14:16:47 +0200 Message-ID: Subject: Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute To: "H.J. Lu" Cc: Jakub Jelinek , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Apr 2021 12:17:01 -0000 On Thu, Apr 22, 2021 at 1:58 PM H.J. Lu wrote: > > On Thu, Apr 22, 2021 at 4:23 AM Richard Biener > wrote: > > > > On Thu, Apr 22, 2021 at 12:30 PM Jakub Jelinek via Gcc-patches > > wrote: > > > > > > On Wed, Apr 21, 2021 at 06:01:07PM -0700, H.J. Lu via Gcc-patches wrote: > > > > How about this? > > > > > > > > @item general_regs_only > > > > @cindex @code{general_regs_only} function attribute, x86 > > > > The @code{general_regs_only} function attribute informs the compiler > > > > that the function uses only general purpose registers. When the > > > > compiler inlines a function with the @code{always_inline} attribute, > > > > target-specific compilation options may lead to inline failures. > > > > The @code{general_regs_only} attribute, if applicable, can be used > > > > together with the @code{always_inline} attribute to reduce inlining > > > > failure. > > > > > > I don't really like this attribute. > > > It is very specific to what you want to solve and doesn't address the > > > general problem, that always_inline means different things in different > > > code, and that is a problem for many targets, not just one. > > > > > > As has been written, in some cases it means inline always, error > > > whenever it is called indirectly which can't be optimized into a direct > > > call that can be inlined and error whenever the inlining fails for other > > > reasons. > > > > > > Another case, e.g. in the glibc fortify wrappers, is inline always > > > when the call is direct or an indirect call can be optimized into a direct > > > call and error when the inlining fails, but support indirect calls without > > > errors. > > > > > > Another case, which is most of the x86/aarch64/arm etc. intrinsics, is > > > inline always unless there is a target mismatch (roughly what is > > > actually implemented). > > > > > > Because from the always_inline attribute it is impossible to determine which > > > one of those it is (for the indirect calls the rule could be > > > gnu_inline extern inline means indirect calls are ok, anything else means > > > indirect calls are bad), we need new syntax to distinguish those cases. > > > > > > general_regs_only attribute doesn't seem to be it, e.g. for the glibc > > > fortify wrappers cases I don't see why we should forbid using floating point > > > in such inlines. > > > > > > So IMHO we need some new attribute for one of those, or optional parameter > > > to always_inline. > > > > > > For the intrinsic case, ideal would be if we could record which ISA flags > > > (or more generally which options) are required and which are not. Either > > > have some syntax where those would be explicitly specified in attribute (but > > > frankly that would be a maintainance nightmare), or derive those from > > > surrounding pragmas. Right now we have those wrapped in > > > #ifndef __AVX2__ > > > #pragma GCC push_options > > > #pragma GCC target("avx2") > > > #define __DISABLE_AVX2__ > > > #endif /* __AVX2__ */ > > > > > > ... > > > > > > #ifdef __DISABLE_AVX2__ > > > #undef __DISABLE_AVX2__ > > > #pragma GCC pop_options > > > #endif /* __DISABLE_AVX2__ */ > > > > > > The question is if the pragma GCC target right now behaves incrementally > > > or not, whether > > > #pragma GCC target("avx2") > > > adds -mavx2 to options if it was missing before and nothing otherwise, or if > > > it switches other options off. If it is incremental, we could e.g. try to > > > use the second least significant bit of global_options_set.x_* to mean > > > this option has been set explicitly by some surrounding #pragma GCC target. > > > The normal tests - global_options_set.x_flag_whatever could still work > > > fine because they wouldn't care if the option was explicit from anywhere > > > (command line or GCC target or target attribute) and just & 2 would mean > > > it was explicit from pragma GCC target; though there is the case of > > > bitfields... And then the inlining decision could check the & 2 flags to > > > see what is required and what is just from command line. > > > Or we can have some other pragma GCC that would be like target but would > > > have flags that are explicit (and could e.g. be more restricted, to ISA > > > options only, and let those use in addition to #pragma GCC target. > > > > I'm still curious as to what you think will break if always-inline does what > > it is documented to do. > > No wrong code. But the compiler will generate a different error message > at the later stage if the ISA for the intrinsic isn't enabled. But all issues at hand we're trying to fix are rejections of _valid_ code. Improving diagnostics for invalid code should be secondary to properly accepting valid code. There's nothing perfect here - for diagnostics we might want to consider an __attribute__((requires_isa(a,b,c))) which says that calling a function with such attribute requires the ISAs in the list to be active. Target code can then iterate over calls _before_ inlining to emit diagnostics. Note that the target flags in effect on the callee are not the same as what would be documented as required - if the CU is compiled with -mavx2 then even SSE intrinsics will have AVX2 "enabled" I think: #include __m128 __attribute__((target("no-avx,sse2"))) foo(__m128 A, __m128 B) { return _mm_sub_ss (A, B); } > gcc-10 -S t.c -mavx2 In file included from t.c:1: t.c: In function 'foo': /usr/lib64/gcc/x86_64-suse-linux/10/include/xmmintrin.h:134:1: error: inlining failed in call to 'always_inline' '_mm_sub_ss': target specific option mismatch 134 | _mm_sub_ss (__m128 __A, __m128 __B) | ^~~~~~~~~~ t.c:5:10: note: called from here 5 | return _mm_sub_ss (A, B); | ^~~~~~~~~~~~~~~~~ the above is obviously bogus (_mm_sub_ss inherited AVX2). Richard. > -- > H.J.