From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id C26233858D32 for ; Mon, 10 Jul 2023 17:28:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C26233858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-3fbc5d5742eso53291115e9.3 for ; Mon, 10 Jul 2023 10:28:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689010108; x=1691602108; h=to:in-reply-to:cc:references:message-id:date:subject:mime-version :from:content-transfer-encoding:from:to:cc:subject:date:message-id :reply-to; bh=19xdW6EPv5F2+eTmYkymctXuZEOY3svAjjReSlqW3rw=; b=fqIjyRdQODK+/qCLYozhm4tBIIhjTvo0knEZMaZVWLLqD1jrQLhyym29efy5aO++YQ H1dEwO5h1/YpcKxQaF3oHeOANafw+Q37svFHIHuMvBAMEbeRy3R14NdWiYL5cayxmwM6 tmvcp6+fdJqKQylpwKWkkMcZwB17Mmaxl8z4QwoyDFNl51h+WcYqcLoP300aTDjj9rs2 HQIDRpcU6Uy9dFzI1zJ2ztjHHMEiCt2TLfIvL85hHNEuLpTlbBdNgysOa9ErpHHwyUHN ywOAUpKOA7dg9/T02rFG/cy36gbRjwCP0WWIbWOGerbwvm3wYrT/G5alI+uvIha2BiP3 pSjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689010108; x=1691602108; h=to:in-reply-to:cc:references:message-id:date:subject:mime-version :from:content-transfer-encoding:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=19xdW6EPv5F2+eTmYkymctXuZEOY3svAjjReSlqW3rw=; b=TWYxfL0nnwOrS8bZeHMioEmyViUcgC6Az0L9K1xG4Sir1z8ijCGbKMorPlgDfXwv5c zz9Y6NDtJmr9i9lwrhFw9ewAkvbIb2i0qafit4jkO2+p8Alhe6XXvNDoUFA3H+APvO68 XeFAEfieNtfF/ZHrO4b/O9zMS25WfJoXcj61KZ6hlUdy/LkDrffydFpz2aOTHaWAKRgm C3WrQmgxZvLuQdHWmywlivreJYldXk3+iUt3Usue3FFgVUL+B2HZWVbjvYT/Q1lnhfrN MObiZLuxJt8hknBEL9eIBmEU8I/J6XEKP8dwEzGl5cwSk21eGnJ6NgvS+xkjuGYqXRia U0Rg== X-Gm-Message-State: ABy/qLbC7HqZMY8kpEUMdqhw7o1yt3wnb+SIg5eJSS7T9ioB9qurqi55 BdtWOQX1wTY3M0+lcLRgsRdGXz1OTFE= X-Google-Smtp-Source: APBJJlEwoLC14o9fr8Y6ud+b5+SCXlPun55KSxi0dIIOB28ay8NHxEbKj+3gt5gOOtj0Ev5CTbKVsA== X-Received: by 2002:a05:600c:2111:b0:3fb:b832:d79a with SMTP id u17-20020a05600c211100b003fbb832d79amr12993546wml.38.1689010107881; Mon, 10 Jul 2023 10:28:27 -0700 (PDT) Received: from smtpclient.apple (dynamic-077-004-123-097.77.4.pool.telefonica.de. [77.4.123.97]) by smtp.gmail.com with ESMTPSA id z14-20020a05600c220e00b003fbacc853ccsm423625wml.18.2023.07.10.10.28.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 10 Jul 2023 10:28:27 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Richard Biener Mime-Version: 1.0 (1.0) Subject: Re: [x86-64] RFC: Add nosse abi attribute Date: Mon, 10 Jul 2023 19:28:16 +0200 Message-Id: References: Cc: gcc-patches@gcc.gnu.org, Jan Hubicka In-Reply-To: To: Michael Matz X-Mailer: iPhone Mail (20F75) X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,MIME_QP_LONG_LINE,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_WEB,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > Am 10.07.2023 um 17:56 schrieb Michael Matz via Gcc-patches : >=20 > =EF=BB=BFHello, >=20 > the ELF psABI for x86-64 doesn't have any callee-saved SSE > registers (there were actual reasons for that, but those don't > matter anymore). This starts to hurt some uses, as it means that > as soon as you have a call (say to memmove/memcpy, even if > implicit as libcall) in a loop that manipulates floating point > or vector data you get saves/restores around those calls. >=20 > But in reality many functions can be written such that they only need > to clobber a subset of the 16 XMM registers (or do the save/restore > themself in the codepaths that needs them, hello memcpy again). > So we want to introduce a way to specify this, via an ABI attribute > that basically says "doesn't clobber the high XMM regs". >=20 > I've opted to do only the obvious: do something special only for > xmm8 to xmm15, without a way to specify the clobber set in more detail. > I think such half/half split is reasonable, and as I don't want to > change the argument passing anyway (whose regs are always clobbered) > there isn't that much wiggle room anyway. What about xmm16 to xmm31 which AVX512 adds and any possible future addition= s to the register file? (I suppose the any variant also covers zmm - and al= so future widened variants?). What about AVX512 mask registers? > I chose to make it possible to write function definitions with that > attribute with GCC adding the necessary callee save/restore code in > the xlogue itself. Carefully note that this is only possible for > the SSE2 registers, as other parts of them would need instructions > that are only optional. When a function doesn't contain calls to > unknown functions we can be a bit more lenient: we can make it so that > GCC simply doesn't touch xmm8-15 at all, then no save/restore is > necessary. If a function contains calls then GCC can't know which > parts of the XMM regset is clobbered by that, it may be parts > which don't even exist yet (say until avx2048 comes out), so we must > restrict ourself to only save/restore the SSE2 parts and then of course > can only claim to not clobber those parts. >=20 > To that end I introduce actually two related attributes (for naming > see below): > * nosseclobber: claims (and ensures) that xmm8-15 aren't clobbered > * noanysseclobber: claims (and ensures) that nothing of any of the > registers overlapping xmm8-15 is clobbered (not even future, as of > yet unknown, parts) >=20 > Ensuring the first is simple: potentially add saves/restore in xlogue > (e.g. when xmm8 is either used explicitely or implicitely by a call). > Ensuring the second comes with more: we must also ensure that no > functions are called that don't guarantee the same thing (in addition > to just removing all xmm8-15 parts alltogether from the available > regsters). >=20 > See also the added testcases for what I intended to support. >=20 > I chose to use the new target independend function-abi facility for > this. I need some adjustments in generic code: > * the "default_abi" is actually more like a "current" abi: it happily > changes its contents according to conditional_register_usage, > and other code assumes that such changes do propagate. > But if that conditonal_reg_usage is actually done because the current > function is of a different ABI, then we must not change default_abi. > * in insn_callee_abi we do look at a potential fndecl for a call > insn (only set when -fipa-ra), but doesn't work for calls through > pointers and (as said) is optional. So, also always look at the > called functions type (it's always recorded in the MEM_EXPR for > non-libcalls), before asking the target. > (The function-abi accessors working on trees were already doing that, > its just the RTL accessor that missed this) >=20 > Accordingly I also implement some more target hooks for function-abi. > With that it's possible to also move the other ABI-influencing code > of i386 to function-abi (ms_abi and friends). I have not done so for > this patch. >=20 > Regarding the names of the attributes: gah! I've left them at > my mediocre attempts of names in order to hopefully get input on better > names :-) >=20 > I would welcome any comments, about the names, the approach, the attempt > at documenting the intricacies of these attributes and anything. >=20 > FWIW, this particular patch was regstrapped on x86-64-linux > with trunk from a week ago (and sniff-tested on current trunk). >=20 >=20 > Ciao, > Michael. >=20 > diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-option= s.cc > index 37cb5a0dcc4..92358f4ac41 100644 > --- a/gcc/config/i386/i386-options.cc > +++ b/gcc/config/i386/i386-options.cc > @@ -3244,6 +3244,16 @@ ix86_set_indirect_branch_type (tree fndecl) > } > } >=20 > +unsigned > +ix86_fntype_to_abi_id (const_tree fntype) > +{ > + if (lookup_attribute ("nosseclobber", TYPE_ATTRIBUTES (fntype))) > + return ABI_LESS_SSE; > + if (lookup_attribute ("noanysseclobber", TYPE_ATTRIBUTES (fntype))) > + return ABI_NO_SSE; > + return ABI_DEFAULT; > +} > + > /* Establish appropriate back-end context for processing the function > FNDECL. The argument might be NULL to indicate processing at top > level, outside of any function scope. */ > @@ -3311,6 +3321,12 @@ ix86_set_current_function (tree fndecl) > else > TREE_TARGET_GLOBALS (new_tree) =3D save_target_globals_default_opts ();= > } > + > + unsigned prev_abi_id =3D 0; > + if (ix86_previous_fndecl) > + prev_abi_id =3D ix86_fntype_to_abi_id (TREE_TYPE (ix86_previous_fndec= l)); > + unsigned this_abi_id =3D ix86_fntype_to_abi_id (TREE_TYPE (fndecl)); > + > ix86_previous_fndecl =3D fndecl; >=20 > static bool prev_no_caller_saved_registers; > @@ -3327,6 +3343,8 @@ ix86_set_current_function (tree fndecl) > else if (prev_no_caller_saved_registers > !=3D cfun->machine->no_caller_saved_registers) > reinit_regs (); > + else if (prev_abi_id !=3D this_abi_id) > + reinit_regs (); >=20 > if (cfun->machine->func_type !=3D TYPE_NORMAL > || cfun->machine->no_caller_saved_registers) > @@ -3940,6 +3958,10 @@ const struct attribute_spec ix86_attribute_table[] =3D= > ix86_handle_fndecl_attribute, NULL }, > { "nodirect_extern_access", 0, 0, true, false, false, false, > handle_nodirect_extern_access_attribute, NULL }, > + { "nosseclobber", 0, 0, false, true, true, true, > + NULL, NULL }, > + { "noanysseclobber", 0, 0, false, true, true, true, > + NULL, NULL }, >=20 > /* End element. */ > { NULL, 0, 0, false, false, false, false, NULL, NULL } > diff --git a/gcc/config/i386/i386-options.h b/gcc/config/i386/i386-options= .h > index 68666067fea..ad39661d852 100644 > --- a/gcc/config/i386/i386-options.h > +++ b/gcc/config/i386/i386-options.h > @@ -53,6 +53,7 @@ extern unsigned int ix86_incoming_stack_boundary; > extern char *ix86_offload_options (void); > extern void ix86_option_override (void); > extern void ix86_override_options_after_change (void); > +unsigned ix86_fntype_to_abi_id (const_tree fntype); > void ix86_set_current_function (tree fndecl); > bool ix86_function_naked (const_tree fn); > void ix86_simd_clone_adjust (struct cgraph_node *node); > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index f0d6167e667..01387a3c38b 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -487,6 +487,20 @@ ix86_conditional_register_usage (void) >=20 > CLEAR_HARD_REG_SET (reg_class_contents[(int)CLOBBERED_REGS]); >=20 > + /* If this function is one of the non-SSE-clobber variants, remove > + those from the call_used_regs. */ > + if (cfun && ix86_fntype_to_abi_id (TREE_TYPE (cfun->decl)) !=3D ABI_DEFA= ULT) > + { > + for (i =3D XMM8_REG; i < XMM16_REG; i++) > + call_used_regs[i] =3D 0; > + if (ix86_fntype_to_abi_id (TREE_TYPE (cfun->decl)) =3D=3D ABI_NO_SS= E) > + { > + /* And from any accessible regs if this is ABI_NO_SSE. */ > + for (i =3D XMM8_REG; i < XMM16_REG; i++) > + CLEAR_HARD_REG_BIT (accessible_reg_set, i); > + } > + } > + > for (i =3D 0; i < FIRST_PSEUDO_REGISTER; i++) > { > /* Set/reset conditionally defined registers from > @@ -1119,6 +1133,8 @@ ix86_comp_type_attributes (const_tree type1, const_t= ree type2) > if (ix86_function_regparm (type1, NULL) > !=3D ix86_function_regparm (type2, NULL)) > return 0; > + if (ix86_fntype_to_abi_id (type1) !=3D ix86_fntype_to_abi_id (type2)) > + return 0; >=20 > return 1; > } > @@ -1791,6 +1807,21 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, /* Arg= ument info to initialize */ > cum->warn_sse =3D true; > cum->warn_mmx =3D true; >=20 > + if (ix86_fntype_to_abi_id (TREE_TYPE (cfun->decl)) =3D=3D ABI_NO_SSE > + && (!fntype > + || ix86_fntype_to_abi_id (fntype) !=3D ABI_NO_SSE)) > + { > + if (fndecl) > + error ("%qD without attribute noanysseclobber cannot be " > + "called from functions with that attribute", fndecl); > + else if (fntype) > + error ("%qT without attribute noanysseclobber cannot be " > + "called from functions with that attribute", fntype); > + else > + error ("functions without attribute noanysseclobber cannot be " > + "called from functions with that attribute"); > + } > + > /* Because type might mismatch in between caller and callee, we need to > use actual type of function for local calls. > FIXME: cgraph_analyze can be told to actually record if function uses= > @@ -6514,7 +6545,7 @@ ix86_nsaved_sseregs (void) > int nregs =3D 0; > int regno; >=20 > - if (!TARGET_64BIT_MS_ABI) > + if (!TARGET_64BIT_MS_ABI && crtl->abi->id() =3D=3D ABI_DEFAULT) > return 0; > for (regno =3D 0; regno < FIRST_PSEUDO_REGISTER; regno++) > if (SSE_REGNO_P (regno) && ix86_save_reg (regno, true, true)) > @@ -20285,6 +20316,34 @@ ix86_hard_regno_mode_ok (unsigned int regno, mach= ine_mode mode) > return false; > } >=20 > +/* Return the descriptor of an nosseclobber ABI_ID. */ > + > +static const predefined_function_abi & > +i386_less_sse_abi (unsigned abi_id) > +{ > + predefined_function_abi &myabi =3D function_abis[abi_id]; > + if (!myabi.initialized_p ()) > + { > + HARD_REG_SET full_reg_clobbers > + =3D default_function_abi.full_reg_clobbers (); > + for (int regno =3D XMM8_REG; regno < XMM16_REG; regno++) > + CLEAR_HARD_REG_BIT (full_reg_clobbers, regno); > + myabi.initialize (abi_id, full_reg_clobbers); > + } > + return myabi; > +} > + > +/* Implement TARGET_FNTYPE_ABI. */ > + > +static const predefined_function_abi & > +i386_fntype_abi (const_tree fntype) > +{ > + unsigned abi_id =3D ix86_fntype_to_abi_id (fntype); > + if (abi_id !=3D ABI_DEFAULT) > + return i386_less_sse_abi (abi_id); > + return default_function_abi; > +} > + > /* Implement TARGET_INSN_CALLEE_ABI. */ >=20 > const predefined_function_abi & > @@ -20341,6 +20400,9 @@ ix86_hard_regno_call_part_clobbered (unsigned int a= bi_id, unsigned int regno, > && ((TARGET_64BIT && REX_SSE_REGNO_P (regno)) > || LEGACY_SSE_REGNO_P (regno))); >=20 > + if (abi_id =3D=3D ABI_NO_SSE) > + return false; > + > return SSE_REGNO_P (regno) && GET_MODE_SIZE (mode) > 16; > } >=20 > @@ -25594,6 +25656,9 @@ ix86_libgcc_floating_mode_supported_p > #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \ > ix86_hard_regno_call_part_clobbered >=20 > +#undef TARGET_FNTYPE_ABI > +#define TARGET_FNTYPE_ABI i386_fntype_abi > + > #undef TARGET_INSN_CALLEE_ABI > #define TARGET_INSN_CALLEE_ABI ix86_insn_callee_abi >=20 > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 844deeae6cb..44d32ec2e4f 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -471,7 +471,9 @@ > (define_constants > [(ABI_DEFAULT 0) > (ABI_VZEROUPPER 1) > - (ABI_UNKNOWN 2)]) > + (ABI_LESS_SSE 2) > + (ABI_NO_SSE 3) > + (ABI_UNKNOWN 4)]) >=20 > ;; Insns whose names begin with "x86_" are emitted by gen_FOO calls > ;; from i386.cc. > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index d88fd75e06e..3adbbc75b1c 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -6680,6 +6680,41 @@ Exception handlers should only be used for exceptio= ns that push an error > code; you should use an interrupt handler in other cases. The system > will crash if the wrong kind of handler is used. >=20 > +@cindex @code{nosseclobber} function attribute, x86 > +@cindex @code{notanysseclobber} function attribute, x86 > +@item nosseclobber > +@itemx notanysseclobber > + > +On 32-bit and 64-bit x86 targets, you can use these attributes to indicat= e that > +a so-marked function doesn't clobber a subset of the SSE2 and AVX registe= rs. > +The @code{nosseclobber} attribute specifies that registers @code{%xmm8} t= hrough > +@code{%xmm15} are not clobbered by a function. This includes the low 16 b= ytes > +of the corresponding AVX2 and AVX512 registers. You can't make assumptio= ns > +about the higher parts of these registers, or other registers: those are > +assumed to be clobbered (or not) according to the base ABI. > + > +The @code{notanysseclobber} attribute specifies that the function doesn't= > +clobber @emph{any} parts of the SSE2/AVX2/AVX512 registers @code{%zmm8} > +through @code{%zmm15}, not even the high parts. > + > +Functions marked with @code{nosseclobber} can be defined > +without restrictions: they can contain arbitrary floating point or vector= > +code, and they can call functions not marked with this attribute (i.e. th= ose > +that must be assumed to clobber parts of these register). > +GCC will insert register saves and restores in the pro- and epilogue in > +those cases (only the low 16 bytes of the used registers will be > +saved/restored, like the attribute implies). > + > +In comparison functions defined with @code{notanysseclobber} are severely= > +restricted: they can't call functions not marked with that attribute. > +They also can't write to any of the @code{%xmm8} through @code{%xmm15} > +registers (or their extended variants with other ISAs). GCC does not > +emit any saves or restores for them. > + > +Calls to such functions (other than above) are unrestricted. The effect > +is simply that some values can be kept in registers over calls to > +such marked functions. > + > @cindex @code{target} function attribute > @item target (@var{options}) > As discussed in @ref{Common Function Attributes}, this attribute=20 > diff --git a/gcc/function-abi.cc b/gcc/function-abi.cc > index 2ab9b2c5649..efbe114218c 100644 > --- a/gcc/function-abi.cc > +++ b/gcc/function-abi.cc > @@ -42,6 +42,26 @@ void > predefined_function_abi::initialize (unsigned int id, > const_hard_reg_set full_reg_clobbers) > { > + /* Don't reinitialize an ABI struct. We might be called from reinit_re= gs > + from the targets conditional_register_usage hook which might depend > + on cfun and might have changed the global register sets according > + to that functions ABI already. That's not the default ABI anymore. > + > + XXX only avoid this if we're reinitializing the default ABI, and the= > + current function is _not_ of the default ABI. That's for > + backward compatibility where some backends modify the regsets with > + the exception that those changes are then reflected also in the defa= ult > + ABI (which rather is then the "current" ABI). E.g. x86_64 with the > + ms_abi vs sysv attribute. They aren't reflected by separate ABI > + structs, but handled different. The "default" ABI hence changes > + back and forth (and is expected to!) between a ms_abi and a sysv > + function. */ > + if (m_initialized > + && id =3D=3D 0 > + && cfun > + && fndecl_abi (cfun->decl).base_abi ().id() !=3D 0) > + return; > + > m_id =3D id; > m_initialized =3D true; > m_full_reg_clobbers =3D full_reg_clobbers; > @@ -224,6 +244,13 @@ insn_callee_abi (const rtx_insn *insn) > if (tree fndecl =3D get_call_fndecl (insn)) > return fndecl_abi (fndecl); >=20 > + if (rtx call =3D get_call_rtx_from (insn)) > + { > + tree memexp =3D MEM_EXPR (XEXP (call, 0)); > + if (memexp) > + return fntype_abi (TREE_TYPE (memexp)); > + } > + > if (targetm.calls.insn_callee_abi) > return targetm.calls.insn_callee_abi (insn); >=20 > diff --git a/gcc/testsuite/gcc.target/i386/sseclobber-1.c b/gcc/testsuite/= gcc.target/i386/sseclobber-1.c > new file mode 100644 > index 00000000000..8758e2d3109 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/sseclobber-1.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target sse2 } */ > +/* { dg-options "-O1" } */ > +/* { dg-final { scan-assembler-times {mm[89], [0-9]*\(%rsp\)} 2 } } */ > +/* { dg-final { scan-assembler-times {mm1[0-5], [0-9]*\(%rsp\)} 6 } } */ > + > +extern int nonsse (int) __attribute__((nosseclobber)); > +extern int normalfunc (int); > + > +/* Demonstrate that all regs potentially clobbered by normal psABI > + functions are saved/restored by otherabi functions. */ > +__attribute__((nosseclobber)) int nonsse (int i) > +{ > + return normalfunc (i + 2) + 3; > +} > diff --git a/gcc/testsuite/gcc.target/i386/sseclobber-2.c b/gcc/testsuite/= gcc.target/i386/sseclobber-2.c > new file mode 100644 > index 00000000000..9abafa0a9ba > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/sseclobber-2.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target sse2 } */ > +/* { dg-options "-O1" } */ > +/* { dg-final { scan-assembler-not {mm[0-9], [0-9]*\(%rsp\)} } } */ > + > +extern int nonsse (int) __attribute__((nosseclobber)); > +extern int othernonsse (int) __attribute__((nosseclobber)); > + > +/* Demonstrate that calling a nosseclobber function from a nosseclobber > + function does _not_ need to save all the regs (unlike in nonsse). */ > +__attribute__((nosseclobber)) int nonsse (int i) > +{ > + return othernonsse (i + 2) + 3; > +} > diff --git a/gcc/testsuite/gcc.target/i386/sseclobber-3.c b/gcc/testsuite/= gcc.target/i386/sseclobber-3.c > new file mode 100644 > index 00000000000..276c7fd926b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/sseclobber-3.c > @@ -0,0 +1,54 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target sse2 } */ > +/* { dg-options "-O1" } */ > +/* for docalc2 we should use the high xmm regs */ > +/* { dg-final { scan-assembler {xmm[89]} } } */ > +/* do docalc4_notany we should use the high ymm regs */ > +/* { dg-final { scan-assembler {ymm[89]} } } */ > +/* for docalc4 (and nowhere else) we should save/restore exactly > + one reg to stack around the inner-loop call */ > +/* { dg-final { scan-assembler-times {ymm[0-9]*, [0-9]*\(%rsp\)} 1 } } */= > + > +typedef double dbl2 __attribute__((vector_size(16))); > +typedef double dbl4 __attribute__((vector_size(32))); > +typedef double dbl8 __attribute__((vector_size(64))); > +extern __attribute__((nosseclobber,const)) double nonsse (int); > + > +/* Demonstrate that some values can be kept in a register over calls > + to otherabi functions. nonsse saves the XMM register, so those > + are usable, hence docalc2 should be able to keep values in registers > + over the nonsse call. */ > +void docalc2 (dbl2 *d, dbl2 *a, dbl2 *b, int n) > +{ > + long i; > + for (i =3D 0; i < n; i++) > + { > + d[i] =3D a[i] * b[i] * nonsse(i); > + } > +} > + > +/* Here we're using YMM registers (four doubles) and those are _not_ > + saved by nonsse() (only the XMM parts) so docalc4 should not keep > + the value in a register over the call to nonsse. */ > +void __attribute__((target("avx2"))) docalc4 (dbl4 *d, dbl4 *a, dbl4 *b, i= nt n) > +{ > + long i; > + for (i =3D 0; i < n; i++) > + { > + d[i] =3D a[i] * b[i] * nonsse(i); > + } > +} > + > +/* And here we're also using YMM registers, but have a call to a > + noanysseclobber function, which _does_ save all [XYZ]MM regs except > + arguments, so docalc4_notany should again be able to keep the value > + in a register. */ > +extern __attribute__((noanysseclobber,const)) double notanysse (int); > +void __attribute__((target("avx2"))) docalc4_notany (dbl4 *d, dbl4 *a, db= l4 *b, int n) > +{ > + long i; > + for (i =3D 0; i < n; i++) > + { > + d[i] =3D a[i] * b[i] * notanysse(i); > + } > +} > diff --git a/gcc/testsuite/gcc.target/i386/sseclobber-4.c b/gcc/testsuite/= gcc.target/i386/sseclobber-4.c > new file mode 100644 > index 00000000000..734f25068f0 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/sseclobber-4.c > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target sse2 } */ > +/* { dg-options "-O1" } */ > +/* { dg-final { scan-assembler-not {mm[0-9], [0-9]*\(%rsp\)} } } */ > + > +extern __attribute__((nosseclobber)) int (*nonsse_ptr) (int); > + > +/* Demonstrate that some values can be kept in a register over calls > + to otherabi functions when called via function pointer. */ > +double docalc (double d) > +{ > + double ret =3D d; > + int i =3D 0; > + while (1) { > + int j =3D nonsse_ptr (i++); > + if (!j) > + break; > + ret +=3D j; > + } > + return ret; > +} > diff --git a/gcc/testsuite/gcc.target/i386/sseclobber-5.c b/gcc/testsuite/= gcc.target/i386/sseclobber-5.c > new file mode 100644 > index 00000000000..1869ae06148 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/sseclobber-5.c > @@ -0,0 +1,37 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target sse2 } */ > +/* { dg-options "-O1" } */ > +/* { dg-final { scan-assembler-not {mm[89]} } } */ > +/* { dg-final { scan-assembler-not {mm1[0-5]} } } */ > + > +extern int noanysse (int) __attribute__((noanysseclobber)); > +extern int noanysse2 (int) __attribute__((noanysseclobber)); > +extern __attribute__((noanysseclobber)) double calcstuff (double, double)= ; > + > +/* Demonstrate that none of the clobbered SSE (or wider) regs are > + used by a noanysse function. */ > +__attribute__((noanysseclobber)) double calcstuff (double d, double e) > +{ > + double s1, s2, s3, s4, s5, s6, s7, s8; > + s1 =3D s2 =3D s3 =3D s4 =3D s5 =3D s6 =3D s7 =3D s8 =3D 0.0; > + while (d > 0.1) > + { > + s1 +=3D s2 * 2 + d; > + s2 +=3D s3 * 3 + e; > + s3 +=3D s4 * 5 + d * e; > + s4 +=3D e / d; > + s5 +=3D s2 * 7 + d - e; > + s5 +=3D 2 * d + e; > + s6 +=3D 5 * e + d; > + s7 +=3D 7 * e * (d+1); > + d -=3D e; > + } > + return s1 + s2 + s3 + s4 + s5 + s6 + s7; > +} > + > +/* Demonstrate that we can call noanysse functions from noannysse > + functions. */ > +__attribute__((noanysseclobber)) int noanysse2 (int i) > +{ > + return noanysse (i + 2) + 3; > +} > diff --git a/gcc/testsuite/gcc.target/i386/sseclobber-6.c b/gcc/testsuite/= gcc.target/i386/sseclobber-6.c > new file mode 100644 > index 00000000000..89ece11c9f2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/sseclobber-6.c > @@ -0,0 +1,17 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target sse2 } */ > +/* { dg-options "-O1" } */ > + > +/* Various ways of invalid usage of the nosse attributes. */ > +extern __attribute__((nosseclobber)) int nonfndecl; /* { dg-warning "only= applies to function types" } */ > + > +extern int normalfunc (int); > +__attribute__((nosseclobber)) int (*nonsse_ptr) (int) =3D normalfunc; /* {= dg-warning "from incompatible pointer type" } */ > + > +extern int noanysse (int) __attribute__((noanysseclobber)); > +/* Demonstrate that it's not allowed to call any functions that > + aren't noanysse from noanysse functions. */ > +__attribute__((noanysseclobber)) int noanysse (int i) > +{ > + return normalfunc (i + 2) + 3; /* { dg-error "cannot be called from fun= ction" } */ > +}