From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 6AD6A3858D28 for ; Fri, 5 Nov 2021 17:52:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6AD6A3858D28 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A5G0XJA004630; Fri, 5 Nov 2021 17:52:54 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3c55cb5rw6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Nov 2021 17:52:54 +0000 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1A5GtHHn013615; Fri, 5 Nov 2021 17:52:54 GMT Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0b-001b2d01.pphosted.com with ESMTP id 3c55cb5rvx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Nov 2021 17:52:53 +0000 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1A5HlRTl011661; Fri, 5 Nov 2021 17:52:53 GMT Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by ppma05wdc.us.ibm.com with ESMTP id 3c4t4gy2d1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 05 Nov 2021 17:52:53 +0000 Received: from b01ledav001.gho.pok.ibm.com (b01ledav001.gho.pok.ibm.com [9.57.199.106]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1A5HqqqE25297240 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 5 Nov 2021 17:52:52 GMT Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3EFC628065; Fri, 5 Nov 2021 17:52:52 +0000 (GMT) Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A8C5828064; Fri, 5 Nov 2021 17:52:51 +0000 (GMT) Received: from sig-9-65-204-72.ibm.com (unknown [9.65.204.72]) by b01ledav001.gho.pok.ibm.com (Postfix) with ESMTP; Fri, 5 Nov 2021 17:52:51 +0000 (GMT) Message-ID: <4d3c58e8e5f0361f807be7ad9b36158227b5c0d2.camel@vnet.ibm.com> Subject: Re: [PATCH 2/5] Add Power10 XXSPLTI* and LXVKQ instructions (LXVKQ) From: will schmidt To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner Date: Fri, 05 Nov 2021 12:52:51 -0500 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-16.el8) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: tOvg-nUwg2DPluzGpTkYb_FvDfofNTRX X-Proofpoint-ORIG-GUID: J27TdhgVQaiDDk-EIHnPQHdAuIfan4d1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-05_02,2021-11-03_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 malwarescore=0 adultscore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 spamscore=0 priorityscore=1501 clxscore=1015 impostorscore=0 lowpriorityscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111050097 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Nov 2021 17:52:59 -0000 On Fri, 2021-11-05 at 00:07 -0400, Michael Meissner wrote: > Add LXVKQ support. > > This patch adds support to generate the LXVKQ instruction to load specific > IEEE-128 floating point constants. > > Compared to the last time I submitted this patch, I modified it so that it > uses the bit pattern of the vector to see if it can generate the LXVKQ > instruction. This means on a little endian Power system, the > following code will generate a LXVKQ 34,16 instruction: > > vector long long foo (void) > { > #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ > return (vector long long) { 0x0000000000000000, 0x8000000000000000 }; > #else > return (vector long long) { 0x8000000000000000, 0x0000000000000000 }; > #endif > } > > because that vector pattern is the same bit pattern as -0.0F128. > > 2021-11-05 Michael Meissner > > gcc/ > > * config/rs6000/constraints.md (eQ): New constraint. > * config/rs6000/predicates.md (easy_fp_constant): Add support for > generating the LXVKQ instruction. > (easy_vector_constant_ieee128): New predicate. > (easy_vector_constant): Add support for generating the LXVKQ > instruction. > * config/rs6000/rs6000-protos.h (constant_generates_lxvkq): New > declaration. > * config/rs6000/rs6000.c (output_vec_const_move): Add support for > generating LXVKQ. > (constant_generates_lxvkq): New function. > * config/rs6000/rs6000.opt (-mieee128-constant): New debug > option. > * config/rs6000/vsx.md (vsx_mov_64bit): Add support for > generating LXVKQ. > (vsx_mov_32bit): Likewise. > * doc/md.texi (PowerPC and IBM RS6000 constraints): Document the > eQ constraint. > > gcc/testsuite/ > > * gcc.target/powerpc/float128-constant.c: New test. > --- > gcc/config/rs6000/constraints.md | 6 + > gcc/config/rs6000/predicates.md | 34 ++++ > gcc/config/rs6000/rs6000-protos.h | 1 + > gcc/config/rs6000/rs6000.c | 62 +++++++ > gcc/config/rs6000/rs6000.opt | 4 + > gcc/config/rs6000/vsx.md | 14 ++ > gcc/doc/md.texi | 4 + > .../gcc.target/powerpc/float128-constant.c | 160 ++++++++++++++++++ > 8 files changed, 285 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-constant.c > > diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md > index c8cff1a3038..e72132b4c28 100644 > --- a/gcc/config/rs6000/constraints.md > +++ b/gcc/config/rs6000/constraints.md > @@ -213,6 +213,12 @@ (define_constraint "eI" > "A signed 34-bit integer constant if prefixed instructions are supported." > (match_operand 0 "cint34_operand")) > > +;; A TF/KF scalar constant or a vector constant that can load certain IEEE > +;; 128-bit constants into vector registers using LXVKQ. > +(define_constraint "eQ" > + "An IEEE 128-bit constant that can be loaded into VSX registers." > + (match_operand 0 "easy_vector_constant_ieee128")) > + > ;; Floating-point constraints. These two are defined so that insn > ;; length attributes can be calculated exactly. > ok > diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md > index 956e42bc514..e0d1c718e9f 100644 > --- a/gcc/config/rs6000/predicates.md > +++ b/gcc/config/rs6000/predicates.md > @@ -601,6 +601,14 @@ (define_predicate "easy_fp_constant" > if (TARGET_VSX && op == CONST0_RTX (mode)) > return 1; > > + /* Constants that can be generated with ISA 3.1 instructions are easy. */ Easy is relative, but OK. > + vec_const_128bit_type vsx_const; > + if (TARGET_POWER10 && vec_const_128bit_to_bytes (op, mode, &vsx_const)) > + { > + if (constant_generates_lxvkq (&vsx_const) != 0) > + return true; > + } > + > /* Otherwise consider floating point constants hard, so that the > constant gets pushed to memory during the early RTL phases. This > has the advantage that double precision constants that can be > @@ -609,6 +617,23 @@ (define_predicate "easy_fp_constant" > return 0; > }) > > +;; Return 1 if the operand is a special IEEE 128-bit value that can be loaded > +;; via the LXVKQ instruction. > + > +(define_predicate "easy_vector_constant_ieee128" > + (match_code "const_vector,const_double") > +{ > + vec_const_128bit_type vsx_const; > + > + /* Can we generate the LXVKQ instruction? */ > + if (!TARGET_IEEE128_CONSTANT || !TARGET_FLOAT128_HW || !TARGET_POWER10 > + || !TARGET_VSX) > + return false; Presumably all of the checks there are valid. (Can we have power10 without float128_hw or ieee128_constant flags set?) I do notice the addition of an ieee128_constant flag below. > + > + return (vec_const_128bit_to_bytes (op, mode, &vsx_const) > + && constant_generates_lxvkq (&vsx_const) != 0); > +}) > + ok > ;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB > ;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction. > > @@ -653,6 +678,15 @@ (define_predicate "easy_vector_constant" > if (zero_constant (op, mode) || all_ones_constant (op, mode)) > return true; > > + /* Constants that can be generated with ISA 3.1 instructions are > + easy. */ > + vec_const_128bit_type vsx_const; > + if (TARGET_POWER10 && vec_const_128bit_to_bytes (op, mode, &vsx_const)) > + { > + if (constant_generates_lxvkq (&vsx_const) != 0) > + return true; > + } > + > if (TARGET_P9_VECTOR > && xxspltib_constant_p (op, mode, &num_insns, &value)) > return true; ok. > diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h > index 490d6e33736..494a95cc6ee 100644 > --- a/gcc/config/rs6000/rs6000-protos.h > +++ b/gcc/config/rs6000/rs6000-protos.h > @@ -250,6 +250,7 @@ typedef struct { > > extern bool vec_const_128bit_to_bytes (rtx, machine_mode, > vec_const_128bit_type *); > +extern unsigned constant_generates_lxvkq (vec_const_128bit_type *); > #endif /* RTX_CODE */ > > #ifdef TREE_CODE ok > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index f285022294a..06d02085b06 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -6991,6 +6991,17 @@ output_vec_const_move (rtx *operands) > gcc_unreachable (); > } > > + vec_const_128bit_type vsx_const; > + if (TARGET_POWER10 && vec_const_128bit_to_bytes (vec, mode, &vsx_const)) > + { > + unsigned imm = constant_generates_lxvkq (&vsx_const); > + if (imm) > + { > + operands[2] = GEN_INT (imm); > + return "lxvkq %x0,%2"; > + } > + } > + > if (TARGET_P9_VECTOR > && xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value)) > { ok > @@ -28872,6 +28883,57 @@ vec_const_128bit_to_bytes (rtx op, > return true; > } > > +/* Determine if an IEEE 128-bit constant can be loaded with LXVKQ. Return zero > + if the LXVKQ instruction cannot be used. Otherwise return the immediate > + value to be used with the LXVKQ instruction. */ > + > +unsigned > +constant_generates_lxvkq (vec_const_128bit_type *vsx_const) > +{ > + /* Is the instruction supported with power10 code generation, IEEE 128-bit > + floating point hardware and VSX registers are available. */ > + if (!TARGET_IEEE128_CONSTANT || !TARGET_FLOAT128_HW || !TARGET_POWER10 > + || !TARGET_VSX) > + return 0; > + > + /* Verify that all of the bottom 3 words in the constants loaded by the > + LXVKQ instruction are zero. */ Ok. I did look at this a bit before it clicked, so would suggest a comment stl "All of the constants that can be loaded by lxvkq will have zero in the bottom 3 words, so ensure those are zero before we use a switch based on the nonzero portion of the constant." It would be fine as-is too. :-) > + if (vsx_const->words[1] != 0 > + || vsx_const->words[2] != 0 > + || vsx_const->words[3] != 0) > + return 0; > + > + /* See if we have a match. */ > + switch (vsx_const->words[0]) > + { > + case 0x3FFF0000U: return 1; /* IEEE 128-bit +1.0. */ > + case 0x40000000U: return 2; /* IEEE 128-bit +2.0. */ > + case 0x40008000U: return 3; /* IEEE 128-bit +3.0. */ > + case 0x40010000U: return 4; /* IEEE 128-bit +4.0. */ > + case 0x40014000U: return 5; /* IEEE 128-bit +5.0. */ > + case 0x40018000U: return 6; /* IEEE 128-bit +6.0. */ > + case 0x4001C000U: return 7; /* IEEE 128-bit +7.0. */ > + case 0x7FFF0000U: return 8; /* IEEE 128-bit +Infinity. */ > + case 0x7FFF8000U: return 9; /* IEEE 128-bit quiet NaN. */ > + case 0x80000000U: return 16; /* IEEE 128-bit -0.0. */ > + case 0xBFFF0000U: return 17; /* IEEE 128-bit -1.0. */ > + case 0xC0000000U: return 18; /* IEEE 128-bit -2.0. */ > + case 0xC0008000U: return 19; /* IEEE 128-bit -3.0. */ > + case 0xC0010000U: return 20; /* IEEE 128-bit -4.0. */ > + case 0xC0014000U: return 21; /* IEEE 128-bit -5.0. */ > + case 0xC0018000U: return 22; /* IEEE 128-bit -6.0. */ > + case 0xC001C000U: return 23; /* IEEE 128-bit -7.0. */ > + case 0xFFFF0000U: return 24; /* IEEE 128-bit -Infinity. */ > + > + /* anything else cannot be loaded. */ > + default: > + break; > + } > + > + return 0; > +} > + > + > struct gcc_target targetm = TARGET_INITIALIZER; ok > > #include "gt-rs6000.h" > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt > index 9d7878f144a..b7433ec4e30 100644 > --- a/gcc/config/rs6000/rs6000.opt > +++ b/gcc/config/rs6000/rs6000.opt > @@ -640,6 +640,10 @@ mprivileged > Target Var(rs6000_privileged) Init(0) > Generate code that will run in privileged state. > > +mieee128-constant > +Target Var(TARGET_IEEE128_CONSTANT) Init(1) Save > +Generate (do not generate) code that uses the LXVKQ instruction. > + > -param=rs6000-density-pct-threshold= > Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 100) Param > When costing for loop vectorization, we probably need to penalize the loop body I do wonder if this option is necessary.. presumably it is useful at least for before/after comparison purposes. Is there any expectation that this would be necessary long term? > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index 0bf04feb6c4..0a376ee4c28 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -1192,16 +1192,19 @@ (define_insn_and_split "*xxspltib__split" > > ;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR) > ;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW > +;; LXVKQ > ;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) > (define_insn "vsx_mov_64bit" > [(set (match_operand:VSX_M 0 "nonimmediate_operand" > "=ZwO, wa, wa, r, we, ?wQ, > ?&r, ??r, ??Y, , wa, v, > + wa, > ?wa, v, , wZ, v") > > (match_operand:VSX_M 1 "input_operand" > "wa, ZwO, wa, we, r, r, > wQ, Y, r, r, wE, jwM, > + eQ, > ?jwM, W, , v, wZ"))] > > "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (mode) > @@ -1213,35 +1216,43 @@ (define_insn "vsx_mov_64bit" > [(set_attr "type" > "vecstore, vecload, vecsimple, mtvsr, mfvsr, load, > store, load, store, *, vecsimple, vecsimple, > + vecperm, > vecsimple, *, *, vecstore, vecload") > (set_attr "num_insns" > "*, *, *, 2, *, 2, > 2, 2, 2, 2, *, *, > + *, > *, 5, 2, *, *") > (set_attr "max_prefixed_insns" > "*, *, *, *, *, 2, > 2, 2, 2, 2, *, *, > + *, > *, *, *, *, *") > (set_attr "length" > "*, *, *, 8, *, 8, > 8, 8, 8, 8, *, *, > + *, > *, 20, 8, *, *") > (set_attr "isa" > ", , , *, *, *, > *, *, *, *, p9v, *, > + p10, > , *, *, *, *")]) > > ;; VSX store VSX load VSX move GPR load GPR store GPR move > +;; LXVKQ > ;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const > ;; LVX (VMX) STVX (VMX) > (define_insn "*vsx_mov_32bit" > [(set (match_operand:VSX_M 0 "nonimmediate_operand" > "=ZwO, wa, wa, ??r, ??Y, , > + wa, > wa, v, ?wa, v, , > wZ, v") > > (match_operand:VSX_M 1 "input_operand" > "wa, ZwO, wa, Y, r, r, > + eQ, > wE, jwM, ?jwM, W, , > v, wZ"))] > > @@ -1253,14 +1264,17 @@ (define_insn "*vsx_mov_32bit" > } > [(set_attr "type" > "vecstore, vecload, vecsimple, load, store, *, > + vecperm, > vecsimple, vecsimple, vecsimple, *, *, > vecstore, vecload") > (set_attr "length" > "*, *, *, 16, 16, 16, > + *, > *, *, *, 20, 16, > *, *") > (set_attr "isa" > ", , , *, *, *, > + p10, > p9v, *, , *, *, > *, *")]) > Just skimmed this part, nothing jumps out at me. > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index 41f1850bf6e..4af8fd76992 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -3336,6 +3336,10 @@ A constant whose negation is a signed 16-bit constant. > @item eI > A signed 34-bit integer constant if prefixed instructions are supported. > > +@item eQ > +An IEEE 128-bit constant that can be loaded into a VSX register with a > +single instruction. > + > @ifset INTERNALS > @item G > A floating point constant that can be loaded into a register with one Should 'single instruction' be replaced with 'lxvkq'? Or have some lxvkq reference added, since that is the only instruction currently behind this constraint? > diff --git a/gcc/testsuite/gcc.target/powerpc/float128-constant.c b/gcc/testsuite/gcc.target/powerpc/float128-constant.c > new file mode 100644 > index 00000000000..e3286a786a5 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/float128-constant.c > @@ -0,0 +1,160 @@ > +/* { dg-require-effective-target ppc_float128_hw } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > + Ok. (Nothing further reviewed in detail). thanks -Will > +/* Test whether the LXVKQ instruction is generated to load special IEEE 128-bit > + constants. */ > + > +_Float128 > +return_0 (void) > +{ > + return 0.0f128; /* XXSPLTIB 34,0. */ > +} > + > +_Float128 > +return_1 (void) > +{ > + return 1.0f128; /* LXVKQ 34,1. */ > +} > + > +_Float128 > +return_2 (void) > +{ > + return 2.0f128; /* LXVKQ 34,2. */ > +} > + > +_Float128 > +return_3 (void) > +{ > + return 3.0f128; /* LXVKQ 34,3. */ > +} > + > +_Float128 > +return_4 (void) > +{ > + return 4.0f128; /* LXVKQ 34,4. */ > +} > + > +_Float128 > +return_5 (void) > +{ > + return 5.0f128; /* LXVKQ 34,5. */ > +} > + > +_Float128 > +return_6 (void) > +{ > + return 6.0f128; /* LXVKQ 34,6. */ > +} > + > +_Float128 > +return_7 (void) > +{ > + return 7.0f128; /* LXVKQ 34,7. */ > +} > + > +_Float128 > +return_m0 (void) > +{ > + return -0.0f128; /* LXVKQ 34,16. */ > +} > + > +_Float128 > +return_m1 (void) > +{ > + return -1.0f128; /* LXVKQ 34,17. */ > +} > + > +_Float128 > +return_m2 (void) > +{ > + return -2.0f128; /* LXVKQ 34,18. */ > +} > + > +_Float128 > +return_m3 (void) > +{ > + return -3.0f128; /* LXVKQ 34,19. */ > +} > + > +_Float128 > +return_m4 (void) > +{ > + return -4.0f128; /* LXVKQ 34,20. */ > +} > + > +_Float128 > +return_m5 (void) > +{ > + return -5.0f128; /* LXVKQ 34,21. */ > +} > + > +_Float128 > +return_m6 (void) > +{ > + return -6.0f128; /* LXVKQ 34,22. */ > +} > + > +_Float128 > +return_m7 (void) > +{ > + return -7.0f128; /* LXVKQ 34,23. */ > +} > + > +_Float128 > +return_inf (void) > +{ > + return __builtin_inff128 (); /* LXVKQ 34,8. */ > +} > + > +_Float128 > +return_minf (void) > +{ > + return - __builtin_inff128 (); /* LXVKQ 34,24. */ > +} > + > +_Float128 > +return_nan (void) > +{ > + return __builtin_nanf128 (""); /* LXVKQ 34,9. */ > +} > + > +/* Note, the following NaNs should not generate a LXVKQ instruction. */ > +_Float128 > +return_mnan (void) > +{ > + return - __builtin_nanf128 (""); /* PLXV 34,... */ > +} > + > +_Float128 > +return_nan2 (void) > +{ > + return __builtin_nanf128 ("1"); /* PLXV 34,... */ > +} > + > +_Float128 > +return_nans (void) > +{ > + return __builtin_nansf128 (""); /* PLXV 34,... */ > +} > + > +vector long long > +return_longlong_neg_0 (void) > +{ > + /* This vector is the same pattern as -0.0F128. */ > +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ > +#define FIRST 0x8000000000000000 > +#define SECOND 0x0000000000000000 > + > +#else > +#define FIRST 0x0000000000000000 > +#define SECOND 0x8000000000000000 > +#endif > + > + return (vector long long) { FIRST, SECOND }; /* LXVKQ 34,16. */ > +} > + > +/* { dg-final { scan-assembler-times {\mlxvkq\M} 19 } } */ > +/* { dg-final { scan-assembler-times {\mplxv\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mxxspltib\M} 1 } } */ > + > -- > 2.31.1 > >