From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 237633AA9022 for ; Fri, 3 Jun 2022 09:49:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 237633AA9022 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=H1nAYj/Lnwmv6TruHq5ofTB3zi78gulFXE5yWLmhJSk=; b=Wr8J/JcKKbG7H2if74c5+j3i11 ++ewXq3Sf+30cCbCAAUXiQL8bwSaQClfxxAmqEdw5aKPyfRf5qZP9UlbUhF81OUQjFfdmz5Nsa9Mn Ed2PXIQuq33by/werqzm2rIGZ9j6RyTmELYuv9UQhyxz+qnh6A169Wb7xVgGbbMtFgfZMUBeqjKTV wkuiGj3z0KS87SKtDtAEbaTFUJ3c6okES8qPBJMEuUKMaI7XcCvPVaib6tuUoFGp0k240P16XD6xe cxvD0cspkjaPDgwnbiCQrx1BgdqCRet9fS0Lp4xSx7odt2Gq8vSyXp+e7xEe9O+FLv2GFfMVuIjeY 8kjH7amg==; Received: from host109-154-46-241.range109-154.btcentralplus.com ([109.154.46.241]:63034 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nx3wE-0005Sq-E7; Fri, 03 Jun 2022 05:49:50 -0400 From: "Roger Sayle" To: "'GCC Patches'" Subject: [x86 PATCH] PR target/91681: zero_extendditi2 pattern for more optimizations. Date: Fri, 3 Jun 2022 10:49:47 +0100 Message-ID: <0af601d8772f$43ba22f0$cb2e68d0$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0AF7_01D87737.A57E8AF0" X-Mailer: Microsoft Outlook 16.0 Thread-Index: Adh3LT/VBXgax6NaQT6NBBRWYjSWDA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jun 2022 09:49:53 -0000 This is a multipart message in MIME format. ------=_NextPart_000_0AF7_01D87737.A57E8AF0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Technically, PR target/91681 has already been resolved; we now recognize the highpart multiplication at the tree-level, we no longer use the stack, and we currently generate the same number of instructions as LLVM. However, it is still possible to do better, the current x86_64 code to generate a double word addition of a zero extended operand, looks like: xorl %r11d, %r11d addq %r10, %rax adcq %r11, %rdx when it's possible (as LLVM does) to use an immediate constant: addq %r10, %rax adcq $0, %rdx To do this, the backend required one or two simple changes, that then themselves required one or two more obscure tweaks. The simple starting point is to define a zero_extendditi2 pattern, for zero extension from DImode to TImode on TARGET_64BIT that is split after reload. Double word (TImode) addition/subtraction is split after reload, so that constrains when things should happen. With zero extension now visible to combine, we add two new define_insn_and_split that add/subtract a zero extended operand in double word mode. These apply to both 32-bit and 64-bit code generation, to produce adc $0 and sbb $0. The first strange tweak is that these new patterns interfere with the optimization that recognizes DW:DI = (HI:SI<<32)+LO:SI as a pair of register moves, or more accurately the combine splitter no longer triggers as we're now converting two instructions into two instructions (not three instructions into two instructions). This is easily repaired (and extended to handle TImode) by changing from a pair of define_split (that handle operand commutativity) to a set of four define_insn_and_split (again to handle operand commutativity). The other/final strange tweak that the above splitters now interfere with AVX512's kunpckdq instruction which is defined as identical RTL, DW:DI = (HI:SI<<32)|zero_extend(LO:SI). To distinguish this, and also avoid AVX512 mask registers being used by reload to perform SImode scalar shifts, I've added the explicit (unspec UNSPEC_MASKOP) to the unpack mask operations, which matches what sse.md does for the other mask specific (logic) operations. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-06-03 Roger Sayle gcc/ChangeLog PR target/91681 * config/i386/i386.md (zero_extendditi2): New define_insn_and_split. (*add3_doubleword_zext): New define_insn_and_split. (*sub3_doubleword_zext): New define_insn_and_split. (*concat3_1): New define_insn_and_split replacing previous define_split for implementing DST = (HI<<32)|LO as pair of move instructions, setting lopart and hipart. (*concat3_2): Likewise. (*concat3_3): Likewise, where HI is zero_extended. (*concat3_4): Likewise, where HI is zero_extended. * config/i386/sse.md (kunpckhi): Add UNSPEC_MASKOP unspec. (kunpcksi): Likewise, add UNSPEC_MASKOP unspec. (kunpckdi): Likewise, add UNSPEC_MASKOP unspec. (vec_pack_trunc_qi): Update to specify required UNSPEC_MASKOP unspec. (vec_pack_trunc_): Likewise. gcc/testsuite/ChangeLog PR target/91681 * g++.target/i386/pr91681.C: New test case (from the PR). * gcc.target/i386/pr91681-1.c: New int128 test case. * gcc.target/i386/pr91681-2.c: Likewise. * gcc.target/i386/pr91681-3.c: Likewise, but for ia32. Thanks in advance, Roger -- ------=_NextPart_000_0AF7_01D87737.A57E8AF0 Content-Type: text/plain; name="patchzt4.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="patchzt4.txt" diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md=0A= index 602dfa7..6cce256 100644=0A= --- a/gcc/config/i386/i386.md=0A= +++ b/gcc/config/i386/i386.md=0A= @@ -4325,6 +4325,16 @@=0A= (set_attr "type" "imovx,mskmov,mskmov")=0A= (set_attr "mode" "SI,QI,QI")])=0A= =0A= +(define_insn_and_split "zero_extendditi2"=0A= + [(set (match_operand:TI 0 "nonimmediate_operand" "=3Dr,o")=0A= + (zero_extend:TI (match_operand:DI 1 "nonimmediate_operand" "rm,r")))]=0A= + "TARGET_64BIT"=0A= + "#"=0A= + "&& reload_completed"=0A= + [(set (match_dup 3) (match_dup 1))=0A= + (set (match_dup 4) (const_int 0))]=0A= + "split_double_mode (TImode, &operands[0], 1, &operands[3], = &operands[4]);")=0A= +=0A= ;; Transform xorl; mov[bw] (set strict_low_part) into movz[bw]l.=0A= (define_peephole2=0A= [(parallel [(set (match_operand:SWI48 0 "general_reg_operand")=0A= @@ -6453,6 +6463,33 @@=0A= [(set_attr "type" "alu")=0A= (set_attr "mode" "QI")])=0A= =0A= +(define_insn_and_split "*add3_doubleword_zext"=0A= + [(set (match_operand: 0 "nonimmediate_operand" "=3Dr,o")=0A= + (plus:=0A= + (zero_extend:=0A= + (match_operand:DWIH 2 "nonimmediate_operand" "rm,r")) =0A= + (match_operand: 1 "nonimmediate_operand" "0,0")))=0A= + (clobber (reg:CC FLAGS_REG))]=0A= + "MEM_P (operands[0]) ? rtx_equal_p (operands[0], operands[1])=0A= + && !MEM_P (operands[2])=0A= + : !MEM_P (operands[1])"=0A= + "#"=0A= + "&& reload_completed"=0A= + [(parallel [(set (reg:CCC FLAGS_REG)=0A= + (compare:CCC=0A= + (plus:DWIH (match_dup 1) (match_dup 2))=0A= + (match_dup 1)))=0A= + (set (match_dup 0)=0A= + (plus:DWIH (match_dup 1) (match_dup 2)))])=0A= + (parallel [(set (match_dup 3)=0A= + (plus:DWIH=0A= + (plus:DWIH=0A= + (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))=0A= + (match_dup 4))=0A= + (const_int 0)))=0A= + (clobber (reg:CC FLAGS_REG))])]=0A= + "split_double_mode (mode, &operands[0], 2, &operands[0], = &operands[3]);")=0A= +=0A= ;; Like DWI, but use POImode instead of OImode.=0A= (define_mode_attr DPWI [(QI "HI") (HI "SI") (SI "DI") (DI "TI") (TI = "POI")])=0A= =0A= @@ -6903,6 +6940,31 @@=0A= }=0A= })=0A= =0A= +(define_insn_and_split "*sub3_doubleword_zext"=0A= + [(set (match_operand: 0 "nonimmediate_operand" "=3Dr,o")=0A= + (minus:=0A= + (match_operand: 1 "nonimmediate_operand" "0,0")=0A= + (zero_extend:=0A= + (match_operand:DWIH 2 "nonimmediate_operand" "rm,r"))))=0A= + (clobber (reg:CC FLAGS_REG))]=0A= + "MEM_P (operands[0]) ? rtx_equal_p (operands[0], operands[1])=0A= + && !MEM_P (operands[2])=0A= + : !MEM_P (operands[1])"=0A= + "#"=0A= + "&& reload_completed"=0A= + [(parallel [(set (reg:CC FLAGS_REG)=0A= + (compare:CC (match_dup 1) (match_dup 2)))=0A= + (set (match_dup 0)=0A= + (minus:DWIH (match_dup 1) (match_dup 2)))])=0A= + (parallel [(set (match_dup 3)=0A= + (minus:DWIH=0A= + (minus:DWIH=0A= + (match_dup 4)=0A= + (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0)))=0A= + (const_int 0)))=0A= + (clobber (reg:CC FLAGS_REG))])]=0A= + "split_double_mode (mode, &operands[0], 2, &operands[0], = &operands[3]);")=0A= +=0A= (define_insn "*sub_1"=0A= [(set (match_operand:SWI 0 "nonimmediate_operand" "=3Dm,")=0A= (minus:SWI=0A= @@ -10956,34 +11018,76 @@=0A= =0A= ;; Split DST =3D (HI<<32)|LO early to minimize register usage.=0A= (define_code_iterator any_or_plus [plus ior xor])=0A= -(define_split=0A= - [(set (match_operand:DI 0 "register_operand")=0A= - (any_or_plus:DI=0A= - (ashift:DI (match_operand:DI 1 "register_operand")=0A= - (const_int 32))=0A= - (zero_extend:DI (match_operand:SI 2 "register_operand"))))]=0A= - "!TARGET_64BIT"=0A= - [(set (match_dup 3) (match_dup 4))=0A= - (set (match_dup 5) (match_dup 2))]=0A= +(define_insn_and_split "*concat3_1"=0A= + [(set (match_operand: 0 "register_operand" "=3Dr")=0A= + (any_or_plus:=0A= + (ashift: (match_operand: 1 "register_operand" "r")=0A= + (match_operand: 2 "const_int_operand" "n"))=0A= + (zero_extend: (match_operand:DWIH 3 "register_operand" "r"))))]=0A= + "INTVAL (operands[2]) =3D=3D * BITS_PER_UNIT=0A= + && ix86_pre_reload_split ()"=0A= + "#"=0A= + "&& 1"=0A= + [(set (match_dup 4) (match_dup 3))=0A= + (set (match_dup 5) (match_dup 6))]=0A= {=0A= - operands[3] =3D gen_highpart (SImode, operands[0]);=0A= - operands[4] =3D gen_lowpart (SImode, operands[1]);=0A= - operands[5] =3D gen_lowpart (SImode, operands[0]);=0A= + operands[4] =3D gen_lowpart (mode, operands[0]);=0A= + operands[5] =3D gen_highpart (mode, operands[0]);=0A= + operands[6] =3D gen_lowpart (mode, operands[1]);=0A= })=0A= =0A= -(define_split=0A= - [(set (match_operand:DI 0 "register_operand")=0A= - (any_or_plus:DI=0A= - (zero_extend:DI (match_operand:SI 1 "register_operand"))=0A= - (ashift:DI (match_operand:DI 2 "register_operand")=0A= - (const_int 32))))]=0A= - "!TARGET_64BIT"=0A= - [(set (match_dup 3) (match_dup 4))=0A= +(define_insn_and_split "*concat3_2"=0A= + [(set (match_operand: 0 "register_operand" "=3Dr")=0A= + (any_or_plus:=0A= + (zero_extend: (match_operand:DWIH 1 "register_operand" "r"))=0A= + (ashift: (match_operand: 2 "register_operand" "r")=0A= + (match_operand: 3 "const_int_operand" "n"))))]=0A= + "INTVAL (operands[3]) =3D=3D * BITS_PER_UNIT=0A= + && ix86_pre_reload_split ()"=0A= + "#"=0A= + "&& 1"=0A= + [(set (match_dup 4) (match_dup 1))=0A= + (set (match_dup 5) (match_dup 6))]=0A= +{=0A= + operands[4] =3D gen_lowpart (mode, operands[0]);=0A= + operands[5] =3D gen_highpart (mode, operands[0]);=0A= + operands[6] =3D gen_lowpart (mode, operands[2]);=0A= +})=0A= +=0A= +(define_insn_and_split "*concat3_3"=0A= + [(set (match_operand: 0 "register_operand" "=3Dr")=0A= + (any_or_plus:=0A= + (ashift:=0A= + (zero_extend: (match_operand:DWIH 1 "register_operand" "r"))=0A= + (match_operand: 2 "const_int_operand" "n"))=0A= + (zero_extend: (match_operand:DWIH 3 "register_operand" "r"))))]=0A= + "INTVAL (operands[2]) =3D=3D * BITS_PER_UNIT=0A= + && ix86_pre_reload_split ()"=0A= + "#"=0A= + "&& 1"=0A= + [(set (match_dup 4) (match_dup 3))=0A= (set (match_dup 5) (match_dup 1))]=0A= {=0A= - operands[3] =3D gen_highpart (SImode, operands[0]);=0A= - operands[4] =3D gen_lowpart (SImode, operands[2]);=0A= - operands[5] =3D gen_lowpart (SImode, operands[0]);=0A= + operands[4] =3D gen_lowpart (mode, operands[0]);=0A= + operands[5] =3D gen_highpart (mode, operands[0]);=0A= +})=0A= +=0A= +(define_insn_and_split "*concat3_4"=0A= + [(set (match_operand: 0 "register_operand" "=3Dr")=0A= + (any_or_plus:=0A= + (zero_extend: (match_operand:DWIH 1 "register_operand" "r"))=0A= + (ashift:=0A= + (zero_extend: (match_operand:DWIH 2 "register_operand" "r"))=0A= + (match_operand: 3 "const_int_operand" "n"))))]=0A= + "INTVAL (operands[3]) =3D=3D * BITS_PER_UNIT=0A= + && ix86_pre_reload_split ()"=0A= + "#"=0A= + "&& 1"=0A= + [(set (match_dup 4) (match_dup 1))=0A= + (set (match_dup 5) (match_dup 2))]=0A= +{=0A= + operands[4] =3D gen_lowpart (mode, operands[0]);=0A= + operands[5] =3D gen_highpart (mode, operands[0]);=0A= })=0A= =0C=0A= ;; Negation instructions=0A= diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md=0A= index 8b2602b..0198156 100644=0A= --- a/gcc/config/i386/sse.md=0A= +++ b/gcc/config/i386/sse.md=0A= @@ -2070,7 +2070,8 @@=0A= (ashift:HI=0A= (zero_extend:HI (match_operand:QI 1 "register_operand" "k"))=0A= (const_int 8))=0A= - (zero_extend:HI (match_operand:QI 2 "register_operand" "k"))))]=0A= + (zero_extend:HI (match_operand:QI 2 "register_operand" "k"))))=0A= + (unspec [(const_int 0)] UNSPEC_MASKOP)]=0A= "TARGET_AVX512F"=0A= "kunpckbw\t{%2, %1, %0|%0, %1, %2}"=0A= [(set_attr "mode" "HI")=0A= @@ -2083,7 +2084,8 @@=0A= (ashift:SI=0A= (zero_extend:SI (match_operand:HI 1 "register_operand" "k"))=0A= (const_int 16))=0A= - (zero_extend:SI (match_operand:HI 2 "register_operand" "k"))))]=0A= + (zero_extend:SI (match_operand:HI 2 "register_operand" "k"))))=0A= + (unspec [(const_int 0)] UNSPEC_MASKOP)]=0A= "TARGET_AVX512BW"=0A= "kunpckwd\t{%2, %1, %0|%0, %1, %2}"=0A= [(set_attr "mode" "SI")])=0A= @@ -2094,7 +2096,8 @@=0A= (ashift:DI=0A= (zero_extend:DI (match_operand:SI 1 "register_operand" "k"))=0A= (const_int 32))=0A= - (zero_extend:DI (match_operand:SI 2 "register_operand" "k"))))]=0A= + (zero_extend:DI (match_operand:SI 2 "register_operand" "k"))))=0A= + (unspec [(const_int 0)] UNSPEC_MASKOP)]=0A= "TARGET_AVX512BW"=0A= "kunpckdq\t{%2, %1, %0|%0, %1, %2}"=0A= [(set_attr "mode" "DI")])=0A= @@ -17398,21 +17401,26 @@=0A= })=0A= =0A= (define_expand "vec_pack_trunc_qi"=0A= - [(set (match_operand:HI 0 "register_operand")=0A= - (ior:HI (ashift:HI (zero_extend:HI (match_operand:QI 2 = "register_operand"))=0A= - (const_int 8))=0A= - (zero_extend:HI (match_operand:QI 1 "register_operand"))))]=0A= + [(parallel=0A= + [(set (match_operand:HI 0 "register_operand")=0A= + (ior:HI=0A= + (ashift:HI (zero_extend:HI (match_operand:QI 2 "register_operand"))=0A= + (const_int 8))=0A= + (zero_extend:HI (match_operand:QI 1 "register_operand"))))=0A= + (unspec [(const_int 0)] UNSPEC_MASKOP)])]=0A= "TARGET_AVX512F")=0A= =0A= (define_expand "vec_pack_trunc_"=0A= - [(set (match_operand: 0 "register_operand")=0A= - (ior:=0A= - (ashift:=0A= + [(parallel=0A= + [(set (match_operand: 0 "register_operand")=0A= + (ior:=0A= + (ashift:=0A= + (zero_extend:=0A= + (match_operand:SWI24 2 "register_operand"))=0A= + (match_dup 3))=0A= (zero_extend:=0A= - (match_operand:SWI24 2 "register_operand"))=0A= - (match_dup 3))=0A= - (zero_extend:=0A= - (match_operand:SWI24 1 "register_operand"))))]=0A= + (match_operand:SWI24 1 "register_operand"))))=0A= + (unspec [(const_int 0)] UNSPEC_MASKOP)])]=0A= "TARGET_AVX512BW"=0A= {=0A= operands[3] =3D GEN_INT (GET_MODE_BITSIZE (mode));=0A= diff --git a/gcc/testsuite/g++.target/i386/pr91681.C = b/gcc/testsuite/g++.target/i386/pr91681.C=0A= new file mode 100644=0A= index 0000000..0271e43=0A= --- /dev/null=0A= +++ b/gcc/testsuite/g++.target/i386/pr91681.C=0A= @@ -0,0 +1,20 @@=0A= +/* { dg-do compile { target int128 } } */=0A= +/* { dg-options "-O2" } */=0A= +=0A= +void multiply128x64x2_3 ( =0A= + const unsigned long a, =0A= + const unsigned long b, =0A= + const unsigned long c, =0A= + const unsigned long d, =0A= + __uint128_t o[2])=0A= +{=0A= + __uint128_t B0 =3D (__uint128_t) b * c;=0A= + __uint128_t B2 =3D (__uint128_t) a * c;=0A= + __uint128_t B1 =3D (__uint128_t) b * d;=0A= + __uint128_t B3 =3D (__uint128_t) a * d;=0A= +=0A= + o[0] =3D B2 + (B0 >> 64);=0A= + o[1] =3D B3 + (B1 >> 64);=0A= +}=0A= +=0A= +/* { dg-final { scan-assembler-not "xor" } } */=0A= diff --git a/gcc/testsuite/gcc.target/i386/pr91681-1.c = b/gcc/testsuite/gcc.target/i386/pr91681-1.c=0A= new file mode 100644=0A= index 0000000..ab83cc4=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/i386/pr91681-1.c=0A= @@ -0,0 +1,20 @@=0A= +/* { dg-do compile { target int128 } } */=0A= +/* { dg-options "-O2" } */=0A= +unsigned __int128 m;=0A= +=0A= +unsigned __int128 foo(unsigned __int128 x, unsigned long long y)=0A= +{=0A= + return x + y;=0A= +}=0A= +=0A= +void bar(unsigned __int128 x, unsigned long long y)=0A= +{=0A= + m =3D x + y;=0A= +}=0A= +=0A= +void baz(unsigned long long y)=0A= +{=0A= + m +=3D y;=0A= +}=0A= +=0A= +/* { dg-final { scan-assembler-not "xor" } } */=0A= diff --git a/gcc/testsuite/gcc.target/i386/pr91681-2.c = b/gcc/testsuite/gcc.target/i386/pr91681-2.c=0A= new file mode 100644=0A= index 0000000..ea52c72=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/i386/pr91681-2.c=0A= @@ -0,0 +1,20 @@=0A= +/* { dg-do compile { target int128 } } */=0A= +/* { dg-options "-O2" } */=0A= +unsigned __int128 m;=0A= +=0A= +unsigned __int128 foo(unsigned __int128 x, unsigned long long y)=0A= +{=0A= + return x - y;=0A= +}=0A= +=0A= +void bar(unsigned __int128 x, unsigned long long y)=0A= +{=0A= + m =3D x - y;=0A= +}=0A= +=0A= +void baz(unsigned long long y)=0A= +{=0A= + m -=3D y;=0A= +}=0A= +=0A= +/* { dg-final { scan-assembler-not "xor" } } */=0A= diff --git a/gcc/testsuite/gcc.target/i386/pr91681-3.c = b/gcc/testsuite/gcc.target/i386/pr91681-3.c=0A= new file mode 100644=0A= index 0000000..22a03c2=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/i386/pr91681-3.c=0A= @@ -0,0 +1,16 @@=0A= +/* { dg-do compile { target ia32 } } */=0A= +/* { dg-options "-O2" } */=0A= +=0A= +unsigned long long m;=0A= +=0A= +unsigned long long foo(unsigned long long x, unsigned int y)=0A= +{=0A= + return x - y;=0A= +}=0A= +=0A= +void bar(unsigned long long x, unsigned int y)=0A= +{=0A= + m =3D x - y;=0A= +}=0A= +=0A= +/* { dg-final { scan-assembler-not "xor" } } */=0A= ------=_NextPart_000_0AF7_01D87737.A57E8AF0--