From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 87999 invoked by alias); 2 May 2018 13:31:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 87979 invoked by uid 89); 2 May 2018 13:30:59 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 02 May 2018 13:30:57 +0000 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5A2C83005168; Wed, 2 May 2018 13:30:56 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.36.118.110]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EF04B2015993; Wed, 2 May 2018 13:30:55 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id w42DUrb6021038; Wed, 2 May 2018 15:30:53 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id w42DUq5i021037; Wed, 2 May 2018 15:30:52 +0200 Date: Wed, 02 May 2018 13:31:00 -0000 From: Jakub Jelinek To: Uros Bizjak Cc: "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH] Fix the x86 *_doubleword_mask* patterns (PR target/85582) Message-ID: <20180502133052.GW8577@tucnak> Reply-To: Jakub Jelinek References: <20180502074246.GR8577@tucnak> <20180502080205.GS8577@tucnak> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) X-IsSubscribed: yes X-SW-Source: 2018-05/txt/msg00088.txt.bz2 On Wed, May 02, 2018 at 10:09:22AM +0200, Uros Bizjak wrote: > On Wed, May 2, 2018 at 10:02 AM, Jakub Jelinek wrote: > > On Wed, May 02, 2018 at 09:50:07AM +0200, Uros Bizjak wrote: > >> > 2018-05-02 Jakub Jelinek > >> > > >> > PR target/85582 > >> > * config/i386/i386.md (*ashl3_doubleword_mask, > >> > *ashl3_doubleword_mask_1, *3_doubleword_mask, > >> > *3_doubleword_mask_1): If and[sq]i3 is needed, don't > >> > clobber operands[2], instead use a new pseudo. Formatting fixes. > >> > > >> > * gcc.c-torture/execute/pr85582-1.c: New test. > >> > * gcc.c-torture/execute/pr85582-2.c: New test. > >> > >> O. > > > > Thanks, committed. BTW, thinking about these some more, > > isn't INTVAL (operands[3]) <= ( * BITS_PER_UNIT) - 1 > > incorrect? Say for being 4 this is INTVAL (operands[3]) <= 31 > > so it will accept & 0 to & 31 (that is correct), or e.g. & -2 (that is > > incorrect). > > Hm, I wasn't thinking about undefined cases here. It isn't just about undefined cases, see the testcase below which is miscompiled because of these checks. Whether some bit in the mask is set or not still doesn't imply anything on whether the other operand of the bit and will have those bits set or clear. > > Isn't the right guarding condition that > > (INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0 > > i.e. that we have guarantee the shift count doesn't have the topmost > > relevant bit set and we can ignore bits above it (UB)? > > Yes, based on above, it looks more robust, indeed. > > > And for the decision if we should use a masking or not perhaps > > if ((INTVAL (operands[3]) & (( * BITS_PER_UNIT) - 1)) > > != ( * BITS_PER_UNIT) - 1) > > , again ignoring bits we don't care about. > > Yes, looks more robust vs. UB, too (but updated guarding condition > won't allow UB values here). Let's also change this, for consistency. > > The patch that changes the condition to your approach is pre-approved. This is what I'll bootstrap/regtest tonight. 2018-05-03 Jakub Jelinek PR target/85582 * config/i386/i386.md (*ashl3_doubleword_mask, *ashl3_doubleword_mask_1, *3_doubleword_mask, *3_doubleword_mask_1): In condition require that the highest significant bit of the shift count mask is clear. In check whether and[sq]i3 is needed verify that all significant bits of the shift count other than the highest are set. * gcc.c-torture/execute/pr85582-3.c: New test. --- gcc/config/i386/i386.md.jj 2018-05-02 09:51:24.898661931 +0200 +++ gcc/config/i386/i386.md 2018-05-02 14:58:58.581494664 +0200 @@ -10366,7 +10366,7 @@ (define_insn_and_split "*ashl3_doub (match_operand:SI 2 "register_operand" "c") (match_operand:SI 3 "const_int_operand")) 0))) (clobber (reg:CC FLAGS_REG))] - "INTVAL (operands[3]) <= ( * BITS_PER_UNIT) - 1 + "(INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0 && can_create_pseudo_p ()" "#" "&& 1" @@ -10385,7 +10385,8 @@ (define_insn_and_split "*ashl3_doub operands[8] = GEN_INT ( * BITS_PER_UNIT); - if (INTVAL (operands[3]) < ( * BITS_PER_UNIT) - 1) + if ((INTVAL (operands[3]) & (( * BITS_PER_UNIT) - 1)) + != (( * BITS_PER_UNIT) - 1)) { rtx tem = gen_reg_rtx (SImode); emit_insn (gen_andsi3 (tem, operands[2], operands[3])); @@ -10406,7 +10407,7 @@ (define_insn_and_split "*ashl3_doub (match_operand:QI 2 "register_operand" "c") (match_operand:QI 3 "const_int_operand")))) (clobber (reg:CC FLAGS_REG))] - "INTVAL (operands[3]) <= ( * BITS_PER_UNIT) - 1 + "(INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0 && can_create_pseudo_p ()" "#" "&& 1" @@ -10425,7 +10426,8 @@ (define_insn_and_split "*ashl3_doub operands[8] = GEN_INT ( * BITS_PER_UNIT); - if (INTVAL (operands[3]) < ( * BITS_PER_UNIT) - 1) + if ((INTVAL (operands[3]) & (( * BITS_PER_UNIT) - 1)) + != (( * BITS_PER_UNIT) - 1)) { rtx tem = gen_reg_rtx (QImode); emit_insn (gen_andqi3 (tem, operands[2], operands[3])); @@ -11126,7 +11128,7 @@ (define_insn_and_split "* * BITS_PER_UNIT) - 1 + "(INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0 && can_create_pseudo_p ()" "#" "&& 1" @@ -11145,7 +11147,8 @@ (define_insn_and_split "* * BITS_PER_UNIT); - if (INTVAL (operands[3]) < ( * BITS_PER_UNIT)-1) + if ((INTVAL (operands[3]) & (( * BITS_PER_UNIT) - 1)) + != (( * BITS_PER_UNIT) - 1)) { rtx tem = gen_reg_rtx (SImode); emit_insn (gen_andsi3 (tem, operands[2], operands[3])); @@ -11166,7 +11169,7 @@ (define_insn_and_split "* * BITS_PER_UNIT) - 1 + "(INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0 && can_create_pseudo_p ()" "#" "&& 1" @@ -11185,7 +11188,8 @@ (define_insn_and_split "* * BITS_PER_UNIT); - if (INTVAL (operands[3]) < ( * BITS_PER_UNIT) - 1) + if ((INTVAL (operands[3]) & (( * BITS_PER_UNIT) - 1)) + != (( * BITS_PER_UNIT) - 1)) { rtx tem = gen_reg_rtx (QImode); emit_insn (gen_andqi3 (tem, operands[2], operands[3])); --- gcc/testsuite/gcc.c-torture/execute/pr85582-3.c.jj 2018-05-02 15:02:23.655654507 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr85582-3.c 2018-05-02 14:55:56.953353103 +0200 @@ -0,0 +1,55 @@ +/* PR target/85582 */ + +#ifdef __SIZEOF_INT128__ +typedef __int128 S; +typedef unsigned __int128 U; +#else +typedef long long S; +typedef unsigned long long U; +#endif + +__attribute__((noipa)) U +f1 (U x, int y) +{ + return x << (y & -2); +} + +__attribute__((noipa)) S +f2 (S x, int y) +{ + return x >> (y & -2); +} + +__attribute__((noipa)) U +f3 (U x, int y) +{ + return x >> (y & -2); +} + +int +main () +{ + U a = (U) 1 << (sizeof (U) * __CHAR_BIT__ - 7); + if (f1 (a, 5) != ((U) 1 << (sizeof (S) * __CHAR_BIT__ - 3))) + __builtin_abort (); + S b = (U) 0x101 << (sizeof (S) * __CHAR_BIT__ / 2 - 7); + if (f1 (b, sizeof (S) * __CHAR_BIT__ / 2) != (U) 0x101 << (sizeof (S) * __CHAR_BIT__ - 7)) + __builtin_abort (); + if (f1 (b, sizeof (S) * __CHAR_BIT__ / 2 + 2) != (U) 0x101 << (sizeof (S) * __CHAR_BIT__ - 5)) + __builtin_abort (); + S c = (U) 1 << (sizeof (S) * __CHAR_BIT__ - 1); + if ((U) f2 (c, 5) != ((U) 0x1f << (sizeof (S) * __CHAR_BIT__ - 5))) + __builtin_abort (); + if ((U) f2 (c, sizeof (S) * __CHAR_BIT__ / 2) != ((U) -1 << (sizeof (S) * __CHAR_BIT__ / 2 - 1))) + __builtin_abort (); + if ((U) f2 (c, sizeof (S) * __CHAR_BIT__ / 2 + 2) != ((U) -1 << (sizeof (S) * __CHAR_BIT__ / 2 - 3))) + __builtin_abort (); + U d = (U) 1 << (sizeof (S) * __CHAR_BIT__ - 1); + if (f3 (c, 5) != ((U) 0x1 << (sizeof (S) * __CHAR_BIT__ - 5))) + __builtin_abort (); + if (f3 (c, sizeof (S) * __CHAR_BIT__ / 2) != ((U) 1 << (sizeof (S) * __CHAR_BIT__ / 2 - 1))) + __builtin_abort (); + if (f3 (c, sizeof (S) * __CHAR_BIT__ / 2 + 2) != ((U) 1 << (sizeof (S) * __CHAR_BIT__ / 2 - 3))) + __builtin_abort (); + return 0; +} Jakub