From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16066 invoked by alias); 28 Aug 2014 08:57:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 16052 invoked by uid 89); 28 Aug 2014 08:57:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f177.google.com Received: from mail-wi0-f177.google.com (HELO mail-wi0-f177.google.com) (209.85.212.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 28 Aug 2014 08:57:38 +0000 Received: by mail-wi0-f177.google.com with SMTP id cc10so477825wib.10 for ; Thu, 28 Aug 2014 01:57:34 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.194.221.74 with SMTP id qc10mr3256450wjc.39.1409216253265; Thu, 28 Aug 2014 01:57:33 -0700 (PDT) Received: by 10.194.20.69 with HTTP; Thu, 28 Aug 2014 01:57:33 -0700 (PDT) In-Reply-To: <53FEDF34.4090605@linaro.org> References: <53FEDF34.4090605@linaro.org> Date: Thu, 28 Aug 2014 08:57:00 -0000 Message-ID: Subject: Re: [PATCH 2/2] Enable elimination of zext/sext From: Richard Biener To: Kugan Cc: Uros Bizjak , "gcc-patches@gcc.gnu.org" , Jakub Jelinek Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2014-08/txt/msg02551.txt.bz2 On Thu, Aug 28, 2014 at 9:50 AM, Kugan wrote: > > > On 27/08/14 20:07, Richard Biener wrote: >> On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak wrote: >>> Hello! >>> >>>> 2014-08-07 Kugan Vivekanandarajah >>>> >>>> * calls.c (precompute_arguments): Check >>>> promoted_for_signed_and_unsigned_p and set the promoted mode. >>>> (promoted_for_signed_and_unsigned_p): New function. >>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p >>>> and set the promoted mode. >>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition. >>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if >>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. >>> >>> This patch regresses: >>> >>> Running target unix >>> FAIL: libgomp.fortran/simd7.f90 -O2 execution test >>> FAIL: libgomp.fortran/simd7.f90 -Os execution test >>> >>> on alphaev6-linux-gnu. >>> >>> The problem can be illustrated with attached testcase with a >>> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in >>> missing SImode extension after DImode shift of SImode subregs for this >>> part: >>> >>> --cut here-- >>> # test.23_12 = PHI <0(37), 1(36)> >>> _242 = ivtmp.181_73 + 2147483645; >>> _240 = _242 * 2; >>> _63 = (integer(kind=4)) _240; >>> if (ubound.6_99 <= 2) >>> goto ; >>> else >>> goto ; >>> ;; succ: 39 >>> ;; 40 >>> >>> ;; basic block 39, loop depth 1 >>> ;; pred: 38 >>> pretmp_337 = test.23_12 | l_76; >>> goto ; >>> ;; succ: 45 >>> >>> ;; basic block 40, loop depth 1 >>> ;; pred: 38 >>> _11 = *c_208[0]; >>> if (_11 != _63) >>> goto ; >>> else >>> goto ; >>> --cut here-- >>> >>> this expands to: >>> >>> (code_label 592 591 593 35 "" [0 uses]) >>> >>> (note 593 592 0 NOTE_INSN_BASIC_BLOCK) >>> >>> ;; _63 = (integer(kind=4)) _240; >>> >>> (insn 594 593 595 (set (reg:SI 538) >>> (const_int 1073741824 [0x40000000])) -1 >>> (nil)) >>> >>> (insn 595 594 596 (set (reg:SI 539) >>> (plus:SI (reg:SI 538) >>> (const_int 1073741824 [0x40000000]))) -1 >>> (nil)) >>> >>> (insn 596 595 597 (set (reg:SI 537) >>> (plus:SI (reg:SI 539) >>> (const_int -3 [0xfffffffffffffffd]))) -1 >>> (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd]) >>> (nil))) >>> >>> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ]) >>> (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0) >>> (reg:SI 537))) -1 >>> (nil)) >>> >>> (insn 598 597 599 (set (reg:DI 540) >>> (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0) >>> (const_int 1 [0x1]))) -1 >>> (nil)) >>> >>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) >>> (reg:DI 540)) -1 >>> (nil)) >>> >>> ... >>> >>> (note 610 609 0 NOTE_INSN_BASIC_BLOCK) >>> >>> ;; _11 = *c_208[0]; >>> >>> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ]) >>> (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4 >>> A128]))) simd7.f90:12 -1 >>> (nil)) >>> >>> ;; if (_11 != _63) >>> >>> (insn 612 611 613 40 (set (reg:DI 545) >>> (eq:DI (reg:DI 120 [ D.1694 ]) >>> (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1 >>> (nil)) >>> >>> (jump_insn 613 612 616 40 (set (pc) >>> (if_then_else (eq (reg:DI 545) >>> (const_int 0 [0])) >>> (label_ref 0) >>> (pc))) simd7.f90:12 -1 >>> (int_list:REG_BR_PROB 450 (nil))) >>> >>> which results in following asm: >>> >>> $L35: >>> addl $25,$7,$2 # 597 addsi3/1 [length = 4] >>> addq $2,$2,$2 # 598 ashldi3/1 [length = 4] <------ here >>> bne $24,$L145 # 601 *bcc_normal [length = 4] >>> lda $4,4($20) # 627 *adddi_internal/2 [length = 4] >>> ldl $8,0($20) # 611 *extendsidi2_1/2 [length = 4] >>> lda $3,3($31) # 74 *movdi/2 [length = 4] >>> cmpeq $8,$2,$2 # 612 *setcc_internal [length = 4] <-- compare >>> bne $2,$L40 # 613 *bcc_normal [length = 4] >>> br $31,$L88 # 2403 jump [length = 4] >>> .align 4 >>> ... >>> >>> Tracking the values with the debugger shows wrong calculation: >>> >>> 0x000000012000108c <+1788>: addl t10,t12,t1 >>> 0x0000000120001090 <+1792>: addq t1,t1,t1 >>> ... >>> 0x00000001200010a4 <+1812>: cmpeq t6,t1,t1 >>> 0x00000001200010a8 <+1816>: bne t1,0x1200010c0 >>> >>> (gdb) si >>> 0x000000012000108c 17 l = l .or. any (b /= 7 + i) >>> (gdb) i r t10 t12 >>> t10 0x7 7 >>> t12 0x7ffffffd 2147483645 >>> >>> (gdb) si >>> 0x0000000120001090 17 l = l .or. any (b /= 7 + i) >>> (gdb) i r t1 >>> t1 0xffffffff80000004 -2147483644 >>> >>> (gdb) si >>> 18 l = l .or. any (c /= 8 + 2 * i) >>> (gdb) i r t1 >>> t1 0xffffffff00000008 -4294967288 >>> >>> At this point, the calculation should zero-extend SImode value to full >>> DImode, since compare operates on DImode values. The problematic insn >>> is (insn 599), which is now a DImode assignment instead of >>> zero-extend, due to: >>> >>> --- a/gcc/cfgexpand.c >>> +++ b/gcc/cfgexpand.c >>> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) >>> GET_MODE (target), temp, unsignedp); >>> } >>> >>> - convert_move (SUBREG_REG (target), temp, unsignedp); >>> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) >>> + && (GET_CODE (temp) == SUBREG) >>> + && (GET_MODE (target) == GET_MODE (temp)) >>> + && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp)))) >>> + emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); >>> + else >>> + convert_move (SUBREG_REG (target), temp, unsignedp); >>> } >>> else if (nontemporal && emit_storent_insn (target, temp)) >>> ; >>> >>> When compiling this code, we have: >>> >>> lhs = _63 >>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0) >>> temp = (subreg:SI (reg:DI 540) 0) >>> >>> So, the code assumes that it is possible to copy (reg:DI 540) directly >>> to (reg:DI 154). However, this is not the case, since we still have >>> garbage in the top 32bits. >>> >>> Reverting the part above fixes the runtime failure, since (insn 599) is now: >>> >>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) >>> (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1 >>> (nil)) >>> >>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*. >> >> Yeah, that makes sense. >> > > Thanks Richard for your comments. > > I added this part of the code (in cfgexpand.c) to handle binary/unary/.. > gimple operations and used the LHS value range to infer the assigned > value range. I will revert this part of the code as this is wrong. > > I dont think checking promoted_mode for temp will be necessary here as > convert_move will handle it correctly if promoted_mode is set for temp. > > Thus, I will reimplement setting promoted_mode to temp (in > expand_expr_real_2) based on the gimple statement content on RHS. i.e. > by looking at the RHS operands and its value ranges and by calculating > the resulting value range. Does this sound OK to you. No, this sounds backward again and won't work because those operands again could be just truncated - thus you can't rely on their value-range. What you would need is VRP computing value-ranges in the promoted mode from the start (and it doesn't do that). Richard. > Thanks, > Kugan >