From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 111156 invoked by alias); 14 Sep 2015 15:50:24 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 111134 invoked by uid 89); 14 Sep 2015 15:50:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-vk0-f43.google.com Received: from mail-vk0-f43.google.com (HELO mail-vk0-f43.google.com) (209.85.213.43) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 14 Sep 2015 15:50:22 +0000 Received: by vkfp126 with SMTP id p126so61492213vkf.3 for ; Mon, 14 Sep 2015 08:50:20 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.31.138.141 with SMTP id m135mr14354510vkd.66.1442245820343; Mon, 14 Sep 2015 08:50:20 -0700 (PDT) Received: by 10.103.40.68 with HTTP; Mon, 14 Sep 2015 08:50:20 -0700 (PDT) In-Reply-To: <20150914120342.GB5849@msticlxl57.ims.intel.com> References: <20150619132130.GA15263@msticlxl57.ims.intel.com> <55BFD483.9020107@redhat.com> <20150821134447.GA3232@msticlxl57.ims.intel.com> <55D753F6.6020900@redhat.com> <20150908154904.GA5849@msticlxl57.ims.intel.com> <20150914120342.GB5849@msticlxl57.ims.intel.com> Date: Mon, 14 Sep 2015 16:50:00 -0000 Message-ID: Subject: Re: [RFC, PR target/65105] Use vector instructions for scalar 64bit computations on 32bit target From: Uros Bizjak To: Ilya Enkovich Cc: Jeff Law , "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2015-09/txt/msg00946.txt.bz2 On Mon, Sep 14, 2015 at 2:03 PM, Ilya Enkovich wrote: > On 09 Sep 10:20, Uros Bizjak wrote: >> On Wed, Sep 9, 2015 at 10:12 AM, Uros Bizjak wrote: >> > On Tue, Sep 8, 2015 at 5:49 PM, Ilya Enkovich wrote: >> > >> > Please depend new changes to insn patterns to TARGET_STV. This way, >> > non-STV compiles will behave exactly as now. >> > >> > +;; Math-dependant integer modes with DImode. >> > +(define_mode_iterator SWIM1248x [(QI "TARGET_QIMODE_MATH") >> > + (HI "TARGET_HIMODE_MATH") >> > + SI DI]) >> > + >> > >> > DI should depend on TARGET_STV && TARGET_SSE2 >> > >> > @@ -2093,9 +2098,9 @@ >> > >> > (define_insn "*movdi_internal" >> > [(set (match_operand:DI 0 "nonimmediate_operand" >> > - "=r ,o ,r,r ,r,m ,*y,*y,?*y,?m,?r ,?*Ym,*v,*v,*v,m ,?r >> > ,?r,?*Yi,?*Ym,?*Yi,*k,*k ,*r ,*m") >> > + "=r ,o ,r,r ,r,m ,*y,*y,?*y,?m,?r ,?*Ym,*v,*v,*v,m,?r >> > ,?r,?*Yi,?*Ym,?*Yi,*k,*k ,*r ,*m") >> > (match_operand:DI 1 "general_operand" >> > - "riFo,riF,Z,rem,i,re,C ,*y,m ,*y,*Yn,r ,C ,*v,m ,*v,*Yj,*v,r >> > ,*Yj ,*Yn ,*r ,*km,*k,*k"))] >> > + "riFo,riF,Z,rem,i,re,C ,*y,m ,*y,*Yn,r ,C ,*v,m ,v,*Yj,*v,r >> > ,*Yj ,*Yn ,*r ,*km,*k,*k"))] >> > "!(MEM_P (operands[0]) && MEM_P (operands[1]))" >> > { >> > >> > Please add new alternative and use enabled attribute to conditionaly >> > select correct alternative. Preferrably, the new alternative should be >> > just after the one it changes, so you will have to change many of the >> > alternative's numbers in attribute calculations. >> > >> > +(define_insn_and_split "*anddi3_doubleword" >> > + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r") >> > + (and:DI >> > + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0") >> > + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm"))) >> > + (clobber (reg:CC FLAGS_REG))] >> > + "!TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)" >> > + "#" >> > + "!TARGET_64BIT && reload_completed" >> > + [(parallel [(set (match_dup 0) >> > >> > You should add TARGET_STV && TARGET_SSE2 in the above and other added patterns. >> >> (I pushed "send" too fast here ;) ) >> >> Please note that you can use "&& ..." in the split condition to avoid >> duplication with insn condition. >> >> + if (TARGET_SSE4_1) >> + { >> + emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0), >> + CONST0_RTX (V4SImode), >> + gen_rtx_SUBREG (SImode, reg, 0))); >> + emit_insn (gen_sse4_1_pinsrd (gen_rtx_SUBREG (V4SImode, vreg, 0), >> + gen_rtx_SUBREG (V4SImode, vreg, 0), >> + gen_rtx_SUBREG (SImode, reg, 4), >> + GEN_INT (2))); >> + } >> + else if (TARGET_INTER_UNIT_MOVES_TO_VEC) >> + { >> + rtx tmp = gen_reg_rtx (DImode); >> + emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0), >> + CONST0_RTX (V4SImode), >> + gen_rtx_SUBREG (SImode, reg, 0))); >> + emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, tmp, 0), >> + CONST0_RTX (V4SImode), >> + gen_rtx_SUBREG (SImode, reg, 4))); >> + emit_insn (gen_vec_interleave_lowv4si >> + (gen_rtx_SUBREG (V4SImode, vreg, 0), >> + gen_rtx_SUBREG (V4SImode, vreg, 0), >> + gen_rtx_SUBREG (V4SImode, tmp, 0))); >> + } >> + else >> + { >> + rtx tmp = assign_386_stack_local (DImode, SLOT_TEMP); >> + emit_move_insn (adjust_address (tmp, SImode, 0), >> + gen_rtx_SUBREG (SImode, reg, 0)); >> + emit_move_insn (adjust_address (tmp, SImode, 4), >> + gen_rtx_SUBREG (SImode, reg, 4)); >> + emit_move_insn (vreg, tmp); >> + } >> >> As a future cleanup idea, maybe we should reimplement the above code >> as an expander and use it in several places. IIRC, there are already >> several places in the code that would benefit from it. >> >> Uros. > > Hi Uros! > > Thanks a lot for your review! I fixed my patch according to your comments. Everything is under stv target now and should be transparent for non-stv targets. Bootstrapped and regtested for x86_64-unknown-linux-gnu. Does it look OK? The patch addresses all my concerns, I have only one additional small change request below: +(define_insn_and_split "*zext_doubleword" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI (match_operand:SWI24 1 "nonimmediate_operand" "rm")))] + "!TARGET_64BIT && TARGET_STV && TARGET_SSE2" + "#" + "&& reload_completed && GENERAL_REG_P (operands[0])" + [(set (match_dup 0) (zero_extend:SI (match_dup 1))) + (set (match_dup 2) (const_int 0))] + "split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);") + +(define_insn_and_split "*zextqi_doubleword" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "qm")))] + "!TARGET_64BIT && TARGET_STV && TARGET_SSE2" + "#" + "&& reload_completed && GENERAL_REG_P (operands[0])" + [(set (match_dup 0) (zero_extend:SI (match_dup 1))) + (set (match_dup 2) (const_int 0))] + "split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);") + Please put the above patterns together with other zero_extend patterns. You can also merge these two patterns using SWI124 mode iterator with mode attribute as a register constraint. Also, no need to check for GENERAL_REG_P after reload, when "r" constraint is in effect: (define_insn_and_split "*zext_doubleword" [(set (match_operand:DI 0 "register_operand" "=r") (zero_extend:DI (match_operand:SWI124 1 "nonimmediate_operand" "m")))] "!TARGET_64BIT && TARGET_STV && TARGET_SSE2" "#" "&& reload_completed" [(set (match_dup 0) (zero_extend:SI (match_dup 1))) (set (match_dup 2) (const_int 0))] "split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);") I didn't look in the algorithmic part thoroughly, I assume that you discussed this with Jeff and come to an acceptable solution. I'll leave the final approval for the algorithmic part to Jeff. Thanks, Uros.