From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7151 invoked by alias); 24 Mar 2015 13:43:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 7142 invoked by uid 89); 24 Mar 2015 13:43:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: Yes, score=5.8 required=5.0 tests=AWL,BAYES_99,BAYES_999,FREEMAIL_FROM,KAM_FROM_URIBL_PCCC,MEDICAL_SUBJECT,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-qg0-f49.google.com Received: from mail-qg0-f49.google.com (HELO mail-qg0-f49.google.com) (209.85.192.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 24 Mar 2015 13:43:43 +0000 Received: by qgep97 with SMTP id p97so67743543qge.1 for ; Tue, 24 Mar 2015 06:43:41 -0700 (PDT) X-Received: by 10.140.236.140 with SMTP id h134mr5795005qhc.87.1427204608658; Tue, 24 Mar 2015 06:43:28 -0700 (PDT) Received: from msticlxl57.ims.intel.com (fmdmzpr01-ext.fm.intel.com. [192.55.54.36]) by mx.google.com with ESMTPSA id k126sm2691323qhc.42.2015.03.24.06.43.24 (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 24 Mar 2015 06:43:27 -0700 (PDT) Date: Tue, 24 Mar 2015 13:43:00 -0000 From: Kirill Yukhin To: Ilya Tocar Cc: GCC Patches , Uros Bizjak Subject: Re: [PATCH] Make wider use of "v" constraint in i386.md Message-ID: <20150324134311.GA40649@msticlxl57.ims.intel.com> References: <20150319092404.GA73948@msticlxl7.ims.intel.com> <20150323160222.GB10265@msticlxl7.ims.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150323160222.GB10265@msticlxl7.ims.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes X-SW-Source: 2015-03/txt/msg01246.txt.bz2 Hello, On 23 Mar 19:02, Ilya Tocar wrote: > Hi, > > I've renamed EXT_SSE_REG_P into EXT_REX_SSE_REG_P for consistency. > Ok for stage1? Patch is OK for stage1. -- Thanks, K > On 19 Mar 12:24, Ilya Tocar wrote: > > Hi, > > > > There were some discussion about "x" constraints being too conservative > > for some patterns in i386.md. > > Patch below fixes it. This is probably stage1 material. > > > > ChangeLog: > > > > gcc/ > > > 2015-03-23 Ilya Tocar > > * config/i386/i386.h (EXT_REX_SSE_REG_P): New. > * config/i386/i386.md (*cmpi_mixed): Use "v" > constraint. > (*cmpi_sse): Ditto. > (*movxi_internal_avx512f): Ditto. > (define_split): Check for xmm16+, when splitting scalar float_extend. > (*extendsfdf2_mixed): Use "v" constraint. > (*extendsfdf2_sse): Ditto. > (define_split): Check for xmm16+, when splitting scalar float_truncate. > (*truncdfsf_fast_sse): Use "v" constraint. > (fix_trunc_sse): Ditto. > (*float2_sse): Ditto. > (define_peephole2): Check for xmm16+, when converting scalar > float_truncate. > (define_peephole2): Check for xmm16+, when converting scalar > float_extend. > (*fop__comm_mixed): Use "v" constraint. > (*fop__comm_sse): Ditto. > (*fop__1_mixed): Ditto. > (*sqrt2_sse): Ditto. > (*ieee_s3): Ditto. > > > --- > gcc/config/i386/i386.h | 2 ++ > gcc/config/i386/i386.md | 82 +++++++++++++++++++++++++++---------------------- > 2 files changed, 47 insertions(+), 37 deletions(-) > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > index 1e755d3..70a471b 100644 > --- a/gcc/config/i386/i386.h > +++ b/gcc/config/i386/i386.h > @@ -1477,6 +1477,8 @@ enum reg_class > #define REX_SSE_REGNO_P(N) \ > IN_RANGE ((N), FIRST_REX_SSE_REG, LAST_REX_SSE_REG) > > +#define EXT_REX_SSE_REG_P(X) (REG_P (X) && EXT_REX_SSE_REGNO_P (REGNO (X))) > + > #define EXT_REX_SSE_REGNO_P(N) \ > IN_RANGE ((N), FIRST_EXT_REX_SSE_REG, LAST_EXT_REX_SSE_REG) > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 1129b93..dc1cd20 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -1639,8 +1639,8 @@ > (define_insn "*cmpi_mixed" > [(set (reg:FPCMP FLAGS_REG) > (compare:FPCMP > - (match_operand:MODEF 0 "register_operand" "f,x") > - (match_operand:MODEF 1 "nonimmediate_operand" "f,xm")))] > + (match_operand:MODEF 0 "register_operand" "f,v") > + (match_operand:MODEF 1 "nonimmediate_operand" "f,vm")))] > "TARGET_MIX_SSE_I387 > && SSE_FLOAT_MODE_P (mode)" > "* return output_fp_compare (insn, operands, true, > @@ -1666,8 +1666,8 @@ > (define_insn "*cmpi_sse" > [(set (reg:FPCMP FLAGS_REG) > (compare:FPCMP > - (match_operand:MODEF 0 "register_operand" "x") > - (match_operand:MODEF 1 "nonimmediate_operand" "xm")))] > + (match_operand:MODEF 0 "register_operand" "v") > + (match_operand:MODEF 1 "nonimmediate_operand" "vm")))] > "TARGET_SSE_MATH > && SSE_FLOAT_MODE_P (mode)" > "* return output_fp_compare (insn, operands, true, > @@ -1959,8 +1959,8 @@ > (set_attr "length_immediate" "1")]) > > (define_insn "*movxi_internal_avx512f" > - [(set (match_operand:XI 0 "nonimmediate_operand" "=x,x ,m") > - (match_operand:XI 1 "vector_move_operand" "C ,xm,x"))] > + [(set (match_operand:XI 0 "nonimmediate_operand" "=v,v ,m") > + (match_operand:XI 1 "vector_move_operand" "C ,vm,v"))] > "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))" > { > switch (which_alternative) > @@ -4003,7 +4003,9 @@ > (match_operand:SF 1 "nonimmediate_operand")))] > "TARGET_USE_VECTOR_FP_CONVERTS > && optimize_insn_for_speed_p () > - && reload_completed && SSE_REG_P (operands[0])" > + && reload_completed && SSE_REG_P (operands[0]) > + && (!EXT_REX_SSE_REG_P (operands[0]) > + || TARGET_AVX512VL)" > [(set (match_dup 2) > (float_extend:V2DF > (vec_select:V2SF > @@ -4048,9 +4050,9 @@ > "operands[2] = gen_rtx_REG (SFmode, REGNO (operands[0]));") > > (define_insn "*extendsfdf2_mixed" > - [(set (match_operand:DF 0 "nonimmediate_operand" "=f,m,x") > + [(set (match_operand:DF 0 "nonimmediate_operand" "=f,m,v") > (float_extend:DF > - (match_operand:SF 1 "nonimmediate_operand" "fm,f,xm")))] > + (match_operand:SF 1 "nonimmediate_operand" "fm,f,vm")))] > "TARGET_SSE2 && TARGET_MIX_SSE_I387" > { > switch (which_alternative) > @@ -4071,8 +4073,8 @@ > (set_attr "mode" "SF,XF,DF")]) > > (define_insn "*extendsfdf2_sse" > - [(set (match_operand:DF 0 "nonimmediate_operand" "=x") > - (float_extend:DF (match_operand:SF 1 "nonimmediate_operand" "xm")))] > + [(set (match_operand:DF 0 "nonimmediate_operand" "=v") > + (float_extend:DF (match_operand:SF 1 "nonimmediate_operand" "vm")))] > "TARGET_SSE2 && TARGET_SSE_MATH" > "%vcvtss2sd\t{%1, %d0|%d0, %1}" > [(set_attr "type" "ssecvt") > @@ -4155,7 +4157,9 @@ > (match_operand:DF 1 "nonimmediate_operand")))] > "TARGET_USE_VECTOR_FP_CONVERTS > && optimize_insn_for_speed_p () > - && reload_completed && SSE_REG_P (operands[0])" > + && reload_completed && SSE_REG_P (operands[0]) > + && (!EXT_REX_SSE_REG_P (operands[0]) > + || TARGET_AVX512VL)" > [(set (match_dup 2) > (vec_concat:V4SF > (float_truncate:V2SF > @@ -4228,9 +4232,9 @@ > ;; Yes, this one doesn't depend on flag_unsafe_math_optimizations, > ;; because nothing we do here is unsafe. > (define_insn "*truncdfsf_fast_sse" > - [(set (match_operand:SF 0 "nonimmediate_operand" "=x") > + [(set (match_operand:SF 0 "nonimmediate_operand" "=v") > (float_truncate:SF > - (match_operand:DF 1 "nonimmediate_operand" "xm")))] > + (match_operand:DF 1 "nonimmediate_operand" "vm")))] > "TARGET_SSE2 && TARGET_SSE_MATH" > "%vcvtsd2ss\t{%1, %d0|%d0, %1}" > [(set_attr "type" "ssecvt") > @@ -4544,7 +4548,7 @@ > ;; When SSE is available, it is always faster to use it! > (define_insn "fix_trunc_sse" > [(set (match_operand:SWI48 0 "register_operand" "=r,r") > - (fix:SWI48 (match_operand:MODEF 1 "nonimmediate_operand" "x,m")))] > + (fix:SWI48 (match_operand:MODEF 1 "nonimmediate_operand" "v,m")))] > "SSE_FLOAT_MODE_P (mode) > && (!TARGET_FISTTP || TARGET_SSE_MATH)" > "%vcvtt2si\t{%1, %0|%0, %1}" > @@ -4864,7 +4868,7 @@ > }) > > (define_insn "*float2_sse" > - [(set (match_operand:MODEF 0 "register_operand" "=f,x,x") > + [(set (match_operand:MODEF 0 "register_operand" "=f,v,v") > (float:MODEF > (match_operand:SWI48 1 "nonimmediate_operand" "m,r,m")))] > "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH" > @@ -4967,7 +4971,9 @@ > && optimize_function_for_speed_p (cfun) > && SSE_REG_P (operands[0]) > && (!SSE_REG_P (operands[1]) > - || REGNO (operands[0]) != REGNO (operands[1]))" > + || REGNO (operands[0]) != REGNO (operands[1])) > + && (!EXT_REX_SSE_REG_P (operands[0]) > + || TARGET_AVX512VL)" > [(set (match_dup 0) > (vec_merge:V4SF > (vec_duplicate:V4SF > @@ -4994,7 +5000,9 @@ > && optimize_function_for_speed_p (cfun) > && SSE_REG_P (operands[0]) > && (!SSE_REG_P (operands[1]) > - || REGNO (operands[0]) != REGNO (operands[1]))" > + || REGNO (operands[0]) != REGNO (operands[1])) > + && (!EXT_REX_SSE_REG_P (operands[0]) > + || TARGET_AVX512VL)" > [(set (match_dup 0) > (vec_merge:V2DF > (float_extend:V2DF > @@ -13617,10 +13625,10 @@ > ;; so use special patterns for add and mull. > > (define_insn "*fop__comm_mixed" > - [(set (match_operand:MODEF 0 "register_operand" "=f,x,x") > + [(set (match_operand:MODEF 0 "register_operand" "=f,x,v") > (match_operator:MODEF 3 "binary_fp_operator" > - [(match_operand:MODEF 1 "nonimmediate_operand" "%0,0,x") > - (match_operand:MODEF 2 "nonimmediate_operand" "fm,xm,xm")]))] > + [(match_operand:MODEF 1 "nonimmediate_operand" "%0,0,v") > + (match_operand:MODEF 2 "nonimmediate_operand" "fm,xm,vm")]))] > "SSE_FLOAT_MODE_P (mode) && TARGET_MIX_SSE_I387 > && COMMUTATIVE_ARITH_P (operands[3]) > && !(MEM_P (operands[1]) && MEM_P (operands[2]))" > @@ -13634,7 +13642,7 @@ > (const_string "fmul") > (const_string "fop")))) > (set_attr "isa" "*,noavx,avx") > - (set_attr "prefix" "orig,orig,vex") > + (set_attr "prefix" "orig,orig,maybe_evex") > (set_attr "mode" "")]) > > (define_insn "*fop__comm_sse" > @@ -13651,7 +13659,7 @@ > (const_string "ssemul") > (const_string "sseadd"))) > (set_attr "isa" "noavx,avx") > - (set_attr "prefix" "orig,vex") > + (set_attr "prefix" "orig,maybe_evex") > (set_attr "mode" "")]) > > (define_insn "*fop__comm_i387" > @@ -13670,10 +13678,10 @@ > (set_attr "mode" "")]) > > (define_insn "*fop__1_mixed" > - [(set (match_operand:MODEF 0 "register_operand" "=f,f,x,x") > + [(set (match_operand:MODEF 0 "register_operand" "=f,f,x,v") > (match_operator:MODEF 3 "binary_fp_operator" > - [(match_operand:MODEF 1 "nonimmediate_operand" "0,fm,0,x") > - (match_operand:MODEF 2 "nonimmediate_operand" "fm,0,xm,xm")]))] > + [(match_operand:MODEF 1 "nonimmediate_operand" "0,fm,0,v") > + (match_operand:MODEF 2 "nonimmediate_operand" "fm,0,xm,vm")]))] > "SSE_FLOAT_MODE_P (mode) && TARGET_MIX_SSE_I387 > && !COMMUTATIVE_ARITH_P (operands[3]) > && !(MEM_P (operands[1]) && MEM_P (operands[2]))" > @@ -13694,7 +13702,7 @@ > ] > (const_string "fop"))) > (set_attr "isa" "*,*,noavx,avx") > - (set_attr "prefix" "orig,orig,orig,vex") > + (set_attr "prefix" "orig,orig,orig,maybe_evex") > (set_attr "mode" "")]) > > (define_insn "*rcpsf2_sse" > @@ -13710,10 +13718,10 @@ > (set_attr "mode" "SF")]) > > (define_insn "*fop__1_sse" > - [(set (match_operand:MODEF 0 "register_operand" "=x,x") > + [(set (match_operand:MODEF 0 "register_operand" "=x,v") > (match_operator:MODEF 3 "binary_fp_operator" > - [(match_operand:MODEF 1 "register_operand" "0,x") > - (match_operand:MODEF 2 "nonimmediate_operand" "xm,xm")]))] > + [(match_operand:MODEF 1 "register_operand" "0,v") > + (match_operand:MODEF 2 "nonimmediate_operand" "xm,vm")]))] > "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH > && !COMMUTATIVE_ARITH_P (operands[3])" > "* return output_387_binary_op (insn, operands);" > @@ -13725,7 +13733,7 @@ > ] > (const_string "sseadd"))) > (set_attr "isa" "noavx,avx") > - (set_attr "prefix" "orig,vex") > + (set_attr "prefix" "orig,maybe_evex") > (set_attr "mode" "")]) > > ;; This pattern is not fully shadowed by the pattern above. > @@ -14029,9 +14037,9 @@ > }) > > (define_insn "*sqrt2_sse" > - [(set (match_operand:MODEF 0 "register_operand" "=x") > + [(set (match_operand:MODEF 0 "register_operand" "=v") > (sqrt:MODEF > - (match_operand:MODEF 1 "nonimmediate_operand" "xm")))] > + (match_operand:MODEF 1 "nonimmediate_operand" "vm")))] > "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH" > "%vsqrt\t{%1, %d0|%d0, %1}" > [(set_attr "type" "sse") > @@ -16993,17 +17001,17 @@ > (UNSPEC_IEEE_MIN "min")]) > > (define_insn "*ieee_s3" > - [(set (match_operand:MODEF 0 "register_operand" "=x,x") > + [(set (match_operand:MODEF 0 "register_operand" "=x,v") > (unspec:MODEF > - [(match_operand:MODEF 1 "register_operand" "0,x") > - (match_operand:MODEF 2 "nonimmediate_operand" "xm,xm")] > + [(match_operand:MODEF 1 "register_operand" "0,v") > + (match_operand:MODEF 2 "nonimmediate_operand" "xm,vm")] > IEEE_MAXMIN))] > "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH" > "@ > \t{%2, %0|%0, %2} > v\t{%2, %1, %0|%0, %1, %2}" > [(set_attr "isa" "noavx,avx") > - (set_attr "prefix" "orig,vex") > + (set_attr "prefix" "orig,maybe_evex") > (set_attr "type" "sseadd") > (set_attr "mode" "")]) > > -- > 1.8.3.1