From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [IPv6:2607:f8b0:4864:20::82e]) by sourceware.org (Postfix) with ESMTPS id 50C64382CB92 for ; Tue, 7 Jun 2022 08:13:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 50C64382CB92 Received: by mail-qt1-x82e.google.com with SMTP id x16so8817028qtw.12 for ; Tue, 07 Jun 2022 01:13:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=r81BKzr+0UI3gCyxJO+8Xsf8T1DM5MDQNtyEoCB6Vww=; b=IbDnqTImCrlBSpyhfyJSxIrmfPONtZzz5xTBb1qXpLxuzD0ZjJ3n5GkYQcy3iuBvMp JOjfJ+O73EZB7cKqUJcWSDseYuNvcir2v2fzkUL5tTAcvay8/ETkn4FgGZrdmxmb0h72 a6QqCUxN7RXWPX7k95e/EDVuPke1dHJ9FjfJ0o4dj5B6GouOIFMrr/JMPznITsgVwjVA CG+J/eM3kUbKZbF7l5S1eEdBRBKGUMUjDdF5B/kapAMNpWmyIvhr5j0LtBa7HQnIIz9r ao2d1dQdH8TMYTpz95d3AwCye3aOGxCJ3xj1k7Eqz7dusyiwaXHAoEKLXINSX/tQY0xS rVeg== X-Gm-Message-State: AOAM532VMlw3BD6MjHXUGqVO13qjTXiLBil3BiCI845rBCGlj30MlXuJ /LH1TQlbGN5Trp9UXy+g2miXduTBUUclzf49uuNw4IVhSa4= X-Google-Smtp-Source: ABdhPJzQxgJjNMqD5eSWPhEJOArS7jft9nauK4aL7+za+wVxwRtdFEJn8+V/wlVObJVO6F9AMpL7eOZgNESdBqyTOf4= X-Received: by 2002:ac8:574d:0:b0:2f3:b4aa:700f with SMTP id 13-20020ac8574d000000b002f3b4aa700fmr21337341qtx.54.1654589600620; Tue, 07 Jun 2022 01:13:20 -0700 (PDT) MIME-Version: 1.0 References: <20220607074133.3296-1-hongtao.liu@intel.com> In-Reply-To: <20220607074133.3296-1-hongtao.liu@intel.com> From: Uros Bizjak Date: Tue, 7 Jun 2022 10:13:09 +0200 Message-ID: Subject: Re: [PATCH] Disparages SSE_REGS alternatives sligntly with ?v instead of *v in *mov{si,di}_internal. To: liuhongt Cc: "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jun 2022 08:13:23 -0000 On Tue, Jun 7, 2022 at 9:41 AM liuhongt wrote: > > So alternative v won't be igored in record_reg_classess. > > Similar for *r alternatives in some vector patterns. > > It helps testcase in the PR, also RA now makes better decisions for > gcc.target/i386/extract-insert-combining.c > > movd %esi, %xmm0 > movd %edi, %xmm1 > - movl %esi, -12(%rsp) > paddd %xmm0, %xmm1 > pinsrd $0, %esi, %xmm0 > paddd %xmm1, %xmm0 > > The patch has no big impact on SPEC2017 for both O2 and Ofast > march=native run. > > And I noticed there's some changes in SPEC2017 > > Before: > mov mem, %eax > vmovd %eax, %xmm0 > .. > mov %eax, 64(%rsp) > > After: > vmovd mem, %xmm0 > .. > vmovd %xmm0, 64(%rsp) > > Which should be exactly what we want? > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} > Ok for trunk? > > gcc/ChangeLog: > > * config/i386/i386.md (*movsi_internal): Change alternative > from *v to ?v. > (*movdi_internal): Ditto. > * config/i386/sse.md (vec_set_0): Change alternative *r > to ?r. > (*vec_extractv4sf_mem): Ditto. > (*vec_extracthf): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr105513-1.c: New test. > * gcc.target/i386/extract-insert-combining.c: Add new > scan-assembler-not for spill. Let's have some experiment with this approach. The above is also better for TUNE_INTER_UNIT_MOVES_{TO,FROM}_VEC, since moves between %eax and %xmm will again go through memory (I'm not sure how much we care for these targets anyway). OK. Thanks, Uros. > --- > gcc/config/i386/i386.md | 8 ++++---- > gcc/config/i386/sse.md | 8 ++++---- > .../gcc.target/i386/extract-insert-combining.c | 1 + > gcc/testsuite/gcc.target/i386/pr105513-1.c | 16 ++++++++++++++++ > 4 files changed, 25 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr105513-1.c > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 48a98e1b68b..5b538413942 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -2251,9 +2251,9 @@ (define_split > > (define_insn "*movdi_internal" > [(set (match_operand:DI 0 "nonimmediate_operand" > - "=r ,o ,r,r ,r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,m,?r ,?*Yd,?r,?*v,?*y,?*x,*k,*k ,*r,*m,*k") > + "=r ,o ,r,r ,r,m ,*y,*y,?*y,?m,?r,?*y,?v,?v,?v,m ,m,?r ,?*Yd,?r,?v,?*y,?*x,*k,*k ,*r,*m,*k") > (match_operand:DI 1 "general_operand" > - "riFo,riF,Z,rem,i,re,C ,*y,Bk ,*y,*y,r ,C ,*v,Bk,*v,v,*Yd,r ,*v,r ,*x ,*y ,*r,*kBk,*k,*k,CBC"))] > + "riFo,riF,Z,rem,i,re,C ,*y,Bk ,*y,*y,r ,C ,?v,Bk,?v,v,*Yd,r ,?v,r ,*x ,*y ,*r,*kBk,*k,*k,CBC"))] > "!(MEM_P (operands[0]) && MEM_P (operands[1])) > && ix86_hardreg_mov_ok (operands[0], operands[1])" > { > @@ -2472,9 +2472,9 @@ (define_peephole2 > > (define_insn "*movsi_internal" > [(set (match_operand:SI 0 "nonimmediate_operand" > - "=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,*k") > + "=r,m ,*y,*y,?*y,?m,?r,?*y,?v,?v,?v,m ,?r,?v,*k,*k ,*rm,*k") > (match_operand:SI 1 "general_operand" > - "g ,re,C ,*y,Bk ,*y,*y,r ,C ,*v,Bk,*v,*v,r ,*r,*kBk,*k ,CBC"))] > + "g ,re,C ,*y,Bk ,*y,*y,r ,C ,?v,Bk,?v,?v,r ,*r,*kBk,*k ,CBC"))] > "!(MEM_P (operands[0]) && MEM_P (operands[1])) > && ix86_hardreg_mov_ok (operands[0], operands[1])" > { > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 62688f8e29d..d41ce2e1a9b 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -10590,11 +10590,11 @@ (define_insn "*vec_concatv4sf_0" > ;; see comment above inline_secondary_memory_needed function in i386.cc > (define_insn "vec_set_0" > [(set (match_operand:VI4F_128 0 "nonimmediate_operand" > - "=Yr,*x,v,v,v,x,x,v,Yr ,*x ,x ,m ,m ,m") > + "=Yr,*x,v,v,v,x,x,v,Yr ,?x ,x ,m ,m ,m") > (vec_merge:VI4F_128 > (vec_duplicate:VI4F_128 > (match_operand: 2 "general_operand" > - " Yr,*x,v,m,r ,m,x,v,*rm,*rm,*rm,!x,!*re,!*fF")) > + " Yr,*x,v,m,r ,m,x,v,?rm,?rm,?rm,!x,?re,!*fF")) > (match_operand:VI4F_128 1 "nonimm_or_0_operand" > " C , C,C,C,C ,C,0,v,0 ,0 ,x ,0 ,0 ,0") > (const_int 1)))] > @@ -11056,7 +11056,7 @@ (define_insn_and_split "*sse4_1_extractps" > (set_attr "mode" "V4SF,V4SF,V4SF,*,*")]) > > (define_insn_and_split "*vec_extractv4sf_mem" > - [(set (match_operand:SF 0 "register_operand" "=v,*r,f") > + [(set (match_operand:SF 0 "register_operand" "=v,?r,f") > (vec_select:SF > (match_operand:V4SF 1 "memory_operand" "o,o,o") > (parallel [(match_operand 2 "const_0_to_3_operand")])))] > @@ -11933,7 +11933,7 @@ (define_insn_and_split "*vec_extract_0" > "operands[1] = gen_lowpart (HFmode, operands[1]);") > > (define_insn "*vec_extracthf" > - [(set (match_operand:HF 0 "register_sse4nonimm_operand" "=*r,m,x,v") > + [(set (match_operand:HF 0 "register_sse4nonimm_operand" "=?r,m,x,v") > (vec_select:HF > (match_operand:V8HF 1 "register_operand" "v,v,0,v") > (parallel > diff --git a/gcc/testsuite/gcc.target/i386/extract-insert-combining.c b/gcc/testsuite/gcc.target/i386/extract-insert-combining.c > index 32d951e6832..5a53d4cbf06 100644 > --- a/gcc/testsuite/gcc.target/i386/extract-insert-combining.c > +++ b/gcc/testsuite/gcc.target/i386/extract-insert-combining.c > @@ -4,6 +4,7 @@ > /* { dg-final { scan-assembler-times "(?:vpaddd|paddd)\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]" 2 } } */ > /* { dg-final { scan-assembler-times "(?:vpinsrd|pinsrd)\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]" 1 } } */ > /* { dg-final { scan-assembler-not "vmovss" } } */ > +/* { dg-final { scan-assembler-not {(?n)mov.*(%rsp)} { target { ! ia32 } } } } */ > > #include > > diff --git a/gcc/testsuite/gcc.target/i386/pr105513-1.c b/gcc/testsuite/gcc.target/i386/pr105513-1.c > new file mode 100644 > index 00000000000..530f5292252 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr105513-1.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-O2 -msse2 -mtune=skylake -mfpmath=sse" } */ > +/* { dg-final { scan-assembler-not "\\(%rsp\\)" } } */ > + > +static int as_int(float x) > +{ > + return (union{float x; int i;}){x}.i; > +} > + > +float f(double y, float x) > +{ > + int i = as_int(x); > + if (__builtin_expect(i > 99, 0)) return 0; > + if (i*2u < 77) if (i==2) return 0; > + return y*x; > +} > -- > 2.18.1 >