public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Hongyu Wang <hongyu.wang@intel.com>
Cc: gcc-patches@gcc.gnu.org, jakub@redhat.com, hongtao.liu@intel.com,
	 hubicka@ucw.cz
Subject: Re: [PATCH 11/13] [APX EGPR] Handle legacy insns that only support GPR16 (3/5)
Date: Thu, 31 Aug 2023 11:28:54 +0200	[thread overview]
Message-ID: <CAFiYyc2-qxZx4KjVf8NEm9EyRm4=LA3j462CAfUv5e_jJrrW6g@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc1E1Ni57TWDoH0yuim4Lbg6s4XU1YWM=BeT33BDcNzAMw@mail.gmail.com>

On Thu, Aug 31, 2023 at 11:26 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Thu, Aug 31, 2023 at 10:25 AM Hongyu Wang via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > From: Kong Lingling <lingling.kong@intel.com>
> >
> > Disable EGPR usage for below legacy insns in opcode map2/3 that have vex
> > but no evex counterpart.
> >
> > insn list:
> > 1. phminposuw/vphminposuw
> > 2. ptest/vptest
> > 3. roundps/vroundps, roundpd/vroundpd,
> >    roundss/vroundss, roundsd/vroundsd
> > 4. pcmpestri/vpcmpestri, pcmpestrm/vpcmpestrm
> > 5. pcmpistri/vpcmpistri, pcmpistrm/vpcmpistrm
>
> How are GPRs involved in the above?  Or did I misunderstand something?

Following up myself - for the memory operand alternatives I guess.  How about
simply disabling the memory alternatives when EGPR is active?  Wouldn't
that simplify the initial patchset a lot?  Re-enabling them when
deemed important
could be done as followup then?

Richard.

> > 6. aesimc/vaesimc, aeskeygenassist/vaeskeygenassist
> >
> > gcc/ChangeLog:
> >
> >         * config/i386/i386-protos.h (x86_evex_reg_mentioned_p): New
> >         prototype.
> >         * config/i386/i386.cc (x86_evex_reg_mentioned_p): New
> >         function.
> >         * config/i386/i386.md (sse4_1_round<mode>2): Set attr gpr32 0
> >         and constraint Bt/BM to all non-evex alternatives, adjust
> >         alternative outputs if evex reg is mentioned.
> >         * config/i386/sse.md (<sse4_1>_ptest<mode>): Set attr gpr32 0
> >         and constraint Bt/BM to all non-evex alternatives.
> >         (ptesttf2): Likewise.
> >         (<sse4_1>_round<ssemodesuffix><avxsizesuffix): Likewise.
> >         (sse4_1_round<ssescalarmodesuffix>): Likewise.
> >         (sse4_2_pcmpestri): Likewise.
> >         (sse4_2_pcmpestrm): Likewise.
> >         (sse4_2_pcmpestr_cconly): Likewise.
> >         (sse4_2_pcmpistr): Likewise.
> >         (sse4_2_pcmpistri): Likewise.
> >         (sse4_2_pcmpistrm): Likewise.
> >         (sse4_2_pcmpistr_cconly): Likewise.
> >         (aesimc): Likewise.
> >         (aeskeygenassist): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/i386/apx-legacy-insn-check-norex2.c: Add intrinsic
> >         tests.
> > ---
> >  gcc/config/i386/i386-protos.h                 |  1 +
> >  gcc/config/i386/i386.cc                       | 13 +++
> >  gcc/config/i386/i386.md                       |  3 +-
> >  gcc/config/i386/sse.md                        | 93 +++++++++++++------
> >  .../i386/apx-legacy-insn-check-norex2.c       | 55 ++++++++++-
> >  5 files changed, 132 insertions(+), 33 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> > index 78eb3e0f584..bbb219e3039 100644
> > --- a/gcc/config/i386/i386-protos.h
> > +++ b/gcc/config/i386/i386-protos.h
> > @@ -65,6 +65,7 @@ extern bool extended_reg_mentioned_p (rtx);
> >  extern bool x86_extended_QIreg_mentioned_p (rtx_insn *);
> >  extern bool x86_extended_reg_mentioned_p (rtx);
> >  extern bool x86_extended_rex2reg_mentioned_p (rtx);
> > +extern bool x86_evex_reg_mentioned_p (rtx [], int);
> >  extern bool x86_maybe_negate_const_int (rtx *, machine_mode);
> >  extern machine_mode ix86_cc_mode (enum rtx_code, rtx, rtx);
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index f5d642948bc..ec93c5bab97 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -22936,6 +22936,19 @@ x86_extended_rex2reg_mentioned_p (rtx insn)
> >    return false;
> >  }
> >
> > +/* Return true when rtx operands mentions register that must be encoded using
> > +   evex prefix.  */
> > +bool
> > +x86_evex_reg_mentioned_p (rtx operands[], int nops)
> > +{
> > +  int i;
> > +  for (i = 0; i < nops; i++)
> > +    if (EXT_REX_SSE_REG_P (operands[i])
> > +       || x86_extended_rex2reg_mentioned_p (operands[i]))
> > +      return true;
> > +  return false;
> > +}
> > +
> >  /* If profitable, negate (without causing overflow) integer constant
> >     of mode MODE at location LOC.  Return true in this case.  */
> >  bool
> > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> > index 83ad01b43c1..4c305e72389 100644
> > --- a/gcc/config/i386/i386.md
> > +++ b/gcc/config/i386/i386.md
> > @@ -21603,7 +21603,7 @@ (define_expand "significand<mode>2"
> >  (define_insn "sse4_1_round<mode>2"
> >    [(set (match_operand:MODEFH 0 "register_operand" "=x,x,x,v,v")
> >         (unspec:MODEFH
> > -         [(match_operand:MODEFH 1 "nonimmediate_operand" "0,x,m,v,m")
> > +         [(match_operand:MODEFH 1 "nonimmediate_operand" "0,x,Bt,v,m")
> >            (match_operand:SI 2 "const_0_to_15_operand")]
> >           UNSPEC_ROUND))]
> >    "TARGET_SSE4_1"
> > @@ -21616,6 +21616,7 @@ (define_insn "sse4_1_round<mode>2"
> >    [(set_attr "type" "ssecvt")
> >     (set_attr "prefix_extra" "1,1,1,*,*")
> >     (set_attr "length_immediate" "1")
> > +   (set_attr "gpr32" "1,1,0,1,1")
> >     (set_attr "prefix" "maybe_vex,maybe_vex,maybe_vex,evex,evex")
> >     (set_attr "isa" "noavx512f,noavx512f,noavx512f,avx512f,avx512f")
> >     (set_attr "avx_partial_xmm_update" "false,false,true,false,true")
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index 05963de9219..456713b991a 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -22617,11 +22617,12 @@ (define_insn "avx2_pblendd<mode>"
> >
> >  (define_insn "sse4_1_phminposuw"
> >    [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,x")
> > -       (unspec:V8HI [(match_operand:V8HI 1 "vector_operand" "YrBm,*xBm,xm")]
> > +       (unspec:V8HI [(match_operand:V8HI 1 "vector_operand" "YrBT,*xBT,xBt")]
> >                      UNSPEC_PHMINPOSUW))]
> >    "TARGET_SSE4_1"
> >    "%vphminposuw\t{%1, %0|%0, %1}"
> >    [(set_attr "isa" "noavx,noavx,avx")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "type" "sselog1")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "prefix" "orig,orig,vex")
> > @@ -23810,12 +23811,13 @@ (define_insn "avx_vtest<ssemodesuffix><avxsizesuffix>"
> >  (define_insn "*<sse4_1>_ptest<mode>"
> >    [(set (reg FLAGS_REG)
> >         (unspec [(match_operand:V_AVX 0 "register_operand" "Yr, *x, x")
> > -                (match_operand:V_AVX 1 "vector_operand" "YrBm, *xBm, xm")]
> > +                (match_operand:V_AVX 1 "vector_operand" "YrBT, *xBT, xBt")]
> >                 UNSPEC_PTEST))]
> >    "TARGET_SSE4_1 && ix86_match_ptest_ccmode (insn)"
> >    "%vptest\t{%1, %0|%0, %1}"
> >    [(set_attr "isa" "noavx,noavx,avx")
> >     (set_attr "type" "ssecomi")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "prefix" "orig,orig,vex")
> >     (set (attr "btver2_decode")
> > @@ -23852,12 +23854,13 @@ (define_expand "<sse4_1>_ptest<mode>"
> >  (define_insn "ptesttf2"
> >    [(set (reg:CC FLAGS_REG)
> >         (unspec:CC [(match_operand:TF 0 "register_operand" "Yr, *x, x")
> > -                   (match_operand:TF 1 "vector_operand" "YrBm, *xBm, xm")]
> > +                   (match_operand:TF 1 "vector_operand" "YrBT, *xBT, xBt")]
> >                    UNSPEC_PTEST))]
> >    "TARGET_SSE4_1"
> >    "%vptest\t{%1, %0|%0, %1}"
> >    [(set_attr "isa" "noavx,noavx,avx")
> >     (set_attr "type" "ssecomi")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "prefix" "orig,orig,vex")
> >     (set_attr "mode" "TI")])
> > @@ -23968,13 +23971,14 @@ (define_expand "lrint<mode><sseintvecmodelower>2"
> >  (define_insn "<sse4_1>_round<ssemodesuffix><avxsizesuffix>"
> >    [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x")
> >         (unspec:VF_128_256
> > -         [(match_operand:VF_128_256 1 "vector_operand" "YrBm,*xBm,xm")
> > +         [(match_operand:VF_128_256 1 "vector_operand" "YrBT,*xBT,xBt")
> >            (match_operand:SI 2 "const_0_to_15_operand")]
> >           UNSPEC_ROUND))]
> >    "TARGET_SSE4_1"
> >    "%vround<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
> >    [(set_attr "isa" "noavx,noavx,avx")
> >     (set_attr "type" "ssecvt")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_data16" "1,1,*")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "length_immediate" "1")
> > @@ -24061,19 +24065,32 @@ (define_insn "sse4_1_round<ssescalarmodesuffix>"
> >    [(set (match_operand:VF_128 0 "register_operand" "=Yr,*x,x,v")
> >         (vec_merge:VF_128
> >           (unspec:VF_128
> > -           [(match_operand:VF_128 2 "nonimmediate_operand" "Yrm,*xm,xm,vm")
> > +           [(match_operand:VF_128 2 "nonimmediate_operand" "YrBt,*xBt,xBt,vm")
> >              (match_operand:SI 3 "const_0_to_15_operand")]
> >             UNSPEC_ROUND)
> >           (match_operand:VF_128 1 "register_operand" "0,0,x,v")
> >           (const_int 1)))]
> >    "TARGET_SSE4_1"
> > -  "@
> > -   round<ssescalarmodesuffix>\t{%3, %2, %0|%0, %<iptr>2, %3}
> > -   round<ssescalarmodesuffix>\t{%3, %2, %0|%0, %<iptr>2, %3}
> > -   vround<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %<iptr>2, %3}
> > -   vrndscale<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %<iptr>2, %3}"
> > -  [(set_attr "isa" "noavx,noavx,avx,avx512f")
> > +{
> > +  switch (which_alternative)
> > +    {
> > +      case 0:
> > +      case 1:
> > +       return "round<ssescalarmodesuffix>\t{%3, %2, %0|%0, %<iptr>2, %3}";
> > +      case 2:
> > +       return "vround<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %<iptr>2, %3}";
> > +      case 3:
> > +       if (x86_evex_reg_mentioned_p (operands, 3))
> > +         return "vrndscale<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %<iptr>2, %3}";
> > +       else
> > +         return "vround<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %<iptr>2, %3}";
> > +      default:
> > +       gcc_unreachable ();
> > +    }
> > +}
> > +  [(set_attr "isa" "noavx,noavx,noavx512f,avx512f")
> >     (set_attr "type" "ssecvt")
> > +   (set_attr "gpr32" "0,0,0,1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "prefix_data16" "1,1,*,*")
> >     (set_attr "prefix_extra" "1")
> > @@ -24085,19 +24102,32 @@ (define_insn "*sse4_1_round<ssescalarmodesuffix>"
> >         (vec_merge:VFH_128
> >           (vec_duplicate:VFH_128
> >             (unspec:<ssescalarmode>
> > -             [(match_operand:<ssescalarmode> 2 "nonimmediate_operand" "Yrm,*xm,xm,vm")
> > +             [(match_operand:<ssescalarmode> 2 "nonimmediate_operand" "YrBt,*xBt,xBt,vm")
> >                (match_operand:SI 3 "const_0_to_15_operand")]
> >               UNSPEC_ROUND))
> >           (match_operand:VFH_128 1 "register_operand" "0,0,x,v")
> >           (const_int 1)))]
> >    "TARGET_SSE4_1"
> > -  "@
> > -   round<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3}
> > -   round<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3}
> > -   vround<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}
> > -   vrndscale<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> > -  [(set_attr "isa" "noavx,noavx,avx,avx512f")
> > +{
> > +  switch (which_alternative)
> > +    {
> > +      case 0:
> > +      case 1:
> > +       return "round<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3}";
> > +      case 2:
> > +       return "vround<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}";
> > +      case 3:
> > +       if (x86_evex_reg_mentioned_p (operands, 3) || <MODE>mode == V8HFmode)
> > +         return "vrndscale<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}";
> > +       else
> > +         return "vround<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}";
> > +      default:
> > +       gcc_unreachable ();
> > +    }
> > +}
> > +  [(set_attr "isa" "noavx,noavx,noavx512f,avx512f")
> >     (set_attr "type" "ssecvt")
> > +   (set_attr "gpr32" "0,0,0,1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "prefix_data16" "1,1,*,*")
> >     (set_attr "prefix_extra" "1")
> > @@ -24318,7 +24348,7 @@ (define_insn "sse4_2_pcmpestri"
> >         (unspec:SI
> >           [(match_operand:V16QI 1 "register_operand" "x,x")
> >            (match_operand:SI 2 "register_operand" "a,a")
> > -          (match_operand:V16QI 3 "nonimmediate_operand" "x,m")
> > +          (match_operand:V16QI 3 "nonimmediate_operand" "x,Bt")
> >            (match_operand:SI 4 "register_operand" "d,d")
> >            (match_operand:SI 5 "const_0_to_255_operand")]
> >           UNSPEC_PCMPESTR))
> > @@ -24333,6 +24363,7 @@ (define_insn "sse4_2_pcmpestri"
> >    "TARGET_SSE4_2"
> >    "%vpcmpestri\t{%5, %3, %1|%1, %3, %5}"
> >    [(set_attr "type" "sselog")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "prefix" "maybe_vex")
> >     (set_attr "length_immediate" "1")
> > @@ -24345,7 +24376,7 @@ (define_insn "sse4_2_pcmpestrm"
> >         (unspec:V16QI
> >           [(match_operand:V16QI 1 "register_operand" "x,x")
> >            (match_operand:SI 2 "register_operand" "a,a")
> > -          (match_operand:V16QI 3 "nonimmediate_operand" "x,m")
> > +          (match_operand:V16QI 3 "nonimmediate_operand" "x,Bt")
> >            (match_operand:SI 4 "register_operand" "d,d")
> >            (match_operand:SI 5 "const_0_to_255_operand")]
> >           UNSPEC_PCMPESTR))
> > @@ -24360,6 +24391,7 @@ (define_insn "sse4_2_pcmpestrm"
> >    "TARGET_SSE4_2"
> >    "%vpcmpestrm\t{%5, %3, %1|%1, %3, %5}"
> >    [(set_attr "type" "sselog")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "prefix" "maybe_vex")
> > @@ -24372,7 +24404,7 @@ (define_insn "sse4_2_pcmpestr_cconly"
> >         (unspec:CC
> >           [(match_operand:V16QI 2 "register_operand" "x,x,x,x")
> >            (match_operand:SI 3 "register_operand" "a,a,a,a")
> > -          (match_operand:V16QI 4 "nonimmediate_operand" "x,m,x,m")
> > +          (match_operand:V16QI 4 "nonimmediate_operand" "x,Bt,x,Bt")
> >            (match_operand:SI 5 "register_operand" "d,d,d,d")
> >            (match_operand:SI 6 "const_0_to_255_operand")]
> >           UNSPEC_PCMPESTR))
> > @@ -24385,6 +24417,7 @@ (define_insn "sse4_2_pcmpestr_cconly"
> >     %vpcmpestri\t{%6, %4, %2|%2, %4, %6}
> >     %vpcmpestri\t{%6, %4, %2|%2, %4, %6}"
> >    [(set_attr "type" "sselog")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "memory" "none,load,none,load")
> > @@ -24396,7 +24429,7 @@ (define_insn_and_split "sse4_2_pcmpistr"
> >    [(set (match_operand:SI 0 "register_operand" "=c,c")
> >         (unspec:SI
> >           [(match_operand:V16QI 2 "register_operand" "x,x")
> > -          (match_operand:V16QI 3 "nonimmediate_operand" "x,m")
> > +          (match_operand:V16QI 3 "nonimmediate_operand" "x,Bt")
> >            (match_operand:SI 4 "const_0_to_255_operand")]
> >           UNSPEC_PCMPISTR))
> >     (set (match_operand:V16QI 1 "register_operand" "=Yz,Yz")
> > @@ -24439,6 +24472,7 @@ (define_insn_and_split "sse4_2_pcmpistr"
> >    DONE;
> >  }
> >    [(set_attr "type" "sselog")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "memory" "none,load")
> > @@ -24448,7 +24482,7 @@ (define_insn "sse4_2_pcmpistri"
> >    [(set (match_operand:SI 0 "register_operand" "=c,c")
> >         (unspec:SI
> >           [(match_operand:V16QI 1 "register_operand" "x,x")
> > -          (match_operand:V16QI 2 "nonimmediate_operand" "x,m")
> > +          (match_operand:V16QI 2 "nonimmediate_operand" "x,Bt")
> >            (match_operand:SI 3 "const_0_to_255_operand")]
> >           UNSPEC_PCMPISTR))
> >     (set (reg:CC FLAGS_REG)
> > @@ -24460,6 +24494,7 @@ (define_insn "sse4_2_pcmpistri"
> >    "TARGET_SSE4_2"
> >    "%vpcmpistri\t{%3, %2, %1|%1, %2, %3}"
> >    [(set_attr "type" "sselog")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "prefix" "maybe_vex")
> > @@ -24471,7 +24506,7 @@ (define_insn "sse4_2_pcmpistrm"
> >    [(set (match_operand:V16QI 0 "register_operand" "=Yz,Yz")
> >         (unspec:V16QI
> >           [(match_operand:V16QI 1 "register_operand" "x,x")
> > -          (match_operand:V16QI 2 "nonimmediate_operand" "x,m")
> > +          (match_operand:V16QI 2 "nonimmediate_operand" "x,Bt")
> >            (match_operand:SI 3 "const_0_to_255_operand")]
> >           UNSPEC_PCMPISTR))
> >     (set (reg:CC FLAGS_REG)
> > @@ -24483,6 +24518,7 @@ (define_insn "sse4_2_pcmpistrm"
> >    "TARGET_SSE4_2"
> >    "%vpcmpistrm\t{%3, %2, %1|%1, %2, %3}"
> >    [(set_attr "type" "sselog")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "prefix" "maybe_vex")
> > @@ -24494,7 +24530,7 @@ (define_insn "sse4_2_pcmpistr_cconly"
> >    [(set (reg:CC FLAGS_REG)
> >         (unspec:CC
> >           [(match_operand:V16QI 2 "register_operand" "x,x,x,x")
> > -          (match_operand:V16QI 3 "nonimmediate_operand" "x,m,x,m")
> > +          (match_operand:V16QI 3 "nonimmediate_operand" "x,Bt,x,Bt")
> >            (match_operand:SI 4 "const_0_to_255_operand")]
> >           UNSPEC_PCMPISTR))
> >     (clobber (match_scratch:V16QI 0 "=Yz,Yz,X,X"))
> > @@ -24506,6 +24542,7 @@ (define_insn "sse4_2_pcmpistr_cconly"
> >     %vpcmpistri\t{%4, %3, %2|%2, %3, %4}
> >     %vpcmpistri\t{%4, %3, %2|%2, %3, %4}"
> >    [(set_attr "type" "sselog")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "memory" "none,load,none,load")
> > @@ -25990,23 +26027,25 @@ (define_insn "aesdeclast"
> >
> >  (define_insn "aesimc"
> >    [(set (match_operand:V2DI 0 "register_operand" "=x")
> > -       (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBm")]
> > +       (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBT")]
> >                       UNSPEC_AESIMC))]
> >    "TARGET_AES"
> >    "%vaesimc\t{%1, %0|%0, %1}"
> >    [(set_attr "type" "sselog1")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "prefix" "maybe_vex")
> >     (set_attr "mode" "TI")])
> >
> >  (define_insn "aeskeygenassist"
> >    [(set (match_operand:V2DI 0 "register_operand" "=x")
> > -       (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBm")
> > +       (unspec:V2DI [(match_operand:V2DI 1 "vector_operand" "xBT")
> >                       (match_operand:SI 2 "const_0_to_255_operand")]
> >                      UNSPEC_AESKEYGENASSIST))]
> >    "TARGET_AES"
> >    "%vaeskeygenassist\t{%2, %1, %0|%0, %1, %2}"
> >    [(set_attr "type" "sselog1")
> > +   (set_attr "gpr32" "0")
> >     (set_attr "prefix_extra" "1")
> >     (set_attr "length_immediate" "1")
> >     (set_attr "prefix" "maybe_vex")
> > diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c
> > index 510213a6ca7..771bcb078e1 100644
> > --- a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c
> > +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c
> > @@ -45,13 +45,22 @@ typedef union
> >    DTYPE a[16];
> >  } tmp_u;
> >
> > -__attribute__((target("sse4.2")))
> > +__attribute__((target("sse4.2,aes")))
> >  void sse_test ()
> >  {
> >    register tmp_u *tdst __asm__("%r16");
> >    register tmp_u *src1 __asm__("%r17");
> >    register tmp_u *src2 __asm__("%r18");
> > -
> > +
> > +  src1->xi[0] = _mm_minpos_epu16 (src1->xi[1]);
> > +  src1->a[2] = _mm_testc_si128 (src1->xi[3], src2->xi[4]);
> > +  src1->xf[3] = _mm_round_ss (src1->xf[5], src2->xf[6],
> > +                             _MM_FROUND_CUR_DIRECTION);
> > +  src1->xf[4] = _mm_round_ps (src1->xf[7], _MM_FROUND_CUR_DIRECTION);
> > +  src1->xd[0] = _mm_round_sd (src1->xd[2], src2->xd[3],
> > +                             _MM_FROUND_CUR_DIRECTION);
> > +  src1->xd[1] = _mm_round_pd (src1->xd[4], _MM_FROUND_CUR_DIRECTION);
> > +
> >    src1->xi[0] = _mm_hadd_epi16 (tdst->xi[2], src2->xi[3]);
> >    src1->xi[1] = _mm_hadd_epi32 (tdst->xi[0], src2->xi[1]);
> >    tdst->xi[2] = _mm_hadds_epi16 (src1->xi[4], src2->xi[5]);
> > @@ -77,16 +86,33 @@ void sse_test ()
> >    tdst->xi[1] = _mm_sign_epi8 (src1->xi[5], src2->xi[6]);
> >    tdst->xi[2] = _mm_sign_epi16 (src1->xi[7], src2->xi[0]);
> >    tdst->xi[3] = _mm_sign_epi32 (src1->xi[1], src2->xi[2]);
> > +
> > +  tdst->a[2] = _mm_cmpestri (src1->xi[3], 16, src2->xi[4], 16, 0x0c);
> > +  tdst->xi[4] = _mm_cmpestrm (src1->xi[3], 16, src2->xi[4], 16, 0x20);
> > +  tdst->a[5] = _mm_cmpistri (src1->xi[5], src2->xi[6], 0x30);
> > +  tdst->xi[6] = _mm_cmpistrm (src1->xi[5], src2->xi[6], 0x40);
> > +
> > +  tdst->xi[7] = _mm_aesimc_si128 (src1->xi[7]);
> > +  tdst->xi[0] = _mm_aeskeygenassist_si128 (src1->xi[1], 0x1b);
> >  }
> >
> > -__attribute__((target("avx2")))
> > +__attribute__((target("avx2,aes")))
> >  void vex_test ()
> >  {
> >
> >    register tmp_u *tdst __asm__("%r16");
> >    register tmp_u *src1 __asm__("%r17");
> >    register tmp_u *src2 __asm__("%r18");
> > -
> > +
> > +  src1->xi[0] = _mm_minpos_epu16 (src1->xi[1]);
> > +  src1->a[2] = _mm256_testc_si256 (src1->yi[2], src2->yi[3]);
> > +  src1->xf[3] = _mm_round_ss (src1->xf[5], src2->xf[6],
> > +                             _MM_FROUND_CUR_DIRECTION);
> > +  src1->yf[4] = _mm256_round_ps (src1->yf[2], _MM_FROUND_CUR_DIRECTION);
> > +  src1->xd[0] = _mm_round_sd (src1->xd[2], src2->xd[3],
> > +                             _MM_FROUND_CUR_DIRECTION);
> > +  src1->yd[1] = _mm256_round_pd (src1->yd[3], _MM_FROUND_CUR_DIRECTION);
> > +
> >    src1->yi[1] = _mm256_hadd_epi16 (tdst->yi[2], src2->yi[3]);
> >    src1->yi[2] = _mm256_hadd_epi32 (tdst->yi[0], src2->yi[1]);
> >    tdst->yi[3] = _mm256_hadds_epi16 (src1->yi[1], src2->yi[2]);
> > @@ -98,7 +124,6 @@ void vex_test ()
> >    src1->yi[1] = _mm256_cmpgt_epi64 (tdst->yi[3], src2->yi[0]);
> >
> >    tdst->yf[2] = _mm256_dp_ps (src1->yf[0], src2->yf[1], 0xbf);
> > -  tdst->xd[3] = _mm_dp_pd (src1->xd[0], src2->xd[1], 0xbf);
> >
> >    tdst->yi[3] = _mm256_mpsadbw_epu8 (src1->yi[1], src2->yi[1], 0xc1);
> >
> > @@ -112,6 +137,14 @@ void vex_test ()
> >    tdst->yi[2] = _mm256_sign_epi8 (src1->yi[0], src2->yi[1]);
> >    tdst->yi[3] = _mm256_sign_epi16 (src1->yi[2], src2->yi[3]);
> >    tdst->yi[0] = _mm256_sign_epi32 (src1->yi[0], src2->yi[1]);
> > +
> > +  tdst->a[2] = _mm_cmpestri (src1->xi[3], 16, src2->xi[4], 16, 0x0c);
> > +  tdst->xi[4] = _mm_cmpestrm (src1->xi[3], 16, src2->xi[4], 16, 0x20);
> > +  tdst->a[5] = _mm_cmpistri (src1->xi[5], src2->xi[6], 0x30);
> > +  tdst->xi[6] = _mm_cmpistrm (src1->xi[5], src2->xi[6], 0x40);
> > +
> > +  tdst->xi[7] = _mm_aesimc_si128 (src1->xi[7]);
> > +  tdst->xi[0] = _mm_aeskeygenassist_si128 (src1->xi[1], 0x1b);
> >  }
> >
> >  /* { dg-final { scan-assembler-not "v?pcmpeqq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > @@ -134,3 +167,15 @@ void vex_test ()
> >  /* { dg-final { scan-assembler-not "v?psignb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> >  /* { dg-final { scan-assembler-not "v?psignw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> >  /* { dg-final { scan-assembler-not "v?psignd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?phminposuw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?ptest\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?roundss\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?roundsd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?roundps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?roundpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?pcmpestri\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?pcmpistri\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?pcmpestrm\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?pcmpistrm\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?aesimc\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > +/* { dg-final { scan-assembler-not "v?aeskeygenassist\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */
> > --
> > 2.31.1
> >

  reply	other threads:[~2023-08-31  9:30 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-31  8:20 [PATCH 00/13] [RFC] Support Intel APX EGPR Hongyu Wang
2023-08-31  8:20 ` [PATCH 01/13] [APX EGPR] middle-end: Add insn argument to base_reg_class Hongyu Wang
2023-08-31 10:15   ` Uros Bizjak
2023-09-01  9:07     ` Hongyu Wang
2023-09-06 19:43       ` Vladimir Makarov
2023-09-07  6:23         ` Uros Bizjak
2023-09-07 12:13           ` Vladimir Makarov
2023-09-08 17:03   ` Vladimir Makarov
2023-09-10  4:49     ` Hongyu Wang
2023-09-14 12:09       ` Vladimir Makarov
2023-08-31  8:20 ` [PATCH 02/13] [APX EGPR] middle-end: Add index_reg_class with insn argument Hongyu Wang
2023-08-31  8:20 ` [PATCH 03/13] [APX_EGPR] Initial support for APX_F Hongyu Wang
2023-08-31  8:20 ` [PATCH 04/13] [APX EGPR] Add 16 new integer general purpose registers Hongyu Wang
2023-08-31  8:20 ` [PATCH 05/13] [APX EGPR] Add register and memory constraints that disallow EGPR Hongyu Wang
2023-08-31  8:20 ` [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint Hongyu Wang
2023-08-31  9:17   ` Jakub Jelinek
2023-08-31 10:00     ` Uros Bizjak
2023-09-01  9:04       ` Hongyu Wang
2023-09-01  9:38         ` Uros Bizjak
2023-09-01 10:35           ` Hongtao Liu
2023-09-01 11:27             ` Uros Bizjak
2023-09-04  0:28               ` Hongtao Liu
2023-09-04  8:57                 ` Uros Bizjak
2023-09-04  9:10                   ` Hongtao Liu
2023-09-01 11:03       ` Richard Sandiford
2023-09-04  1:03         ` Hongtao Liu
2023-09-01  9:04     ` Hongyu Wang
2023-08-31  8:20 ` [PATCH 07/13] [APX EGPR] Add backend hook for base_reg_class/index_reg_class Hongyu Wang
2023-08-31  8:20 ` [PATCH 08/13] [APX EGPR] Handle GPR16 only vector move insns Hongyu Wang
2023-08-31  9:43   ` Jakub Jelinek
2023-09-01  9:07     ` Hongyu Wang
2023-09-01  9:20       ` Jakub Jelinek
2023-09-01 11:34         ` Hongyu Wang
2023-09-01 11:41           ` Jakub Jelinek
2023-08-31  8:20 ` [PATCH 09/13] [APX EGPR] Handle legacy insn that only support GPR16 (1/5) Hongyu Wang
2023-08-31 10:06   ` Uros Bizjak
2023-08-31  8:20 ` [PATCH 10/13] [APX EGPR] Handle legacy insns that only support GPR16 (2/5) Hongyu Wang
2023-08-31  8:20 ` [PATCH 11/13] [APX EGPR] Handle legacy insns that only support GPR16 (3/5) Hongyu Wang
2023-08-31  9:26   ` Richard Biener
2023-08-31  9:28     ` Richard Biener [this message]
2023-09-01  9:03       ` Hongyu Wang
2023-09-01 10:38       ` Hongtao Liu
2023-08-31  9:31     ` Jakub Jelinek
2023-08-31  8:20 ` [PATCH 12/13] [APX_EGPR] Handle legacy insns that only support GPR16 (4/5) Hongyu Wang
2023-08-31  8:20 ` [PATCH 13/13] [APX EGPR] Handle vex insns that only support GPR16 (5/5) Hongyu Wang
2023-08-31  9:19 ` [PATCH 00/13] [RFC] Support Intel APX EGPR Richard Biener
2023-09-01  8:55   ` Hongyu Wang
2023-09-22 10:56 [PATCH v2 00/13] " Hongyu Wang
2023-09-22 10:56 ` [PATCH 11/13] [APX EGPR] Handle legacy insns that only support GPR16 (3/5) Hongyu Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc2-qxZx4KjVf8NEm9EyRm4=LA3j462CAfUv5e_jJrrW6g@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=hongyu.wang@intel.com \
    --cc=hubicka@ucw.cz \
    --cc=jakub@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).