From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk1-xa35.google.com (mail-vk1-xa35.google.com [IPv6:2607:f8b0:4864:20::a35]) by sourceware.org (Postfix) with ESMTPS id 8F7263858D28 for ; Mon, 6 Dec 2021 03:44:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8F7263858D28 Received: by mail-vk1-xa35.google.com with SMTP id f7so5887167vkf.10 for ; Sun, 05 Dec 2021 19:44:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=JJkn/RdNNGo7Ti5fAKiA55Omalq1kV05zpaa4EV0aQs=; b=IOXcfApoxTK1Cm4pOX1Lkocc2vEeIQ/f4jnEW5jZg+GL0zyzkXOh/Y/WBt3VvyIlP2 aiiy9LY/rMjz9w18q6QIuJRjfSrip+UV9/wPAmp/0hJmp8xrsssBnjx6YL1LEg11mSDp 6J4VWrv1gwULa6/baB1ApVz+Bp3rel3f3I/jADxaZjaVFHoQxT7lZwIDsBIZbDWpjCcF 60fkNTF/fgwN62sEt0gxXjdmGE8sySXtp3G9j9Ngl+K+2rg7Fbt+jv9h+4Z3a1E+s+9Y 8Lh0fUFcY18tiY47CxAzfAsPx2bYICr+CdeOBXJy9v+XioUpLPu5rBZ0iPHIeyFdP2tv ji5g== X-Gm-Message-State: AOAM530wgjFsU3nHY8Tpd49ZBA2uSZXIgmPiQS1ZgqNwJKWzJQMYQe+5 VjUYKmZR3Bczx+FlNeUcvU+DdgcLx5759FR4lLhInwTLBO6RWg== X-Google-Smtp-Source: ABdhPJz3UOxiBLGUDLa8RwSqzo054ZhJ6k9eqCCyXekz1YcdcIS71ztny5hVIrSdrGA76QFhDi1kKWekLvKUF9ouQwI= X-Received: by 2002:a05:6122:548:: with SMTP id y8mr39187881vko.24.1638762285137; Sun, 05 Dec 2021 19:44:45 -0800 (PST) MIME-Version: 1.0 References: <20211206034101.1663134-1-hongtao.liu@intel.com> In-Reply-To: <20211206034101.1663134-1-hongtao.liu@intel.com> From: Hongtao Liu Date: Mon, 6 Dec 2021 11:44:34 +0800 Message-ID: Subject: Re: [PATCH] [i386] Prefer INT_SSE_REGS for SSE_FLOAT_MODE_P in preferred_reload_class. To: Uros Bizjak Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Dec 2021 03:44:46 -0000 Forget --in-reply-to when git send-email. > I was thinking about: > > --cut here-- > @@ -19194,9 +19194,17 @@ ix86_preferred_reload_class (rtx x, > reg_class_t regclass) > return NO_REGS; > } > > - /* Prefer SSE regs only, if we can use them for math. */ > + /* Prefer SSE if we can use them for math. Also allow integer regs > + when moves between register units are cheap. */ > if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH) > - return SSE_CLASS_P (regclass) ? regclass : NO_REGS; > + { > + if (TARGET_INTER_UNIT_MOVES_FROM_VEC > + && TARGET_INTER_UNIT_MOVES_TO_VEC > + && GET_MODE_SIZE (mode) <=3D GET_MODE_SIZE (word_mode)) > + return INT_SSE_CLASS_P (regclass) ? regclass : NO_REGS; > + else > + return SSE_CLASS_P (regclass) ? regclass : NO_REGS; > + } > > /* Generally when we see PLUS here, it's the function invariant > (plus soft-fp const_int). Which can only be computed into general > --cut here-- > > So, INT_SSE class is allowed when interunit moves are enabled. The > patch also takes care for 64-bit moves which are expensive on 32-bit > targets. I like your version, update patch. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} w/ and w/o -march= =3Dk8. On Mon, Dec 6, 2021 at 11:41 AM liuhongt wrote: > > When moves between integer and sse registers are cheap. > > 2021-12-06 Hongtao Liu > Uro=C5=A1 Bizjak > gcc/ChangeLog: > > PR target/95740 > * config/i386/i386.c (ix86_preferred_reload_class): Allow > integer regs when moves between register units are cheap. > * config/i386/i386.h (INT_SSE_CLASS_P): New. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr95740.c: New test. > --- > gcc/config/i386/i386.c | 12 ++++++++++-- > gcc/config/i386/i386.h | 2 ++ > gcc/testsuite/gcc.target/i386/pr95740.c | 26 +++++++++++++++++++++++++ > 3 files changed, 38 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr95740.c > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 80fee627358..e3c2e294988 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -19194,9 +19194,17 @@ ix86_preferred_reload_class (rtx x, reg_class_t = regclass) > return NO_REGS; > } > > - /* Prefer SSE regs only, if we can use them for math. */ > + /* Prefer SSE if we can use them for math. Also allow integer regs > + when moves between register units are cheap. */ > if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH) > - return SSE_CLASS_P (regclass) ? regclass : NO_REGS; > + { > + if (TARGET_INTER_UNIT_MOVES_FROM_VEC > + && TARGET_INTER_UNIT_MOVES_TO_VEC > + && GET_MODE_SIZE (mode) <=3D GET_MODE_SIZE (word_mode)) > + return INT_SSE_CLASS_P (regclass) ? regclass : NO_REGS; > + else > + return SSE_CLASS_P (regclass) ? regclass : NO_REGS; > + } > > /* Generally when we see PLUS here, it's the function invariant > (plus soft-fp const_int). Which can only be computed into general > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > index 2fda1e0686e..ec90e47904b 100644 > --- a/gcc/config/i386/i386.h > +++ b/gcc/config/i386/i386.h > @@ -1283,6 +1283,8 @@ enum reg_class > reg_class_subset_p ((CLASS), FLOAT_REGS) > #define SSE_CLASS_P(CLASS) \ > reg_class_subset_p ((CLASS), ALL_SSE_REGS) > +#define INT_SSE_CLASS_P(CLASS) \ > + reg_class_subset_p ((CLASS), INT_SSE_REGS) > #define MMX_CLASS_P(CLASS) \ > ((CLASS) =3D=3D MMX_REGS) > #define MASK_CLASS_P(CLASS) \ > diff --git a/gcc/testsuite/gcc.target/i386/pr95740.c b/gcc/testsuite/gcc.= target/i386/pr95740.c > new file mode 100644 > index 00000000000..7ecd71ba8c1 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr95740.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-msse2 -O2 -mtune=3Dgeneric -mtune-ctrl=3Duse_incdec -m= asm=3Datt -mfpmath=3Dsse" } */ > +/* { dg-final { scan-assembler-times {(?n)movd[\t ]*%xmm0.*%eax} 1 } } *= / > +/* { dg-final { scan-assembler-times {(?n)incl[\t ]*%eax} 1 } } */ > +/* { dg-final { scan-assembler-times {(?n)movq[\t ]*%xmm0.*%rax} 1 } } *= / > +/* { dg-final { scan-assembler-times {(?n)incq[\t ]*%rax} 1 } } */ > + > +int > +foo (float a) > +{ > + union{ > + int b; > + float a;}u; > + u.a =3D a; > + return u.b + 1; > +} > + > +long long > +foo1 (double a) > +{ > + union{ > + long long b; > + double a;}u; > + u.a =3D a; > + return u.b + 1; > +} > -- > 2.18.2 > -- BR, Hongtao