public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Fix gcc.target/i386/pr61403.c
@ 2017-10-25 19:36 Jan Hubicka
  2017-10-25 20:10 ` Evgeny Stupachenko
  0 siblings, 1 reply; 2+ messages in thread
From: Jan Hubicka @ 2017-10-25 19:36 UTC (permalink / raw)
  To: gcc-patches, rguenther, evstupac

Hi,
my core tuning patch has caused regression gcc.target/i386/pr61403.c which I have
missed in my testing.  The testcase looks for blend instruction which is no longer
output.  The reason is that the loop is now vectorized with SLP while before my
changes the costmodel claimed SLP vectorization is not good and vectorizer 
disabled it.

I have tested that on skylake, the new code is about 11% faster. The PR itself
is only about vectorizing the loop.  I am not quite sure what was the intention 
of the testcase, but perhaps we can just check that there is vectorized sqrt
that is output in any case?

Honza

Index: ../../gcc/testsuite/gcc.target/i386/pr61403.c
===================================================================
--- ../../gcc/testsuite/gcc.target/i386/pr61403.c       (revision 253935)
+++ ../../gcc/testsuite/gcc.target/i386/pr61403.c       (working copy)
@@ -23,4 +23,4 @@ norm (struct XYZ *in, struct XYZ *out, i
     }
 }

-/* { dg-final { scan-assembler "blend" } } */
+/* { dg-final { scan-assembler "rsqrtps" } } */

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Fix gcc.target/i386/pr61403.c
  2017-10-25 19:36 Fix gcc.target/i386/pr61403.c Jan Hubicka
@ 2017-10-25 20:10 ` Evgeny Stupachenko
  0 siblings, 0 replies; 2+ messages in thread
From: Evgeny Stupachenko @ 2017-10-25 20:10 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: GCC Patches, Richard Biener

Hi Honza,

That should be fine unless vectorization is done using extract/insert
instructions.

Thanks,
Evgeny

On Wed, Oct 25, 2017 at 12:25 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> my core tuning patch has caused regression gcc.target/i386/pr61403.c which I have
> missed in my testing.  The testcase looks for blend instruction which is no longer
> output.  The reason is that the loop is now vectorized with SLP while before my
> changes the costmodel claimed SLP vectorization is not good and vectorizer
> disabled it.
>
> I have tested that on skylake, the new code is about 11% faster. The PR itself
> is only about vectorizing the loop.  I am not quite sure what was the intention
> of the testcase, but perhaps we can just check that there is vectorized sqrt
> that is output in any case?
>
> Honza
>
> Index: ../../gcc/testsuite/gcc.target/i386/pr61403.c
> ===================================================================
> --- ../../gcc/testsuite/gcc.target/i386/pr61403.c       (revision 253935)
> +++ ../../gcc/testsuite/gcc.target/i386/pr61403.c       (working copy)
> @@ -23,4 +23,4 @@ norm (struct XYZ *in, struct XYZ *out, i
>      }
>  }
>
> -/* { dg-final { scan-assembler "blend" } } */
> +/* { dg-final { scan-assembler "rsqrtps" } } */
>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-10-25 20:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-25 19:36 Fix gcc.target/i386/pr61403.c Jan Hubicka
2017-10-25 20:10 ` Evgeny Stupachenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).