public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [r12-3893 Regression] FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4 on Linux/x86_64
@ 2021-09-27 18:28 sunil.k.pandey
  2021-09-28  6:59 ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: sunil.k.pandey @ 2021-09-27 18:28 UTC (permalink / raw)
  To: gcc-patches, gcc-regression, rguenther

On Linux/x86_64,

6390c5047adb75960f86d56582e6322aaa4d9281 is the first bad commit
commit 6390c5047adb75960f86d56582e6322aaa4d9281
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Nov 18 09:36:57 2020 +0100

    Allow different vector types for stmt groups

caused

FAIL: gcc.dg/vect/bb-slp-17.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 1
FAIL: gcc.dg/vect/bb-slp-17.c scan-tree-dump-times slp2 "optimized: basic block" 1
FAIL: gcc.dg/vect/bb-slp-pr65935.c -flto -ffat-lto-objects  scan-tree-dump-times slp1 "optimized: basic block" 10
FAIL: gcc.dg/vect/bb-slp-pr65935.c scan-tree-dump-times slp1 "optimized: basic block" 10
FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4

with GCC configured with

../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-3893/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/bb-slp-17.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/bb-slp-17.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/bb-slp-pr65935.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/bb-slp-pr65935.c --target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/vect-pr97352.c --target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [r12-3893 Regression] FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4 on Linux/x86_64
  2021-09-27 18:28 [r12-3893 Regression] FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4 on Linux/x86_64 sunil.k.pandey
@ 2021-09-28  6:59 ` Richard Biener
  2021-09-28  7:43   ` Hongtao Liu
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2021-09-28  6:59 UTC (permalink / raw)
  To: sunil.k.pandey; +Cc: gcc-patches, richard.sandiford, ubizjak

On Mon, 27 Sep 2021, sunil.k.pandey wrote:

> On Linux/x86_64,
> 
> 6390c5047adb75960f86d56582e6322aaa4d9281 is the first bad commit
> commit 6390c5047adb75960f86d56582e6322aaa4d9281
> Author: Richard Biener <rguenther@suse.de>
> Date:   Wed Nov 18 09:36:57 2020 +0100
> 
>     Allow different vector types for stmt groups
> 
> caused
> 
> FAIL: gcc.dg/vect/bb-slp-17.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 1
> FAIL: gcc.dg/vect/bb-slp-17.c scan-tree-dump-times slp2 "optimized: basic block" 1

This shows that it is maybe a bad idea to support V2SImode vectorization
with -m32 when we refuse to implement even plus.

OTOH it's just the mode that's available, autovectorize_vector_modes
doesn't include the corresponding mode but we still pick it up via
the related vector mode for group-size == 2.

> FAIL: gcc.dg/vect/bb-slp-pr65935.c -flto -ffat-lto-objects  scan-tree-dump-times slp1 "optimized: basic block" 10
> FAIL: gcc.dg/vect/bb-slp-pr65935.c scan-tree-dump-times slp1 "optimized: basic block" 10

We are now vectorizing the SSE tail when vectorizing with AVX.  I'll 
adjust the testcase to prefer SSE.

> FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4

With -mach=cascadelake we get

        vpermpd $68, c, %ymm0
        vpermpd $238, c, %ymm0

instead of

        vmovapd c, %ymm1
        vinsertf128     $1, %xmm1, %ymm1, %ymm0
        vperm2f128      $49, %ymm1, %ymm1, %ymm0

what's a way to disallow additional -march= from taking effect?  It's
really impossible to cater for all possible ISA variants in these kind
of testcases.

Richard.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [r12-3893 Regression] FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4 on Linux/x86_64
  2021-09-28  6:59 ` Richard Biener
@ 2021-09-28  7:43   ` Hongtao Liu
  2021-09-28  8:02     ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Hongtao Liu @ 2021-09-28  7:43 UTC (permalink / raw)
  To: Richard Biener; +Cc: sunil.k.pandey, Richard Sandiford, GCC Patches

On Tue, Sep 28, 2021 at 2:59 PM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Mon, 27 Sep 2021, sunil.k.pandey wrote:
>
> > On Linux/x86_64,
> >
> > 6390c5047adb75960f86d56582e6322aaa4d9281 is the first bad commit
> > commit 6390c5047adb75960f86d56582e6322aaa4d9281
> > Author: Richard Biener <rguenther@suse.de>
> > Date:   Wed Nov 18 09:36:57 2020 +0100
> >
> >     Allow different vector types for stmt groups
> >
> > caused
> >
> > FAIL: gcc.dg/vect/bb-slp-17.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 1
> > FAIL: gcc.dg/vect/bb-slp-17.c scan-tree-dump-times slp2 "optimized: basic block" 1
>
> This shows that it is maybe a bad idea to support V2SImode vectorization
> with -m32 when we refuse to implement even plus.
>
> OTOH it's just the mode that's available, autovectorize_vector_modes
> doesn't include the corresponding mode but we still pick it up via
> the related vector mode for group-size == 2.
>
> > FAIL: gcc.dg/vect/bb-slp-pr65935.c -flto -ffat-lto-objects  scan-tree-dump-times slp1 "optimized: basic block" 10
> > FAIL: gcc.dg/vect/bb-slp-pr65935.c scan-tree-dump-times slp1 "optimized: basic block" 10
>
> We are now vectorizing the SSE tail when vectorizing with AVX.  I'll
> adjust the testcase to prefer SSE.
>
> > FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4
>
> With -mach=cascadelake we get
>
>         vpermpd $68, c, %ymm0
>         vpermpd $238, c, %ymm0
>
> instead of
>
>         vmovapd c, %ymm1
>         vinsertf128     $1, %xmm1, %ymm1, %ymm0
>         vperm2f128      $49, %ymm1, %ymm1, %ymm0
>
> what's a way to disallow additional -march= from taking effect?  It's
I usually add -mno-{avx,avx512f} and -mtune=generic or sometimes
-mprefer-vector-width=* to the testcases.
or use (?:vinsertf128|vpermpd) for alternative instructions.
> really impossible to cater for all possible ISA variants in these kind
> of testcases.
Additional option -march=cascadelake sometimes can find real regression.
>
> Richard.



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [r12-3893 Regression] FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4 on Linux/x86_64
  2021-09-28  7:43   ` Hongtao Liu
@ 2021-09-28  8:02     ` Richard Biener
  2021-09-28  8:18       ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2021-09-28  8:02 UTC (permalink / raw)
  To: Hongtao Liu; +Cc: sunil.k.pandey, Richard Sandiford, GCC Patches

On Tue, 28 Sep 2021, Hongtao Liu wrote:

> On Tue, Sep 28, 2021 at 2:59 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Mon, 27 Sep 2021, sunil.k.pandey wrote:
> >
> > > On Linux/x86_64,
> > >
> > > 6390c5047adb75960f86d56582e6322aaa4d9281 is the first bad commit
> > > commit 6390c5047adb75960f86d56582e6322aaa4d9281
> > > Author: Richard Biener <rguenther@suse.de>
> > > Date:   Wed Nov 18 09:36:57 2020 +0100
> > >
> > >     Allow different vector types for stmt groups
> > >
> > > caused
> > >
> > > FAIL: gcc.dg/vect/bb-slp-17.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 1
> > > FAIL: gcc.dg/vect/bb-slp-17.c scan-tree-dump-times slp2 "optimized: basic block" 1
> >
> > This shows that it is maybe a bad idea to support V2SImode vectorization
> > with -m32 when we refuse to implement even plus.
> >
> > OTOH it's just the mode that's available, autovectorize_vector_modes
> > doesn't include the corresponding mode but we still pick it up via
> > the related vector mode for group-size == 2.

It looks like we could define the vectorize.related_mode hook to
reject V2SImode when !TARGET_MMX_WITH_SSE - the default implementation
just checks for vector_mode_supported_p.

> > > FAIL: gcc.dg/vect/bb-slp-pr65935.c -flto -ffat-lto-objects  scan-tree-dump-times slp1 "optimized: basic block" 10
> > > FAIL: gcc.dg/vect/bb-slp-pr65935.c scan-tree-dump-times slp1 "optimized: basic block" 10
> >
> > We are now vectorizing the SSE tail when vectorizing with AVX.  I'll
> > adjust the testcase to prefer SSE.
> >
> > > FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4
> >
> > With -mach=cascadelake we get
> >
> >         vpermpd $68, c, %ymm0
> >         vpermpd $238, c, %ymm0
> >
> > instead of
> >
> >         vmovapd c, %ymm1
> >         vinsertf128     $1, %xmm1, %ymm1, %ymm0
> >         vperm2f128      $49, %ymm1, %ymm1, %ymm0
> >
> > what's a way to disallow additional -march= from taking effect?  It's
> I usually add -mno-{avx,avx512f} and -mtune=generic or sometimes
> -mprefer-vector-width=* to the testcases.

OK, I will try this route then.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [r12-3893 Regression] FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4 on Linux/x86_64
  2021-09-28  8:02     ` Richard Biener
@ 2021-09-28  8:18       ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2021-09-28  8:18 UTC (permalink / raw)
  To: Hongtao Liu; +Cc: sunil.k.pandey, Richard Sandiford, GCC Patches

On Tue, 28 Sep 2021, Richard Biener wrote:

> On Tue, 28 Sep 2021, Hongtao Liu wrote:
> 
> > On Tue, Sep 28, 2021 at 2:59 PM Richard Biener via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > On Mon, 27 Sep 2021, sunil.k.pandey wrote:
> > >
> > > > On Linux/x86_64,
> > > >
> > > > 6390c5047adb75960f86d56582e6322aaa4d9281 is the first bad commit
> > > > commit 6390c5047adb75960f86d56582e6322aaa4d9281
> > > > Author: Richard Biener <rguenther@suse.de>
> > > > Date:   Wed Nov 18 09:36:57 2020 +0100
> > > >
> > > >     Allow different vector types for stmt groups
> > > >
> > > > caused
> > > >
> > > > FAIL: gcc.dg/vect/bb-slp-17.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 1
> > > > FAIL: gcc.dg/vect/bb-slp-17.c scan-tree-dump-times slp2 "optimized: basic block" 1
> > >
> > > This shows that it is maybe a bad idea to support V2SImode vectorization
> > > with -m32 when we refuse to implement even plus.
> > >
> > > OTOH it's just the mode that's available, autovectorize_vector_modes
> > > doesn't include the corresponding mode but we still pick it up via
> > > the related vector mode for group-size == 2.
> 
> It looks like we could define the vectorize.related_mode hook to
> reject V2SImode when !TARGET_MMX_WITH_SSE - the default implementation
> just checks for vector_mode_supported_p.

Meh, that doesn't work.  We then fall through

  else if (SCALAR_INT_MODE_P (prevailing_mode)
           || !related_vector_mode (prevailing_mode,
                                    inner_mode, nunits).exists 
(&simd_mode))
    {
      /* Fall back to using mode_for_vector, mostly in the hope of being
         able to use an integer mode.  */
      if (known_eq (nunits, 0U)
          && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, 
&nunits))
        return NULL_TREE;

      if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode))
        return NULL_TREE;

and return V2SImode anyway from mode_for_vector ...

So - should we only allow integer modes here as the comment suggests?
With that, thus

      if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)
          || GET_MODE_CLASS (simd_mode) != MODE_INT)
        return NULL_TREE;

we "properly" _not_ use V2SImode for vectorization on x86 when
!TARGET_MMX_WITH_SSE.  Note that will also not use V2SImode
for vectorizing copies (which are properly supported).  So I'm
not sure rejecting V2SImode outright is "proper" ...

Richard.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-28  8:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-27 18:28 [r12-3893 Regression] FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4 on Linux/x86_64 sunil.k.pandey
2021-09-28  6:59 ` Richard Biener
2021-09-28  7:43   ` Hongtao Liu
2021-09-28  8:02     ` Richard Biener
2021-09-28  8:18       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).