public inbox for gcc-regression@sourceware.org
help / color / mirror / Atom feed
* [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
@ 2023-04-13  1:48 haochen.jiang
  2023-04-13  9:15 ` Andre Simoes Dias Vieira
  0 siblings, 1 reply; 13+ messages in thread
From: haochen.jiang @ 2023-04-13  1:48 UTC (permalink / raw)
  To: andre.simoesdiasvieira, gcc-regression, gcc-patches, haochen.jiang

On Linux/x86_64,

58c8c1b383bc3c286d6527fc6e8fb62463f9a877 is the first bad commit
commit 58c8c1b383bc3c286d6527fc6e8fb62463f9a877
Author: Andre Vieira <andre.simoesdiasvieira@arm.com>
Date:   Tue Apr 11 10:07:43 2023 +0100

    if-conv: Restore MASK_CALL conversion [PR108888]

caused

FAIL: gcc.dg/vect/vect-simd-clone-16e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-16f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-17e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-17f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-18e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2

with GCC configured with

../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-7135/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me at haochen dot jiang at intel.com)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-13  1:48 [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64 haochen.jiang
@ 2023-04-13  9:15 ` Andre Simoes Dias Vieira
  2023-04-13 10:01   ` Andrew Stubbs
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Simoes Dias Vieira @ 2023-04-13  9:15 UTC (permalink / raw)
  To: gcc-regression, gcc-patches, haochen.jiang, Andrew Stubbs

Hi,

@Andrew: Could you have a look at these? I had a quick look at 17f.c and it looks to me like the target selectors aren't specific enough. Unfortunately I am not familiar enough with target selectors (or targets for that matter) for x86_64. From what I could tell with -m32 gcc decides to unroll the vectorized loop so you end up with 4 simdclones rather than the 2 it tests for, GCC probably uses a different cost structure for -m32 that decides it is profitable to unroll?

As for -march=cascadelake, that seems to prevent gcc from using the inbranch simdclones altogether, so I suspect that cascadelake doesn't support these inbranch simdclones or the vector types it is trying to use.

Kind regards,
Andre

________________________________________
From: haochen.jiang <haochenj@ecsmtp.sh.intel.com>
Sent: Thursday, April 13, 2023 2:48 AM
To: Andre Simoes Dias Vieira; gcc-regression@gcc.gnu.org; gcc-patches@gcc.gnu.org; haochen.jiang@intel.com
Subject: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

On Linux/x86_64,

58c8c1b383bc3c286d6527fc6e8fb62463f9a877 is the first bad commit
commit 58c8c1b383bc3c286d6527fc6e8fb62463f9a877
Author: Andre Vieira <andre.simoesdiasvieira@arm.com>
Date:   Tue Apr 11 10:07:43 2023 +0100

    if-conv: Restore MASK_CALL conversion [PR108888]

caused

FAIL: gcc.dg/vect/vect-simd-clone-16e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-16f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-17e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-17f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-18e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2

with GCC configured with

../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-7135/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me at haochen dot jiang at intel.com)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-13  9:15 ` Andre Simoes Dias Vieira
@ 2023-04-13 10:01   ` Andrew Stubbs
  2023-04-13 12:59     ` Andre Vieira (lists)
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Stubbs @ 2023-04-13 10:01 UTC (permalink / raw)
  To: Andre Simoes Dias Vieira, gcc-regression, gcc-patches, haochen.jiang

Hi Andre,

I don't have a cascadelake device to test on, nor any knowledge about 
what makes it different from regular x86_64.

If the cascadelake device is supposed to work the same as other x86_64 
devices for these vectors then the test has found a bug in the compiler 
and you should be looking to fix that, not fudge the testcase.

Alternatively, if the device's capabilities really are different and the 
tests should behave differently, then the actual expectations need to be 
encoded in the dejagnu directives. If you can't tell the difference by 
looking at the "x86_64*-*-*" target selector alone then the correct 
solution is to invent a new "effective-target" selector. There are lots 
of examples of using these throughout the testsuite (you could use 
dg-require-effective-target to disable the whole testcase, or just use 
the name in the scan-tree-dump-times directive to customise the 
expectations), and the definitions can be found in the 
lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are 
fixed expressions and some run the compiler to probe the configuration, 
but in this case you probably want to do something with "check-flags".

For the unroll problem, you can probably tweak the optimization options 
to disable that, the same as has been done for the epilogues feature 
that had the same problem.

Since these are new tests for a new feature, I don't really understand 
why this is classed as a regression?

Andrew

P.S. there was a commit to these tests in the last few days, so make 
sure you pull that before making changes.
On 13/04/2023 10:15, Andre Simoes Dias Vieira wrote:
> Hi,
> 
> @Andrew: Could you have a look at these? I had a quick look at 17f.c and it looks to me like the target selectors aren't specific enough. Unfortunately I am not familiar enough with target selectors (or targets for that matter) for x86_64. From what I could tell with -m32 gcc decides to unroll the vectorized loop so you end up with 4 simdclones rather than the 2 it tests for, GCC probably uses a different cost structure for -m32 that decides it is profitable to unroll?
> 
> As for -march=cascadelake, that seems to prevent gcc from using the inbranch simdclones altogether, so I suspect that cascadelake doesn't support these inbranch simdclones or the vector types it is trying to use.
> 
> Kind regards,
> Andre
> 
> ________________________________________
> From: haochen.jiang <haochenj@ecsmtp.sh.intel.com>
> Sent: Thursday, April 13, 2023 2:48 AM
> To: Andre Simoes Dias Vieira; gcc-regression@gcc.gnu.org; gcc-patches@gcc.gnu.org; haochen.jiang@intel.com
> Subject: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
> 
> On Linux/x86_64,
> 
> 58c8c1b383bc3c286d6527fc6e8fb62463f9a877 is the first bad commit
> commit 58c8c1b383bc3c286d6527fc6e8fb62463f9a877
> Author: Andre Vieira <andre.simoesdiasvieira@arm.com>
> Date:   Tue Apr 11 10:07:43 2023 +0100
> 
>      if-conv: Restore MASK_CALL conversion [PR108888]
> 
> caused
> 
> FAIL: gcc.dg/vect/vect-simd-clone-16e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
> FAIL: gcc.dg/vect/vect-simd-clone-16f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2
> FAIL: gcc.dg/vect/vect-simd-clone-17e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
> FAIL: gcc.dg/vect/vect-simd-clone-17f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2
> FAIL: gcc.dg/vect/vect-simd-clone-18e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3
> FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2
> 
> with GCC configured with
> 
> ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-7135/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap
> 
> To reproduce:
> 
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32\ -march=cascadelake}'"
> 
> (Please do not reply to this email, for question about this report, contact me at haochen dot jiang at intel.com)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-13 10:01   ` Andrew Stubbs
@ 2023-04-13 12:59     ` Andre Vieira (lists)
  2023-04-13 14:00       ` Richard Biener
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Vieira (lists) @ 2023-04-13 12:59 UTC (permalink / raw)
  To: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang



On 13/04/2023 11:01, Andrew Stubbs wrote:
> Hi Andre,
> 
> I don't have a cascadelake device to test on, nor any knowledge about 
> what makes it different from regular x86_64.

Not sure you need one, but yeah I don't know either, it looks like it 
fails because:
in-branch vector clones are not yet supported for integer mask modes.

A quick look tells me this is because mask_mode is not VOIDmode. 
i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set 
mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect 
cascadelake is TARGET_AVX512F.

This is where I bail out as I really don't want to dive into the target 
specific simd clone handling of x86 ;)

> 
> If the cascadelake device is supposed to work the same as other x86_64 
> devices for these vectors then the test has found a bug in the compiler 
> and you should be looking to fix that, not fudge the testcase.
> 
> Alternatively, if the device's capabilities really are different and the 
> tests should behave differently, then the actual expectations need to be 
> encoded in the dejagnu directives. If you can't tell the difference by 
> looking at the "x86_64*-*-*" target selector alone then the correct 
> solution is to invent a new "effective-target" selector. There are lots 
> of examples of using these throughout the testsuite (you could use 
> dg-require-effective-target to disable the whole testcase, or just use 
> the name in the scan-tree-dump-times directive to customise the 
> expectations), and the definitions can be found in the 
> lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are 
> fixed expressions and some run the compiler to probe the configuration, 
> but in this case you probably want to do something with "check-flags".

Even though I agree with you, I'm not the right person to do this 
digging for such target specific stuff. So for now I'd probably suggest 
xfailing this for avx512f.
> 
> For the unroll problem, you can probably tweak the optimization options 
> to disable that, the same as has been done for the epilogues feature 
> that had the same problem.

I mistaken the current behaviour for unrolling, it's actually because of 
a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to 
determine the vectype of a stmt. For a function like foo, that has the 
same type (long long) everywhere this wouldn't be a problem, however, 
because you transformed it into a MASK_CALL that has a function pointer 
(which is 32-bit in -m32) that now becomes the 'smallest' type.

This is all a red-herring though, I don't think we should be calling 
this function for potential simdclone calls as the type on which the 
veclen is not necessarily the 'smallest' type. And some arguments (like 
uniform and linear) should be ignored anyway as they won't be mapped to 
vectors.  So I do think this might have been broken even before your 
changes, but needs further investigation.
> Since these are new tests for a new feature, I don't really understand 
> why this is classed as a regression?
> 
> Andrew
> 
> P.S. there was a commit to these tests in the last few days, so make 
> sure you pull that before making changes.

The latest commit to these tests was mine, it's the one Haochen is 
reporting this regression against. My commit was to fix the issue richi 
had introduced that was preventing the feature you introduced from 
working. The reason nobody noticed was because the tests you introduced 
didn't actually test your feature, since you didn't specify 'inbranch' 
the omp declare simd pragma was allowing the use of not-inbranch simd 
clones and the vectorizer was being smart enough to circumvent the 
conditional and was still able to use simdclones (non inbranch ones) so 
when the inbranch stopped working, the test didn't notice.

The other changes to this test were already after the fix for 108888 
that broke the inbranch feature you added, and so it was fixing a 
cascadelake testism but for the not-inbranch simdclones. So basically 
fixing a testism of a testism :/


I am working on simdclone's for AArch64 for next Stage 1 so I don't mind 
looking at the issue with the vectype being chosen wrongly, as for the 
other x86 specific testisms I'll leave them to someone else.

Kind Regards,
Andre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-13 12:59     ` Andre Vieira (lists)
@ 2023-04-13 14:00       ` Richard Biener
  2023-04-13 14:25         ` Andre Vieira (lists)
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Biener @ 2023-04-13 14:00 UTC (permalink / raw)
  To: Andre Vieira (lists)
  Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang

On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
>
> On 13/04/2023 11:01, Andrew Stubbs wrote:
> > Hi Andre,
> >
> > I don't have a cascadelake device to test on, nor any knowledge about
> > what makes it different from regular x86_64.
>
> Not sure you need one, but yeah I don't know either, it looks like it
> fails because:
> in-branch vector clones are not yet supported for integer mask modes.
>
> A quick look tells me this is because mask_mode is not VOIDmode.
> i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set
> mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect
> cascadelake is TARGET_AVX512F.
>
> This is where I bail out as I really don't want to dive into the target
> specific simd clone handling of x86 ;)
>
> >
> > If the cascadelake device is supposed to work the same as other x86_64
> > devices for these vectors then the test has found a bug in the compiler
> > and you should be looking to fix that, not fudge the testcase.
> >
> > Alternatively, if the device's capabilities really are different and the
> > tests should behave differently, then the actual expectations need to be
> > encoded in the dejagnu directives. If you can't tell the difference by
> > looking at the "x86_64*-*-*" target selector alone then the correct
> > solution is to invent a new "effective-target" selector. There are lots
> > of examples of using these throughout the testsuite (you could use
> > dg-require-effective-target to disable the whole testcase, or just use
> > the name in the scan-tree-dump-times directive to customise the
> > expectations), and the definitions can be found in the
> > lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are
> > fixed expressions and some run the compiler to probe the configuration,
> > but in this case you probably want to do something with "check-flags".
>
> Even though I agree with you, I'm not the right person to do this
> digging for such target specific stuff. So for now I'd probably suggest
> xfailing this for avx512f.
> >
> > For the unroll problem, you can probably tweak the optimization options
> > to disable that, the same as has been done for the epilogues feature
> > that had the same problem.
>
> I mistaken the current behaviour for unrolling, it's actually because of
> a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to
> determine the vectype of a stmt. For a function like foo, that has the
> same type (long long) everywhere this wouldn't be a problem, however,
> because you transformed it into a MASK_CALL that has a function pointer
> (which is 32-bit in -m32) that now becomes the 'smallest' type.
>
> This is all a red-herring though, I don't think we should be calling
> this function for potential simdclone calls as the type on which the
> veclen is not necessarily the 'smallest' type. And some arguments (like
> uniform and linear) should be ignored anyway as they won't be mapped to
> vectors.  So I do think this might have been broken even before your
> changes, but needs further investigation.
> > Since these are new tests for a new feature, I don't really understand
> > why this is classed as a regression?
> >
> > Andrew
> >
> > P.S. there was a commit to these tests in the last few days, so make
> > sure you pull that before making changes.
>
> The latest commit to these tests was mine, it's the one Haochen is
> reporting this regression against. My commit was to fix the issue richi
> had introduced that was preventing the feature you introduced from
> working. The reason nobody noticed was because the tests you introduced
> didn't actually test your feature, since you didn't specify 'inbranch'
> the omp declare simd pragma was allowing the use of not-inbranch simd
> clones and the vectorizer was being smart enough to circumvent the
> conditional and was still able to use simdclones (non inbranch ones) so
> when the inbranch stopped working, the test didn't notice.
>
> The other changes to this test were already after the fix for 108888
> that broke the inbranch feature you added, and so it was fixing a
> cascadelake testism but for the not-inbranch simdclones. So basically
> fixing a testism of a testism :/
>
>
> I am working on simdclone's for AArch64 for next Stage 1 so I don't mind
> looking at the issue with the vectype being chosen wrongly, as for the
> other x86 specific testisms I'll leave them to someone else.

Btw, the new testsuite FAILs could be just epiloge vectorizations, so
maybe try the usual --param vect-epilogues-nomask=0 ...

> Kind Regards,
> Andre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-13 14:00       ` Richard Biener
@ 2023-04-13 14:25         ` Andre Vieira (lists)
  2023-04-14  6:55           ` Richard Biener
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Vieira (lists) @ 2023-04-13 14:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang



On 13/04/2023 15:00, Richard Biener wrote:
> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>>
>> On 13/04/2023 11:01, Andrew Stubbs wrote:
>>> Hi Andre,
>>>
>>> I don't have a cascadelake device to test on, nor any knowledge about
>>> what makes it different from regular x86_64.
>>
>> Not sure you need one, but yeah I don't know either, it looks like it
>> fails because:
>> in-branch vector clones are not yet supported for integer mask modes.
>>
>> A quick look tells me this is because mask_mode is not VOIDmode.
>> i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set
>> mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect
>> cascadelake is TARGET_AVX512F.
>>
>> This is where I bail out as I really don't want to dive into the target
>> specific simd clone handling of x86 ;)
>>
>>>
>>> If the cascadelake device is supposed to work the same as other x86_64
>>> devices for these vectors then the test has found a bug in the compiler
>>> and you should be looking to fix that, not fudge the testcase.
>>>
>>> Alternatively, if the device's capabilities really are different and the
>>> tests should behave differently, then the actual expectations need to be
>>> encoded in the dejagnu directives. If you can't tell the difference by
>>> looking at the "x86_64*-*-*" target selector alone then the correct
>>> solution is to invent a new "effective-target" selector. There are lots
>>> of examples of using these throughout the testsuite (you could use
>>> dg-require-effective-target to disable the whole testcase, or just use
>>> the name in the scan-tree-dump-times directive to customise the
>>> expectations), and the definitions can be found in the
>>> lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are
>>> fixed expressions and some run the compiler to probe the configuration,
>>> but in this case you probably want to do something with "check-flags".
>>
>> Even though I agree with you, I'm not the right person to do this
>> digging for such target specific stuff. So for now I'd probably suggest
>> xfailing this for avx512f.
>>>
>>> For the unroll problem, you can probably tweak the optimization options
>>> to disable that, the same as has been done for the epilogues feature
>>> that had the same problem.
>>
>> I mistaken the current behaviour for unrolling, it's actually because of
>> a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to
>> determine the vectype of a stmt. For a function like foo, that has the
>> same type (long long) everywhere this wouldn't be a problem, however,
>> because you transformed it into a MASK_CALL that has a function pointer
>> (which is 32-bit in -m32) that now becomes the 'smallest' type.
>>
>> This is all a red-herring though, I don't think we should be calling
>> this function for potential simdclone calls as the type on which the
>> veclen is not necessarily the 'smallest' type. And some arguments (like
>> uniform and linear) should be ignored anyway as they won't be mapped to
>> vectors.  So I do think this might have been broken even before your
>> changes, but needs further investigation.
>>> Since these are new tests for a new feature, I don't really understand
>>> why this is classed as a regression?
>>>
>>> Andrew
>>>
>>> P.S. there was a commit to these tests in the last few days, so make
>>> sure you pull that before making changes.
>>
>> The latest commit to these tests was mine, it's the one Haochen is
>> reporting this regression against. My commit was to fix the issue richi
>> had introduced that was preventing the feature you introduced from
>> working. The reason nobody noticed was because the tests you introduced
>> didn't actually test your feature, since you didn't specify 'inbranch'
>> the omp declare simd pragma was allowing the use of not-inbranch simd
>> clones and the vectorizer was being smart enough to circumvent the
>> conditional and was still able to use simdclones (non inbranch ones) so
>> when the inbranch stopped working, the test didn't notice.
>>
>> The other changes to this test were already after the fix for 108888
>> that broke the inbranch feature you added, and so it was fixing a
>> cascadelake testism but for the not-inbranch simdclones. So basically
>> fixing a testism of a testism :/
>>
>>
>> I am working on simdclone's for AArch64 for next Stage 1 so I don't mind
>> looking at the issue with the vectype being chosen wrongly, as for the
>> other x86 specific testisms I'll leave them to someone else.
> 
> Btw, the new testsuite FAILs could be just epiloge vectorizations, so
> maybe try the usual --param vect-epilogues-nomask=0 ...
It already has those, Jakub added them.

But that's not it, I've been looking at it, and there is code in place 
that does what I expected which is defer the choice of vectype for simd 
clones until vectorizable_simd_clone_call, unfortunately it has a 
mistaken assumption that simdclones don't return :/
see vect_get_vector_types_for_stmt:
...
   if (gimple_get_lhs (stmt) == NULL_TREE
       /* MASK_STORE has no lhs, but is ok.  */
       && !gimple_call_internal_p (stmt, IFN_MASK_STORE))
     {
       if (is_a <gcall *> (stmt))
         {
           /* Ignore calls with no lhs.  These must be calls to
              #pragma omp simd functions, and what vectorization factor
              it really needs can't be determined until
              vectorizable_simd_clone_call.  */
           if (dump_enabled_p ())
             dump_printf_loc (MSG_NOTE, vect_location,
                              "defer to SIMD clone analysis.\n");
           return opt_result::success ();
         }

       return opt_result::failure_at (stmt,
                                      "not vectorized: irregular 
stmt.%G", stmt);
     }
...

I'm working on a patch.
> 
>> Kind Regards,
>> Andre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-13 14:25         ` Andre Vieira (lists)
@ 2023-04-14  6:55           ` Richard Biener
  2023-04-14  8:43             ` Andre Vieira (lists)
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Biener @ 2023-04-14  6:55 UTC (permalink / raw)
  To: Andre Vieira (lists)
  Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang

On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists)
<andre.simoesdiasvieira@arm.com> wrote:
>
>
>
> On 13/04/2023 15:00, Richard Biener wrote:
> > On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >>
> >>
> >> On 13/04/2023 11:01, Andrew Stubbs wrote:
> >>> Hi Andre,
> >>>
> >>> I don't have a cascadelake device to test on, nor any knowledge about
> >>> what makes it different from regular x86_64.
> >>
> >> Not sure you need one, but yeah I don't know either, it looks like it
> >> fails because:
> >> in-branch vector clones are not yet supported for integer mask modes.
> >>
> >> A quick look tells me this is because mask_mode is not VOIDmode.
> >> i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set
> >> mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect
> >> cascadelake is TARGET_AVX512F.
> >>
> >> This is where I bail out as I really don't want to dive into the target
> >> specific simd clone handling of x86 ;)
> >>
> >>>
> >>> If the cascadelake device is supposed to work the same as other x86_64
> >>> devices for these vectors then the test has found a bug in the compiler
> >>> and you should be looking to fix that, not fudge the testcase.
> >>>
> >>> Alternatively, if the device's capabilities really are different and the
> >>> tests should behave differently, then the actual expectations need to be
> >>> encoded in the dejagnu directives. If you can't tell the difference by
> >>> looking at the "x86_64*-*-*" target selector alone then the correct
> >>> solution is to invent a new "effective-target" selector. There are lots
> >>> of examples of using these throughout the testsuite (you could use
> >>> dg-require-effective-target to disable the whole testcase, or just use
> >>> the name in the scan-tree-dump-times directive to customise the
> >>> expectations), and the definitions can be found in the
> >>> lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are
> >>> fixed expressions and some run the compiler to probe the configuration,
> >>> but in this case you probably want to do something with "check-flags".
> >>
> >> Even though I agree with you, I'm not the right person to do this
> >> digging for such target specific stuff. So for now I'd probably suggest
> >> xfailing this for avx512f.
> >>>
> >>> For the unroll problem, you can probably tweak the optimization options
> >>> to disable that, the same as has been done for the epilogues feature
> >>> that had the same problem.
> >>
> >> I mistaken the current behaviour for unrolling, it's actually because of
> >> a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to
> >> determine the vectype of a stmt. For a function like foo, that has the
> >> same type (long long) everywhere this wouldn't be a problem, however,
> >> because you transformed it into a MASK_CALL that has a function pointer
> >> (which is 32-bit in -m32) that now becomes the 'smallest' type.
> >>
> >> This is all a red-herring though, I don't think we should be calling
> >> this function for potential simdclone calls as the type on which the
> >> veclen is not necessarily the 'smallest' type. And some arguments (like
> >> uniform and linear) should be ignored anyway as they won't be mapped to
> >> vectors.  So I do think this might have been broken even before your
> >> changes, but needs further investigation.
> >>> Since these are new tests for a new feature, I don't really understand
> >>> why this is classed as a regression?
> >>>
> >>> Andrew
> >>>
> >>> P.S. there was a commit to these tests in the last few days, so make
> >>> sure you pull that before making changes.
> >>
> >> The latest commit to these tests was mine, it's the one Haochen is
> >> reporting this regression against. My commit was to fix the issue richi
> >> had introduced that was preventing the feature you introduced from
> >> working. The reason nobody noticed was because the tests you introduced
> >> didn't actually test your feature, since you didn't specify 'inbranch'
> >> the omp declare simd pragma was allowing the use of not-inbranch simd
> >> clones and the vectorizer was being smart enough to circumvent the
> >> conditional and was still able to use simdclones (non inbranch ones) so
> >> when the inbranch stopped working, the test didn't notice.
> >>
> >> The other changes to this test were already after the fix for 108888
> >> that broke the inbranch feature you added, and so it was fixing a
> >> cascadelake testism but for the not-inbranch simdclones. So basically
> >> fixing a testism of a testism :/
> >>
> >>
> >> I am working on simdclone's for AArch64 for next Stage 1 so I don't mind
> >> looking at the issue with the vectype being chosen wrongly, as for the
> >> other x86 specific testisms I'll leave them to someone else.
> >
> > Btw, the new testsuite FAILs could be just epiloge vectorizations, so
> > maybe try the usual --param vect-epilogues-nomask=0 ...
> It already has those, Jakub added them.
>
> But that's not it, I've been looking at it, and there is code in place
> that does what I expected which is defer the choice of vectype for simd
> clones until vectorizable_simd_clone_call, unfortunately it has a
> mistaken assumption that simdclones don't return :/

I think that's not it - when the SIMD clone returns a vector we have to
determine the vector type in this function.  We cannot defer this.

> see vect_get_vector_types_for_stmt:
> ...
>    if (gimple_get_lhs (stmt) == NULL_TREE
>        /* MASK_STORE has no lhs, but is ok.  */
>        && !gimple_call_internal_p (stmt, IFN_MASK_STORE))
>      {
>        if (is_a <gcall *> (stmt))
>          {
>            /* Ignore calls with no lhs.  These must be calls to
>               #pragma omp simd functions, and what vectorization factor
>               it really needs can't be determined until
>               vectorizable_simd_clone_call.  */
>            if (dump_enabled_p ())
>              dump_printf_loc (MSG_NOTE, vect_location,
>                               "defer to SIMD clone analysis.\n");
>            return opt_result::success ();
>          }
>
>        return opt_result::failure_at (stmt,
>                                       "not vectorized: irregular
> stmt.%G", stmt);
>      }
> ...
>
> I'm working on a patch.
> >
> >> Kind Regards,
> >> Andre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-14  6:55           ` Richard Biener
@ 2023-04-14  8:43             ` Andre Vieira (lists)
  2023-04-14  9:09               ` Richard Biener
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Vieira (lists) @ 2023-04-14  8:43 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang

Resending this to everyone (sorry for the double send Richard).

On 14/04/2023 09:15, Andre Vieira (lists) wrote:
 >
 >
 > On 14/04/2023 07:55, Richard Biener wrote:
 >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists)
 >> <andre.simoesdiasvieira@arm.com> wrote:
 >>>
 >>>
 >>>
 >>> On 13/04/2023 15:00, Richard Biener wrote:
 >>>> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
 >>>> <gcc-patches@gcc.gnu.org> wrote:
 >>>>>
 >>>>>
 >>>>>
 >>>
 >>> But that's not it, I've been looking at it, and there is code in place
 >>> that does what I expected which is defer the choice of vectype for simd
 >>> clones until vectorizable_simd_clone_call, unfortunately it has a
 >>> mistaken assumption that simdclones don't return :/
 >>
 >> I think that's not it - when the SIMD clone returns a vector we have to
 >> determine the vector type in this function.  We cannot defer this.
 >
 > What's 'this function' here, do you mean we have to determine the
 > vectype in 'vect_get_vector_types_for_stmt' &
 > 'vect_determine_vf_for_stmt' ? Because at that time we don't yet know
 > what clone we will be using, this choice is done inside
 > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need
 > to know the vf as that has to be a multiple of the chosen clone's
 > simdlen. So we simply can't use the simdclone's types (as that depends
 > on the simdlen) to choose the vf because the choice of simdlend depends
 > on the vf. And there was already code in place to handle this,
 > unfortunately that code was wrong and had the wrong assumption that
 > simdclones didn't return (probably was true at some point and bitrotted).
 >
 >>
 >>> see vect_get_vector_types_for_stmt:
 >>> ...
 >>>     if (gimple_get_lhs (stmt) == NULL_TREE
 >>>         /* MASK_STORE has no lhs, but is ok.  */
 >>>         && !gimple_call_internal_p (stmt, IFN_MASK_STORE))
 >>>       {
 >>>         if (is_a <gcall *> (stmt))
 >>>           {
 >>>             /* Ignore calls with no lhs.  These must be calls to
 >>>                #pragma omp simd functions, and what vectorization 
factor
 >>>                it really needs can't be determined until
 >>>                vectorizable_simd_clone_call.  */
 >>>             if (dump_enabled_p ())
 >>>               dump_printf_loc (MSG_NOTE, vect_location,
 >>>                                "defer to SIMD clone analysis.\n");
 >>>             return opt_result::success ();
 >>>           }
 >>>
 >>>         return opt_result::failure_at (stmt,
 >>>                                        "not vectorized: irregular
 >>> stmt.%G", stmt);
 >>>       }
 >>> ...
 >>>
 >>> I'm working on a patch.
 >>>>
 >>>>> Kind Regards,
 >>>>> Andre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-14  8:43             ` Andre Vieira (lists)
@ 2023-04-14  9:09               ` Richard Biener
  2023-04-14  9:14                 ` Richard Biener
  2023-04-14  9:42                 ` Andre Vieira (lists)
  0 siblings, 2 replies; 13+ messages in thread
From: Richard Biener @ 2023-04-14  9:09 UTC (permalink / raw)
  To: Andre Vieira (lists)
  Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang

On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists)
<andre.simoesdiasvieira@arm.com> wrote:
>
> Resending this to everyone (sorry for the double send Richard).
>
> On 14/04/2023 09:15, Andre Vieira (lists) wrote:
>  >
>  >
>  > On 14/04/2023 07:55, Richard Biener wrote:
>  >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists)
>  >> <andre.simoesdiasvieira@arm.com> wrote:
>  >>>
>  >>>
>  >>>
>  >>> On 13/04/2023 15:00, Richard Biener wrote:
>  >>>> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
>  >>>> <gcc-patches@gcc.gnu.org> wrote:
>  >>>>>
>  >>>>>
>  >>>>>
>  >>>
>  >>> But that's not it, I've been looking at it, and there is code in place
>  >>> that does what I expected which is defer the choice of vectype for simd
>  >>> clones until vectorizable_simd_clone_call, unfortunately it has a
>  >>> mistaken assumption that simdclones don't return :/
>  >>
>  >> I think that's not it - when the SIMD clone returns a vector we have to
>  >> determine the vector type in this function.  We cannot defer this.
>  >
>  > What's 'this function' here, do you mean we have to determine the
>  > vectype in 'vect_get_vector_types_for_stmt' &
>  > 'vect_determine_vf_for_stmt' ?

Yes.

> Because at that time we don't yet know
>  > what clone we will be using, this choice is done inside
>  > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need
>  > to know the vf as that has to be a multiple of the chosen clone's
>  > simdlen. So we simply can't use the simdclone's types (as that depends
>  > on the simdlen) to choose the vf because the choice of simdlend depends
>  > on the vf. And there was already code in place to handle this,
>  > unfortunately that code was wrong and had the wrong assumption that
>  > simdclones didn't return (probably was true at some point and bitrotted).

But to compute the VF we need to know the vector types!  We're only
calling vectorizable_* when the VF is final.  That said, the code you quote:

>  >>
>  >>> see vect_get_vector_types_for_stmt:
>  >>> ...
>  >>>     if (gimple_get_lhs (stmt) == NULL_TREE

is just for the case of a function without return value.  For this case
it's OK to do nothing - 'vectype' is the vector type of all vector defs
a stmt produces.

For calls with a LHS it should fall through to generic code doing
get_vectype_for_scalar_type on the LHS type.

>  >>>         /* MASK_STORE has no lhs, but is ok.  */
>  >>>         && !gimple_call_internal_p (stmt, IFN_MASK_STORE))
>  >>>       {
>  >>>         if (is_a <gcall *> (stmt))
>  >>>           {
>  >>>             /* Ignore calls with no lhs.  These must be calls to
>  >>>                #pragma omp simd functions, and what vectorization
> factor
>  >>>                it really needs can't be determined until
>  >>>                vectorizable_simd_clone_call.  */
>  >>>             if (dump_enabled_p ())
>  >>>               dump_printf_loc (MSG_NOTE, vect_location,
>  >>>                                "defer to SIMD clone analysis.\n");
>  >>>             return opt_result::success ();
>  >>>           }
>  >>>
>  >>>         return opt_result::failure_at (stmt,
>  >>>                                        "not vectorized: irregular
>  >>> stmt.%G", stmt);
>  >>>       }
>  >>> ...
>  >>>
>  >>> I'm working on a patch.
>  >>>>
>  >>>>> Kind Regards,
>  >>>>> Andre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-14  9:09               ` Richard Biener
@ 2023-04-14  9:14                 ` Richard Biener
  2023-04-14  9:42                 ` Andre Vieira (lists)
  1 sibling, 0 replies; 13+ messages in thread
From: Richard Biener @ 2023-04-14  9:14 UTC (permalink / raw)
  To: Andre Vieira (lists)
  Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang

On Fri, Apr 14, 2023 at 11:09 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists)
> <andre.simoesdiasvieira@arm.com> wrote:
> >
> > Resending this to everyone (sorry for the double send Richard).
> >
> > On 14/04/2023 09:15, Andre Vieira (lists) wrote:
> >  >
> >  >
> >  > On 14/04/2023 07:55, Richard Biener wrote:
> >  >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists)
> >  >> <andre.simoesdiasvieira@arm.com> wrote:
> >  >>>
> >  >>>
> >  >>>
> >  >>> On 13/04/2023 15:00, Richard Biener wrote:
> >  >>>> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
> >  >>>> <gcc-patches@gcc.gnu.org> wrote:
> >  >>>>>
> >  >>>>>
> >  >>>>>
> >  >>>
> >  >>> But that's not it, I've been looking at it, and there is code in place
> >  >>> that does what I expected which is defer the choice of vectype for simd
> >  >>> clones until vectorizable_simd_clone_call, unfortunately it has a
> >  >>> mistaken assumption that simdclones don't return :/
> >  >>
> >  >> I think that's not it - when the SIMD clone returns a vector we have to
> >  >> determine the vector type in this function.  We cannot defer this.
> >  >
> >  > What's 'this function' here, do you mean we have to determine the
> >  > vectype in 'vect_get_vector_types_for_stmt' &
> >  > 'vect_determine_vf_for_stmt' ?
>
> Yes.
>
> > Because at that time we don't yet know
> >  > what clone we will be using, this choice is done inside
> >  > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need
> >  > to know the vf as that has to be a multiple of the chosen clone's
> >  > simdlen. So we simply can't use the simdclone's types (as that depends
> >  > on the simdlen) to choose the vf because the choice of simdlend depends
> >  > on the vf. And there was already code in place to handle this,
> >  > unfortunately that code was wrong and had the wrong assumption that
> >  > simdclones didn't return (probably was true at some point and bitrotted).
>
> But to compute the VF we need to know the vector types!  We're only
> calling vectorizable_* when the VF is final.  That said, the code you quote:
>
> >  >>
> >  >>> see vect_get_vector_types_for_stmt:
> >  >>> ...
> >  >>>     if (gimple_get_lhs (stmt) == NULL_TREE
>
> is just for the case of a function without return value.  For this case
> it's OK to do nothing - 'vectype' is the vector type of all vector defs
> a stmt produces.
>
> For calls with a LHS it should fall through to generic code doing
> get_vectype_for_scalar_type on the LHS type.

Isn't just the target selector wrong in these tests?

/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone}
2 "vect" { target { { ! avx_runtime } && { ! { i686*-*-* && { ! lp64 }
} } } } } } */
/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone}
3 "vect" { target { avx_runtime && { ! { i686*-*-* && { ! lp64 } } } }
} } } */
/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone}
4 "vect" { target { i686*-*-* && { ! lp64 } } } } } */

the i686-*-* should be { i?86-*-* x86_64-*-* } && { ! lp64 } to
properly check x86_64 -m32?

I'm going to patch that.

Richard.

> >  >>>         /* MASK_STORE has no lhs, but is ok.  */
> >  >>>         && !gimple_call_internal_p (stmt, IFN_MASK_STORE))
> >  >>>       {
> >  >>>         if (is_a <gcall *> (stmt))
> >  >>>           {
> >  >>>             /* Ignore calls with no lhs.  These must be calls to
> >  >>>                #pragma omp simd functions, and what vectorization
> > factor
> >  >>>                it really needs can't be determined until
> >  >>>                vectorizable_simd_clone_call.  */
> >  >>>             if (dump_enabled_p ())
> >  >>>               dump_printf_loc (MSG_NOTE, vect_location,
> >  >>>                                "defer to SIMD clone analysis.\n");
> >  >>>             return opt_result::success ();
> >  >>>           }
> >  >>>
> >  >>>         return opt_result::failure_at (stmt,
> >  >>>                                        "not vectorized: irregular
> >  >>> stmt.%G", stmt);
> >  >>>       }
> >  >>> ...
> >  >>>
> >  >>> I'm working on a patch.
> >  >>>>
> >  >>>>> Kind Regards,
> >  >>>>> Andre

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-14  9:09               ` Richard Biener
  2023-04-14  9:14                 ` Richard Biener
@ 2023-04-14  9:42                 ` Andre Vieira (lists)
  2023-04-14 11:47                   ` Richard Biener
  1 sibling, 1 reply; 13+ messages in thread
From: Andre Vieira (lists) @ 2023-04-14  9:42 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang



On 14/04/2023 10:09, Richard Biener wrote:
> On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists)
> <andre.simoesdiasvieira@arm.com> wrote:
>>
>> Resending this to everyone (sorry for the double send Richard).
>>
>> On 14/04/2023 09:15, Andre Vieira (lists) wrote:
>>   >
>>   >
>>   > On 14/04/2023 07:55, Richard Biener wrote:
>>   >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists)
>>   >> <andre.simoesdiasvieira@arm.com> wrote:
>>   >>>
>>   >>>
>>   >>>
>>   >>> On 13/04/2023 15:00, Richard Biener wrote:
>>   >>>> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
>>   >>>> <gcc-patches@gcc.gnu.org> wrote:
>>   >>>>>
>>   >>>>>
>>   >>>>>
>>   >>>
>>   >>> But that's not it, I've been looking at it, and there is code in place
>>   >>> that does what I expected which is defer the choice of vectype for simd
>>   >>> clones until vectorizable_simd_clone_call, unfortunately it has a
>>   >>> mistaken assumption that simdclones don't return :/
>>   >>
>>   >> I think that's not it - when the SIMD clone returns a vector we have to
>>   >> determine the vector type in this function.  We cannot defer this.
>>   >
>>   > What's 'this function' here, do you mean we have to determine the
>>   > vectype in 'vect_get_vector_types_for_stmt' &
>>   > 'vect_determine_vf_for_stmt' ?
> 
> Yes.
> 
>> Because at that time we don't yet know
>>   > what clone we will be using, this choice is done inside
>>   > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need
>>   > to know the vf as that has to be a multiple of the chosen clone's
>>   > simdlen. So we simply can't use the simdclone's types (as that depends
>>   > on the simdlen) to choose the vf because the choice of simdlend depends
>>   > on the vf. And there was already code in place to handle this,
>>   > unfortunately that code was wrong and had the wrong assumption that
>>   > simdclones didn't return (probably was true at some point and bitrotted).
> 
> But to compute the VF we need to know the vector types!  We're only
> calling vectorizable_* when the VF is final.  That said, the code you quote:
> 
>>   >>
>>   >>> see vect_get_vector_types_for_stmt:
>>   >>> ...
>>   >>>     if (gimple_get_lhs (stmt) == NULL_TREE
> 
> is just for the case of a function without return value.  For this case
> it's OK to do nothing - 'vectype' is the vector type of all vector defs
> a stmt produces.
> 
> For calls with a LHS it should fall through to generic code doing
> get_vectype_for_scalar_type on the LHS type.

I think that may work, but right now it will still go and look at the 
arguments of the call and use the smallest type among them to adjust the 
nunits (in 'vect_get_vector_types_for_stmt').

In fact (this is just for illustration) if I hack that function like this:
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -12745,8 +12745,11 @@ vect_get_vector_types_for_stmt (vec_info 
*vinfo, stmt_vec_info stmt_info,
        /* The number of units is set according to the smallest scalar
          type (or the largest vector size, but we only support one
          vector size per vectorization).  */
-      scalar_type = vect_get_smallest_scalar_type (stmt_info,
-                                                  TREE_TYPE (vectype));
+      if (simd_clone_call_p (stmt_info->stmt))
+       scalar_type = TREE_TYPE (vectype);
+      else
+       scalar_type = vect_get_smallest_scalar_type (stmt_info,
+                                                    TREE_TYPE (vectype));
        if (scalar_type != TREE_TYPE (vectype))
         {
           if (dump_enabled_p ())

It will use the same types as before without (-m32), like I explained 
before the -m32 turns the pointer inside MASK_CALL into a 32-bit pointer 
so now the smallest size is 32-bits. This makes it pick V8SI instead of 
the original V4DI (scalar return type is DImode). Changing the VF to 8, 
thus unrolling the loop as it needs to make 2 calls, each handling 4 nunits.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-14  9:42                 ` Andre Vieira (lists)
@ 2023-04-14 11:47                   ` Richard Biener
  2023-04-14 12:57                     ` Andre Vieira (lists)
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Biener @ 2023-04-14 11:47 UTC (permalink / raw)
  To: Andre Vieira (lists)
  Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang

On Fri, Apr 14, 2023 at 11:42 AM Andre Vieira (lists)
<andre.simoesdiasvieira@arm.com> wrote:
>
>
>
> On 14/04/2023 10:09, Richard Biener wrote:
> > On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists)
> > <andre.simoesdiasvieira@arm.com> wrote:
> >>
> >> Resending this to everyone (sorry for the double send Richard).
> >>
> >> On 14/04/2023 09:15, Andre Vieira (lists) wrote:
> >>   >
> >>   >
> >>   > On 14/04/2023 07:55, Richard Biener wrote:
> >>   >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists)
> >>   >> <andre.simoesdiasvieira@arm.com> wrote:
> >>   >>>
> >>   >>>
> >>   >>>
> >>   >>> On 13/04/2023 15:00, Richard Biener wrote:
> >>   >>>> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches
> >>   >>>> <gcc-patches@gcc.gnu.org> wrote:
> >>   >>>>>
> >>   >>>>>
> >>   >>>>>
> >>   >>>
> >>   >>> But that's not it, I've been looking at it, and there is code in place
> >>   >>> that does what I expected which is defer the choice of vectype for simd
> >>   >>> clones until vectorizable_simd_clone_call, unfortunately it has a
> >>   >>> mistaken assumption that simdclones don't return :/
> >>   >>
> >>   >> I think that's not it - when the SIMD clone returns a vector we have to
> >>   >> determine the vector type in this function.  We cannot defer this.
> >>   >
> >>   > What's 'this function' here, do you mean we have to determine the
> >>   > vectype in 'vect_get_vector_types_for_stmt' &
> >>   > 'vect_determine_vf_for_stmt' ?
> >
> > Yes.
> >
> >> Because at that time we don't yet know
> >>   > what clone we will be using, this choice is done inside
> >>   > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need
> >>   > to know the vf as that has to be a multiple of the chosen clone's
> >>   > simdlen. So we simply can't use the simdclone's types (as that depends
> >>   > on the simdlen) to choose the vf because the choice of simdlend depends
> >>   > on the vf. And there was already code in place to handle this,
> >>   > unfortunately that code was wrong and had the wrong assumption that
> >>   > simdclones didn't return (probably was true at some point and bitrotted).
> >
> > But to compute the VF we need to know the vector types!  We're only
> > calling vectorizable_* when the VF is final.  That said, the code you quote:
> >
> >>   >>
> >>   >>> see vect_get_vector_types_for_stmt:
> >>   >>> ...
> >>   >>>     if (gimple_get_lhs (stmt) == NULL_TREE
> >
> > is just for the case of a function without return value.  For this case
> > it's OK to do nothing - 'vectype' is the vector type of all vector defs
> > a stmt produces.
> >
> > For calls with a LHS it should fall through to generic code doing
> > get_vectype_for_scalar_type on the LHS type.
>
> I think that may work, but right now it will still go and look at the
> arguments of the call and use the smallest type among them to adjust the
> nunits (in 'vect_get_vector_types_for_stmt').
>
> In fact (this is just for illustration) if I hack that function like this:
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -12745,8 +12745,11 @@ vect_get_vector_types_for_stmt (vec_info
> *vinfo, stmt_vec_info stmt_info,
>         /* The number of units is set according to the smallest scalar
>           type (or the largest vector size, but we only support one
>           vector size per vectorization).  */
> -      scalar_type = vect_get_smallest_scalar_type (stmt_info,
> -                                                  TREE_TYPE (vectype));
> +      if (simd_clone_call_p (stmt_info->stmt))
> +       scalar_type = TREE_TYPE (vectype);
> +      else
> +       scalar_type = vect_get_smallest_scalar_type (stmt_info,
> +                                                    TREE_TYPE (vectype));
>         if (scalar_type != TREE_TYPE (vectype))
>          {
>            if (dump_enabled_p ())
>
> It will use the same types as before without (-m32), like I explained
> before the -m32 turns the pointer inside MASK_CALL into a 32-bit pointer
> so now the smallest size is 32-bits. This makes it pick V8SI instead of
> the original V4DI (scalar return type is DImode). Changing the VF to 8,
> thus unrolling the loop as it needs to make 2 calls, each handling 4 nunits.

Ah, but then vect_get_smallest_scalar_type should simply ignore that
pointer in MASK_CALL.  It should only look at the arguments relevant
for vectorization.  So

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 8daf7bd7dd3..70bb8595951 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -165,6 +165,8 @@ vect_get_smallest_scalar_type (stmt_vec_info
stmt_info, tree scalar_type)
            }
          else if (internal_fn_mask_index (ifn) == 0)
            i = 1;
+         else if (ifn == IFN_MASK_CALL)
+           i = 1;
        }
       if (i < gimple_call_num_args (call))
        {

?

Richard.

>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
  2023-04-14 11:47                   ` Richard Biener
@ 2023-04-14 12:57                     ` Andre Vieira (lists)
  0 siblings, 0 replies; 13+ messages in thread
From: Andre Vieira (lists) @ 2023-04-14 12:57 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Stubbs, gcc-regression, gcc-patches, haochen.jiang



On 14/04/2023 12:47, Richard Biener wrote:
> On Fri, Apr 14, 2023 at 11:42 AM Andre Vieira (lists)
> <andre.simoesdiasvieira@arm.com> wrote:
>>
>>
> 
> Ah, but then vect_get_smallest_scalar_type should simply ignore that
> pointer in MASK_CALL.  It should only look at the arguments relevant
> for vectorization.  So
> 
> diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> index 8daf7bd7dd3..70bb8595951 100644
> --- a/gcc/tree-vect-data-refs.cc
> +++ b/gcc/tree-vect-data-refs.cc
> @@ -165,6 +165,8 @@ vect_get_smallest_scalar_type (stmt_vec_info
> stmt_info, tree scalar_type)
>              }
>            else if (internal_fn_mask_index (ifn) == 0)
>              i = 1;
> +         else if (ifn == IFN_MASK_CALL)
> +           i = 1;
>          }
>         if (i < gimple_call_num_args (call))
>          {
> 
> ?

The problem with that is that not all arguments of a simdclone are 
mapped to vectors, for instance if one of them is 'uniform' then it's 
passed as a single scalar as uniform says it won't change between 
iteration, thus it has no effect on VF.

End of the day, the logic behind VF selection for simd clones is very 
different to that of a normal statement. Take for instance the logic 
behind simdlen selection for SVE, its not yet in trunk, where we select 
the number of elements to handle based on the widest element type. So say a:
#pragma omp declare simd
int foo (int, long long);

would have 4 NEON simd clones with (based on Narrowest element type):
/* 128 bit */
V4SI foo (V4SI, V4DI)
V4SI foo (V4SI, V4DI, V4BI)
/* 64 bit */
V2SI foo (V2SI, V2DI)
V2SI foo (V2SI, V2DI, V2BI)

and SVE (based on Widest element type):
VNx2SI foo (VN2xSI, VN2xDI, VN2xBI)

This also shows how basing this on the return value wouldn't quite work 
either for SVE :(

For now I think all targets base the selection on narrowest element type 
and I think we could get away with only looking at the types that map to 
vectors using the simdclone information, that's a pretty straight 
forward change to the existing code I think. And I'll make sure to come 
up with a way of dealing with this in a more robust way for SVE to work.

I'll try to write something up, but probably won't be able to post it 
until Monday.

Kind regards,
Andre

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-04-14 12:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-13  1:48 [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64 haochen.jiang
2023-04-13  9:15 ` Andre Simoes Dias Vieira
2023-04-13 10:01   ` Andrew Stubbs
2023-04-13 12:59     ` Andre Vieira (lists)
2023-04-13 14:00       ` Richard Biener
2023-04-13 14:25         ` Andre Vieira (lists)
2023-04-14  6:55           ` Richard Biener
2023-04-14  8:43             ` Andre Vieira (lists)
2023-04-14  9:09               ` Richard Biener
2023-04-14  9:14                 ` Richard Biener
2023-04-14  9:42                 ` Andre Vieira (lists)
2023-04-14 11:47                   ` Richard Biener
2023-04-14 12:57                     ` Andre Vieira (lists)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).