public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
@ 2011-09-13  9:55 ` vincenzo.innocente at cern dot ch
  2011-09-27 14:13 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-09-13  9:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #10 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2011-09-13 09:52:53 UTC ---
resurrecting this:
just checked with gcc version 4.7.0 20110910

-mveclibabi=svml
-LwhereverYouhaveIntelSoftware/linux/x86_64/Compiler/11.1/072/lib/intel64/
-lsvml  -lirc

and sincos does not vectorize nor emits svml symbols (sin, cos independently
are ok)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
  2011-09-13  9:55 ` [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing vincenzo.innocente at cern dot ch
@ 2011-09-27 14:13 ` rguenth at gcc dot gnu.org
  2020-09-22  6:35 ` rguenth at gcc dot gnu.org
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-09-27 14:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #11 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-09-27 13:59:57 UTC ---
*** Bug 50534 has been marked as a duplicate of this bug. ***


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
  2011-09-13  9:55 ` [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing vincenzo.innocente at cern dot ch
  2011-09-27 14:13 ` rguenth at gcc dot gnu.org
@ 2020-09-22  6:35 ` rguenth at gcc dot gnu.org
  2020-09-22  7:04 ` kangshan0910 at hotmail dot com
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-22  6:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kangshan0910 at hotmail dot com

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
*** Bug 97149 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2020-09-22  6:35 ` rguenth at gcc dot gnu.org
@ 2020-09-22  7:04 ` kangshan0910 at hotmail dot com
  2020-09-22  8:11 ` rguenther at suse dot de
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: kangshan0910 at hotmail dot com @ 2020-09-22  7:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #13 from 康 珊 <kangshan0910 at hotmail dot com> ---
(In reply to Richard Biener from comment #12)
> *** Bug 97149 has been marked as a duplicate of this bug. ***

Will you look at this?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2020-09-22  7:04 ` kangshan0910 at hotmail dot com
@ 2020-09-22  8:11 ` rguenther at suse dot de
  2020-09-22  9:03 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2020-09-22  8:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #14 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 22 Sep 2020, kangshan0910 at hotmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
> 
> --- Comment #13 from ? ? <kangshan0910 at hotmail dot com> ---
> (In reply to Richard Biener from comment #12)
> > *** Bug 97149 has been marked as a duplicate of this bug. ***
> 
> Will you look at this?

glibc needs to think of an ABI for OMP SIMD for this, then we can
autovectorize.  Alternatively we could of course use
vector sin and vector cos separately which might be better than
not vectorizing it at all?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2020-09-22  8:11 ` rguenther at suse dot de
@ 2020-09-22  9:03 ` jakub at gcc dot gnu.org
  2020-09-22  9:37 ` fw at gcc dot gnu.org
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-22  9:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fweimer at redhat dot com,
                   |                            |jakub at gcc dot gnu.org

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
OpenMP has a way to express this,
#pragma omp declare simd notinbranch linear(y,z)
double sincos (double x, double *y, double *z);
As can be seen on -O2 -fopenmp-simd:
#pragma omp declare simd notinbranch linear(y, z)
double foo (double x, double *y, double *z);

double a[1024], b[1024], c[1024];

void
bar (void)
{
  #pragma omp simd
  for (int i = 0; i < 1024; i++)
    foo (a[i], &b[i], &c[i]);
}
it works fine with that.  But if #pragma omp simd is commented out, it is not,
because the compiler has no assurance on what the function call behavior will
be e.g. for data reference analysis (it could e.g. modify the global vars).
Now, for sincos proper, the compiler knows what the function does if not
-fno-builtin-sincos.

Unfortunately, I think glibc currently defines sincos in math-vector.h the
unuseful way with just simd ("notinbranch") attribute, which effectively means
that the caller is supposed to pass vectors of pointers (well integers with
those sizes) and so needs scatter under the hood.
So the best thing would be to change glibc first.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2020-09-22  9:03 ` jakub at gcc dot gnu.org
@ 2020-09-22  9:37 ` fw at gcc dot gnu.org
  2020-09-22  9:49 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: fw at gcc dot gnu.org @ 2020-09-22  9:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

Florian Weimer <fw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fw at gcc dot gnu.org

--- Comment #16 from Florian Weimer <fw at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #15)
> Unfortunately, I think glibc currently defines sincos in math-vector.h the
> unuseful way with just simd ("notinbranch") attribute, which effectively
> means
> that the caller is supposed to pass vectors of pointers (well integers with
> those sizes) and so needs scatter under the hood.
> So the best thing would be to change glibc first.

What exactly do we need to change? Factor out the declaration of sincos, so
that we can apply “linear(__p1, __p2)” to it?

Should glibc provide a cexpi function? The complex result would avoid having to
go through memory at all.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2020-09-22  9:37 ` fw at gcc dot gnu.org
@ 2020-09-22  9:49 ` rguenth at gcc dot gnu.org
  2020-09-22  9:58 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-22  9:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
IIRC cepxi has the complication that we have to define the mangling for vector
_Complex double.  But cexpi is what the vectorizer currently sees, of course
(we can of course vectorize it as sincos with arguments in memory again).

So yeah, glibc is lacking here.

Can one actually define multiple OMP SIMD variants for a single function?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2020-09-22  9:49 ` rguenth at gcc dot gnu.org
@ 2020-09-22  9:58 ` jakub at gcc dot gnu.org
  2020-09-22  9:58 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-22  9:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #18 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Florian Weimer from comment #16)
> (In reply to Jakub Jelinek from comment #15)
> > Unfortunately, I think glibc currently defines sincos in math-vector.h the
> > unuseful way with just simd ("notinbranch") attribute, which effectively
> > means
> > that the caller is supposed to pass vectors of pointers (well integers with
> > those sizes) and so needs scatter under the hood.
> > So the best thing would be to change glibc first.
> 
> What exactly do we need to change? Factor out the declaration of sincos, so
> that we can apply “linear(__p1, __p2)” to it?

It could provide
#pragma omp declare simd linear(__p1, __p2)
double sincos (double __p0, double *__p1, double *__p2);
prototypes and assembly definition for that function.
Note, right now the simd attribute syntax doesn't allow specifying linear
arguments, there have been talks about extending the syntax, but it hasn't been
finished yet.  Looking at sysdeps/x86_64/fpu/*sincos*.S I actually see the
correct symbols there already, just they aren't exported from the library (the
symbols with *l4l4*, *l8l8* etc. in there).
So I guess we first need to extend the simd attribute syntax, then change glibc
to use it and export those, and finally benefit.

> Should glibc provide a cexpi function? The complex result would avoid having
> to go through memory at all.

Not sure if that would help, because vectorization of complex types doesn't
really work well in GCC unless they are lowered to scalar ops before
vectorization.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2020-09-22  9:58 ` jakub at gcc dot gnu.org
@ 2020-09-22  9:58 ` jakub at gcc dot gnu.org
  2020-09-22 10:15 ` kangshan0910 at hotmail dot com
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-22  9:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #17)
> Can one actually define multiple OMP SIMD variants for a single function?

Yes.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2020-09-22  9:58 ` jakub at gcc dot gnu.org
@ 2020-09-22 10:15 ` kangshan0910 at hotmail dot com
  2020-09-22 10:19 ` kangshan0910 at hotmail dot com
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: kangshan0910 at hotmail dot com @ 2020-09-22 10:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #20 from 康 珊 <kangshan0910 at hotmail dot com> ---
Created attachment 49255
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49255&action=edit
test source file

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2020-09-22 10:15 ` kangshan0910 at hotmail dot com
@ 2020-09-22 10:19 ` kangshan0910 at hotmail dot com
  2020-09-22 10:43 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: kangshan0910 at hotmail dot com @ 2020-09-22 10:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #21 from 康 珊 <kangshan0910 at hotmail dot com> ---
(In reply to Jakub Jelinek from comment #15)
> OpenMP has a way to express this,
> #pragma omp declare simd notinbranch linear(y,z)
> double sincos (double x, double *y, double *z);
> As can be seen on -O2 -fopenmp-simd:
> #pragma omp declare simd notinbranch linear(y, z)
> double foo (double x, double *y, double *z);
> 
> double a[1024], b[1024], c[1024];
> 
> void
> bar (void)
> {
>   #pragma omp simd
>   for (int i = 0; i < 1024; i++)
>     foo (a[i], &b[i], &c[i]);
> }
> it works fine with that.  But if #pragma omp simd is commented out, it is
> not, because the compiler has no assurance on what the function call
> behavior will be e.g. for data reference analysis (it could e.g. modify the
> global vars).
> Now, for sincos proper, the compiler knows what the function does if not
> -fno-builtin-sincos.
> 
> Unfortunately, I think glibc currently defines sincos in math-vector.h the
> unuseful way with just simd ("notinbranch") attribute, which effectively
> means
> that the caller is supposed to pass vectors of pointers (well integers with
> those sizes) and so needs scatter under the hood.
> So the best thing would be to change glibc first.

Thanks for your reply.(In reply to Jakub Jelinek from comment #15)
> OpenMP has a way to express this,
> #pragma omp declare simd notinbranch linear(y,z)
> double sincos (double x, double *y, double *z);
> As can be seen on -O2 -fopenmp-simd:
> #pragma omp declare simd notinbranch linear(y, z)
> double foo (double x, double *y, double *z);
> 
> double a[1024], b[1024], c[1024];
> 
> void
> bar (void)
> {
>   #pragma omp simd
>   for (int i = 0; i < 1024; i++)
>     foo (a[i], &b[i], &c[i]);
> }
> it works fine with that.  But if #pragma omp simd is commented out, it is
> not, because the compiler has no assurance on what the function call
> behavior will be e.g. for data reference analysis (it could e.g. modify the
> global vars).
> Now, for sincos proper, the compiler knows what the function does if not
> -fno-builtin-sincos.
> 
> Unfortunately, I think glibc currently defines sincos in math-vector.h the
> unuseful way with just simd ("notinbranch") attribute, which effectively
> means
> that the caller is supposed to pass vectors of pointers (well integers with
> those sizes) and so needs scatter under the hood.
> So the best thing would be to change glibc first.

Thanks for your reply. I tried your method. But it still doesn't work for
sincos. 

#pragma omp declare simd notinbranch linear(y, z)
void sincos(double x, double *y, double *z);

for (unsigned int i = 0; i < loopCount; i++)
    {
#pragma omp simd
        for (unsigned int j = i * dim; j < (i + 1) * dim; j++)
        {
            sincos(input_array[j], &result_array[j], &result_array1[j]);
        }
    }

$ gcc -m64 -g -O2 -fopenmp-simd -fPIC -Wall test_sincos.c -o test_sincos -lm
-march=skylake-avx512 -ffast-math -ftree-loop-vectorize
$ ldd test_sincos
        linux-vdso.so.1 (0x00007ffd2393f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f574f49d000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f574f2a2000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f574f601000)
My gcc version is "10.1.1 20200507".
And I have uploaded my test source file.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2020-09-22 10:19 ` kangshan0910 at hotmail dot com
@ 2020-09-22 10:43 ` rguenth at gcc dot gnu.org
  2020-09-22 11:14 ` kangshan0910 at hotmail dot com
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-22 10:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #22 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to 康 珊 from comment #21)
> $ gcc -m64 -g -O2 -fopenmp-simd -fPIC -Wall test_sincos.c -o test_sincos -lm
> -march=skylake-avx512 -ffast-math -ftree-loop-vectorize

Try adding -fno-builtin-sincos here

> $ ldd test_sincos
>         linux-vdso.so.1 (0x00007ffd2393f000)
>         libm.so.6 => /lib64/libm.so.6 (0x00007f574f49d000)
>         libc.so.6 => /lib64/libc.so.6 (0x00007f574f2a2000)
>         /lib64/ld-linux-x86-64.so.2 (0x00007f574f601000)
> My gcc version is "10.1.1 20200507".
> And I have uploaded my test source file.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (12 preceding siblings ...)
  2020-09-22 10:43 ` rguenth at gcc dot gnu.org
@ 2020-09-22 11:14 ` kangshan0910 at hotmail dot com
  2020-09-22 11:17 ` rguenther at suse dot de
  2020-09-22 11:31 ` kangshan0910 at hotmail dot com
  15 siblings, 0 replies; 26+ messages in thread
From: kangshan0910 at hotmail dot com @ 2020-09-22 11:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #23 from 康 珊 <kangshan0910 at hotmail dot com> ---
It seems that gcc will use "_ZGVeN8vl8l8_sincos" which doesn't exit in libmvec.
/usr/bin/ld: /tmp/ccTYyEZM.o: in function `main':
/home/pnp/ks/test_sincos.c:38: undefined reference to `_ZGVeN8vl8l8_sincos'
/usr/bin/ld: /home/pnp/ks/test_sincos.c:38: undefined reference to
`_ZGVeN8vl8l8_sincos'
collect2: error: ld returned 1 exit status

I'm using clear linux the glibc of which is 2.31 and I also tried Ubuntu 19.10
the glibc of which is 2.30.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (13 preceding siblings ...)
  2020-09-22 11:14 ` kangshan0910 at hotmail dot com
@ 2020-09-22 11:17 ` rguenther at suse dot de
  2020-09-22 11:31 ` kangshan0910 at hotmail dot com
  15 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2020-09-22 11:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #24 from rguenther at suse dot de <rguenther at suse dot de> ---
On September 22, 2020 1:14:48 PM GMT+02:00, kangshan0910 at hotmail dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
>
>--- Comment #23 from 康 珊 <kangshan0910 at hotmail dot com> ---
>It seems that gcc will use "_ZGVeN8vl8l8_sincos" which doesn't exit in
>libmvec.
>/usr/bin/ld: /tmp/ccTYyEZM.o: in function `main':
>/home/pnp/ks/test_sincos.c:38: undefined reference to
>`_ZGVeN8vl8l8_sincos'
>/usr/bin/ld: /home/pnp/ks/test_sincos.c:38: undefined reference to
>`_ZGVeN8vl8l8_sincos'
>collect2: error: ld returned 1 exit status

That's expected and the thing that needs to be fixed. 

>I'm using clear linux the glibc of which is 2.31 and I also tried
>Ubuntu 19.10
>the glibc of which is 2.30.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
       [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
                   ` (14 preceding siblings ...)
  2020-09-22 11:17 ` rguenther at suse dot de
@ 2020-09-22 11:31 ` kangshan0910 at hotmail dot com
  15 siblings, 0 replies; 26+ messages in thread
From: kangshan0910 at hotmail dot com @ 2020-09-22 11:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770

--- Comment #25 from 康 珊 <kangshan0910 at hotmail dot com> ---
(In reply to rguenther@suse.de from comment #24)
> On September 22, 2020 1:14:48 PM GMT+02:00, kangshan0910 at hotmail dot com
> <gcc-bugzilla@gcc.gnu.org> wrote:
> >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
> >
> >--- Comment #23 from 康 珊 <kangshan0910 at hotmail dot com> ---
> >It seems that gcc will use "_ZGVeN8vl8l8_sincos" which doesn't exit in
> >libmvec.
> >/usr/bin/ld: /tmp/ccTYyEZM.o: in function `main':
> >/home/pnp/ks/test_sincos.c:38: undefined reference to
> >`_ZGVeN8vl8l8_sincos'
> >/usr/bin/ld: /home/pnp/ks/test_sincos.c:38: undefined reference to
> >`_ZGVeN8vl8l8_sincos'
> >collect2: error: ld returned 1 exit status
> 
> That's expected and the thing that needs to be fixed. 
> 
> >I'm using clear linux the glibc of which is 2.31 and I also tried
> >Ubuntu 19.10
> >the glibc of which is 2.30.

But it has _ZGVeN8vvv_sincos(__m512d, __m512i, __m512i), why gcc not use this
call?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
                   ` (8 preceding siblings ...)
  2009-07-20 12:30 ` ubizjak at gmail dot com
@ 2009-07-20 12:55 ` rguenther at suse dot de
  9 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2009-07-20 12:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from rguenther at suse dot de  2009-07-20 12:55 -------
Subject: Re:  Vectorization of complex types,
 vectorization of sincos missing

On Mon, 20 Jul 2009, irar at il dot ibm dot com wrote:

> 
> 
> ------- Comment #7 from irar at il dot ibm dot com  2009-07-20 11:18 -------
> AFAIU, querying for the component type of complex type is not difficult to
> implement. 
> I think, that loop-based vectorization is preferable here, so we should stay
> with vectorization factor of 2 for doubles.
> 
> The next problem is to vectorize 
>   D.1611_4 = IMAGPART_EXPR <sincostmp.1_1>;
> and
>   D.1612_6 = REALPART_EXPR <sincostmp.1_1>;
> 
> Currently, we support only loads and stores with IMAGPART/REALPART_EXPR,
> vectorizing them as strided accesses, with extract odd and even operations for
> loads. So, we will have to support interleaving of non-memory variables. 
> 
> Does __builtin_cexpi have a vector implementation? If so, does it return two
> vectors?

No, currently cexpi doesn't have a vectorized version.  We could add
an internal builtin for that that takes a vector as argument and
returns a vector with complex components.  And lower this during expansion
to a suitable available form (eventually just two calls).

> If not, I guess, we need something like:
> 
>   sincostmp.1 = __builtin_cexpi (xd[i]);
>   sincostmp.2 = __builtin_cexpi (xd[i+1]);
>   v1 = VEC_EXTRACT_EVEN (sincostmp.1, sincostmp.2);
>   v2 = VEC_EXTRACT_ODD (sincostmp.1, sincostmp.2);
>   sf[i:i+1] = v1;
>   cf[i:i+1] = v2;
>   i = i + 2;

Yes, that was my initial idea.

> Or we can use the two vectors from vectorized __builtin_cexpi as parameters of
> extract operations.
> Does that make sense?

Yes, I think so.  With a vectorized builtin we'd have

  v0 = xd[i:i+1];
  sincostmp.1 = __builtin_vect_cexpi (v0);
  v1 = VEC_EXTRACT_EVEN (sincostmp.1[0], sincostmp.1[1]);
  v2 = VEC_EXTRACT_ODD (sincostmp.1[0], sincostmp.1[1]);
  sf[i:i+1] = v1;
  cf[i:i+1] = v2;
  i = i + 2;

where sincostmp.1[0] would select the lower half of a V4DF and
sincostmp.1[1] the upper half of a V4DF.  But that's probably
more difficult as we'd have both V2DF and V4DF in the IL.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
                   ` (7 preceding siblings ...)
  2009-07-20 11:18 ` irar at il dot ibm dot com
@ 2009-07-20 12:30 ` ubizjak at gmail dot com
  2009-07-20 12:55 ` rguenther at suse dot de
  9 siblings, 0 replies; 26+ messages in thread
From: ubizjak at gmail dot com @ 2009-07-20 12:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from ubizjak at gmail dot com  2009-07-20 12:30 -------
(In reply to comment #7)

> Does __builtin_cexpi have a vector implementation? If so, does it return two
> vectors?

No, only vectorized sincos is implemented (see also links at PR40766, Comment
#11).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
                   ` (6 preceding siblings ...)
  2009-07-16 17:31 ` irar at il dot ibm dot com
@ 2009-07-20 11:18 ` irar at il dot ibm dot com
  2009-07-20 12:30 ` ubizjak at gmail dot com
  2009-07-20 12:55 ` rguenther at suse dot de
  9 siblings, 0 replies; 26+ messages in thread
From: irar at il dot ibm dot com @ 2009-07-20 11:18 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from irar at il dot ibm dot com  2009-07-20 11:18 -------
AFAIU, querying for the component type of complex type is not difficult to
implement. 
I think, that loop-based vectorization is preferable here, so we should stay
with vectorization factor of 2 for doubles.

The next problem is to vectorize 
  D.1611_4 = IMAGPART_EXPR <sincostmp.1_1>;
and
  D.1612_6 = REALPART_EXPR <sincostmp.1_1>;

Currently, we support only loads and stores with IMAGPART/REALPART_EXPR,
vectorizing them as strided accesses, with extract odd and even operations for
loads. So, we will have to support interleaving of non-memory variables. 

Does __builtin_cexpi have a vector implementation? If so, does it return two
vectors?

If not, I guess, we need something like:

  sincostmp.1 = __builtin_cexpi (xd[i]);
  sincostmp.2 = __builtin_cexpi (xd[i+1]);
  v1 = VEC_EXTRACT_EVEN (sincostmp.1, sincostmp.2);
  v2 = VEC_EXTRACT_ODD (sincostmp.1, sincostmp.2);
  sf[i:i+1] = v1;
  cf[i:i+1] = v2;
  i = i + 2;

Or we can use the two vectors from vectorized __builtin_cexpi as parameters of
extract operations.
Does that make sense?

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2009-07-16 13:58 ` rguenther at suse dot de
@ 2009-07-16 17:31 ` irar at il dot ibm dot com
  2009-07-20 11:18 ` irar at il dot ibm dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: irar at il dot ibm dot com @ 2009-07-16 17:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from irar at il dot ibm dot com  2009-07-16 17:31 -------
(In reply to comment #3)
> > make_vector_type returns NULL for this type.
> Yes - there is no vector type for complex double.  But the vectorizer
> could query for a vector type for the complex component type (double)
> and divide the vector element count by 2 (for complex) to get the
> vectorization factor which would be 1 here.  

I see.

> Should SLP the be possible
> for that loop?

Not with the current implementation - SLP needs strided stores to start. Here
the stores are not even adjacent. I think, it would be better to vectorize this
loop with regular loop-based vectorization to avoid permutations. I'll take a
better look on Sunday.

Ira

> Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2009-07-16 13:32 ` burnus at gcc dot gnu dot org
@ 2009-07-16 13:58 ` rguenther at suse dot de
  2009-07-16 17:31 ` irar at il dot ibm dot com
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2009-07-16 13:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from rguenther at suse dot de  2009-07-16 13:57 -------
Subject: Re:  Vectorization of complex types,
 vectorization of sincos missing

On Thu, 16 Jul 2009, burnus at gcc dot gnu dot org wrote:

> ------- Comment #4 from burnus at gcc dot gnu dot org  2009-07-16 13:32 -------
> (In reply to comment #3)
> > Yes - there is no vector type for complex double.  But the vectorizer
> > could query for a vector type for the complex component type (double)
> > and divide the vector element count by 2 (for complex) to get the
> > vectorization factor which would be 1 here.
> 
> I do not know much about this, but wouldn't that fail if one wants to vectorize
> true complex functions such as ccosf (assuming that they are in principle
> vectorizable)?

Well, for ccosf we would have a vectorization factor of 2 left for V4SF.
Of course this assumes that we present the vectorizer with a vectorized
ccosf with the signature v4sf (*)(v4sf).  Or we would need to introduce
complex vector modes - which I'd rather avoid.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2009-07-16 13:05 ` rguenther at suse dot de
@ 2009-07-16 13:32 ` burnus at gcc dot gnu dot org
  2009-07-16 13:58 ` rguenther at suse dot de
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-07-16 13:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from burnus at gcc dot gnu dot org  2009-07-16 13:32 -------
(In reply to comment #3)
> Yes - there is no vector type for complex double.  But the vectorizer
> could query for a vector type for the complex component type (double)
> and divide the vector element count by 2 (for complex) to get the
> vectorization factor which would be 1 here.

I do not know much about this, but wouldn't that fail if one wants to vectorize
true complex functions such as ccosf (assuming that they are in principle
vectorizable)?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2009-07-16 12:29 ` irar at il dot ibm dot com
@ 2009-07-16 13:05 ` rguenther at suse dot de
  2009-07-16 13:32 ` burnus at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2009-07-16 13:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenther at suse dot de  2009-07-16 13:05 -------
Subject: Re:  Vectorization of complex types,
 vectorization of sincos missing

On Thu, 16 Jul 2009, irar at il dot ibm dot com wrote:

> ------- Comment #2 from irar at il dot ibm dot com  2009-07-16 12:29 -------
> pr40770.c:20: note: ==> examining statement: sincostmp.21_1 = __builtin_cexpi
> (D.1625_3);
> pr40770.c:20: note: get vectype for scalar type:  complex double
> pr40770.c:20: note: not vectorized: unsupported data-type complex double
> 
> make_vector_type returns NULL for this type.

Yes - there is no vector type for complex double.  But the vectorizer
could query for a vector type for the complex component type (double)
and divide the vector element count by 2 (for complex) to get the
vectorization factor which would be 1 here.  Should SLP the be possible
for that loop?

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
  2009-07-16  9:43 ` [Bug tree-optimization/40770] " ubizjak at gmail dot com
  2009-07-16  9:44 ` rguenth at gcc dot gnu dot org
@ 2009-07-16 12:29 ` irar at il dot ibm dot com
  2009-07-16 13:05 ` rguenther at suse dot de
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: irar at il dot ibm dot com @ 2009-07-16 12:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from irar at il dot ibm dot com  2009-07-16 12:29 -------
pr40770.c:20: note: ==> examining statement: sincostmp.21_1 = __builtin_cexpi
(D.1625_3);
pr40770.c:20: note: get vectype for scalar type:  complex double
pr40770.c:20: note: not vectorized: unsupported data-type complex double

make_vector_type returns NULL for this type.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
  2009-07-16  9:43 ` [Bug tree-optimization/40770] " ubizjak at gmail dot com
@ 2009-07-16  9:44 ` rguenth at gcc dot gnu dot org
  2009-07-16 12:29 ` irar at il dot ibm dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-07-16  9:44 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from rguenth at gcc dot gnu dot org  2009-07-16 09:44 -------
The middle-end presents the vectorizer with

<bb 3>:
  # i_13 = PHI <i_7(4), 0(2)>
  # ivtmp.26_8 = PHI <ivtmp.26_16(4), 1024(2)>
  D.1623_3 = xd[i_13];
  sincostmp.21_1 = __builtin_cexpi (D.1623_3);
  D.1624_4 = IMAGPART_EXPR <sincostmp.21_1>;
  sd[i_13] = D.1624_4;
  D.1625_6 = REALPART_EXPR <sincostmp.21_1>;
  cd[i_13] = D.1625_6;
  i_7 = i_13 + 1;
  ivtmp.26_16 = ivtmp.26_8 - 1;
  if (ivtmp.26_16 != 0)
    goto <bb 4>;
  else
    goto <bb 5>;

which has first of all complex types (they should be recognized as V2DF
with vectorization factor 1, thus SLP-able).

For the float case

<bb 3>:
  # i_13 = PHI <i_7(4), 0(2)>
  # ivtmp.6_8 = PHI <ivtmp.6_16(4), 1024(2)>
  D.1610_3 = xf[i_13];
  sincostmp.1_1 = __builtin_cexpif (D.1610_3);
  D.1611_4 = IMAGPART_EXPR <sincostmp.1_1>;
  sf[i_13] = D.1611_4;
  D.1612_6 = REALPART_EXPR <sincostmp.1_1>;
  cf[i_13] = D.1612_6;
  i_7 = i_13 + 1;
  ivtmp.6_16 = ivtmp.6_8 - 1;
  if (ivtmp.6_16 != 0)
    goto <bb 4>;
  else
    goto <bb 5>;

they should be V2SF, thus use V4SF and vectorization factor 2.  Still use
SLP probably.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
  2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
@ 2009-07-16  9:43 ` ubizjak at gmail dot com
  2009-07-16  9:44 ` rguenth at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: ubizjak at gmail dot com @ 2009-07-16  9:43 UTC (permalink / raw)
  To: gcc-bugs



-- 

ubizjak at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-07-16 09:43:14
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-09-22 11:31 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-40770-4@http.gcc.gnu.org/bugzilla/>
2011-09-13  9:55 ` [Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing vincenzo.innocente at cern dot ch
2011-09-27 14:13 ` rguenth at gcc dot gnu.org
2020-09-22  6:35 ` rguenth at gcc dot gnu.org
2020-09-22  7:04 ` kangshan0910 at hotmail dot com
2020-09-22  8:11 ` rguenther at suse dot de
2020-09-22  9:03 ` jakub at gcc dot gnu.org
2020-09-22  9:37 ` fw at gcc dot gnu.org
2020-09-22  9:49 ` rguenth at gcc dot gnu.org
2020-09-22  9:58 ` jakub at gcc dot gnu.org
2020-09-22  9:58 ` jakub at gcc dot gnu.org
2020-09-22 10:15 ` kangshan0910 at hotmail dot com
2020-09-22 10:19 ` kangshan0910 at hotmail dot com
2020-09-22 10:43 ` rguenth at gcc dot gnu.org
2020-09-22 11:14 ` kangshan0910 at hotmail dot com
2020-09-22 11:17 ` rguenther at suse dot de
2020-09-22 11:31 ` kangshan0910 at hotmail dot com
2009-07-16  9:42 [Bug tree-optimization/40770] New: " rguenth at gcc dot gnu dot org
2009-07-16  9:43 ` [Bug tree-optimization/40770] " ubizjak at gmail dot com
2009-07-16  9:44 ` rguenth at gcc dot gnu dot org
2009-07-16 12:29 ` irar at il dot ibm dot com
2009-07-16 13:05 ` rguenther at suse dot de
2009-07-16 13:32 ` burnus at gcc dot gnu dot org
2009-07-16 13:58 ` rguenther at suse dot de
2009-07-16 17:31 ` irar at il dot ibm dot com
2009-07-20 11:18 ` irar at il dot ibm dot com
2009-07-20 12:30 ` ubizjak at gmail dot com
2009-07-20 12:55 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).