public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug ada/46006] New: vectorization outside of loops
@ 2010-10-13 14:24 jakub at gcc dot gnu.org
2010-10-17 13:22 ` [Bug tree-optimization/46006] " irar at il dot ibm.com
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: jakub at gcc dot gnu.org @ 2010-10-13 14:24 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46006
Summary: vectorization outside of loops
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: ada
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: jakub@gcc.gnu.org
CC: irar@gcc.gnu.org
Are there any plans to try to vectorize parts of code like:
struct A
{
double x, y, z;
};
struct B
{
struct A a, b;
};
struct C
{
struct A c;
double d;
};
__attribute__((noinline, noclone)) int
foo (const struct C *u, struct B v)
{
double a, b, c, d;
a = v.b.x * v.b.x + v.b.y * v.b.y + v.b.z * v.b.z;
b = 2.0 * v.b.x * (v.a.x - u->c.x)
+ 2.0 * v.b.y * (v.a.y - u->c.y) + 2.0 * v.b.z * (v.a.z - u->c.z);
c = u->c.x * u->c.x + u->c.y * u->c.y + u->c.z * u->c.z
+ v.a.x * v.a.x + v.a.y * v.a.y + v.a.z * v.a.z
+ 2.0 * (-u->c.x * v.a.x - u->c.y * v.a.y - u->c.z * v.a.z)
- u->d * u->d;
if ((d = b * b - 4.0 * a * c) < 0.0)
return 0;
return d;
}
int
main (void)
{
int i, j;
struct C c = { { 1.0, 1.0, 1.0 }, 1.0 };
struct B b = { { 1.0, 1.0, 1.0 }, { 1.0, 1.0, 1.0 } };
for (i = 0; i < 100000000; i++)
{
asm volatile ("" : : "r" (&c), "r" (&b) : "memory");
j = foo (&c, b);
asm volatile ("" : : "r" (j));
}
return 0;
}
(this is the hot spot from c-ray benchmark, the function is actually larger but
at least according to callgrind in most cases the early return on < 0.0
happens;
as the function is large and called from multiple spots, it isn't inlined).
I'd say (though, haven't tried to code it by hand using intrinsics) that by
doing many of the multiplications/additions in parallel (especially for AVX)
there could be significant speedups (-O3 -ffast-math).
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/46006] vectorization outside of loops
2010-10-13 14:24 [Bug ada/46006] New: vectorization outside of loops jakub at gcc dot gnu.org
@ 2010-10-17 13:22 ` irar at il dot ibm.com
2012-03-13 23:16 ` pinskia at gcc dot gnu.org
2023-06-21 13:17 ` [Bug tree-optimization/46006] vectorization outside of loops starting from loads rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: irar at il dot ibm.com @ 2010-10-17 13:22 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46006
Ira Rosen <irar at il dot ibm.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |irar at il dot ibm.com
--- Comment #1 from Ira Rosen <irar at il dot ibm.com> 2010-10-17 13:22:18 UTC ---
This code requires SLP to originate from loads, which seems to be a bit more
complicated than the currently implemented use-def scan (it will also need to
reduce/extract scalars from the vectors in the end of vector computation). I
don't see any major obstacles for this, however, currently I don't plan to work
on this.
Another required feature is to work on groups bigger than vectorization factor,
i.e., combining 2 statements in this example and leaving the 3rd one scalar.
Ira
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/46006] vectorization outside of loops
2010-10-13 14:24 [Bug ada/46006] New: vectorization outside of loops jakub at gcc dot gnu.org
2010-10-17 13:22 ` [Bug tree-optimization/46006] " irar at il dot ibm.com
@ 2012-03-13 23:16 ` pinskia at gcc dot gnu.org
2023-06-21 13:17 ` [Bug tree-optimization/46006] vectorization outside of loops starting from loads rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-03-13 23:16 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46006
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Status|UNCONFIRMED |NEW
Last reconfirmed| |2012-03-13
Ever Confirmed|0 |1
Severity|normal |enhancement
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-03-13 22:59:24 UTC ---
Confirmed.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/46006] vectorization outside of loops starting from loads
2010-10-13 14:24 [Bug ada/46006] New: vectorization outside of loops jakub at gcc dot gnu.org
2010-10-17 13:22 ` [Bug tree-optimization/46006] " irar at il dot ibm.com
2012-03-13 23:16 ` pinskia at gcc dot gnu.org
@ 2023-06-21 13:17 ` rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-21 13:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46006
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2021-08-24 00:00:00 |2023-6-21
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
We're almost there:
t2.c:22:5: note: Starting SLP discovery for
t2.c:22:5: note: powmult_4 = v$b$z_53 * v$b$z_53;
t2.c:22:5: note: powmult_1 = v$b$x_51 * v$b$x_51;
t2.c:22:5: note: powmult_2 = v$b$y_52 * v$b$y_52;
but:
t2.c:22:5: note: vectype: vector(2) double
t2.c:22:5: note: nunits = 2
t2.c:22:5: missed: Build SLP failed: unrolling required in basic block SLP
and for reductions we do not try to split the group.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-06-21 13:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-13 14:24 [Bug ada/46006] New: vectorization outside of loops jakub at gcc dot gnu.org
2010-10-17 13:22 ` [Bug tree-optimization/46006] " irar at il dot ibm.com
2012-03-13 23:16 ` pinskia at gcc dot gnu.org
2023-06-21 13:17 ` [Bug tree-optimization/46006] vectorization outside of loops starting from loads rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).