public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/53243] New: Use vector comparisons for if cascades
@ 2012-05-05 4:06 drepper.fsp at gmail dot com
2012-05-07 9:33 ` [Bug tree-optimization/53243] " rguenth at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: drepper.fsp at gmail dot com @ 2012-05-05 4:06 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53243
Bug #: 53243
Summary: Use vector comparisons for if cascades
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: drepper.fsp@gmail.com
Target: x86_64-linux
Created attachment 27312
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27312
Test program (compile with and without -DOLD)
The vector units can compare multiple comparisons concurrently but this is not
used automatically in gcc in situations where it can lead to better
performance. Assume a function like this:
void
f(float a)
{
if (a < 1.0)
cb(1);
else if (a < 2.0)
cb(2);
else if (a < 3.0)
cb(3);
else if (a < 4.0)
cb(4);
else if (a < 5.0)
cb(5);
else if (a < 6.0)
cb(6);
else if (a < 7.0)
cb(7);
else if (a < 8.0)
cb(8);
else
++o;
}
In this case the first or second if is not marked with __builtin_expect as
likely, otherwise the following *might* not apply.
The routine can be rewritten for AVX machines like this:
void
f(float a)
{
const __m256 fv = _mm256_set_ps(8.0,7.0,6.0,5.0,4.0,3.0,2.0,1.0);
__m256 r = _mm256_cmp_ps(fv, _mm256_set1_ps(a), _CMP_LT_OS);
int i = _mm256_movemask_ps(r);
asm goto ("bsr %0, %0; jz %l[less1]; .pushsection .rodata; 1: .quad %l2, %l3,
%l4, %l5, %l6, %l7, %l8, %l9; .popsection; jmp *1b(,%0,8)" : : "r" (i) : :
less1, less2, less3, less4, less5, less6, less7, less8, gt8);
__builtin_unreachable ();
less1:
cb(1);
return;
less2:
cb(2);
return;
less3:
cb(3);
return;
less4:
cb(4);
return;
less5:
cb(5);
return;
less6:
cb(6);
return;
less7:
cb(7);
return;
less8:
cb(8);
return;
gt8:
++o;
}
This might not generate the absolute best code but it runs for the test program
which I attach 20% faster.
The same technique can be applied to integer comparisons. More complex if
cascades can also be simplified a lot by masking the integer bsr result
accordingly. This should still be faster.
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug tree-optimization/53243] Use vector comparisons for if cascades
2012-05-05 4:06 [Bug tree-optimization/53243] New: Use vector comparisons for if cascades drepper.fsp at gmail dot com
@ 2012-05-07 9:33 ` rguenth at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-07 9:33 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53243
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2012-05-07
Ever Confirmed|0 |1
--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-07 09:31:59 UTC ---
How does performance look like when you do not have random distribution of
comparison results but branch pedictors can do a good job? Thus, are
branch pedictors not disabled by this transform, only retaining branch
target buffering?
Your code can probably be optimized further by recognizing that only the
parameter to cb is conditional if we don't take the else branch, thus do
if (!(a < 8.0))
++o;
else
do vector compare, compute cb argument from result and call cb
doing no branches at all.
I don't see where your or the above transform would best fit into.
It's straight-line code as written, still we'd need to perform
some if-conversion (difficult for calls, so only pairs with the above
idea) and then vectorization.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-05-07 9:32 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-05 4:06 [Bug tree-optimization/53243] New: Use vector comparisons for if cascades drepper.fsp at gmail dot com
2012-05-07 9:33 ` [Bug tree-optimization/53243] " rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).