public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/56160] New: unnecessary additions in loop [x86, x86_64]
@ 2013-01-31 10:43 jtaylor.debian at gmail dot com
2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: jtaylor.debian at gmail dot com @ 2013-01-31 10:43 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160
Bug #: 56160
Summary: unnecessary additions in loop [x86, x86_64]
Classification: Unclassified
Product: gcc
Version: 4.4.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: jtaylor.debian@gmail.com
the attached code which does complex float multiplication using sse3 produces 4
unnecessary integer additions if the NaN fallback function comp_mult is inlined
the assembly for the loop generated with -msse3 -O3 -std=c99 in gcc 4.4, 4.6,
4.7 and 4.8 svn 195604 looks like this:
28: 0f 28 0e movaps (%esi),%xmm1
2b: f3 0f 12 c1 movsldup %xmm1,%xmm0
2f: 8b 55 08 mov 0x8(%ebp),%edx
32: 0f 28 13 movaps (%ebx),%xmm2
35: f3 0f 16 c9 movshdup %xmm1,%xmm1
39: 0f 59 c2 mulps %xmm2,%xmm0
3c: 0f c6 d2 b1 shufps $0xb1,%xmm2,%xmm2
40: 0f 59 ca mulps %xmm2,%xmm1
43: f2 0f d0 c1 addsubps %xmm1,%xmm0
47: 0f 29 04 fa movaps %xmm0,(%edx,%edi,8)
4b: 0f c2 c0 04 cmpneqps %xmm0,%xmm0
4f: 0f 50 c0 movmskps %xmm0,%eax
52: 85 c0 test %eax,%eax
54: 75 1d jne 73 <sse3_mult+0x73> // inlined
comp_mult
56: 83 c7 02 add $0x2,%edi
59: 83 c6 10 add $0x10,%esi
5c: 83 c3 10 add $0x10,%ebx
5f: 83 c1 10 add $0x10,%ecx
62: 83 45 e4 10 addl $0x10,-0x1c(%ebp)
66: 39 7d 14 cmp %edi,0x14(%ebp)
69: 7f bd jg 28 <sse3_mult+0x28>
...
the 4 adds for esi ebx ecx and ebp are completely unnecessary and reduce
performance by about 20% on my core2duo.
on amd64 it also creates to seemingly unnecessary additions but I did not test
the performance.
a way to coax gcc to emit proper code is to not allow it to inline the fallback
it then generates following good assembly with only one integer add:
a8: 0f 28 0c df movaps (%edi,%ebx,8),%xmm1
ac: f3 0f 12 c1 movsldup %xmm1,%xmm0
b0: 8b 45 08 mov 0x8(%ebp),%eax
b3: 0f 28 14 de movaps (%esi,%ebx,8),%xmm2
b7: f3 0f 16 c9 movshdup %xmm1,%xmm1
bb: 0f 59 c2 mulps %xmm2,%xmm0
be: 0f c6 d2 b1 shufps $0xb1,%xmm2,%xmm2
c2: 0f 59 ca mulps %xmm2,%xmm1
c5: f2 0f d0 c1 addsubps %xmm1,%xmm0
c9: 0f 29 04 d8 movaps %xmm0,(%eax,%ebx,8)
cd: 0f c2 c0 04 cmpneqps %xmm0,%xmm0
d1: 0f 50 c0 movmskps %xmm0,%eax
d4: 85 c0 test %eax,%eax
d6: 75 10 jne e8 <sse3_mult+0x58> // non-inlined
comp_mult
d8: 83 c3 02 add $0x2,%ebx
db: 39 5d 14 cmp %ebx,0x14(%ebp)
de: 7f c8 jg a8 <sse3_mult+0x18>
...
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c/56160] unnecessary additions in loop [x86, x86_64]
2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
@ 2013-01-31 10:44 ` jtaylor.debian at gmail dot com
2013-01-31 10:48 ` jtaylor.debian at gmail dot com
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: jtaylor.debian at gmail dot com @ 2013-01-31 10:44 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160
--- Comment #1 from Julian Taylor <jtaylor.debian at gmail dot com> 2013-01-31 10:43:51 UTC ---
Created attachment 29313
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29313
code
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c/56160] unnecessary additions in loop [x86, x86_64]
2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
@ 2013-01-31 10:48 ` jtaylor.debian at gmail dot com
2013-01-31 17:51 ` [Bug middle-end/56160] " pinskia at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: jtaylor.debian at gmail dot com @ 2013-01-31 10:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160
--- Comment #2 from Julian Taylor <jtaylor.debian at gmail dot com> 2013-01-31 10:47:54 UTC ---
these three lines is missing at the top of the attachment
#include <complex.h>
#include <pmmintrin.h>
#define UNLIKELY(x) __builtin_expect((x),0)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/56160] unnecessary additions in loop [x86, x86_64]
2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
2013-01-31 10:48 ` jtaylor.debian at gmail dot com
@ 2013-01-31 17:51 ` pinskia at gcc dot gnu.org
2013-01-31 17:53 ` jtaylor.debian at gmail dot com
2021-07-24 4:53 ` [Bug target/56160] " pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-01-31 17:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|c |middle-end
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2013-01-31 17:50:42 UTC ---
Can you try a new compiler, 4.4 is no longer maintained?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/56160] unnecessary additions in loop [x86, x86_64]
2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
` (2 preceding siblings ...)
2013-01-31 17:51 ` [Bug middle-end/56160] " pinskia at gcc dot gnu.org
@ 2013-01-31 17:53 ` jtaylor.debian at gmail dot com
2021-07-24 4:53 ` [Bug target/56160] " pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: jtaylor.debian at gmail dot com @ 2013-01-31 17:53 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160
Julian Taylor <jtaylor.debian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|4.4.1 |4.8.0
--- Comment #4 from Julian Taylor <jtaylor.debian at gmail dot com> 2013-01-31 17:53:17 UTC ---
it is still the case in 4.8 svn r195604 (built on i586 fedora 11) and the
versions in between, 4.4 is the oldest I tested.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/56160] unnecessary additions in loop [x86, x86_64]
2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
` (3 preceding siblings ...)
2013-01-31 17:53 ` jtaylor.debian at gmail dot com
@ 2021-07-24 4:53 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-24 4:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target| |x86_64-linux-gnu
Component|middle-end |target
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is just IV-OPTs going wrong.
One thing which I will note does improve the code is doing:
__m128 n = _mm_cmpneq_ps(res, res);
int need = _mm_movemask_ps(n);
if (UNLIKELY(need)) {
comp_mult(r, a, b, i);
} else
_mm_store_ps((float*)&r[i], res);
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-07-24 4:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
2013-01-31 10:48 ` jtaylor.debian at gmail dot com
2013-01-31 17:51 ` [Bug middle-end/56160] " pinskia at gcc dot gnu.org
2013-01-31 17:53 ` jtaylor.debian at gmail dot com
2021-07-24 4:53 ` [Bug target/56160] " pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).