public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/32412] New: Passing struct as parameter breaks SRA for stack-allocated struct inside called function
@ 2007-06-20 8:30 scovich at gmail dot com
2007-06-20 12:57 ` [Bug middle-end/32412] " pinskia at gcc dot gnu dot org
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: scovich at gmail dot com @ 2007-06-20 8:30 UTC (permalink / raw)
To: gcc-bugs
sra-bug.C (below) contains a function which stack-allocates a local struct
containing two small arrays. The function depends on SRA to eliminate repeated
memory accesses to the two arrays as it streams over a large, third array.
The performance of the executables resulting from
g++ -Wall -O3 -msse3 -fpeel-loops sra-bug.C
and
g++ -Wall -O3 -msse3 -fpeel-loops sra-bug.C -DTRIGGER_BUG
differs by exactly 2x on my machine (a 2.66GHz Core2 quad Xeon), with the
runtime increasing from .395 ns/value/entry to .790 ns/value/entry.
The only difference between the two versions is whether the array pointer and
count are passed as separate arguments (fast) or wrapped in a struct (slow),
even though the latter gets copied into local variables before use. Use of the
__restrict keyword didn't seem to make a difference. The assembler output shows
that excessive loads and stores nearly double the instruction count of the
unrolled inner loop for the slower case.
FYI gcc-4.2.0 shows similar behavior, though its output is slower than 4.1 for
both cases (.420ns vs 1.10ns). gcc-4.3-20070617 performs equally badly on both
versions of the code (.690 ns/value/entry).
sra-bug.C:
===========================================================
#include <emmintrin.h>
#include <stdint.h>
#include <cassert>
#include <cstdio>
#include <sys/time.h>
struct stopwatch_t {
struct timeval tv; long long mark;
stopwatch_t() { reset(); }
double time_ns() {
long long old_mark = mark; reset(); return 1e3*(mark - old_mark);
}
void reset() {
gettimeofday(&tv, NULL); mark = tv.tv_usec + tv.tv_sec*1000000ll;
}
};
template<int N, class T, class Action>
inline void unrolled_loop(T* entries, Action &action) {
for(int i=0; i < N; i++) action(entries[i]);
}
static __m128i const ALL_ZEROS = {0ull, 0ull};
static __m128i const ALL_ONES = {~0ull, ~0ull};
static int const COUNT=4;
struct Action16 {
__m128i _results[COUNT];
__m128i _values[COUNT];
__m128i* _dest;
Action16(__m128i* dest, uint64_t const* values) : _dest(dest) {
for(int i=0; i < COUNT; i++) {
_results[i] = ALL_ZEROS;
_values[i] = _mm_set1_epi16((short) values[i]);
}
}
void operator()(__m128i const &entry) {
for(int i=0; i < COUNT; i++)
_results[i] |= _mm_cmpeq_epi16(_values[i], entry);
}
~Action16() {
for(int i=0; i < COUNT; i++)
_dest[i] = _mm_movemask_epi8(_results[i])? ALL_ONES : ALL_ZEROS;
}
};
struct wrapper {
__m128i const* entries;
int count;
};
#ifdef TRIGGER_BUG
void foo(__m128i* dest, uint64_t const* values, wrapper const &w) {
__m128i const* entries = w.entries; int count = w.count;
#else
void foo(__m128i* dest, uint64_t const* values, __m128i const* entries, int
coun
t) {
#endif
static int const unroll_count=16;
Action16 action(dest, values);
assert((count % unroll_count) == 0);
for(int i=0; i+unroll_count < count; i+=unroll_count)
unrolled_loop<unroll_count>(&entries[i], action);
}
int main() {
int VALUE_COUNT = 1000000;
int LIST_SIZE = 2048;
uint64_t* values = new uint64_t[VALUE_COUNT];
__m128i* dest = (__m128i*) _mm_malloc(16*VALUE_COUNT, 16);
__m128i entries[LIST_SIZE];
wrapper w = {entries, LIST_SIZE};
stopwatch_t timer;
for(int j=0; j < 5; j++) {
for(int i=0; i < VALUE_COUNT; i+= COUNT) {
#ifdef TRIGGER_BUG
foo(dest+i, values+i, w);
#else
foo(dest+i, values+i, entries, LIST_SIZE);
#endif
}
printf("%.3lf ns/value/entry\n", timer.time_ns()/LIST_SIZE/VALUE_COUNT);
}
}
--
Summary: Passing struct as parameter breaks SRA for stack-
allocated struct inside called function
Product: gcc
Version: 4.1.2
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: scovich at gmail dot com
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32412
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/32412] Passing struct as parameter breaks SRA for stack-allocated struct inside called function
2007-06-20 8:30 [Bug c++/32412] New: Passing struct as parameter breaks SRA for stack-allocated struct inside called function scovich at gmail dot com
@ 2007-06-20 12:57 ` pinskia at gcc dot gnu dot org
2007-06-20 17:49 ` scovich at gmail dot com
2007-06-20 18:22 ` scovich at gmail dot com
2 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-06-20 12:57 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from pinskia at gcc dot gnu dot org 2007-06-20 12:57 -------
wrapper const &w
You are passing via reference which does not break SRA, just changes the ABI
and such.
This is a very very hard problem to solve without the whole program.
I wondering if I should close it as won't fix.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|c++ |middle-end
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32412
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/32412] Passing struct as parameter breaks SRA for stack-allocated struct inside called function
2007-06-20 8:30 [Bug c++/32412] New: Passing struct as parameter breaks SRA for stack-allocated struct inside called function scovich at gmail dot com
2007-06-20 12:57 ` [Bug middle-end/32412] " pinskia at gcc dot gnu dot org
@ 2007-06-20 17:49 ` scovich at gmail dot com
2007-06-20 18:22 ` scovich at gmail dot com
2 siblings, 0 replies; 5+ messages in thread
From: scovich at gmail dot com @ 2007-06-20 17:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from scovich at gmail dot com 2007-06-20 17:49 -------
(In reply to comment #1)
> wrapper const &w
>
> You are passing via reference which does not break SRA, just changes the ABI
> and such.
>
> This is a very very hard problem to solve without the whole program.
>
> I wondering if I should close it as won't fix.
>
I'm not convinced the ABI change by itself is the culprit:
1. Passing w by value gives the same result. Granted, passing a struct at all
changes the ABI, but the const ref part isn't an issue, at least.
2. You have to actually use the wrapper's 'entries' pointer for the problem to
appear (diff for modified test case below).
3. The problem goes away if you convert Action16 to use scalars instead of
arrays, so SRA for structs is unaffected.
Why does passing a pointer inside a struct on the stack instead of passing it
in a register suddenly require the whole program to analyze properly? There's
no way stack-allocated arrays can alias with arrays passed into the function. I
would have expected a few extra instructions in the function prologue to load
the values into registers, followed by business as usual.
$ diff sra-bug.C.orig sra-bug.C
==============================================================
51a52,54
> void foo(__m128i* dest, uint64_t const* values,
> __m128i const* _entries, int _count, wrapper w)
> {
53d55
< void foo(__m128i* dest, uint64_t const* values, wrapper const &w) {
56c58
< void foo(__m128i* dest, uint64_t const* values, __m128i const* entries, int
co
unt) {
---
> __m128i const* entries = _entries; int count = _count;
75,79c77
< #ifdef TRIGGER_BUG
< foo(dest+i, values+i, w);
< #else
< foo(dest+i, values+i, entries, LIST_SIZE);
< #endif
---
> foo(dest+i, values+i, entries, LIST_SIZE, w);
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32412
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/32412] Passing struct as parameter breaks SRA for stack-allocated struct inside called function
2007-06-20 8:30 [Bug c++/32412] New: Passing struct as parameter breaks SRA for stack-allocated struct inside called function scovich at gmail dot com
2007-06-20 12:57 ` [Bug middle-end/32412] " pinskia at gcc dot gnu dot org
2007-06-20 17:49 ` scovich at gmail dot com
@ 2007-06-20 18:22 ` scovich at gmail dot com
2 siblings, 0 replies; 5+ messages in thread
From: scovich at gmail dot com @ 2007-06-20 18:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from scovich at gmail dot com 2007-06-20 18:22 -------
(In reply to comment #1)
Sorry for the double post, but I just tried creating a wrapper_foo() that
copies the values out of the struct, then passes them on to foo() as scalars.
The problem only appears if foo() gets inlined into wrapper_foo().
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32412
^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <bug-32412-4@http.gcc.gnu.org/bugzilla/>]
end of thread, other threads:[~2012-01-19 5:13 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-20 8:30 [Bug c++/32412] New: Passing struct as parameter breaks SRA for stack-allocated struct inside called function scovich at gmail dot com
2007-06-20 12:57 ` [Bug middle-end/32412] " pinskia at gcc dot gnu dot org
2007-06-20 17:49 ` scovich at gmail dot com
2007-06-20 18:22 ` scovich at gmail dot com
[not found] <bug-32412-4@http.gcc.gnu.org/bugzilla/>
2012-01-19 5:31 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).