public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics
@ 2008-12-30 12:58 tim at klingt dot org
2008-12-30 12:59 ` [Bug inline-asm/38671] " tim at klingt dot org
` (14 more replies)
0 siblings, 15 replies; 16+ messages in thread
From: tim at klingt dot org @ 2008-12-30 12:58 UTC (permalink / raw)
To: gcc-bugs
i experience some speed regressions with gcc-4.4, with sse intrinsics on a
core2 (x86_64). the code is:
namespace detail
{
/** compute x1 * (1 + x2 * amount) */
__m128 inline amp_mod4_loop(__m128 x1, __m128 x2, __m128 amount, __m128 one)
{
return _mm_mul_ps(x1,
_mm_add_ps(one,
_mm_mul_ps(x2, amount)));
}
} /* namespace detail */
template <>
inline void amp_mod4(float * out, const float * in1, const float * in2,
const float amount, unsigned int n)
{
n = n >> 2;
const __m128 one = detail::gen_one();
const __m128 amnt = _mm_set_ps1(amount);
do
{
const __m128 x1 = _mm_load_ps(in1);
in1 += 4;
const __m128 x2 = _mm_load_ps(in2);
in2 += 4;
const __m128 result = detail::amp_mod4_loop(x1, x2, amnt, one);
_mm_store_ps(out, result);
out += 4;
}
while (--n);
}
the results for different compilers (using hardware performance counters) are:
gcc-4.4:
cycles: 1416276094
branch misses: 425897
gcc-4.4 -march=core2:
cycles: 1520034636
branch misses: 3263912
gcc-4.3:
cycles: 1548838336
branch misses: 5990424
gcc-4.3 -march=core2:
cycles: 1386605444
branch misses: 5609
gcc-4.2:
cycles: 1321697674
branch misses: 3682
it seems that gcc-4.3 with -march core2 and gcc-4.2 generate code, which is
more friendly to the branch predictor. tuning for core2 on gcc-4.4 actually
seems to generate worse code.
the best code (gcc-4.2) is:
0000000000400de0 <bench_1_simd(unsigned int)>:
400de0: 66 0f ef c0 pxor %xmm0,%xmm0
400de4: c1 ef 02 shr $0x2,%edi
400de7: 0f 28 15 32 0f 00 00 movaps 0xf32(%rip),%xmm2 #
401d20 <_IO_stdin_used+0xb0>
400dee: 31 c0 xor %eax,%eax
400df0: 66 0f 76 c0 pcmpeqd %xmm0,%xmm0
400df4: 66 0f 72 d0 19 psrld $0x19,%xmm0
400df9: 66 0f 72 f0 17 pslld $0x17,%xmm0
400dfe: 0f 28 c8 movaps %xmm0,%xmm1
400e01: 0f 28 80 e0 26 60 00 movaps 0x6026e0(%rax),%xmm0
400e08: 0f 59 c2 mulps %xmm2,%xmm0
400e0b: 0f 58 c1 addps %xmm1,%xmm0
400e0e: 0f 59 80 e0 25 60 00 mulps 0x6025e0(%rax),%xmm0
400e15: 0f 29 80 e0 24 60 00 movaps %xmm0,0x6024e0(%rax)
400e1c: 48 83 c0 10 add $0x10,%rax
400e20: 83 ef 01 sub $0x1,%edi
400e23: 75 dc jne 400e01 <bench_1_simd(unsigned
int)+0x21>
400e25: f3 c3 repz retq
400e27: 90 nop
400e28: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
the worst code (gcc-4.4, -march=core2) is 15% slower:
0000000000400e70 <bench_1_simd(unsigned int)>:
400e70: 66 0f ef d2 pxor %xmm2,%xmm2
400e74: 89 fa mov %edi,%edx
400e76: 66 0f 76 d2 pcmpeqd %xmm2,%xmm2
400e7a: c1 ea 02 shr $0x2,%edx
400e7d: 66 0f 72 d2 19 psrld $0x19,%xmm2
400e82: ff ca dec %edx
400e84: 66 0f 72 f2 17 pslld $0x17,%xmm2
400e89: 48 ff c2 inc %rdx
400e8c: 0f 28 0d 7d 17 00 00 movaps 0x177d(%rip),%xmm1 #
402610 <_IO_stdin_used+0xb0>
400e93: 48 c1 e2 04 shl $0x4,%rdx
400e97: 31 c0 xor %eax,%eax
400e99: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
400ea0: 0f 28 c1 movaps %xmm1,%xmm0
400ea3: 0f 59 80 e0 36 60 00 mulps 0x6036e0(%rax),%xmm0
400eaa: 0f 58 c2 addps %xmm2,%xmm0
400ead: 0f 59 80 e0 35 60 00 mulps 0x6035e0(%rax),%xmm0
400eb4: 0f 29 80 e0 34 60 00 movaps %xmm0,0x6034e0(%rax)
400ebb: 48 83 c0 10 add $0x10,%rax
400ebf: 48 39 d0 cmp %rdx,%rax
400ec2: 75 dc jne 400ea0 <bench_1_simd(unsigned
int)+0x30>
400ec4: f3 c3 repz retq
400ec6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
400ecd: 00 00 00
--
Summary: [4.4 Regression] speed regression with sse intrinsics
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: inline-asm
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: tim at klingt dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug inline-asm/38671] [4.4 Regression] speed regression with sse intrinsics
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
@ 2008-12-30 12:59 ` tim at klingt dot org
2008-12-30 16:23 ` [Bug target/38671] " pinskia at gcc dot gnu dot org
` (13 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: tim at klingt dot org @ 2008-12-30 12:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from tim at klingt dot org 2008-12-30 12:58 -------
Created an attachment (id=17013)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17013&action=view)
preprocessed source (gcc-4.4)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/38671] [4.4 Regression] speed regression with sse intrinsics
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
2008-12-30 12:59 ` [Bug inline-asm/38671] " tim at klingt dot org
@ 2008-12-30 16:23 ` pinskia at gcc dot gnu dot org
2008-12-31 7:50 ` pinskia at gcc dot gnu dot org
` (12 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-30 16:23 UTC (permalink / raw)
To: gcc-bugs
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|inline-asm |target
GCC target triplet| |i?86-*-*
Keywords| |missed-optimization
Target Milestone|--- |4.4.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/38671] [4.4 Regression] speed regression with sse intrinsics
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
2008-12-30 12:59 ` [Bug inline-asm/38671] " tim at klingt dot org
2008-12-30 16:23 ` [Bug target/38671] " pinskia at gcc dot gnu dot org
@ 2008-12-31 7:50 ` pinskia at gcc dot gnu dot org
2008-12-31 7:57 ` pinskia at gcc dot gnu dot org
` (11 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-31 7:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2008-12-31 07:47 -------
sys_perf_counter_open always returns less than zero for me.
This is with:
Linux gcc13 2.6.18-6-vserver-amd64 #1 SMP Sun Feb 10 17:55:04 UTC 2008 x86_64
GNU/Linux
What system call is it trying to do and why?
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
GCC target triplet|i?86-*-* |x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/38671] [4.4 Regression] speed regression with sse intrinsics
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (2 preceding siblings ...)
2008-12-31 7:50 ` pinskia at gcc dot gnu dot org
@ 2008-12-31 7:57 ` pinskia at gcc dot gnu dot org
2008-12-31 8:11 ` [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops pinskia at gcc dot gnu dot org
` (10 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-31 7:57 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 456 bytes --]
------- Comment #3 from pinskia at gcc dot gnu dot org 2008-12-31 07:56 -------
t.cc: In function float __vector__ nova::detail::gen_one():
t.cc:34160: warning: x is used uninitialized in this function
inline __m128 gen_one(void)
{
__m128i x;
__m128i ones = _mm_cmpeq_epi32(x, x);
return (__m128)_mm_slli_epi32 (_mm_srli_epi32(ones, 25), 23);
}
Is undefined code I think.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (3 preceding siblings ...)
2008-12-31 7:57 ` pinskia at gcc dot gnu dot org
@ 2008-12-31 8:11 ` pinskia at gcc dot gnu dot org
2008-12-31 8:14 ` [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits) pinskia at gcc dot gnu dot org
` (9 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-31 8:11 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from pinskia at gcc dot gnu dot org 2008-12-31 08:10 -------
D.45587 = VIEW_CONVERT_EXPR<__v4si>(x);
D.45589 = __builtin_ia32_pcmpeqd128 (D.45587, D.45587);
D.45591 = __builtin_ia32_psrldi128 (D.45589, 25);
D.45594 = __builtin_ia32_pslldi128 (D.45591, 23);
one = VIEW_CONVERT_EXPR<__m128>(VIEW_CONVERT_EXPR<__m128i>(D.45594));
D.45644 = (long unsigned int) ((n >> 2) + 4294967295) + 1 * 16;
ivtmp.516 = 0;
So the inner loop is not the issue, only the setup code.
The extra subtract/add comes from D.45644.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|target |middle-end
Summary|[4.4 Regression] speed |[4.4 Regression] extra code
|regression with sse |for setting up loops
|intrinsics |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (4 preceding siblings ...)
2008-12-31 8:11 ` [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops pinskia at gcc dot gnu dot org
@ 2008-12-31 8:14 ` pinskia at gcc dot gnu dot org
2008-12-31 9:21 ` tim at klingt dot org
` (8 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-31 8:14 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from pinskia at gcc dot gnu dot org 2008-12-31 08:12 -------
Confirmed, though I don't have a fully reduced testcase yet. Basically it
comes down to using unsigned int rather than size_t. If you had used size_t as
the index, the code would have worked correctly.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2008-12-31 08:12:50
date| |
Summary|[4.4 Regression] extra code |[4.4 Regression] extra code
|for setting up loops |for setting up loops (IV-
| |opts and 32bits vs 64bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (5 preceding siblings ...)
2008-12-31 8:14 ` [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits) pinskia at gcc dot gnu dot org
@ 2008-12-31 9:21 ` tim at klingt dot org
2009-01-05 11:28 ` rguenth at gcc dot gnu dot org
` (7 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: tim at klingt dot org @ 2008-12-31 9:21 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1008 bytes --]
------- Comment #6 from tim at klingt dot org 2008-12-31 09:20 -------
> sys_perf_counter_open always returns less than zero for me.
> This is with:
> Linux gcc13 2.6.18-6-vserver-amd64 #1 SMP Sun Feb 10 17:55:04 UTC 2008 x86_64
> GNU/Linux
>
> What system call is it trying to do and why?
>
it is trying to open the performance counters
(http://lwn.net/Articles/310176/). it requires a patched kernel, though ...
(In reply to comment #3)
> t.cc: In function �float __vector__ nova::detail::gen_one()�:
> t.cc:34160: warning: �x� is used uninitialized in this function
>
> inline __m128 gen_one(void)
> {
> __m128i x;
> __m128i ones = _mm_cmpeq_epi32(x, x);
> return (__m128)_mm_slli_epi32 (_mm_srli_epi32(ones, 25), 23);
> }
>
> Is undefined code I think.
this code is valid. the uninitialized xmm register x is compared with itself in
order to set the register ones to ffffffffffffffffffffffffffffffff.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (6 preceding siblings ...)
2008-12-31 9:21 ` tim at klingt dot org
@ 2009-01-05 11:28 ` rguenth at gcc dot gnu dot org
2009-04-21 16:00 ` [Bug middle-end/38671] [4.4/4.5 " jakub at gcc dot gnu dot org
` (6 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-05 11:28 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.4/4.5 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (7 preceding siblings ...)
2009-01-05 11:28 ` rguenth at gcc dot gnu dot org
@ 2009-04-21 16:00 ` jakub at gcc dot gnu dot org
2009-07-22 10:35 ` jakub at gcc dot gnu dot org
` (5 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-04-21 16:00 UTC (permalink / raw)
To: gcc-bugs
--
jakub at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.4.0 |4.4.1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.4/4.5 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (8 preceding siblings ...)
2009-04-21 16:00 ` [Bug middle-end/38671] [4.4/4.5 " jakub at gcc dot gnu dot org
@ 2009-07-22 10:35 ` jakub at gcc dot gnu dot org
2009-10-15 12:54 ` jakub at gcc dot gnu dot org
` (4 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-07-22 10:35 UTC (permalink / raw)
To: gcc-bugs
--
jakub at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.4.1 |4.4.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.4/4.5 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (9 preceding siblings ...)
2009-07-22 10:35 ` jakub at gcc dot gnu dot org
@ 2009-10-15 12:54 ` jakub at gcc dot gnu dot org
2010-01-21 13:16 ` jakub at gcc dot gnu dot org
` (3 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-10-15 12:54 UTC (permalink / raw)
To: gcc-bugs
--
jakub at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.4.2 |4.4.3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.4/4.5 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (10 preceding siblings ...)
2009-10-15 12:54 ` jakub at gcc dot gnu dot org
@ 2010-01-21 13:16 ` jakub at gcc dot gnu dot org
2010-03-01 23:32 ` [Bug middle-end/38671] [4.3/4.4/4.5 " pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-01-21 13:16 UTC (permalink / raw)
To: gcc-bugs
--
jakub at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.4.3 |4.4.4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.3/4.4/4.5 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits)
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (11 preceding siblings ...)
2010-01-21 13:16 ` jakub at gcc dot gnu dot org
@ 2010-03-01 23:32 ` pinskia at gcc dot gnu dot org
2010-03-01 23:35 ` [Bug middle-end/38671] [4.3/4.4/4.5 Regression] selecting one IV instead of three pinskia at gcc dot gnu dot org
2010-04-30 9:25 ` [Bug middle-end/38671] [4.3/4.4/4.5/4.6 " jakub at gcc dot gnu dot org
14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-03-01 23:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from pinskia at gcc dot gnu dot org 2010-03-01 23:32 -------
Still getting:
D.41749_31 = n_13 + 4294967295;
D.41750_32 = (long unsigned int) D.41749_31;
D.41751_49 = D.41750_32 + 1;
Reduced testcase:
int f(int *a, int n, int *b)
{
n = n >> 2;
do {
*b = *a;
a += 4;
b += 4;
} while (--n);
}
--- CUT ---
I want to say this was introduced by POINTER plus work :(.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |4.3.2 4.5.0
Known to work| |4.2.4
Summary|[4.4/4.5 Regression] extra |[4.3/4.4/4.5 Regression]
|code for setting up loops |extra code for setting up
|(IV-opts and 32bits vs |loops (IV-opts and 32bits vs
|64bits) |64bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.3/4.4/4.5 Regression] selecting one IV instead of three
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (12 preceding siblings ...)
2010-03-01 23:32 ` [Bug middle-end/38671] [4.3/4.4/4.5 " pinskia at gcc dot gnu dot org
@ 2010-03-01 23:35 ` pinskia at gcc dot gnu dot org
2010-04-30 9:25 ` [Bug middle-end/38671] [4.3/4.4/4.5/4.6 " jakub at gcc dot gnu dot org
14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-03-01 23:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from pinskia at gcc dot gnu dot org 2010-03-01 23:34 -------
For 4.2, we use three IVs; while from 4.3 and above, we use one IV.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.3/4.4/4.5 Regression] |[4.3/4.4/4.5 Regression]
|extra code for setting up |selecting one IV instead of
|loops (IV-opts and 32bits vs|three
|64bits) |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug middle-end/38671] [4.3/4.4/4.5/4.6 Regression] selecting one IV instead of three
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
` (13 preceding siblings ...)
2010-03-01 23:35 ` [Bug middle-end/38671] [4.3/4.4/4.5 Regression] selecting one IV instead of three pinskia at gcc dot gnu dot org
@ 2010-04-30 9:25 ` jakub at gcc dot gnu dot org
14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-04-30 9:25 UTC (permalink / raw)
To: gcc-bugs
--
jakub at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.4.4 |4.4.5
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38671
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-04-30 8:54 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-30 12:58 [Bug inline-asm/38671] New: [4.4 Regression] speed regression with sse intrinsics tim at klingt dot org
2008-12-30 12:59 ` [Bug inline-asm/38671] " tim at klingt dot org
2008-12-30 16:23 ` [Bug target/38671] " pinskia at gcc dot gnu dot org
2008-12-31 7:50 ` pinskia at gcc dot gnu dot org
2008-12-31 7:57 ` pinskia at gcc dot gnu dot org
2008-12-31 8:11 ` [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops pinskia at gcc dot gnu dot org
2008-12-31 8:14 ` [Bug middle-end/38671] [4.4 Regression] extra code for setting up loops (IV-opts and 32bits vs 64bits) pinskia at gcc dot gnu dot org
2008-12-31 9:21 ` tim at klingt dot org
2009-01-05 11:28 ` rguenth at gcc dot gnu dot org
2009-04-21 16:00 ` [Bug middle-end/38671] [4.4/4.5 " jakub at gcc dot gnu dot org
2009-07-22 10:35 ` jakub at gcc dot gnu dot org
2009-10-15 12:54 ` jakub at gcc dot gnu dot org
2010-01-21 13:16 ` jakub at gcc dot gnu dot org
2010-03-01 23:32 ` [Bug middle-end/38671] [4.3/4.4/4.5 " pinskia at gcc dot gnu dot org
2010-03-01 23:35 ` [Bug middle-end/38671] [4.3/4.4/4.5 Regression] selecting one IV instead of three pinskia at gcc dot gnu dot org
2010-04-30 9:25 ` [Bug middle-end/38671] [4.3/4.4/4.5/4.6 " jakub at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).