public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/38856] New: loop iv detection failure, SSA autoincrement
@ 2009-01-15 18:24 sergei_lus at yahoo dot com
2009-01-15 18:42 ` [Bug c/38856] " pinskia at gcc dot gnu dot org
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: sergei_lus at yahoo dot com @ 2009-01-15 18:24 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3575 bytes --]
I apologize if it is a well disguised feature, but I am forced to consider this
being a performance regression/bug.
In the following trivial example:
void
VecADD(
long long *In1,
long long *In2,
long long *Out,
unsigned int samples
){
int i;
for (i = 0; i < samples; i++) {
Out[i] = In1[i] + In2[i];
}
}
there is an implicit imprecision in the way C is used - type of 'samples' is
unsigned, while type of 'i' is signed.
The problem on the high level - induction variable analysis fails for this
loop, which impairs further tree level loop optimizations from functioning
properly (including autoincrement). In my port performance is off by 50% for
this loop. GCC 3.4.6 was able to handle this situation fine.
What I believe to be the problem at the lowest level is a non-minimal (or
overly restrictive) SSA representation right before the iv detection:
VecADD (In1, In2, Out, samples)
{
int i;
long long int D.1857;
long long int D.1856;
long long int * D.1855;
long long int D.1854;
long long int * D.1853;
long long int * D.1852;
unsigned int D.1851;
unsigned int i.0;
<bb 2>:
<bb 6>:
# i_10 = PHI <0(2)>
i.0_5 = (unsigned int) i_10;
if (i.0_5 < samples_4(D))
goto <bb 3>;
else
goto <bb 5>;
<bb 3>:
# i.0_9 = PHI <i.0_3(4), i.0_5(6)>
# i_14 = PHI <i_1(4), i_10(6)>
D.1851_6 = i.0_9 * 8;
D.1852_8 = Out_7(D) + D.1851_6;
D.1853_12 = In1_11(D) + D.1851_6;
D.1854_13 = *D.1853_12;
D.1855_17 = In2_16(D) + D.1851_6;
D.1856_18 = *D.1855_17;
D.1857_19 = D.1854_13 + D.1856_18;
*D.1852_8 = D.1857_19;
i_20 = i_14 + 1;
<bb 4>:
# i_1 = PHI <i_20(3)>
i.0_3 = (unsigned int) i_1;
if (i.0_3 < samples_4(D))
goto <bb 3>;
else
goto <bb 5>;
<bb 5>:
return;
}
The two PHI nodes in the beginning of BB3 break the iv detection. Same example
when types of i and samples would match will be analyzed perfectly fine
with the SSA at the same point looking like this:
VecADD (In1, In2, Out, samples)
{
int i;
long long int D.1857;
long long int D.1856;
long long int * D.1855;
long long int D.1854;
long long int * D.1853;
long long int * D.1852;
unsigned int D.1851;
unsigned int i.0;
<bb 2>:
<bb 6>:
# i_9 = PHI <0(2)>
if (i_9 < samples_3(D))
goto <bb 3>;
else
goto <bb 5>;
<bb 3>:
# i_13 = PHI <i_1(4), i_9(6)>
i.0_4 = (unsigned int) i_13;
D.1851_5 = i.0_4 * 8;
D.1852_7 = Out_6(D) + D.1851_5;
D.1853_11 = In1_10(D) + D.1851_5;
D.1854_12 = *D.1853_11;
D.1855_16 = In2_15(D) + D.1851_5;
D.1856_17 = *D.1855_16;
D.1857_18 = D.1854_12 + D.1856_17;
*D.1852_7 = D.1857_18;
i_19 = i_13 + 1;
<bb 4>:
# i_1 = PHI <i_19(3)>
if (i_1 < samples_3(D))
goto <bb 3>;
else
goto <bb 5>;
<bb 5>:
return;
}
On one hand I seem to understand that a danger of signed/unsigned overflow at
increment can force this kind of conservatism, but on the high level this
situation was handled fine by gcc 3.4.6 and is handled with no issues by
another SSA based compiler. If there is a way to relax this strict
interpretation of C rules by GCC 4.3.2, I would gladly learn about it, but my
brief flag mining exercise yielded no results. Thank you.
--
Summary: loop iv detection failure, SSA autoincrement
Product: gcc
Version: 4.3.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sergei_lus at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug c/38856] loop iv detection failure, SSA autoincrement
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
@ 2009-01-15 18:42 ` pinskia at gcc dot gnu dot org
2009-01-15 21:03 ` sergei_lus at yahoo dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-15 18:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from pinskia at gcc dot gnu dot org 2009-01-15 18:41 -------
Well first off, iv-opts on the tree level should be the place where
autoincrement is helped out. See PR 31849 which I think this is a duplicate
of.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug c/38856] loop iv detection failure, SSA autoincrement
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
2009-01-15 18:42 ` [Bug c/38856] " pinskia at gcc dot gnu dot org
@ 2009-01-15 21:03 ` sergei_lus at yahoo dot com
2009-01-15 21:13 ` [Bug middle-end/38856] loop iv detection failure pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: sergei_lus at yahoo dot com @ 2009-01-15 21:03 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 715 bytes --]
------- Comment #2 from sergei_lus at yahoo dot com 2009-01-15 21:03 -------
Andrew, thank you for the prompt reply.
I have seen PR 31849 and used the patch suggested. Without it autoincrement
would not work at all for either case. But the patch seems to deal with the
case when iv _were_ detected and generally alters cost computation to take that
addressing mode into account, while in this case I see that ivs are _not_
detected in the first place. This seems to be a more fundamental issue -
should SSA be made "minimal" at that point or should iv detection be improved
to "see" trough aliased PHI node I lack better words to describe it.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/38856] loop iv detection failure
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
2009-01-15 18:42 ` [Bug c/38856] " pinskia at gcc dot gnu dot org
2009-01-15 21:03 ` sergei_lus at yahoo dot com
@ 2009-01-15 21:13 ` pinskia at gcc dot gnu dot org
2009-01-15 21:17 ` pinskia at gcc dot gnu dot org
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-15 21:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from pinskia at gcc dot gnu dot org 2009-01-15 21:13 -------
This bug is a bit funny, autoincrement is not the issue, just the detecting of
the induction variable's limits correctly. For an example I noticed that if I
compile this on a LP64 target, the induction variable is found correctly.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|c |middle-end
Keywords| |missed-optimization
Summary|loop iv detection failure, |loop iv detection failure
|SSA autoincrement |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/38856] loop iv detection failure
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
` (2 preceding siblings ...)
2009-01-15 21:13 ` [Bug middle-end/38856] loop iv detection failure pinskia at gcc dot gnu dot org
@ 2009-01-15 21:17 ` pinskia at gcc dot gnu dot org
2009-01-16 9:57 ` rguenth at gcc dot gnu dot org
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-15 21:17 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from pinskia at gcc dot gnu dot org 2009-01-15 21:17 -------
Here is a good testcase:
void
VecADD(
int *In1,
int *In2,
int *Out,
unsigned int samples
){
int i;
for (i = 0; i < samples; i++) {
Out[i] = In1[i];
}
}
This testcase should be vectorized with -maltivec -O3 on PowerPC but is not
while it is with -m64.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-01-15 21:17:17
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/38856] loop iv detection failure
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
` (3 preceding siblings ...)
2009-01-15 21:17 ` pinskia at gcc dot gnu dot org
@ 2009-01-16 9:57 ` rguenth at gcc dot gnu dot org
2009-01-16 11:46 ` Andrew Thomas Pinski
2009-01-16 11:46 ` pinskia at gmail dot com
2009-01-16 18:53 ` pinskia at gcc dot gnu dot org
6 siblings, 1 reply; 9+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-16 9:57 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from rguenth at gcc dot gnu dot org 2009-01-16 09:57 -------
I think this boils down to the usual POINTER_PLUS fallout.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug middle-end/38856] loop iv detection failure
2009-01-16 9:57 ` rguenth at gcc dot gnu dot org
@ 2009-01-16 11:46 ` Andrew Thomas Pinski
0 siblings, 0 replies; 9+ messages in thread
From: Andrew Thomas Pinski @ 2009-01-16 11:46 UTC (permalink / raw)
To: gcc-bugzilla; +Cc: gcc-bugs
Sent from my iPhone
On Jan 16, 2009, at 1:57 AM, "rguenth at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org
> wrote:
>
>
> ------- Comment #5 from rguenth at gcc dot gnu dot org 2009-01-16
> 09:57 -------
> I think this boils down to the usual POINTER_PLUS fallout.
It failed in 4.1 also so nope :).
>
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/38856] loop iv detection failure
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
` (4 preceding siblings ...)
2009-01-16 9:57 ` rguenth at gcc dot gnu dot org
@ 2009-01-16 11:46 ` pinskia at gmail dot com
2009-01-16 18:53 ` pinskia at gcc dot gnu dot org
6 siblings, 0 replies; 9+ messages in thread
From: pinskia at gmail dot com @ 2009-01-16 11:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from pinskia at gmail dot com 2009-01-16 11:46 -------
Subject: Re: loop iv detection failure
Sent from my iPhone
On Jan 16, 2009, at 1:57 AM, "rguenth at gcc dot gnu dot org"
<gcc-bugzilla@gcc.gnu.org
> wrote:
>
>
> ------- Comment #5 from rguenth at gcc dot gnu dot org 2009-01-16
> 09:57 -------
> I think this boils down to the usual POINTER_PLUS fallout.
It failed in 4.1 also so nope :).
>
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/38856] loop iv detection failure
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
` (5 preceding siblings ...)
2009-01-16 11:46 ` pinskia at gmail dot com
@ 2009-01-16 18:53 ` pinskia at gcc dot gnu dot org
6 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-16 18:53 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from pinskia at gcc dot gnu dot org 2009-01-16 18:53 -------
Here is a testcase where IV can be found but we end up with two copies of the
same register in the loop:
int f(unsigned long s, unsigned long *c)
{
long j;
for(j = 0;j < s; j++)
{
c[j]=j;
}
}
Which is basically the same issue.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-01-16 18:53 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
2009-01-15 18:42 ` [Bug c/38856] " pinskia at gcc dot gnu dot org
2009-01-15 21:03 ` sergei_lus at yahoo dot com
2009-01-15 21:13 ` [Bug middle-end/38856] loop iv detection failure pinskia at gcc dot gnu dot org
2009-01-15 21:17 ` pinskia at gcc dot gnu dot org
2009-01-16 9:57 ` rguenth at gcc dot gnu dot org
2009-01-16 11:46 ` Andrew Thomas Pinski
2009-01-16 11:46 ` pinskia at gmail dot com
2009-01-16 18:53 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).