public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/38856]  New: loop iv detection failure, SSA autoincrement
@ 2009-01-15 18:24 sergei_lus at yahoo dot com
  2009-01-15 18:42 ` [Bug c/38856] " pinskia at gcc dot gnu dot org
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: sergei_lus at yahoo dot com @ 2009-01-15 18:24 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3575 bytes --]

I apologize if it is a well disguised feature, but I am forced to consider this
being a performance regression/bug. 

In the following trivial example:
void
VecADD(
    long long *In1,
    long long *In2,
    long long *Out,
    unsigned int samples
){
  int i;
  for (i = 0; i < samples; i++) {
    Out[i] = In1[i] + In2[i];
  }
}

there is an implicit imprecision in the way C is used - type of 'samples' is
unsigned, while type of 'i' is signed. 

The problem on the high level - induction variable analysis fails for this
loop, which impairs further tree level loop optimizations from functioning
properly (including autoincrement). In my port performance is off by 50% for
this loop. GCC 3.4.6 was able to handle this situation fine. 

What I believe to be the problem at the lowest level is a non-minimal (or
overly restrictive) SSA representation right before the iv detection:

VecADD (In1, In2, Out, samples)
{
  int i;
  long long int D.1857;
  long long int D.1856;
  long long int * D.1855;
  long long int D.1854;
  long long int * D.1853;
  long long int * D.1852;
  unsigned int D.1851;
  unsigned int i.0;

<bb 2>:

<bb 6>:
  # i_10 = PHI <0(2)>
  i.0_5 = (unsigned int) i_10;
  if (i.0_5 < samples_4(D))
    goto <bb 3>;
  else
    goto <bb 5>;

<bb 3>:
  # i.0_9 = PHI <i.0_3(4), i.0_5(6)>
  # i_14 = PHI <i_1(4), i_10(6)>
  D.1851_6 = i.0_9 * 8;
  D.1852_8 = Out_7(D) + D.1851_6;
  D.1853_12 = In1_11(D) + D.1851_6;
  D.1854_13 = *D.1853_12;
  D.1855_17 = In2_16(D) + D.1851_6;
  D.1856_18 = *D.1855_17;
  D.1857_19 = D.1854_13 + D.1856_18;
  *D.1852_8 = D.1857_19;
  i_20 = i_14 + 1;

<bb 4>:
  # i_1 = PHI <i_20(3)>
  i.0_3 = (unsigned int) i_1;
  if (i.0_3 < samples_4(D))
    goto <bb 3>;
  else
    goto <bb 5>;

<bb 5>:
  return;
}

The two PHI nodes in the beginning of BB3 break the iv detection. Same example
when types of ‘i’ and ‘samples’ would match will be analyzed perfectly fine
with the SSA at the same point looking like this:

VecADD (In1, In2, Out, samples)
{
  int i;
  long long int D.1857;
  long long int D.1856;
  long long int * D.1855;
  long long int D.1854;
  long long int * D.1853;
  long long int * D.1852;
  unsigned int D.1851;
  unsigned int i.0;

<bb 2>:

<bb 6>:
  # i_9 = PHI <0(2)>
  if (i_9 < samples_3(D))
    goto <bb 3>;
  else
    goto <bb 5>;

<bb 3>:
  # i_13 = PHI <i_1(4), i_9(6)>
  i.0_4 = (unsigned int) i_13;
  D.1851_5 = i.0_4 * 8;
  D.1852_7 = Out_6(D) + D.1851_5;
  D.1853_11 = In1_10(D) + D.1851_5;
  D.1854_12 = *D.1853_11;
  D.1855_16 = In2_15(D) + D.1851_5;
  D.1856_17 = *D.1855_16;
  D.1857_18 = D.1854_12 + D.1856_17;
  *D.1852_7 = D.1857_18;
  i_19 = i_13 + 1;

<bb 4>:
  # i_1 = PHI <i_19(3)>
  if (i_1 < samples_3(D))
    goto <bb 3>;
  else
    goto <bb 5>;

<bb 5>:
  return;
}

On one hand I seem to understand that a danger of signed/unsigned overflow at
increment can force this kind of conservatism, but on the high level this
situation was handled fine by gcc 3.4.6 and is handled with no issues by
another SSA based compiler. If there is a way to relax this strict
interpretation of C rules by GCC 4.3.2, I would gladly learn about it, but my
brief flag mining exercise yielded no results. Thank you.


-- 
           Summary: loop iv detection failure, SSA autoincrement
           Product: gcc
           Version: 4.3.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: sergei_lus at yahoo dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c/38856] loop iv detection failure, SSA autoincrement
  2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
@ 2009-01-15 18:42 ` pinskia at gcc dot gnu dot org
  2009-01-15 21:03 ` sergei_lus at yahoo dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-15 18:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from pinskia at gcc dot gnu dot org  2009-01-15 18:41 -------
Well first off, iv-opts on the tree level should be the place where
autoincrement is helped out.  See PR 31849 which I think this is a duplicate
of.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug c/38856] loop iv detection failure, SSA autoincrement
  2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
  2009-01-15 18:42 ` [Bug c/38856] " pinskia at gcc dot gnu dot org
@ 2009-01-15 21:03 ` sergei_lus at yahoo dot com
  2009-01-15 21:13 ` [Bug middle-end/38856] loop iv detection failure pinskia at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: sergei_lus at yahoo dot com @ 2009-01-15 21:03 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 715 bytes --]



------- Comment #2 from sergei_lus at yahoo dot com  2009-01-15 21:03 -------
Andrew, thank you for the prompt reply. 

I have seen PR 31849 and used the patch suggested. Without it autoincrement
would not work at all for either case. But the patch seems to deal with the
case when iv _were_ detected and generally alters cost computation to take that
addressing mode into account, while in this case I see that ivs are _not_
detected in the first place.  This seems to be a more fundamental issue -
should SSA be made "minimal" at that point or should iv detection be improved
to "see" trough “aliased” PHI node – I lack better words to describe it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/38856] loop iv detection failure
  2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
  2009-01-15 18:42 ` [Bug c/38856] " pinskia at gcc dot gnu dot org
  2009-01-15 21:03 ` sergei_lus at yahoo dot com
@ 2009-01-15 21:13 ` pinskia at gcc dot gnu dot org
  2009-01-15 21:17 ` pinskia at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-15 21:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from pinskia at gcc dot gnu dot org  2009-01-15 21:13 -------
This bug is a bit funny, autoincrement is not the issue, just the detecting of
the induction variable's limits correctly.  For an example I noticed that if I
compile this on a LP64 target, the induction variable is found correctly.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|c                           |middle-end
           Keywords|                            |missed-optimization
            Summary|loop iv detection failure,  |loop iv detection failure
                   |SSA autoincrement           |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/38856] loop iv detection failure
  2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
                   ` (2 preceding siblings ...)
  2009-01-15 21:13 ` [Bug middle-end/38856] loop iv detection failure pinskia at gcc dot gnu dot org
@ 2009-01-15 21:17 ` pinskia at gcc dot gnu dot org
  2009-01-16  9:57 ` rguenth at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-15 21:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from pinskia at gcc dot gnu dot org  2009-01-15 21:17 -------
Here is a good testcase:
void
VecADD(
    int *In1,
    int *In2,
    int *Out,
    unsigned int samples
){
  int i;
  for (i = 0; i < samples; i++) {
    Out[i] = In1[i];
  }
}

This testcase should be vectorized with -maltivec -O3 on PowerPC but is not
while it is with -m64.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-01-15 21:17:17
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/38856] loop iv detection failure
  2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
                   ` (3 preceding siblings ...)
  2009-01-15 21:17 ` pinskia at gcc dot gnu dot org
@ 2009-01-16  9:57 ` rguenth at gcc dot gnu dot org
  2009-01-16 11:46   ` Andrew Thomas Pinski
  2009-01-16 11:46 ` pinskia at gmail dot com
  2009-01-16 18:53 ` pinskia at gcc dot gnu dot org
  6 siblings, 1 reply; 11+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-16  9:57 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from rguenth at gcc dot gnu dot org  2009-01-16 09:57 -------
I think this boils down to the usual POINTER_PLUS fallout.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bug middle-end/38856] loop iv detection failure
  2009-01-16  9:57 ` rguenth at gcc dot gnu dot org
@ 2009-01-16 11:46   ` Andrew Thomas Pinski
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Thomas Pinski @ 2009-01-16 11:46 UTC (permalink / raw)
  To: gcc-bugzilla; +Cc: gcc-bugs



Sent from my iPhone

On Jan 16, 2009, at 1:57 AM, "rguenth at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org 
 > wrote:

>
>
> ------- Comment #5 from rguenth at gcc dot gnu dot org  2009-01-16  
> 09:57 -------
> I think this boils down to the usual POINTER_PLUS fallout.

It failed in 4.1 also so nope :).

>
>
>
> -- 
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/38856] loop iv detection failure
  2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
                   ` (4 preceding siblings ...)
  2009-01-16  9:57 ` rguenth at gcc dot gnu dot org
@ 2009-01-16 11:46 ` pinskia at gmail dot com
  2009-01-16 18:53 ` pinskia at gcc dot gnu dot org
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gmail dot com @ 2009-01-16 11:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from pinskia at gmail dot com  2009-01-16 11:46 -------
Subject: Re:  loop iv detection failure



Sent from my iPhone

On Jan 16, 2009, at 1:57 AM, "rguenth at gcc dot gnu dot org"
<gcc-bugzilla@gcc.gnu.org 
 > wrote:

>
>
> ------- Comment #5 from rguenth at gcc dot gnu dot org  2009-01-16  
> 09:57 -------
> I think this boils down to the usual POINTER_PLUS fallout.

It failed in 4.1 also so nope :).

>
>
>
> -- 
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856
>


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/38856] loop iv detection failure
  2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
                   ` (5 preceding siblings ...)
  2009-01-16 11:46 ` pinskia at gmail dot com
@ 2009-01-16 18:53 ` pinskia at gcc dot gnu dot org
  6 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-01-16 18:53 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from pinskia at gcc dot gnu dot org  2009-01-16 18:53 -------
Here is a testcase where IV can be found but we end up with two copies of the
same register in the loop:
int f(unsigned long s, unsigned long *c)
{
  long j;
  for(j = 0;j < s; j++)
    {
      c[j]=j;
    }
}

Which is basically the same issue.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/38856] loop iv detection failure
       [not found] <bug-38856-4@http.gcc.gnu.org/bugzilla/>
  2010-12-01 20:49 ` slarin at codeaurora dot org
@ 2021-11-28  4:31 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-11-28  4:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think the original issue has been fixed since GCC 4.5.0 or so.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/38856] loop iv detection failure
       [not found] <bug-38856-4@http.gcc.gnu.org/bugzilla/>
@ 2010-12-01 20:49 ` slarin at codeaurora dot org
  2021-11-28  4:31 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 11+ messages in thread
From: slarin at codeaurora dot org @ 2010-12-01 20:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38856

Sergei Larin <slarin at codeaurora dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |slarin at codeaurora dot
                   |                            |org

--- Comment #8 from Sergei Larin <slarin at codeaurora dot org> 2010-12-01 20:48:47 UTC ---

Hello, 

  This has not been touched in a while, nevertheless the issue still exist with
4.5.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-11-28  4:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-15 18:24 [Bug c/38856] New: loop iv detection failure, SSA autoincrement sergei_lus at yahoo dot com
2009-01-15 18:42 ` [Bug c/38856] " pinskia at gcc dot gnu dot org
2009-01-15 21:03 ` sergei_lus at yahoo dot com
2009-01-15 21:13 ` [Bug middle-end/38856] loop iv detection failure pinskia at gcc dot gnu dot org
2009-01-15 21:17 ` pinskia at gcc dot gnu dot org
2009-01-16  9:57 ` rguenth at gcc dot gnu dot org
2009-01-16 11:46   ` Andrew Thomas Pinski
2009-01-16 11:46 ` pinskia at gmail dot com
2009-01-16 18:53 ` pinskia at gcc dot gnu dot org
     [not found] <bug-38856-4@http.gcc.gnu.org/bugzilla/>
2010-12-01 20:49 ` slarin at codeaurora dot org
2021-11-28  4:31 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).