public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop
@ 2005-09-13  9:46 rguenth at gcc dot gnu dot org
  2005-09-13 13:12 ` [Bug tree-optimization/23855] " rguenth at gcc dot gnu dot org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-09-13  9:46 UTC (permalink / raw)
  To: gcc-bugs

void bar(void);
void foo(int ie, int je)
{
  int i, j;
  for (j=0; j<je; ++j)
    for (i=0; i<ie; ++i)
      bar();
}

should _not_ be transformed to

foo (ie, je)
{ 
  int j;
  int i;
  
<bb 0>:
  if (je > 0) goto <L23>; else goto <L5>;
  
<L23>:;
  j = 0;
  goto <bb 3> (<L2>);

<L22>:;
  i = 0;

<L1>:;
  bar ();
  i = i + 1;
  if (ie != i) goto <L1>; else goto <L3>;

<L3>:;
  j = j + 1;
  if (je != j) goto <L2>; else goto <L5>;

<L2>:;
  if (ie > 0) goto <L22>; else goto <L3>;

<L5>:;
  return;

}

i.e. containing an loop-invariant check if (ie > 0).

Both DOM and copy-header do this transformation.  Disabling both
we get

;; Function foo (foo)
  
foo (ie, je)
{
  int j;
  int i;

<bb 0>:
  j = 0;
  goto <bb 4> (<L4>);

<L1>:;
  bar ();
  i = i + 1;

<L2>:;
  if (i < ie) goto <L1>; else goto <L3>;

<L3>:;
  j = j + 1;

<L4>:;
  if (j < je) goto <L8>; else goto <L5>;

<L8>:;
  i = 0;
  goto <bb 2> (<L2>);

<L5>:;
  return;

}

which is a _lot_ faster for small ie.  Optimally we would hoist the
loop invariant check out of the j loop.

-- 
           Summary: Should not do loop header copying for inner loop
           Product: gcc
           Version: 4.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P2
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: rguenth at gcc dot gnu dot org
                CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
  2005-09-13  9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
@ 2005-09-13 13:12 ` rguenth at gcc dot gnu dot org
  2005-09-13 13:30 ` rakdver at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-09-13 13:12 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rguenth at gcc dot gnu dot org  2005-09-13 13:12 -------
Unswitching should clean this up, but unfortunately(?) only looks at innermost
loops.  For a reason it seems, just removing this checks results in an ICE.
Testcase for unswitching:

void bar(void);
void foo(int ie, int je)
{
  int j;
  int i;

  if (je <= 0)
    goto L5;

  j = 0;
L2:
  if (ie <= 0)
    goto L3;

  i = 0;
L1:
  bar ();
  i = i + 1;
  if (ie != i) goto L1;

L3:
  j = j + 1;
  if (je != j) goto L2;

L5:
  return;
}


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rakdver at atrey dot karlin
                   |                            |dot mff dot cuni dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
  2005-09-13  9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
  2005-09-13 13:12 ` [Bug tree-optimization/23855] " rguenth at gcc dot gnu dot org
@ 2005-09-13 13:30 ` rakdver at gcc dot gnu dot org
  2005-09-13 13:47 ` rguenth at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-09-13 13:30 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at gcc dot gnu dot org  2005-09-13 13:29 -------
It is not clear to me why you find the code without header copying better -- 
number of checks of each condition is exactly the same in both cases, and with 
right ordering of basic blocks, there should be one less jump executed. 
Depending on branch prediction on the target, either the original code or the 
code with copied headers may be faster, but it is hard to determine which one. 
 
Unswitching of non-innermost loops should not be hard to implement (most of 
code is in place, it only was quite some time that it was tested for anything 
but innermost loops, so there probably is some bitrot; and a few pieces on tree 
level may be missing). 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
  2005-09-13  9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
  2005-09-13 13:12 ` [Bug tree-optimization/23855] " rguenth at gcc dot gnu dot org
  2005-09-13 13:30 ` rakdver at gcc dot gnu dot org
@ 2005-09-13 13:47 ` rguenth at gcc dot gnu dot org
  2005-09-13 14:01 ` pinskia at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-09-13 13:47 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rguenth at gcc dot gnu dot org  2005-09-13 13:47 -------
Well, it is the case that I have some numerical application that has
such loops and the case of small ie (1 or 2) does happen during boundary
updates, so instead of

  if (ie <= 0)
    return;
  for (j=0; j<je; ++j)
    i=0;
    do {
      foo();
      i=i+1;
    } while (i<ie);

and executing the if (ie <= 0) exactly one time, we execute it
je times, which is as often as half times the number of total iterations
of the inner loop body.  You are probably right that the execution number
without loop header copying is exactly the same, but unswitching this
condition certainly helps for small ie (where loop versioning with constant
ie and completely unrolled inner loop would help even more, of course).

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
  2005-09-13  9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2005-09-13 13:47 ` rguenth at gcc dot gnu dot org
@ 2005-09-13 14:01 ` pinskia at gcc dot gnu dot org
  2005-09-13 14:09 ` rguenth at gcc dot gnu dot org
  2005-09-13 20:08 ` pinskia at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-09-13 14:01 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-13 14:01 -------
On PPC, we get optimal and almost unswitched loop:
L4:
        ble- cr4,L7
        li r31,0
L6:
        addi r31,r31,1
        bl _bar
        cmpw cr7,r30,r31
        bne+ cr7,L6
L7:
        addi r29,r29,1
        cmpw cr7,r28,r29
        bne+ cr7,L4


cr4 is a callee save register.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
  2005-09-13  9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2005-09-13 14:01 ` pinskia at gcc dot gnu dot org
@ 2005-09-13 14:09 ` rguenth at gcc dot gnu dot org
  2005-09-13 20:08 ` pinskia at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-09-13 14:09 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rguenth at gcc dot gnu dot org  2005-09-13 14:09 -------
Heh - you unswitched the comparison but not the jump!

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
  2005-09-13  9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2005-09-13 14:09 ` rguenth at gcc dot gnu dot org
@ 2005-09-13 20:08 ` pinskia at gcc dot gnu dot org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-09-13 20:08 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-09-13 20:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-13  9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
2005-09-13 13:12 ` [Bug tree-optimization/23855] " rguenth at gcc dot gnu dot org
2005-09-13 13:30 ` rakdver at gcc dot gnu dot org
2005-09-13 13:47 ` rguenth at gcc dot gnu dot org
2005-09-13 14:01 ` pinskia at gcc dot gnu dot org
2005-09-13 14:09 ` rguenth at gcc dot gnu dot org
2005-09-13 20:08 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).