public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/42720]  New: Empty loop generated at unswitch-loops with -O2 -fprofile-use
@ 2010-01-13  2:47 jingyu at google dot com
  2010-01-29 23:34 ` [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass jingyu at google dot com
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: jingyu at google dot com @ 2010-01-13  2:47 UTC (permalink / raw)
  To: gcc-bugs

The bug is triggered with -O2 -fprofile-use.

test case, loop.cpp:

int fun_b(int hbs[], int num, void *obj) {
 int i;
 int s = 0;
 for (i = 0; i < num; i++) {
   if (obj != 0) {
     if ((int)obj - hbs[i] > 0) {
       s += hbs[i];
     }
   }
 }
 return s;
}

int main () {
 int i;
 int s = 0;
 int hbs[100];
 for (i = 0; i < 100; ++i) {
   hbs[i] = i * 2000 + 100000;
 }
 for (i = 0; i < 20; ++i) {
   s += fun_b (hbs, 100, &hbs[i]);
 }
 return s;
}

Profile the program. Apparently the loop inside fun_b() is hot.

$arm-eabi-g++ loop.cpp -O2 -fprofile-use --save-temps -c -o loop.o

We we see an empty loop (.L5) if obj==0, in function fun_b.

_Z5fun_bPiiPv:
       @ args = 0, pretend = 0, frame = 0
       @ frame_needed = 0, uses_anonymous_args = 0
       @ link register save eliminated.
       cmp     r1, #0
       stmfd   sp!, {r4, r5}
       mov     r3, r0
       ble     .L57
       cmp     r2, #0   <--- "if (obj != 0)" is moved out of loop
       beq     .L5
     ....
.L3:
       ldmfd   sp!, {r4, r5}
       bx      lr
.L5:                     ;;  if (obj == 0), empty loop
       add     r2, r2, #1    ;;
       cmp     r2, r1        ;;
       bne     .L5           ;;
.L57:
       mov     r0, #0
       b       .L3

The empty loop (.L5) should have been eliminated. I have tested -O2 without
-fprofile-use, where the empty loop is gone.

I find that the root cause of the inefficiency of -O2 FDO is that during
unswitch-loops, the simplification of loop conditions is missed when FDO is on. 

Let's say,
Version A: "-O2 -funswitch-loops", which does right thing.
Version B: "-O2 -fprofile-use". Version B generates an empty loop which should
be eliminated.

Before switch-loop pass, the loop (inner-most, hot) is

 loop {
    if (obj != 0) {
      ...
    }
  }

Both version A and version B perform one pass of unswitch-loop on this loop
body.
In function tree_unswitch_single_loop(),
after "nloop = tree_unswitch_loop (loop, bbs[i], cond)", the loop becomes

if (obj != 0) {
 loop {               <---- original copy of the loop
   if (obj != 0) {
     ...
   }
 }
} else {
 loop {              <----- "nloop": a new copy of the loop
    if (obj != 0) {
      ...
    }
  }
}

Then, right before the end of tree_unswitch_single_loop(), gcc recursively
calls itself on modified loops.
  tree_unswitch_single_loop (nloop, num + 1);

>From here, Version A and Version B starts to perform differently.

For Version A ("-O2 -funswitch-loops"), gcc conditions looking for
unswitch-loop opportunity in the new loop "nloop".
It finds that the condition of the new loop can be simplified. Since obj is 0
when it comes to the new loop, gcc
replaces obj by 0. Thus the loop becomes

if (obj != 0) {
 loop {               <---- original copy of the loop
   if (obj != 0) {
     ...
   }
 }
} else {
 loop {                   <----- "nloop": a new copy of the loop
    if (0 != 0) {     <--- obj is replaced by "0"
      ...
    }
  }
}
Therefore, in the TODO pass cleanup-cfg, the "nloop" is entirely removed.

However, for Version B ("-O2 -fprofile-use"), gcc finds that the "nloop" is a
cold loop, so it returns immediately, without checking if the condition can be
simplified. Thus nloop is not cleaned up by the following cleanup-cfg pass and
results in an empty loop.

The problematic code in is unswitch_single_loop() in loop-unswitch.c.

static void
unswitch_single_loop(struct loop *loop, ...)
{ ...
  /* Do not unswitch in cold areas.  */
  if (optimize_loop_for_size_p (loop))
    {
       dump
       return;
     }
  ...
  do
    {   ...
       /* Check whether the result can be predicted.  */
       for (acond = cond_checked; acond; acond = XEXP (acond, 1))
           simplify_using_condition (XEXP (acond, 0), &cond, NULL);
          ...
      } while (repeat);
    ...
  /* Unswitch the loop on this condition.  */
  nloop = unswitch_loop (loop, bbs[i], cond, cinsn);
  ...

  /* Invoke itself on modified loops.  */
  unswitch_single_loop (nloop, rconds, num + 1);
  unswitch_single_loop (loop, conds, num + 1);
  ...
}

To fix the empty loop problem, my thought is to propagate the conditions
immediately after nloop is inserted.

Any suggestion?

Thanks,
Jing


-- 
           Summary: Empty loop generated at unswitch-loops with -O2 -
                    fprofile-use
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jingyu at google dot com
 GCC build triplet: X86_64-linux-gnu
  GCC host triplet: X86_64-linux-gnu
GCC target triplet: arm-unknown-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
@ 2010-01-29 23:34 ` jingyu at google dot com
  2010-01-29 23:38 ` pinskia at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jingyu at google dot com @ 2010-01-29 23:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from jingyu at google dot com  2010-01-29 23:33 -------
The problem also exists on non-FDO use. So I changed the title of the bug.

Since unswitch-loop is set at -O3, I give -O3 to the command line. Parameter
"max-unswitch-level" means we only do unswitch-loop at most once.

$arm-eabi-g++ loop.cpp -O3 --param max-unswitch-level=0 -S

We can see from the assembly code that when "if (obj==0)", the empty loop is
not eliminated.

.LFB0:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        cmp     r1, #0
        stmfd   sp!, {r4, r5}
.LCFI0:
        mov     r3, r0
        ble     .L11
        cmp     r2, #0  <---"if (obj != 0)" is moved out of loop

        movne   r0, #0
        movne   ip, r0
        beq     .L5
...
.L5:
        add     r2, r2, #1 ;;  if (obj == 0), empty loop
        cmp     r2, r1     ;;
        bne     .L5        ;;
.L11:
        mov     r0, #0
        b       .L3


I made a mistake when I wrote the first bug report for this issue. The
problematic logic is on tree_unswitch_single_loop() (tree-ssa-loop-unswitch.c).

tree_unswitch_single_loop (loop, num):
  check if num > max-unswitch-level, return false.
  check if optimize_loop_for_size_p(loop), return false.
  check if loop is not innermost, return false.
  check if loop size > max_unswitch_insn, return false.
  while(1) {
    find a condition cond inside loop that can be unswitched.
    if cond can be simplified by loop entry checks, simplify cond and continue.
    Otherwise, break
  }
  unswitch loop. Now nloop is a copy of loop.
  ...
  tree_unswitch_single_loop (nloop, num+1);
  tree_unswitch_single_loop (loop, num+1);


Notice that after loop is unswitched, the conditions of loop and nloop are not
simplified. The condition simplification of nloop and loop are expected to be
done on the recursive call of tree_unswitch_single_loop(nloop, num+1), and
tree_unswitch_single_loop (loop, num+1). However, on the recursive call, if
something has changed and makes any "check" fail, the condition simplification
will not be done.

Ideally, we need to simplify loop and nloop conditions immediately after
unswitch. That is the correct thing we want to do. But given the current logic,
I am proposing another type of fix.

There are four checks at the beginning of tree_unswitch_single_loop(). Two
checks may fail during recursive calls:
  check if num > max-unswitch-level, return false.
  check if optimize_loop_for_size_p(loop), return false.

So I think we can simply move these two conditions after condition
simplification loop. It is always good to simplify conditions inside the loop
even if the loop is cold.

I pasted here the patch on gcc-4.4.0. I have tested the patch on the FDO case
and the -O3 case. The empty loop is gone in both cases.

185,192d184
<   /* Do not unswitch too much.  */
<   if (num > PARAM_VALUE (PARAM_MAX_UNSWITCH_LEVEL))
<     {
<       if (dump_file && (dump_flags & TDF_DETAILS))
<       fprintf (dump_file, ";; Not unswitching anymore, hit max level\n");
<       return false;
<     }
< 
201,208d192
<   /* Do not unswitch in cold regions.  */
<   if (optimize_loop_for_size_p (loop))
<     {
<       if (dump_file && (dump_flags & TDF_DETAILS))
<       fprintf (dump_file, ";; Not unswitching cold loops\n");
<       return false;
<     }
< 
254a239,254
>   /* Do not unswitch too much.  */
>   if (num > PARAM_VALUE (PARAM_MAX_UNSWITCH_LEVEL))
>     {
>       if (dump_file && (dump_flags & TDF_DETAILS))
>       fprintf (dump_file, ";; Not unswitching anymore, hit max level\n");
>       return false;
>     }
> 
>   /* Do not unswitch in cold regions.  */
>   if (optimize_loop_for_size_p (loop))
>     {
>       if (dump_file && (dump_flags & TDF_DETAILS))
>       fprintf (dump_file, ";; Not unswitching cold loops\n");
>       return false;
>     }
> 

Do you think if the patch would work?

Thanks!


-- 

jingyu at google dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |4.5.0 4.4.0
            Summary|Empty loop generated at     |Problematic condition
                   |unswitch-loops with -O2 -   |simplification logic at
                   |fprofile-use                |unswitch-loops pass


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
  2010-01-29 23:34 ` [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass jingyu at google dot com
@ 2010-01-29 23:38 ` pinskia at gcc dot gnu dot org
  2010-01-29 23:59 ` jingyu at google dot com
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-29 23:38 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from pinskia at gcc dot gnu dot org  2010-01-29 23:38 -------
On x86, the empty loop is removed at -O3 on the tree level:
<bb 3>:
  if (obj_7(D) != 0B)
    goto <bb 4>;
  else
    goto <bb 5>;

<bb 4>:
  # i_18 = PHI <0(3), i_20(4)>
  # s_23 = PHI <0(3), s_2(4)>
  obj.0_8 = (int) obj_7(D);
  D.2053_25 = (unsigned int) i_18;
  D.1247_13 = MEM[base: hbs_11(D), index: D.2053_25, step: 4];
  D.1248_14 = obj.0_8 - D.1247_13;
  s_19 = s_23 + D.1247_13;
  s_2 = [cond_expr] D.1248_14 > 0 ? s_19 : s_23;
  i_20 = i_18 + 1;
  if (i_20 != num_6(D))
    goto <bb 4>;
  else
    goto <bb 5>;

<bb 5>:
  # s_3 = PHI <s_2(4), 0(3)>


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
  2010-01-29 23:34 ` [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass jingyu at google dot com
  2010-01-29 23:38 ` pinskia at gcc dot gnu dot org
@ 2010-01-29 23:59 ` jingyu at google dot com
  2010-01-30  0:05 ` pinskia at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jingyu at google dot com @ 2010-01-29 23:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from jingyu at google dot com  2010-01-29 23:59 -------
You must set "--param max-unswitch-level=0" to trigger the bug in non-FDO use.

I just tried gcc-4.2.4 on X86 platform. The problem exists.

$ gcc loop.cpp -O3  --param max-unswitch-level=0 -m32 -S
        testl   %eax, %eax
        jne     .L5
        xorl    %eax, %eax
        .p2align 4,,7
.L7:
        addl    $1, %eax      <--- empty loop
        cmpl    %edx, %eax
        jne     .L7
        xorl    %ecx, %ecx

By default, max-unswitch-level is 3. So if you don't change max-unswitch-level,
after unswitch loop once, the conditions of nloop and loop can be simplified by
recursive calls.

Rather than writing a complicated test case which will do unswitch-loop 4
times, I would like to change the max-unswitch-level=0 to trigger the bug
early.

In FDO use, you can reproduce the bug with "-O2 -fprofile-use" on x86 with
gcc-4.4.0 or higher. I have checked gcc-4.2.x. The check "if
optimize_loop_for_size_p(loop)" is not in gcc-4.2.x. So gcc-4.2.x does not have
this problem in FDO use.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (2 preceding siblings ...)
  2010-01-29 23:59 ` jingyu at google dot com
@ 2010-01-30  0:05 ` pinskia at gcc dot gnu dot org
  2010-01-30  0:21 ` jingyu at google dot com
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-01-30  0:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from pinskia at gcc dot gnu dot org  2010-01-30 00:05 -------
(In reply to comment #3)
> You must set "--param max-unswitch-level=0" to trigger the bug in non-FDO use.
> 

So the problem is really if (optimize_loop_for_size_p (loop)) .  I think you
need to figure out why that is returning true.  I can see why to some extent as
we have a pointer comparison against a NULL pointer; we assume the pointer will
not be null when duplicating the loop.  


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (3 preceding siblings ...)
  2010-01-30  0:05 ` pinskia at gcc dot gnu dot org
@ 2010-01-30  0:21 ` jingyu at google dot com
  2010-01-30 10:59 ` rguenth at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jingyu at google dot com @ 2010-01-30  0:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from jingyu at google dot com  2010-01-30 00:21 -------
>So the problem is really if (optimize_loop_for_size_p (loop)) .  I think you
>need to figure out why that is returning true.

That's because after unswitch-loop, one copy of the loop becomes cold. Let's
look at loop.cpp.

Before switch-loop pass, the loop (inner-most, hot) is

 loop {        
    if (obj != 0) {
      ...      <---- hot!
    }
  }

In function tree_unswitch_single_loop(),
after "nloop = tree_unswitch_loop (loop, bbs[i], cond)", the loop becomes

if (obj != 0) {
 loop {               <---- original copy of the loop
   if (obj != 0) { 
     ...    <---  hot!
   }
 }
} else {  <--- cold! becuase obj==0 is rarely seen
 loop {              <----- "nloop": a new copy of the loop
    if (obj != 0) {
      ...
    }
  }
}

nloop becomes a cold loop, but we still want to simplify its condition.
Otherwise, nloop becomes an empty loop in the end.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (4 preceding siblings ...)
  2010-01-30  0:21 ` jingyu at google dot com
@ 2010-01-30 10:59 ` rguenth at gcc dot gnu dot org
  2010-01-30 11:00 ` rguenth at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-30 10:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from rguenth at gcc dot gnu dot org  2010-01-30 10:59 -------
The interesting thing is that the empty loop is not removed by
the control-dependent DCE pass that follows the 2nd VRP.

This happens because


<bb 3>:
  if (obj_7(D) != 0B)
    goto <bb 5>;
  else
    goto <bb 4>;

<bb 4>:
  # i_36 = PHI <0(3)>

<bb 10>:
  # i_25 = PHI <i_36(4), i_35(12)>
  i_35 = i_25 + 1;
  if (i_35 != num_6(D))
    goto <bb 12>;
  else
    goto <bb 8>;

<bb 12>:
  goto <bb 10>;

...

<bb 7>:
Invalid sum of incoming frequencies 6888, should be 7735
  # s_2 = PHI <s_19(6), s_23(11)>
  i_20 = i_18 + 1;
  if (i_20 != num_6(D))
    goto <bb 13>;
  else
    goto <bb 8>;

...

<bb 8>:
  # s_3 = PHI <s_2(7), 0(10)>


the PHI node defining s_3 marks the controlling statement necessary
which in turn marks the loop necessary.  Now the question is if this
is an inherent limitation of CD-DCE or if the controlling stmt
should be that in BB3 instead (the immediate common dominator of
bb7 and bb10 is bb3).  Steven?  You implemented that CD-DCE stuff?


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu dot
                   |                            |org, hubicka at gcc dot gnu
                   |                            |dot org
           Keywords|                            |missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (5 preceding siblings ...)
  2010-01-30 10:59 ` rguenth at gcc dot gnu dot org
@ 2010-01-30 11:00 ` rguenth at gcc dot gnu dot org
  2010-01-30 12:01 ` rakdver at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-30 11:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from rguenth at gcc dot gnu dot org  2010-01-30 11:00 -------
Oh, and Zdenek might have an idea about the condition simplification in
unswitching.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rakdver at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (6 preceding siblings ...)
  2010-01-30 11:00 ` rguenth at gcc dot gnu dot org
@ 2010-01-30 12:01 ` rakdver at gcc dot gnu dot org
  2010-01-30 16:07 ` rguenth at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2010-01-30 12:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from rakdver at gcc dot gnu dot org  2010-01-30 12:01 -------
(In reply to comment #7)
> Oh, and Zdenek might have an idea about the condition simplification in
> unswitching.

I agree that some of the checks in tree_unswitch_single_loop are badly placed
-- it does not make sense to check them repeatedly in the recursion.  I'd
suggest to move them to tree_ssa_unswitch_loops, i.e.,

Index: tree-ssa-loop-unswitch.c
===================================================================
*** tree-ssa-loop-unswitch.c    (revision 155960)
--- tree-ssa-loop-unswitch.c    (working copy)
*************** tree_ssa_unswitch_loops (void)
*** 88,93 ****
--- 88,113 ----
    /* Go through inner loops (only original ones).  */
    FOR_EACH_LOOP (li, loop, LI_ONLY_INNERMOST)
      {
+       if (dump_file && (dump_flags & TDF_DETAILS))
+       fprintf (dump_file, ";; Considering loop %d\n", loop->num);
+
+       /* Do not unswitch in cold regions.  */
+       if (optimize_loop_for_size_p (loop))
+       {
+         if (dump_file && (dump_flags & TDF_DETAILS))
+           fprintf (dump_file, ";; Not unswitching cold loops\n");
+         continue;
+       }
+
+       /* The loop should not be too large, to limit code growth.  */
+       if (tree_num_loop_insns (loop, &eni_size_weights)
+         > (unsigned) PARAM_VALUE (PARAM_MAX_UNSWITCH_INSNS))
+       {
+         if (dump_file && (dump_flags & TDF_DETAILS))
+           fprintf (dump_file, ";; Not unswitching, loop too big\n");
+         continue;
+       }
+
        changed |= tree_unswitch_single_loop (loop, 0);
      }

*************** tree_unswitch_single_loop (struct loop *
*** 189,219 ****
        return false;
      }

-   /* Only unswitch innermost loops.  */
-   if (loop->inner)
-     {
-       if (dump_file && (dump_flags & TDF_DETAILS))
-       fprintf (dump_file, ";; Not unswitching, not innermost loop\n");
-       return false;
-     }
-
-   /* Do not unswitch in cold regions.  */
-   if (optimize_loop_for_size_p (loop))
-     {
-       if (dump_file && (dump_flags & TDF_DETAILS))
-       fprintf (dump_file, ";; Not unswitching cold loops\n");
-       return false;
-     }
-
-   /* The loop should not be too large, to limit code growth.  */
-   if (tree_num_loop_insns (loop, &eni_size_weights)
-       > (unsigned) PARAM_VALUE (PARAM_MAX_UNSWITCH_INSNS))
-     {
-       if (dump_file && (dump_flags & TDF_DETAILS))
-       fprintf (dump_file, ";; Not unswitching, loop too big\n");
-       return false;
-     }
-
    i = 0;
    bbs = get_loop_body (loop);

--- 209,214 ----


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (7 preceding siblings ...)
  2010-01-30 12:01 ` rakdver at gcc dot gnu dot org
@ 2010-01-30 16:07 ` rguenth at gcc dot gnu dot org
  2010-01-30 16:14 ` steven at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-30 16:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from rguenth at gcc dot gnu dot org  2010-01-30 16:06 -------
(In reply to comment #6)
> The interesting thing is that the empty loop is not removed by
> the control-dependent DCE pass that follows the 2nd VRP.
> 
> This happens because
> 
> 
> <bb 3>:
>   if (obj_7(D) != 0B)
>     goto <bb 5>;
>   else
>     goto <bb 4>;
> 
> <bb 4>:
>   # i_36 = PHI <0(3)>
> 
> <bb 10>:
>   # i_25 = PHI <i_36(4), i_35(12)>
>   i_35 = i_25 + 1;
>   if (i_35 != num_6(D))
>     goto <bb 12>;
>   else
>     goto <bb 8>;
> 
> <bb 12>:
>   goto <bb 10>;
> 
> ...
> 
> <bb 7>:
> Invalid sum of incoming frequencies 6888, should be 7735
>   # s_2 = PHI <s_19(6), s_23(11)>
>   i_20 = i_18 + 1;
>   if (i_20 != num_6(D))
>     goto <bb 13>;
>   else
>     goto <bb 8>;
> 
> ...
> 
> <bb 8>:
>   # s_3 = PHI <s_2(7), 0(10)>
> 
> 
> the PHI node defining s_3 marks the controlling statement necessary
> which in turn marks the loop necessary.  Now the question is if this
> is an inherent limitation of CD-DCE or if the controlling stmt
> should be that in BB3 instead (the immediate common dominator of
> bb7 and bb10 is bb3).  Steven?  You implemented that CD-DCE stuff?

-> PR42906


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (8 preceding siblings ...)
  2010-01-30 16:07 ` rguenth at gcc dot gnu dot org
@ 2010-01-30 16:14 ` steven at gcc dot gnu dot org
  2010-01-30 23:33 ` jingyu at google dot com
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-01-30 16:14 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from steven at gcc dot gnu dot org  2010-01-30 16:14 -------
That "CD-DCE stuff" has by now deviated so far from the text book
implementation that I don't recognize it anymore at all.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (9 preceding siblings ...)
  2010-01-30 16:14 ` steven at gcc dot gnu dot org
@ 2010-01-30 23:33 ` jingyu at google dot com
  2010-02-02 23:57 ` jingyu at google dot com
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jingyu at google dot com @ 2010-01-30 23:33 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 524 bytes --]



------- Comment #11 from jingyu at google dot com  2010-01-30 23:33 -------
Subject: Re:  Problematic condition 
        simplification logic at unswitch-loops pass

> I agree that some of the checks in tree_unswitch_single_loop are badly placed
> -- it does not make sense to check them repeatedly in the recursion.  I'd
> suggest to move them to tree_ssa_unswitch_loops, i.e.,
>

I agree on this modification. Theses three checks only need to be checked once.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (10 preceding siblings ...)
  2010-01-30 23:33 ` jingyu at google dot com
@ 2010-02-02 23:57 ` jingyu at google dot com
  2010-02-07 20:28 ` pinskia at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jingyu at google dot com @ 2010-02-02 23:57 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 783 bytes --]



------- Comment #12 from jingyu at google dot com  2010-02-02 23:57 -------
Subject: Re:  Problematic condition 
        simplification logic at unswitch-loops pass

Zdenek,

I did dejagnu tests on your patch. It gave no regression on
"--target_board=arm-sim/thumb".
Do you think this patch will be accepted when GCC enters stage1?

Thanks,
Jing


On Sat, Jan 30, 2010 at 3:33 PM, Jing Yu <jingyu@google.com> wrote:
>> I agree that some of the checks in tree_unswitch_single_loop are badly placed
>> -- it does not make sense to check them repeatedly in the recursion.  I'd
>> suggest to move them to tree_ssa_unswitch_loops, i.e.,
>>
>
> I agree on this modification. Theses three checks only need to be checked once.
>


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (11 preceding siblings ...)
  2010-02-02 23:57 ` jingyu at google dot com
@ 2010-02-07 20:28 ` pinskia at gcc dot gnu dot org
  2010-04-08 18:17 ` jingyu at gcc dot gnu dot org
  2010-05-13 18:09 ` jingyu at google dot com
  14 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-02-07 20:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from pinskia at gcc dot gnu dot org  2010-02-07 20:28 -------
Confirmed.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2010-02-07 20:28:09
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (12 preceding siblings ...)
  2010-02-07 20:28 ` pinskia at gcc dot gnu dot org
@ 2010-04-08 18:17 ` jingyu at gcc dot gnu dot org
  2010-05-13 18:09 ` jingyu at google dot com
  14 siblings, 0 replies; 16+ messages in thread
From: jingyu at gcc dot gnu dot org @ 2010-04-08 18:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from jingyu at gcc dot gnu dot org  2010-04-08 18:17 -------
Subject: Bug 42720

Author: jingyu
Date: Thu Apr  8 18:16:57 2010
New Revision: 158138

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=158138
Log:
Fix a problematic logic at unswitch-loops pass.

2010-04-07  Jing Yu  <jingyu@google.com>
            Zdenek Dvorak  <ook@ucw.cz>

        PR tree-optimization/42720
        * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Move one-time
        loop unswitch conditions here from
        (tree_unswitch_single_loop).

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-loop-unswitch.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass
  2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
                   ` (13 preceding siblings ...)
  2010-04-08 18:17 ` jingyu at gcc dot gnu dot org
@ 2010-05-13 18:09 ` jingyu at google dot com
  14 siblings, 0 replies; 16+ messages in thread
From: jingyu at google dot com @ 2010-05-13 18:09 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from jingyu at google dot com  2010-05-13 18:09 -------
Patch was committed to trunk (4.6) r158138.

Resolved.


-- 

jingyu at google dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42720


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-05-13 18:09 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-13  2:47 [Bug tree-optimization/42720] New: Empty loop generated at unswitch-loops with -O2 -fprofile-use jingyu at google dot com
2010-01-29 23:34 ` [Bug tree-optimization/42720] Problematic condition simplification logic at unswitch-loops pass jingyu at google dot com
2010-01-29 23:38 ` pinskia at gcc dot gnu dot org
2010-01-29 23:59 ` jingyu at google dot com
2010-01-30  0:05 ` pinskia at gcc dot gnu dot org
2010-01-30  0:21 ` jingyu at google dot com
2010-01-30 10:59 ` rguenth at gcc dot gnu dot org
2010-01-30 11:00 ` rguenth at gcc dot gnu dot org
2010-01-30 12:01 ` rakdver at gcc dot gnu dot org
2010-01-30 16:07 ` rguenth at gcc dot gnu dot org
2010-01-30 16:14 ` steven at gcc dot gnu dot org
2010-01-30 23:33 ` jingyu at google dot com
2010-02-02 23:57 ` jingyu at google dot com
2010-02-07 20:28 ` pinskia at gcc dot gnu dot org
2010-04-08 18:17 ` jingyu at gcc dot gnu dot org
2010-05-13 18:09 ` jingyu at google dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).