public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug gcov/profile/24487]  New: Basic block frequencies inaccurate
@ 2005-10-22 20:24 dje at gcc dot gnu dot org
  2005-10-23 20:11 ` [Bug gcov/profile/24487] [Regression 3.4/4.0/4.1] " dje at gcc dot gnu dot org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: dje at gcc dot gnu dot org @ 2005-10-22 20:24 UTC (permalink / raw)
  To: gcc-bugs

The basic block frequencies used when compiling without profiling information
appear to be non-sensical.  For instance, a case where block 0 is fed by the
entry block but has a frequency of 0.  This causes EDGE_FREQUENCY to return
zero, which affects optimizations such as final.c:compute_alignments().

An example is deflate.c from gzip-1.2.4a.  Compiled with GCC mainline on
powerpc-linux with options

-m32 -O3 -mcpu=power4 -ffast-math -funroll-loops -fpeel-loops
-ftree-loop-linear

deflate.c.50.compgotos shows the following information for function deflate:

Reordered sequence:
 0 bb 0  [0]
 1 bb 1  [0]
 2 bb 2  [1]
 3 bb 3  [1]
 4 bb 4  [0]
 5 bb 5  [0]
 6 bb 6  [0]
 7 bb 7  [0]
 8 bb 8  [0]
 9 bb 9  [0]
 10 bb 10  [0]
 11 bb 11  [0]
 12 bb 12  [0]
 13 bb 13  [0]
 14 bb 14  [0]
 15 bb 15  [0]
 16 bb 16  [0]
 17 bb 17  [0]
 18 bb 18  [1]
 19 bb 19  [0]
 20 bb 20  [0]
 21 bb 21  [0]
 22 bb 22  [5]
 23 bb 23  [5]
 24 bb 24  [5]
 25 bb 25  [3]
 26 bb 26  [4]
 27 bb 27  [4]
 28 bb 28  [2]
 29 bb 29  [1]
 30 bb 30  [1250]
 31 bb 31  [625]
 32 bb 32  [1250]
 33 bb 33  [625]
 34 bb 34  [1250]
 35 bb 35  [625]
 36 bb 36  [1250]
 37 bb 37  [625]
 38 bb 38  [1250]
 39 bb 39  [625]
 40 bb 40  [1250]
...

Basic blocks 10-11 contain a critical loop, but the basic block frequencies
misrepresent it as a very cold block.


-- 
           Summary: Basic block frequencies inaccurate
           Product: gcc
           Version: 4.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P2
         Component: gcov/profile
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dje at gcc dot gnu dot org
GCC target triplet: powerpc-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [Regression 3.4/4.0/4.1] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
@ 2005-10-23 20:11 ` dje at gcc dot gnu dot org
  2005-10-23 20:16 ` [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] " pinskia at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: dje at gcc dot gnu dot org @ 2005-10-23 20:11 UTC (permalink / raw)
  To: gcc-bugs



-- 

dje at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
      Known to fail|                            |3.4.3 4.0.2 4.1.0
      Known to work|                            |3.3.3
   Last reconfirmed|0000-00-00 00:00:00         |2005-10-23 20:11:24
               date|                            |
            Summary|Basic block frequencies     |[Regression 3.4/4.0/4.1]
                   |inaccurate                  |Basic block frequencies
                   |                            |inaccurate


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
  2005-10-23 20:11 ` [Bug gcov/profile/24487] [Regression 3.4/4.0/4.1] " dje at gcc dot gnu dot org
@ 2005-10-23 20:16 ` pinskia at gcc dot gnu dot org
  2005-10-24 21:36 ` dje at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-10-23 20:16 UTC (permalink / raw)
  To: gcc-bugs



-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.0.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
  2005-10-23 20:11 ` [Bug gcov/profile/24487] [Regression 3.4/4.0/4.1] " dje at gcc dot gnu dot org
  2005-10-23 20:16 ` [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] " pinskia at gcc dot gnu dot org
@ 2005-10-24 21:36 ` dje at gcc dot gnu dot org
  2005-10-28 16:32 ` hubicka at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: dje at gcc dot gnu dot org @ 2005-10-24 21:36 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from dje at gcc dot gnu dot org  2005-10-24 21:36 -------
fill_window() includes a loop whose count is based on WSIZE.  The profile
prediction pass calculates an accurate count estimate of 0x8000.  Other loops
have unknown loop bounds, which are estimated with a count of 3-10.  When
fill_window is inlined into deflate(), the large known count interferes with
the frequencies calculated for other loops causing them all to appear cold and
causing the basic block frequencies to become very innacurate.


-- 

dje at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 GCC target triplet|powerpc-*-*                 |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2005-10-24 21:36 ` dje at gcc dot gnu dot org
@ 2005-10-28 16:32 ` hubicka at gcc dot gnu dot org
  2005-10-28 17:31 ` dje at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-10-28 16:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from hubicka at gcc dot gnu dot org  2005-10-28 16:32 -------
I've benchmarked the change reducing maximum number of iterations predicted for
loop with constant bounds to 100 and 10 respectively.  100 makes no actual
change to x86-64 SPEC run, 10 seems to result in small degradation:
   164.gzip          1400     117        1193*     1400     117        1195*
   175.vpr           1400     166         845*     1400     166         844*
   176.gcc           1100     107        1031*     1100     106        1035*
   181.mcf           1800     335         537*     1800     333         541*
   186.crafty        1000      49.8      2010*     1000      49.7      2012*
   197.parser        1800     232         777*     1800     230         783*
   252.eon           1300      71.3      1823*     1300      75.4      1724*
   253.perlbmk       1800     125        1440*     1800     124        1446*
   254.gap           1100      95.3      1154*     1100      96.0      1146*
   255.vortex        1900     112        1689*     1900     113        1689*
   256.bzip2         1500     149        1005*     1500     149        1005*
   300.twolf         3000     353         850*     3000     353         851*
   Est. SPECint_base2000                 1118
   Est. SPECint2000                                                    1114
(specFP still in progress).  I will also try the idea of increasing estimated
number of iterations for loops without contatn bounds when some high limit is
known for loop with constant bound within function, but actually I don't like
that idea much anymore as it would result in profile to be artifically steep on
wrong places, most probably :(
Does the missprediction manifest somehow on PPC?


-- 

hubicka at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |hubicka at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2005-10-23 20:11:24         |2005-10-28 16:32:22
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2005-10-28 16:32 ` hubicka at gcc dot gnu dot org
@ 2005-10-28 17:31 ` dje at gcc dot gnu dot org
  2005-10-28 21:29 ` hubicka at ucw dot cz
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: dje at gcc dot gnu dot org @ 2005-10-28 17:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from dje at gcc dot gnu dot org  2005-10-28 17:30 -------
The misprediction causes (or at least contributes to) compute_alignments in
final.c not aligning the critical loop in longest_match.  The recent changes to
bb-reorder.c moved the start of the loop from an ideal alignment to the worst
alignment, causing a 8-10% performance drop due to instruction fetch problems. 
Without the explicit alignment directives omitted because of misprediction, the
alignment of the loop is random.


-- 

dje at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dberlin at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2005-10-28 17:31 ` dje at gcc dot gnu dot org
@ 2005-10-28 21:29 ` hubicka at ucw dot cz
  2005-10-31  6:39 ` mmitchel at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at ucw dot cz @ 2005-10-28 21:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from hubicka at ucw dot cz  2005-10-28 21:29 -------
Subject: Re: [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block
frequencies inaccurate

> 
> 
> ------- Comment #3 from dje at gcc dot gnu dot org  2005-10-28 17:30 -------
> The misprediction causes (or at least contributes to) compute_alignments in
> final.c not aligning the critical loop in longest_match.  The recent changes to
> bb-reorder.c moved the start of the loop from an ideal alignment to the worst
> alignment, causing a 8-10% performance drop due to instruction fetch problems. 
> Without the explicit alignment directives omitted because of misprediction, the
> alignment of the loop is random.

I see, Athlon is less sensitive to alignment that might explain it.  I
will try to fix my SPEC testing setup on PPC after SVN revamp and
benchmark it there too...

Honza

> 
> 
> -- 
> 
> dje at gcc dot gnu dot org changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |dberlin at gcc dot gnu dot
>                    |                            |org
> 
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487
> 
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
> You are the assignee for the bug, or are watching the assignee.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2005-10-28 21:29 ` hubicka at ucw dot cz
@ 2005-10-31  6:39 ` mmitchel at gcc dot gnu dot org
  2005-10-31 14:49 ` hubicka at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-10-31  6:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from mmitchel at gcc dot gnu dot org  2005-10-31 06:39 -------
Rather than increasing the estimate for loops with unknown bounds or throttling
the maximum for loops with known bounds, why not notice, when inlining, that
we've mixed the two, and drop all frequency guesses in the resulting function? 
(This is the usual lattice arithmetic idea.)  If we don't know, we just don't
know.  It's probably better to admit that we have no information than to
pretend that we understand what's going on.  (I have no evidence that my idea
actually helps, though; it could be horrible.)

Leaving as P2.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
                   ` (6 preceding siblings ...)
  2005-10-31  6:39 ` mmitchel at gcc dot gnu dot org
@ 2005-10-31 14:49 ` hubicka at gcc dot gnu dot org
  2005-10-31 14:54 ` hubicka at gcc dot gnu dot org
  2005-11-03  8:22 ` [Bug gcov/profile/24487] [3.4/4.0 " hubicka at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-10-31 14:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from hubicka at gcc dot gnu dot org  2005-10-31 14:49 -------
Subject: Bug 24487

Author: hubicka
Date: Mon Oct 31 14:48:57 2005
New Revision: 106276

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=106276
Log:
        PR profile/24487
        * predict.c (predict_loops): Do not estimate more than
        MAX_PRED_LOOP_ITERATIONS in PRED_LOOP_ITERATIONS heuristic.
        * predict.def (MAX_PRED_LOOP_ITERATIONS): Define.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/predict.c
    trunk/gcc/predict.def


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
                   ` (7 preceding siblings ...)
  2005-10-31 14:49 ` hubicka at gcc dot gnu dot org
@ 2005-10-31 14:54 ` hubicka at gcc dot gnu dot org
  2005-11-03  8:22 ` [Bug gcov/profile/24487] [3.4/4.0 " hubicka at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-10-31 14:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from hubicka at gcc dot gnu dot org  2005-10-31 14:54 -------
Concerning Mark's comment (I noticed only after committing the patch). I am not
sure what exactly Mark has in mind - this situation is not actually dependend
on inlining - easilly
we might just have funcition with two loops, one initialization with large
known bounds
and other iterating as many times as the initialization one but without obvious
bounds in it.

Also all the frequency estimates we have everywhere are guessed, so we almost
never know so I don't see why dropping them in this specific case is good idea.
Even the bad estimates we had before the patch was producing better code than
no estimates at all (I've just tested on x86).

I am keeping bug open for a moment so we can clarify the idea before forgetting
abou it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gcov/profile/24487] [3.4/4.0 Regression] Basic block frequencies inaccurate
  2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
                   ` (8 preceding siblings ...)
  2005-10-31 14:54 ` hubicka at gcc dot gnu dot org
@ 2005-11-03  8:22 ` hubicka at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-11-03  8:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from hubicka at gcc dot gnu dot org  2005-11-03 08:22 -------
no longer 4.1 regression.


-- 

hubicka at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
      Known to fail|3.4.3 4.0.2 4.1.0           |3.4.3 4.0.2
         Resolution|                            |FIXED
            Summary|[3.4/4.0/4.1 Regression]    |[3.4/4.0 Regression] Basic
                   |Basic block frequencies     |block frequencies inaccurate
                   |inaccurate                  |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24487


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-11-03  8:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-22 20:24 [Bug gcov/profile/24487] New: Basic block frequencies inaccurate dje at gcc dot gnu dot org
2005-10-23 20:11 ` [Bug gcov/profile/24487] [Regression 3.4/4.0/4.1] " dje at gcc dot gnu dot org
2005-10-23 20:16 ` [Bug gcov/profile/24487] [3.4/4.0/4.1 Regression] " pinskia at gcc dot gnu dot org
2005-10-24 21:36 ` dje at gcc dot gnu dot org
2005-10-28 16:32 ` hubicka at gcc dot gnu dot org
2005-10-28 17:31 ` dje at gcc dot gnu dot org
2005-10-28 21:29 ` hubicka at ucw dot cz
2005-10-31  6:39 ` mmitchel at gcc dot gnu dot org
2005-10-31 14:49 ` hubicka at gcc dot gnu dot org
2005-10-31 14:54 ` hubicka at gcc dot gnu dot org
2005-11-03  8:22 ` [Bug gcov/profile/24487] [3.4/4.0 " hubicka at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).