public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "matt at use dot net" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/51182] [ipa-iterations] running multiple passes of early IPA on a file produces different code when it shouldn't
Date: Fri, 18 Nov 2011 02:20:00 -0000	[thread overview]
Message-ID: <bug-51182-4-oLSPNyIn2N@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-51182-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51182

--- Comment #4 from Matt Hargett <matt at use dot net> 2011-11-18 01:43:32 UTC ---
Ah, okay. I read in your email you were looking for evidence of bugs, and the
behaviour looked fishy to me. Regardless, here is a performance improvement
that perhaps should be gotten within one iteration.

Attached is the combined.i from pmccabe, which can be compiled and linked
directly to be an executable (on a Debian/Ubuntu-ish amd64 system, anyway).

Using -O3 (or -Ofast), two iterations produces a binary that performs better
than just one iteration. Performance was measured at the macro level, based on
timings when run against tens of thousands of files while in single-user mode
on a ramdisk. In addition, performance at the micro level was measured by
looking at cache misses and branch misprediction rates using callgrind (a tool
within valgrind), with output below. The second iteration reduces the I1 miss
rate, as well as the misprediction rate. (Multiple iterations of -O2 is more of
a mixed bag at the micro level, for some reason, and appears to have no
macro-level performance impact.)


matt@matt-desktop:~/src/pmccabe-2.7$ valgrind --tool=callgrind --branch-sim=yes
--cache-sim=yes ./pmccabe.o3i1.loop.whopr *.c test0[012][0123456]

==4119== 
==4119== Events    : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw Bc Bcm Bi Bim
==4119== Collected : 10312284 2549768 1398563 3869 3209 1417 1285 2045 990
2534056 74514 208896 8052
==4119== 
==4119== I   refs:      10,312,284
==4119== I1  misses:         3,869
==4119== LLi misses:         1,285
==4119== I1  miss rate:        0.3%
==4119== LLi miss rate:        0.1%
==4119== 
==4119== D   refs:       3,948,331  (2,549,768 rd + 1,398,563 wr)
==4119== D1  misses:         4,626  (    3,209 rd +     1,417 wr)
==4119== LLd misses:         3,035  (    2,045 rd +       990 wr)
==4119== D1  miss rate:        0.1% (      0.1%   +       0.1%  )
==4119== LLd miss rate:        0.0% (      0.0%   +       0.0%  )
==4119== 
==4119== LL refs:            8,495  (    7,078 rd +     1,417 wr)
==4119== LL misses:          4,320  (    3,330 rd +       990 wr)
==4119== LL miss rate:         0.0% (      0.0%   +       0.0%  )
==4119== 
==4119== Branches:       2,742,952  (2,534,056 cond +   208,896 ind)
==4119== Mispredicts:       82,566  (   74,514 cond +     8,052 ind)
==4119== Mispred rate:         3.0% (      2.9%     +       3.8%   )


matt@matt-desktop:~/src/pmccabe-2.7$ valgrind --tool=callgrind --branch-sim=yes
--cache-sim=yes ./pmccabe.o3i2.loop.whopr *.c test0[012][0123456]

==4122== 
==4122== Events    : Ir Dr Dw I1mr D1mr D1mw ILmr DLmr DLmw Bc Bcm Bi Bim
==4122== Collected : 10312147 2549768 1398563 3054 3209 1416 1286 2049 989
2534056 74071 208896 7618
==4122== 
==4122== I   refs:      10,312,147
==4122== I1  misses:         3,054
==4122== LLi misses:         1,286
==4122== I1  miss rate:        0.2%
==4122== LLi miss rate:        0.1%
==4122== 
==4122== D   refs:       3,948,331  (2,549,768 rd + 1,398,563 wr)
==4122== D1  misses:         4,625  (    3,209 rd +     1,416 wr)
==4122== LLd misses:         3,038  (    2,049 rd +       989 wr)
==4122== D1  miss rate:        0.1% (      0.1%   +       0.1%  )
==4122== LLd miss rate:        0.0% (      0.0%   +       0.0%  )
==4122== 
==4122== LL refs:            7,679  (    6,263 rd +     1,416 wr)
==4122== LL misses:          4,324  (    3,335 rd +       989 wr)
==4122== LL miss rate:         0.0% (      0.0%   +       0.0%  )
==4122== 
==4122== Branches:       2,742,952  (2,534,056 cond +   208,896 ind)
==4122== Mispredicts:       81,689  (   74,071 cond +     7,618 ind)
==4122== Mispred rate:         2.9% (      2.9%     +       3.6%   )


      parent reply	other threads:[~2011-11-18  1:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-16 23:12 [Bug middle-end/51182] New: [ipa-iterations] running multiple passes of early IPA on a file produces difference " matt at use dot net
2011-11-16 23:13 ` [Bug middle-end/51182] " matt at use dot net
2011-11-17 19:47 ` [Bug middle-end/51182] [ipa-iterations] running multiple passes of early IPA on a file produces different " rguenth at gcc dot gnu.org
2011-11-18  2:19 ` matt at use dot net
2011-11-18  2:20 ` matt at use dot net [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-51182-4-oLSPNyIn2N@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).