[Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code
@ 2011-11-20  3:31 matt at use dot net
  2012-08-10  9:52 ` [Bug middle-end/51233] " rguenth at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: matt at use dot net @ 2011-11-20  3:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233

             Bug #: 51233
           Summary: [ipa-iterations] running multiple passes of early IPA
                    on zlib produces more optimal code
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: matt@use.net

Using current trunk, with Maxim's eipa-iterations patch. I modified the zlib
1.2.3.4 makefile (from Ubuntu 11.10's source package) as such for building on
my Ubuntu 11.10/amd64 system:

CC=gcc
CFLAGS=--param eipa-iterations=3 -flto -Ofast
SFLAGS=$(CFLAGS) -shared -fPIC
LDFLAGS=-flto -L. libz.a

And then built and tested the resulting minigzip utility both at the
macro-level (total runtime), and the micro-level (using callgrind's cache miss
and branch misprediction benchmarks). Macro level, when run a single 50MB file
on a ramdisk in single user mode shows minor improvements that may qualify as
noise. At the micro level, callgrind shows 0.4% fewer branch mispredictions,
and a dramatic decrease in data accesses (but a slight uptick in data cache
misses).

While there are some notable code differences between 2 and 3 iterations, they
don't appear to have an effect on the performance at the macro- or micro-level.

Given the relative simplicity of the code in the library, these additional
optimizations could possibly have been gotten within a single iteration.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/51233] [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code
  2011-11-20  3:31 [Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code matt at use dot net
@ 2012-08-10  9:52 ` rguenth at gcc dot gnu.org
  2012-08-14  0:26 ` matt at use dot net
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-08-10  9:52 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |lto
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-08-10
     Ever Confirmed|0                           |1

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-08-10 09:52:37 UTC ---
We've discussed the possibility of iterating the LTO WPA phase by outputting
LTO bytecode from the LTRANS phase (at some point before expansion, likely
even before loop optimizations).  That would be the way to go here I think,
apart from the idea that popped up how to do better for early optimization
of cycles in the cgraph.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/51233] [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code
  2011-11-20  3:31 [Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code matt at use dot net
  2012-08-10  9:52 ` [Bug middle-end/51233] " rguenth at gcc dot gnu.org
@ 2012-08-14  0:26 ` matt at use dot net
  2012-08-14  8:23 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: matt at use dot net @ 2012-08-14  0:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233

--- Comment #2 from Matt Hargett <matt at use dot net> 2012-08-14 00:26:35 UTC ---
Okay. I filed this bug at your request last year because of your concerns that
some of the improvements seen with multiple iterations might be "papering over"
existing bugs in the optimizers. Does this mean that in this zlib case the
passes are all fine, but multiple iterations legitimately helps?

The original discussion was in the context of Maxim's devirt patches. Would the
approach you mention still allow for the testcases from his proposed patches to
pass? (We can discuss this second question on-list, if you like.)

Thanks for reviving this; we saw dramatic performance improvements with the
4.6-based deliverable we got from Maxim.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/51233] [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code
  2011-11-20  3:31 [Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code matt at use dot net
  2012-08-10  9:52 ` [Bug middle-end/51233] " rguenth at gcc dot gnu.org
  2012-08-14  0:26 ` matt at use dot net
@ 2012-08-14  8:23 ` rguenth at gcc dot gnu.org
  2012-08-14 17:43 ` matt at use dot net
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-08-14  8:23 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233

--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-08-14 08:23:16 UTC ---
Multiple iterations may still paper over missed-optimization bugs in passes.
Using LTO to drive the iteration makes more sense (well, if iterating makes
any sense ...), as it will consider the whole program when iterating, not just
a single translation unit.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/51233] [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code
  2011-11-20  3:31 [Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code matt at use dot net
                   ` (2 preceding siblings ...)
  2012-08-14  8:23 ` rguenth at gcc dot gnu.org
@ 2012-08-14 17:43 ` matt at use dot net
  2012-12-04 20:35 ` matt at use dot net
  2012-12-05  9:24 ` richard.guenther at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: matt at use dot net @ 2012-08-14 17:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233

--- Comment #4 from Matt Hargett <matt at use dot net> 2012-08-14 17:43:30 UTC ---
I agree it's more appropriate in LTO, but can still provide measurable benefit
for template-heavy C++ applications where lots of implementation bodies are in
header files by necessity.

With regard to zlib performance improving with multiple iterations, is there a
multiple iterations patch against trunk you'd like me to retest on this
specific testcase? We never did an RTL/tree dump in the 4.6/4.7 case for each
iteration to see if the improvements could have been caught in a single
iteration.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/51233] [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code
  2011-11-20  3:31 [Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code matt at use dot net
                   ` (3 preceding siblings ...)
  2012-08-14 17:43 ` matt at use dot net
@ 2012-12-04 20:35 ` matt at use dot net
  2012-12-05  9:24 ` richard.guenther at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: matt at use dot net @ 2012-12-04 20:35 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233

--- Comment #5 from Matt Hargett <matt at use dot net> 2012-12-04 20:35:09 UTC ---
ping? if you're more comfortable with relegating multiple passes to LTO, I
think that's a good starting point. we can wait for a per-unit C++ template
case to come up after that's in.

is there anything you'd like me to do to get this moving again?

thanks!


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/51233] [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code
  2011-11-20  3:31 [Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code matt at use dot net
                   ` (4 preceding siblings ...)
  2012-12-04 20:35 ` matt at use dot net
@ 2012-12-05  9:24 ` richard.guenther at gmail dot com
  5 siblings, 0 replies; 7+ messages in thread
From: richard.guenther at gmail dot com @ 2012-12-05  9:24 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233

--- Comment #6 from richard.guenther at gmail dot com <richard.guenther at gmail dot com> 2012-12-05 09:23:56 UTC ---
On Tue, Dec 4, 2012 at 9:35 PM, matt at use dot net
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51233
>
> --- Comment #5 from Matt Hargett <matt at use dot net> 2012-12-04 20:35:09 UTC ---
> ping? if you're more comfortable with relegating multiple passes to LTO, I
> think that's a good starting point. we can wait for a per-unit C++ template
> case to come up after that's in.

Yes, multiple LTO passes is what I think should be done (or alternatively
if one really dislikes this better processing of cgraph SCCs during early
optimizations like I outlined in some e-mail response to the original
patches).  But the LTO approach should be more powerful anyway.

> is there anything you'd like me to do to get this moving again?

Produce patches for review?

;)


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-12-05  9:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-20  3:31 [Bug middle-end/51233] New: [ipa-iterations] running multiple passes of early IPA on zlib produces more optimal code matt at use dot net
2012-08-10  9:52 ` [Bug middle-end/51233] " rguenth at gcc dot gnu.org
2012-08-14  0:26 ` matt at use dot net
2012-08-14  8:23 ` rguenth at gcc dot gnu.org
2012-08-14 17:43 ` matt at use dot net
2012-12-04 20:35 ` matt at use dot net
2012-12-05  9:24 ` richard.guenther at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).