From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-481431-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 1815 invoked by alias); 24 Mar 2015 13:39:05 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 1547 invoked by uid 48); 24 Mar 2015 13:39:00 -0000
From: "jamborm at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug ipa/65478] [5 regression] crafty performance regression
Date: Tue, 24 Mar 2015 14:10:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: ipa
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: jamborm at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 5.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-65478-4-ZwcXvZQ7ME@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-65478-4@http.gcc.gnu.org/bugzilla/>
References: <bug-65478-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-03/txt/msg02575.txt.bz2

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65478

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamborm at gcc dot gnu.org
--- Comment #6 from Martin Jambor <jamborm at gcc dot gnu.org> ---
I can confirm I can see a fairly consistent 4% run time increase
caused by r219863 on my desktop (from ~22.74s to ~23.64s).  However,
when I disable cloning of the Search function, for example by using
the attribute noclone, I only get to, ~23.31s which is still 2.5%
slower.  (All the times are of course subject to noise but I have
measured them repeatedly and as I said, they are fairly consistent).
This suggests that cloning of function Search and not inlining
NextMove is only part of the story.


> I would suggest we may disable/add negative hint for cloning in the
> case where the specialized function will end up calling
> unspecialized version of itself with non-cold edge.

Recursion is handled by iterating over SCCs in call graph in IPA-CP,
and the redirection of the final call to "close" the SCCs is done in a
different iteration than the first cloning.  This unfortunately means
that when function decide_about_value reasons about cloning or not, it
does not know what recursive calls are going to be redirected and
which are not.  Making it aware of this would require a hack in
cgraph_edge_brings_value_p functions.  I may try writing it but I
wonder whether it is really easier than undoing all cloning in an SCC,
which is the right way to implement this as it would also work for
recursions involving two or more functions.

> We also may consider adding bit of negative hints for cases where
> cloning would turn function called once (by noncold edge) to a
> function called twice.

This would be much easier, although the penalty would have to be quite
big because the goodness number calculated by
good_cloning_opportunity_p is 830 and the threshold is 500.

But given the above, perhaps, for gcc 5 at least, we might want to
introduce a 0.7 factor penalty for this and another 0.7 factor penalty
just for being within an SCC?