From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-294098-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 15044 invoked by alias); 9 Jun 2011 23:45:35 -0000
Received: (qmail 15027 invoked by uid 22791); 9 Jun 2011 23:45:34 -0000
X-SWARE-Spam-Status: No, hits=-1.2 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 09 Jun 2011 23:45:18 +0000
Received: from kpbe13.cbf.corp.google.com (kpbe13.cbf.corp.google.com [172.25.105.77])	by smtp-out.google.com with ESMTP id p59NjHuh010673	for <gcc-patches@gcc.gnu.org>; Thu, 9 Jun 2011 16:45:17 -0700
Received: from gxk9 (gxk9.prod.google.com [10.202.11.9])	by kpbe13.cbf.corp.google.com with ESMTP id p59NhVLQ017054	(version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT)	for <gcc-patches@gcc.gnu.org>; Thu, 9 Jun 2011 16:45:11 -0700
Received: by gxk9 with SMTP id 9so1435439gxk.26        for <gcc-patches@gcc.gnu.org>; Thu, 09 Jun 2011 16:45:11 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.100.239.3 with SMTP id m3mr1322526anh.15.1307663110972; Thu, 09 Jun 2011 16:45:10 -0700 (PDT)
Received: by 10.101.107.5 with HTTP; Thu, 9 Jun 2011 16:45:10 -0700 (PDT)
In-Reply-To: <BANLkTi=-Xg9AY2yNw-MgAQ==6RuuGxiyPg@mail.gmail.com>
References: <BANLkTik0OK=0ksWUosRPGW3x23-OAA34ujryD7iimFZH51x58w@mail.gmail.com>	<BANLkTi=70RMg26xoi=ORR_4ZV50xEfjSrJUq+chZR1uOWnJXfg@mail.gmail.com>	<BANLkTi=-Xg9AY2yNw-MgAQ==6RuuGxiyPg@mail.gmail.com>
Date: Fri, 10 Jun 2011 00:08:00 -0000
Message-ID: <BANLkTim-NWd+=mS6KOAzfHTWp8cyZO00VJO0OpD0zEDd68zZUA@mail.gmail.com>
Subject: Re: [google] pessimize stack accounting during inlining
From: Mark Heffernan <meheff@google.com>
To: Richard Guenther <richard.guenther@gmail.com>
Cc: Xinliang David Li <davidxl@google.com>,        GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-System-Of-Record: true
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2011-06/txt/msg00783.txt.bz2

On Wed, Jun 8, 2011 at 1:54 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> Well, it's still not a hard limit as we can't tell how many spill slots
> or extra call argument or return value slots we need.

Agreed. =A0It's not perfect. =A0But I've found this does a reasonable job
of preventing the inliner from pushing the frame size much beyond the
imposed limit especially if the limit is large (eg, many K) relative
to the typical total size of spill slots, arguments, etc.

Mark

>
> Richard.
>
> > Thanks,
> >
> > David
> >
> > On Tue, Jun 7, 2011 at 4:29 PM, Mark Heffernan <meheff@google.com> wrot=
e:
> >> This patch pessimizes stack accounting during inlining. =A0This enables
> >> setting a firm=A0stack size limit (via parameters=A0"large-stack-frame=
" and
> >> "large-stack-frame-growth"). =A0Without this patch the inliner is over=
ly
> >> optimistic about potential stack reuse resulting in actual stack frame=
s much
> >> larger than the parameterized limits.
> >> Internal benchmarks show minor performance differences with non-fdo and
> >> lipo, but overall neutral. =A0Tested/bootstrapped on x86-64.
> >> Ok for google-main?
> >> Mark
> >>
> >> 2011-06-07 =A0Mark Heffernan =A0<meheff@google.com>
> >> =A0 =A0 =A0 =A0 * cgraph.h (cgraph_global_info): Remove field.
> >> =A0 =A0 =A0 =A0 * ipa-inline.c (cgraph_clone_inlined_nodes): Change
> >> =A0 =A0 =A0 =A0 stack frame computation.
> >> =A0 =A0 =A0 =A0 (cgraph_check_inline_limits): Ditto.
> >> =A0 =A0 =A0 =A0 (compute_inline_parameters): Remove dead initializatio=
n.
> >>
> >> Index: gcc/cgraph.h
> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >> --- gcc/cgraph.h =A0 =A0 =A0 =A0(revision 174512)
> >> +++ gcc/cgraph.h =A0 =A0 =A0 =A0(working copy)
> >> @@ -136,8 +136,6 @@ struct GTY(()) cgraph_local_info {
> >> =A0struct GTY(()) cgraph_global_info {
> >> =A0 =A0/* Estimated stack frame consumption by the function. =A0*/
> >> =A0 =A0HOST_WIDE_INT estimated_stack_size;
> >> - =A0/* Expected offset of the stack frame of inlined function. =A0*/
> >> - =A0HOST_WIDE_INT stack_frame_offset;
> >>
> >> =A0 =A0/* For inline clones this points to the function they will be
> >> =A0 =A0 =A0 inlined into. =A0*/
> >> Index: gcc/ipa-inline.c
> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >> --- gcc/ipa-inline.c =A0 =A0(revision 174512)
> >> +++ gcc/ipa-inline.c =A0 =A0(working copy)
> >> @@ -229,8 +229,6 @@ void
> >> =A0cgraph_clone_inlined_nodes (struct cgraph_edge *e, bool duplicate,
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 bool update_or=
iginal)
> >> =A0{
> >> - =A0HOST_WIDE_INT peak;
> >> -
> >> =A0 =A0if (duplicate)
> >> =A0 =A0 =A0{
> >> =A0 =A0 =A0 =A0/* We may eliminate the need for out-of-line copy to be=
 output.
> >> @@ -279,13 +277,13 @@ cgraph_clone_inlined_nodes (struct cgrap
> >> =A0 =A0 =A0e->callee->global.inlined_to =3D e->caller->global.inlined_=
to;
> >> =A0 =A0else
> >> =A0 =A0 =A0e->callee->global.inlined_to =3D e->caller;
> >> - =A0e->callee->global.stack_frame_offset
> >> - =A0 =A0=3D e->caller->global.stack_frame_offset
> >> - =A0 =A0 =A0+ inline_summary (e->caller)->estimated_self_stack_size;
> >> - =A0peak =3D e->callee->global.stack_frame_offset
> >> - =A0 =A0 =A0+ inline_summary (e->callee)->estimated_self_stack_size;
> >> - =A0if (e->callee->global.inlined_to->global.estimated_stack_size < p=
eak)
> >> - =A0 =A0e->callee->global.inlined_to->global.estimated_stack_size =3D=
 peak;
> >> +
> >> + =A0/* Pessimistically assume no sharing of stack space. =A0That is, =
the
> >> + =A0 =A0 frame size of a function is estimated as the original frame =
size
> >> + =A0 =A0 plus the sum of the frame sizes of all inlined callees. =A0*/
> >> + =A0e->callee->global.inlined_to->global.estimated_stack_size +=3D
> >> + =A0 =A0inline_summary (e->callee)->estimated_self_stack_size;
> >> +
> >> =A0 =A0cgraph_propagate_frequency (e->callee);
> >>
> >> =A0 =A0/* Recursively clone all bodies. =A0*/
> >> @@ -430,8 +428,7 @@ cgraph_check_inline_limits (struct cgrap
> >>
> >> =A0 =A0stack_size_limit +=3D stack_size_limit * PARAM_VALUE
> >> (PARAM_STACK_FRAME_GROWTH) / 100;
> >>
> >> - =A0inlined_stack =3D (to->global.stack_frame_offset
> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ inline_summary (to)->estimated_=
self_stack_size
> >> + =A0inlined_stack =3D (to->global.estimated_stack_size
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ what->global.estimated_stack_=
size);
> >> =A0 =A0if (inlined_stack =A0> stack_size_limit
> >> =A0 =A0 =A0 =A0&& inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME=
))
> >> @@ -2064,7 +2061,6 @@ compute_inline_parameters (struct cgraph
> >> =A0 =A0self_stack_size =3D optimize ? estimated_stack_frame_size (node=
) : 0;
> >> =A0 =A0inline_summary (node)->estimated_self_stack_size =3D self_stack=
_size;
> >> =A0 =A0node->global.estimated_stack_size =3D self_stack_size;
> >> - =A0node->global.stack_frame_offset =3D 0;
> >>
> >> =A0 =A0/* Can this function be inlined at all? =A0*/
> >> =A0 =A0node->local.inlinable =3D tree_inlinable_function_p (node->decl=
);
> >>
> >