From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-293937-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 3106 invoked by alias); 8 Jun 2011 08:54:24 -0000
Received: (qmail 3091 invoked by uid 22791); 8 Jun 2011 08:54:23 -0000
X-SWARE-Spam-Status: No, hits=-2.4 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST
X-Spam-Check-By: sourceware.org
Received: from mail-ww0-f41.google.com (HELO mail-ww0-f41.google.com) (74.125.82.41)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 08 Jun 2011 08:54:07 +0000
Received: by wwi18 with SMTP id 18so3137425wwi.2        for <gcc-patches@gcc.gnu.org>; Wed, 08 Jun 2011 01:54:06 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.227.55.67 with SMTP id t3mr7137369wbg.90.1307523246034; Wed, 08 Jun 2011 01:54:06 -0700 (PDT)
Received: by 10.227.37.152 with HTTP; Wed, 8 Jun 2011 01:54:06 -0700 (PDT)
In-Reply-To: <BANLkTi=70RMg26xoi=ORR_4ZV50xEfjSrJUq+chZR1uOWnJXfg@mail.gmail.com>
References: <BANLkTik0OK=0ksWUosRPGW3x23-OAA34ujryD7iimFZH51x58w@mail.gmail.com>	<BANLkTi=70RMg26xoi=ORR_4ZV50xEfjSrJUq+chZR1uOWnJXfg@mail.gmail.com>
Date: Wed, 08 Jun 2011 09:11:00 -0000
Message-ID: <BANLkTi=-Xg9AY2yNw-MgAQ==6RuuGxiyPg@mail.gmail.com>
Subject: Re: [google] pessimize stack accounting during inlining
From: Richard Guenther <richard.guenther@gmail.com>
To: Xinliang David Li <davidxl@google.com>
Cc: Mark Heffernan <meheff@google.com>, GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2011-06/txt/msg00622.txt.bz2

On Wed, Jun 8, 2011 at 1:36 AM, Xinliang David Li <davidxl@google.com> wrot=
e:
> Ok for google/main. =A0 A good candidate patch for trunk too.

Well, it's still not a hard limit as we can't tell how many spill slots
or extra call argument or return value slots we need.

Richard.

> Thanks,
>
> David
>
> On Tue, Jun 7, 2011 at 4:29 PM, Mark Heffernan <meheff@google.com> wrote:
>> This patch pessimizes stack accounting during inlining. =A0This enables
>> setting a firm=A0stack size limit (via parameters=A0"large-stack-frame" =
and
>> "large-stack-frame-growth"). =A0Without this patch the inliner is overly
>> optimistic about potential stack reuse resulting in actual stack frames =
much
>> larger than the parameterized limits.
>> Internal benchmarks show minor performance differences with non-fdo and
>> lipo, but overall neutral. =A0Tested/bootstrapped on x86-64.
>> Ok for google-main?
>> Mark
>>
>> 2011-06-07 =A0Mark Heffernan =A0<meheff@google.com>
>> =A0 =A0 =A0 =A0 * cgraph.h (cgraph_global_info): Remove field.
>> =A0 =A0 =A0 =A0 * ipa-inline.c (cgraph_clone_inlined_nodes): Change
>> =A0 =A0 =A0 =A0 stack frame computation.
>> =A0 =A0 =A0 =A0 (cgraph_check_inline_limits): Ditto.
>> =A0 =A0 =A0 =A0 (compute_inline_parameters): Remove dead initialization.
>>
>> Index: gcc/cgraph.h
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> --- gcc/cgraph.h =A0 =A0 =A0 =A0(revision 174512)
>> +++ gcc/cgraph.h =A0 =A0 =A0 =A0(working copy)
>> @@ -136,8 +136,6 @@ struct GTY(()) cgraph_local_info {
>> =A0struct GTY(()) cgraph_global_info {
>> =A0 =A0/* Estimated stack frame consumption by the function. =A0*/
>> =A0 =A0HOST_WIDE_INT estimated_stack_size;
>> - =A0/* Expected offset of the stack frame of inlined function. =A0*/
>> - =A0HOST_WIDE_INT stack_frame_offset;
>>
>> =A0 =A0/* For inline clones this points to the function they will be
>> =A0 =A0 =A0 inlined into. =A0*/
>> Index: gcc/ipa-inline.c
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> --- gcc/ipa-inline.c =A0 =A0(revision 174512)
>> +++ gcc/ipa-inline.c =A0 =A0(working copy)
>> @@ -229,8 +229,6 @@ void
>> =A0cgraph_clone_inlined_nodes (struct cgraph_edge *e, bool duplicate,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 bool update_orig=
inal)
>> =A0{
>> - =A0HOST_WIDE_INT peak;
>> -
>> =A0 =A0if (duplicate)
>> =A0 =A0 =A0{
>> =A0 =A0 =A0 =A0/* We may eliminate the need for out-of-line copy to be o=
utput.
>> @@ -279,13 +277,13 @@ cgraph_clone_inlined_nodes (struct cgrap
>> =A0 =A0 =A0e->callee->global.inlined_to =3D e->caller->global.inlined_to;
>> =A0 =A0else
>> =A0 =A0 =A0e->callee->global.inlined_to =3D e->caller;
>> - =A0e->callee->global.stack_frame_offset
>> - =A0 =A0=3D e->caller->global.stack_frame_offset
>> - =A0 =A0 =A0+ inline_summary (e->caller)->estimated_self_stack_size;
>> - =A0peak =3D e->callee->global.stack_frame_offset
>> - =A0 =A0 =A0+ inline_summary (e->callee)->estimated_self_stack_size;
>> - =A0if (e->callee->global.inlined_to->global.estimated_stack_size < pea=
k)
>> - =A0 =A0e->callee->global.inlined_to->global.estimated_stack_size =3D p=
eak;
>> +
>> + =A0/* Pessimistically assume no sharing of stack space. =A0That is, the
>> + =A0 =A0 frame size of a function is estimated as the original frame si=
ze
>> + =A0 =A0 plus the sum of the frame sizes of all inlined callees. =A0*/
>> + =A0e->callee->global.inlined_to->global.estimated_stack_size +=3D
>> + =A0 =A0inline_summary (e->callee)->estimated_self_stack_size;
>> +
>> =A0 =A0cgraph_propagate_frequency (e->callee);
>>
>> =A0 =A0/* Recursively clone all bodies. =A0*/
>> @@ -430,8 +428,7 @@ cgraph_check_inline_limits (struct cgrap
>>
>> =A0 =A0stack_size_limit +=3D stack_size_limit * PARAM_VALUE
>> (PARAM_STACK_FRAME_GROWTH) / 100;
>>
>> - =A0inlined_stack =3D (to->global.stack_frame_offset
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ inline_summary (to)->estimated_se=
lf_stack_size
>> + =A0inlined_stack =3D (to->global.estimated_stack_size
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ what->global.estimated_stack_si=
ze);
>> =A0 =A0if (inlined_stack =A0> stack_size_limit
>> =A0 =A0 =A0 =A0&& inlined_stack > PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
>> @@ -2064,7 +2061,6 @@ compute_inline_parameters (struct cgraph
>> =A0 =A0self_stack_size =3D optimize ? estimated_stack_frame_size (node) =
: 0;
>> =A0 =A0inline_summary (node)->estimated_self_stack_size =3D self_stack_s=
ize;
>> =A0 =A0node->global.estimated_stack_size =3D self_stack_size;
>> - =A0node->global.stack_frame_offset =3D 0;
>>
>> =A0 =A0/* Can this function be inlined at all? =A0*/
>> =A0 =A0node->local.inlinable =3D tree_inlinable_function_p (node->decl);
>>
>