public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PR debug/47106] account used vars only once
@ 2011-01-06 11:20 Alexandre Oliva
  2011-01-10 15:24 ` Richard Guenther
  0 siblings, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-01-06 11:20 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1781 bytes --]

This -fcompare-debug failure occurs because we remove variables from the
lexical blocks in the compilation without debug info, but we account the
stack space they won't use in the compilation with debug info, which
leads to different inlining decisions when stack-growth limits are set.

It turns out that the use of TREE_USED in the stack frame size
estimation functions seems to be misguided: we initially set it for all
local decls that don't appear in lexical blocks (not shown in patch),
then account the size of all LOCAL_DECLs that have it set (which means
accounting only those that don't appear in lexical blocks), set it on
all LOCAL_DECLs, and then account the size of decls in lexical blocks if
they have TREE_USED (which means accounting only those that are in
LOCAL_DECLs, not accounting those that aren't).

I fixed it so that we leave TREE_USED alone after initialization: we
account variables that aren't in lexical blocks while looping over
LOCAL_DECLs, and then account all variables in lexical blocks, as long
as they are actually used, which avoids accounting declarations retained
only for purposes of debug information generation.

I suppose restoring the assignment and test of TREE_USED would also
work, but AFAICT it would just waste CPU cycles.

This was regstrapped on x86_{32,64}-linux-gnu.  I also bootstrapped it
with BOOT_CFLAGS='-O2 -g -fpartial-inlining -flto -fconserve-stack', and
regstrapping succeeded except for gfortran.dg/realloc_on_assign_2.exe at
all torture levels: stage3 f951 fails to compile it, although a
non-bootstrap f951 does, on both x86_{32,64}.  From that I concluded
that stage[23]s are miscompiled with these BOOT_CFLAGS above, even
before my patch, but I haven't debugged or investigated that any
further.

Ok to install?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ipa-inline-stack-growth-compare-debug-pr47106.patch --]
[-- Type: text/x-diff, Size: 1194 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	* cfgexpand.c (account_used_vars_for_block): Only account vars
	that are annotated as used.
	(estimated_stack_frame_size): Don't set TREE_USED.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2010-12-30 18:05:43.930553884 -0200
+++ gcc/cfgexpand.c	2011-01-04 02:39:50.745382341 -0200
@@ -1325,7 +1325,7 @@ account_used_vars_for_block (tree block,
 
   /* Expand all variables at this level.  */
   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (TREE_USED (t))
+    if (var_ann (t) && var_ann (t)->used)
       size += expand_one_var (t, toplevel, false);
 
   /* Expand all variables at containing levels.  */
@@ -1389,9 +1389,10 @@ estimated_stack_frame_size (tree decl)
 
   FOR_EACH_LOCAL_DECL (cfun, ix, var)
     {
+      /* TREE_USED marks local variables that do not appear in lexical
+	 blocks.  We don't want to expand those that do twice.  */
       if (TREE_USED (var))
         size += expand_one_var (var, true, false);
-      TREE_USED (var) = 1;
     }
   size += account_used_vars_for_block (outer_block, true);
 

[-- Attachment #3: Type: text/plain, Size: 257 bytes --]


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-06 11:20 [PR debug/47106] account used vars only once Alexandre Oliva
@ 2011-01-10 15:24 ` Richard Guenther
  2011-01-10 15:50   ` Jan Hubicka
  2011-01-20 22:01   ` Alexandre Oliva
  0 siblings, 2 replies; 46+ messages in thread
From: Richard Guenther @ 2011-01-10 15:24 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Thu, Jan 6, 2011 at 11:11 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> This -fcompare-debug failure occurs because we remove variables from the
> lexical blocks in the compilation without debug info, but we account the
> stack space they won't use in the compilation with debug info, which
> leads to different inlining decisions when stack-growth limits are set.
>
> It turns out that the use of TREE_USED in the stack frame size
> estimation functions seems to be misguided: we initially set it for all
> local decls that don't appear in lexical blocks (not shown in patch),
> then account the size of all LOCAL_DECLs that have it set (which means
> accounting only those that don't appear in lexical blocks), set it on
> all LOCAL_DECLs, and then account the size of decls in lexical blocks if
> they have TREE_USED (which means accounting only those that are in
> LOCAL_DECLs, not accounting those that aren't).
>
> I fixed it so that we leave TREE_USED alone after initialization: we
> account variables that aren't in lexical blocks while looping over
> LOCAL_DECLs, and then account all variables in lexical blocks, as long
> as they are actually used, which avoids accounting declarations retained
> only for purposes of debug information generation.
>
> I suppose restoring the assignment and test of TREE_USED would also
> work, but AFAICT it would just waste CPU cycles.
>
> This was regstrapped on x86_{32,64}-linux-gnu.  I also bootstrapped it
> with BOOT_CFLAGS='-O2 -g -fpartial-inlining -flto -fconserve-stack', and
> regstrapping succeeded except for gfortran.dg/realloc_on_assign_2.exe at
> all torture levels: stage3 f951 fails to compile it, although a
> non-bootstrap f951 does, on both x86_{32,64}.  From that I concluded
> that stage[23]s are miscompiled with these BOOT_CFLAGS above, even
> before my patch, but I haven't debugged or investigated that any
> further.
>
> Ok to install?

It looks ok when just looking at the documentation of the var_ann
used filed, but I cannot find the code that computes it during
out-of-SSA, instead do we rely on remove-unused-locals computing it?
Are we sure that we at least
run local vars removal once before we estimated stack frame
size the first time - I see pass_inline_parameters before the
first run?  All variables are initially created with used
set to false (which is a non-conservative default, likely false
even).  I guess I'm more happy with the patch if you switch the
initialization in create_var_ann to mark the variable used.

I can't make sense of the TREE_USED setting in
estimate_stack_frame_size in the first place - maybe if
the loop in account_used_vars_for_block would be
testing !TREE_USED ..., but then only using the
var-anns used flag in the for-each-local-decl loop will
have the same problem with debug-compare?

Thus, I can't see how the duplicate-accounting avoiding
works (even before your patch), and I smell an inconsistency
when only partially using var-anns used flags.

Maybe Honza can explain how the code works ...?

Thanks,
Richard.

>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-10 15:24 ` Richard Guenther
@ 2011-01-10 15:50   ` Jan Hubicka
  2011-01-10 16:31     ` Richard Guenther
  2011-01-20 22:01   ` Alexandre Oliva
  1 sibling, 1 reply; 46+ messages in thread
From: Jan Hubicka @ 2011-01-10 15:50 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Alexandre Oliva, gcc-patches, Jan Hubicka

> It looks ok when just looking at the documentation of the var_ann
> used filed, but I cannot find the code that computes it during
> out-of-SSA, instead do we rely on remove-unused-locals computing it?

AFAIK remove-unused-locals was supposed to clear it and it was supposed
to be run after passes removing references to locals, so it should be run
before we get into inlinine parameters.
> Are we sure that we at least
> run local vars removal once before we estimated stack frame
> size the first time - I see pass_inline_parameters before the
> first run?  All variables are initially created with used
> set to false (which is a non-conservative default, likely false
> even).  I guess I'm more happy with the patch if you switch the
> initialization in create_var_ann to mark the variable used.
> 
> I can't make sense of the TREE_USED setting in
> estimate_stack_frame_size in the first place - maybe if
> the loop in account_used_vars_for_block would be
> testing !TREE_USED ..., but then only using the
> var-anns used flag in the for-each-local-decl loop will
> have the same problem with debug-compare?
> 
> Thus, I can't see how the duplicate-accounting avoiding
> works (even before your patch), and I smell an inconsistency
> when only partially using var-anns used flags.
> 
> Maybe Honza can explain how the code works ...?
Nope, I think this is Steven's code, since the code originally just executed
the Rth's stack packing path and got its result.

Honza

> 
> Thanks,
> Richard.
> 
> >
> >
> > --
> > Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> > You must be the change you wish to see in the world. -- Gandhi
> > Be Free! -- http://FSFLA.org/   FSF Latin America board member
> > Free Software Evangelist      Red Hat Brazil Compiler Engineer
> >
> >

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-10 15:50   ` Jan Hubicka
@ 2011-01-10 16:31     ` Richard Guenther
  2011-01-10 16:58       ` Jan Hubicka
  0 siblings, 1 reply; 46+ messages in thread
From: Richard Guenther @ 2011-01-10 16:31 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Alexandre Oliva, gcc-patches

2011/1/10 Jan Hubicka <hubicka@ucw.cz>:
>> It looks ok when just looking at the documentation of the var_ann
>> used filed, but I cannot find the code that computes it during
>> out-of-SSA, instead do we rely on remove-unused-locals computing it?
>
> AFAIK remove-unused-locals was supposed to clear it and it was supposed
> to be run after passes removing references to locals, so it should be run
> before we get into inlinine parameters.

inline-parameters is run in lowering passes, so for cycles we'll end
up estimating stack size to zero?  Or can we drop the early
inline-parameters pass run and rely on that after early opts?

>> Are we sure that we at least
>> run local vars removal once before we estimated stack frame
>> size the first time - I see pass_inline_parameters before the
>> first run?  All variables are initially created with used
>> set to false (which is a non-conservative default, likely false
>> even).  I guess I'm more happy with the patch if you switch the
>> initialization in create_var_ann to mark the variable used.
>>
>> I can't make sense of the TREE_USED setting in
>> estimate_stack_frame_size in the first place - maybe if
>> the loop in account_used_vars_for_block would be
>> testing !TREE_USED ..., but then only using the
>> var-anns used flag in the for-each-local-decl loop will
>> have the same problem with debug-compare?
>>
>> Thus, I can't see how the duplicate-accounting avoiding
>> works (even before your patch), and I smell an inconsistency
>> when only partially using var-anns used flags.
>>
>> Maybe Honza can explain how the code works ...?
> Nope, I think this is Steven's code, since the code originally just executed
> the Rth's stack packing path and got its result.

Hm.

Richard.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-10 16:31     ` Richard Guenther
@ 2011-01-10 16:58       ` Jan Hubicka
  0 siblings, 0 replies; 46+ messages in thread
From: Jan Hubicka @ 2011-01-10 16:58 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jan Hubicka, Alexandre Oliva, gcc-patches

> 2011/1/10 Jan Hubicka <hubicka@ucw.cz>:
> >> It looks ok when just looking at the documentation of the var_ann
> >> used filed, but I cannot find the code that computes it during
> >> out-of-SSA, instead do we rely on remove-unused-locals computing it?
> >
> > AFAIK remove-unused-locals was supposed to clear it and it was supposed
> > to be run after passes removing references to locals, so it should be run
> > before we get into inlinine parameters.
> 
> inline-parameters is run in lowering passes, so for cycles we'll end
> up estimating stack size to zero?  Or can we drop the early
> inline-parameters pass run and rely on that after early opts?

well, the first inline-parameters is run for early inlining to work.
Early inlining needs both estimates of functon it is inlining into and functions
it is inlininig.  First is computed by the first invocation of inline-parameters,
the second is computed by the inline-parameters at the end of early optimizations.


I think we can simply move the first inline_parameters just befre early inlining.

Honza
> 
> >> Are we sure that we at least
> >> run local vars removal once before we estimated stack frame
> >> size the first time - I see pass_inline_parameters before the
> >> first run?  All variables are initially created with used
> >> set to false (which is a non-conservative default, likely false
> >> even).  I guess I'm more happy with the patch if you switch the
> >> initialization in create_var_ann to mark the variable used.
> >>
> >> I can't make sense of the TREE_USED setting in
> >> estimate_stack_frame_size in the first place - maybe if
> >> the loop in account_used_vars_for_block would be
> >> testing !TREE_USED ..., but then only using the
> >> var-anns used flag in the for-each-local-decl loop will
> >> have the same problem with debug-compare?
> >>
> >> Thus, I can't see how the duplicate-accounting avoiding
> >> works (even before your patch), and I smell an inconsistency
> >> when only partially using var-anns used flags.
> >>
> >> Maybe Honza can explain how the code works ...?
> > Nope, I think this is Steven's code, since the code originally just executed
> > the Rth's stack packing path and got its result.
> 
> Hm.
> 
> Richard.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-10 15:24 ` Richard Guenther
  2011-01-10 15:50   ` Jan Hubicka
@ 2011-01-20 22:01   ` Alexandre Oliva
  2011-01-21 10:51     ` Richard Guenther
  2011-01-21 23:27     ` Alexandre Oliva
  1 sibling, 2 replies; 46+ messages in thread
From: Alexandre Oliva @ 2011-01-20 22:01 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 204 bytes --]

On Jan 10, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> I guess I'm more happy with the patch if you switch the
> initialization in create_var_ann to mark the variable used.

Like this?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ipa-inline-stack-growth-compare-debug-pr47106.patch --]
[-- Type: text/x-diff, Size: 1673 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	* cfgexpand.c (account_used_vars_for_block): Only account vars
	that are annotated as used.
	(estimated_stack_frame_size): Don't set TREE_USED.
	* tree-dfa.c (create_var_ann): Mark variable as used.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-01-20 14:55:54.405562397 -0200
+++ gcc/cfgexpand.c	2011-01-20 15:18:29.628014702 -0200
@@ -1325,7 +1325,7 @@ account_used_vars_for_block (tree block,
 
   /* Expand all variables at this level.  */
   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (TREE_USED (t))
+    if (var_ann (t) && var_ann (t)->used)
       size += expand_one_var (t, toplevel, false);
 
   /* Expand all variables at containing levels.  */
@@ -1389,9 +1389,10 @@ estimated_stack_frame_size (tree decl)
 
   FOR_EACH_LOCAL_DECL (cfun, ix, var)
     {
+      /* TREE_USED marks local variables that do not appear in lexical
+	 blocks.  We don't want to expand those that do twice.  */
       if (TREE_USED (var))
         size += expand_one_var (var, true, false);
-      TREE_USED (var) = 1;
     }
   size += account_used_vars_for_block (outer_block, true);
 
Index: gcc/tree-dfa.c
===================================================================
--- gcc/tree-dfa.c.orig	2011-01-20 15:42:18.837908738 -0200
+++ gcc/tree-dfa.c	2011-01-20 15:44:10.794077945 -0200
@@ -137,6 +137,9 @@ create_var_ann (tree t)
   ann = ggc_alloc_cleared_var_ann_d ();
   *DECL_VAR_ANN_PTR (t) = ann;
 
+  /* Assume the variable is used, at least for now.  */
+  ann->used = true;
+
   return ann;
 }
 

[-- Attachment #3: Type: text/plain, Size: 258 bytes --]



-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-20 22:01   ` Alexandre Oliva
@ 2011-01-21 10:51     ` Richard Guenther
  2011-01-21 19:01       ` H.J. Lu
  2011-01-21 23:27     ` Alexandre Oliva
  1 sibling, 1 reply; 46+ messages in thread
From: Richard Guenther @ 2011-01-21 10:51 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Thu, Jan 20, 2011 at 10:27 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jan 10, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>> I guess I'm more happy with the patch if you switch the
>> initialization in create_var_ann to mark the variable used.
>
> Like this?

Yes.  This is ok if it passes bootstap & regtest.

Thanks,
Richard.

>
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-21 10:51     ` Richard Guenther
@ 2011-01-21 19:01       ` H.J. Lu
  2011-01-21 22:13         ` Alexandre Oliva
  0 siblings, 1 reply; 46+ messages in thread
From: H.J. Lu @ 2011-01-21 19:01 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Alexandre Oliva, gcc-patches, Jan Hubicka

On Fri, Jan 21, 2011 at 2:35 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Jan 20, 2011 at 10:27 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Jan 10, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>>
>>> I guess I'm more happy with the patch if you switch the
>>> initialization in create_var_ann to mark the variable used.
>>
>> Like this?
>
> Yes.  This is ok if it passes bootstap & regtest.
>

Alexandre, did you run bootstap before checking in your
change? It caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47402


-- 
H.J.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-21 19:01       ` H.J. Lu
@ 2011-01-21 22:13         ` Alexandre Oliva
  0 siblings, 0 replies; 46+ messages in thread
From: Alexandre Oliva @ 2011-01-21 22:13 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Richard Guenther, gcc-patches, Jan Hubicka

On Jan 21, 2011, "H.J. Lu" <hjl.tools@gmail.com> wrote:

> Alexandre, did you run bootstap before checking in your
> change?

Yep.  According to my records, I ran it on top of revision 169058.
However, my subsequent bootstrap, with revision 169093 (pristine),
showed a -fcompare-debug regression in fortran/module.o.  Very odd...
Looking into it.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-20 22:01   ` Alexandre Oliva
  2011-01-21 10:51     ` Richard Guenther
@ 2011-01-21 23:27     ` Alexandre Oliva
  2011-01-31 12:24       ` Alexandre Oliva
  1 sibling, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-01-21 23:27 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

On Jan 20, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:

> 	* tree-dfa.c (create_var_ann): Mark variable as used.

> Index: gcc/tree-dfa.c
> ===================================================================
> --- gcc/tree-dfa.c.orig	2011-01-20 15:42:18.837908738 -0200
> +++ gcc/tree-dfa.c	2011-01-20 15:44:10.794077945 -0200
> @@ -137,6 +137,9 @@ create_var_ann (tree t)
>    ann = ggc_alloc_cleared_var_ann_d ();
>    *DECL_VAR_ANN_PTR (t) = ann;
 
> +  /* Assume the variable is used, at least for now.  */
> +  ann->used = true;
> +
>    return ann;
>  }
 
This hunk caused PR 47402, so I've reverted it, but I left the rest of
the patch in.  I'll be away tomorrow, but I'll look back into this on
Sunday to figure out why we're making different decisions regarding the
(partial) inlining of get_integer, and their different stack sizes.  I
suspect we may have to propagate used flags when cloning variables, or
run some pass to figure out what's used/unused after (partial) cloning,
for we seem to be looking at default-initialized (not recomputed)
information, a possibility that IIUC you and honza were concered about
when reviewing the original patch, but that went way over the top of my
head :-(

As I said, I'll look into it on Sunday, but any tips on where to insert
recomputation or somesuch would be highly appreciated.

TIA,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-21 23:27     ` Alexandre Oliva
@ 2011-01-31 12:24       ` Alexandre Oliva
  2011-01-31 14:01         ` Richard Guenther
  0 siblings, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-01-31 12:24 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 3022 bytes --]

On Jan 21, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:

> we seem to be looking at default-initialized (not recomputed)
> information, a possibility that IIUC you and honza were concered about
> when reviewing the original patch, but that went way over the top of
> my head :-(

I ended up introducing some infrastructure to check that we don't use
uninitialized “used” in var_ann, and that helped me catch a few
problems.

First, it reported that var decls created corresponding to result decls
of inlined functions were added twice to the LOCAL_DECLs list.  The
first patch below fixes that.

Second, it showed that we completely failed to compute used flags after
inlining.  At first, I split the computation of used flags out of
remove_unused_locals() and refrain from actually removing variables, but
then I decided calling removed_unused_locals() after inlining might get
us smaller functions and more accurate computations for additional
inlining.

But that was not enough to ensure all accesses were properly
initialized.  Temporaries created after the referenced_vars gimple pass
are not recorded in referenced_vars, so the resetting of the used flag
in remove_unused_vars doesn't apply to them.  I considered going over
LOCAL_DECLS, but instead I decided to arrange for variables to be added
to referenced_vars as they are created.  I hope that's not just ok, but
also desirable.

With these fixes, in the second patch, I could get a full bootstrap and
library build on x86_64-linux-gnu, fixing both PR 47106 bug and the
regression it caused.


The third patch introduces an abstract interface to set and clear the
used flag, so that introducing the checks could be done with a simpler
patch, and the fourth patch introduces the rejection of uses before
initialization of this flag using a 3-state variable rather than a
boolean, and additional checks to verify that the flags are at the
expected states before we start computing what locals are unused.  I
don't expect these two patches to be installed, and I'm only posting
them for the record, but if you think they'd useful (say, the abstract
interface is desirable, or the additional checks should be enabled given
some compile-time --enable-checking flag), let me know and I'll prepare
and submit the patches for inclusion.


I'm regstrapping them all together on x86_64-linux-gnu and
i686-linux-gnu, at -O1 and -O3 in addition to the regular -O2 bootstrap
(with -fcompare-debug at stage3, just because), and then I'll regstrap
only the first two.  This will take some time, because of the hardware
failures I experienced, but I wanted to post the patches early so that,
if they require changes or aren't acceptable, I can save some
no-longer-copious machine time and devote it to the other bugs that are
on my plate.  Thanks for your understanding.

Ok to install the first 2 if they pass regstrap?  Opinions on the other 2?


1st: fix duplicate LOCAL_DECLs for inlined result decls


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: no-duplicate-inlined-return-local-decl.patch --]
[-- Type: text/x-diff, Size: 921 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* tree-inline.c (declare_return_variable): Add result decl to
	local decls only once.

Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-01-29 22:21:06.066627335 -0200
+++ gcc/tree-inline.c	2011-01-29 22:23:01.011659701 -0200
@@ -2755,6 +2755,8 @@ declare_return_variable (copy_body_data 
   tree caller_type;
   tree var, use;
 
+  gcc_checking_assert (cfun->decl == caller);
+
   /* Handle type-mismatches in the function declaration return type
      vs. the call expression.  */
   if (modify_dest)
@@ -2864,7 +2866,6 @@ declare_return_variable (copy_body_data 
     }
 
   DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
-  add_local_decl (DECL_STRUCT_FUNCTION (caller), var);
 
   /* Do not have the rest of GCC warn about this variable as it should
      not be visible to the user.  */

[-- Attachment #3: Type: text/plain, Size: 116 bytes --]


2nd: fix bug and regression: remove unused vars after inlining and add
newly-created variables to referenced_vars


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: ipa-compute-unused-locals-pr47106.patch --]
[-- Type: text/x-diff, Size: 1193 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* tree-inline.c (tree_function_versioning): Remove unused locals.
	* gimple-low.c (record_vars_into): Mark newly-created variables
	as referenced.
	
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-01-29 22:22:05.308644185 -0200
+++ gcc/tree-inline.c	2011-01-29 22:22:05.590644263 -0200
@@ -5250,6 +5250,9 @@ tree_function_versioning (tree old_decl,
 
   gcc_assert (!id.debug_stmts);
   VEC_free (gimple, heap, init_stmts);
+
+  remove_unused_locals ();
+
   pop_cfun ();
   current_function_decl = old_current_function_decl;
   gcc_assert (!current_function_decl
Index: gcc/gimple-low.c
===================================================================
--- gcc/gimple-low.c.orig	2011-01-29 22:32:12.653832705 -0200
+++ gcc/gimple-low.c	2011-01-29 22:32:48.931843458 -0200
@@ -907,6 +907,8 @@ record_vars_into (tree vars, tree fn)
 
       /* Record the variable.  */
       add_local_decl (cfun, var);
+      if (gimple_referenced_vars (cfun))
+	add_referenced_var (var);
     }
 
   if (fn != current_function_decl)

[-- Attachment #5: Type: text/plain, Size: 68 bytes --]


3rd: introduce abstract interface to clear and test the used flag


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: ipa-var-used-abstract-interface.patch --]
[-- Type: text/x-diff, Size: 3743 bytes --]

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-01-28 07:19:45.337737263 -0200
+++ gcc/cfgexpand.c	2011-01-28 07:21:02.132628818 -0200
@@ -1325,7 +1325,7 @@ account_used_vars_for_block (tree block,
 
   /* Expand all variables at this level.  */
   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (var_ann (t) && var_ann (t)->used)
+    if (var_ann (t) && is_used_p (t))
       size += expand_one_var (t, toplevel, false);
 
   /* Expand all variables at containing levels.  */
Index: gcc/tree-flow-inline.h
===================================================================
--- gcc/tree-flow-inline.h.orig	2011-01-28 07:18:44.211743324 -0200
+++ gcc/tree-flow-inline.h	2011-01-28 07:21:02.158638868 -0200
@@ -569,9 +569,27 @@ static inline void
 set_is_used (tree var)
 {
   var_ann_t ann = get_var_ann (var);
-  ann->used = 1;
+  ann->used = true;
 }
 
+/* Clear VAR's used flag, so that it may be discarded during rtl
+   expansion.  */
+
+static inline void
+clear_is_used (tree var)
+{
+  var_ann_t ann = var_ann (var);
+  ann->used = false;
+}
+
+/* Return true if VAR is marked as used.  */
+
+static inline bool
+is_used_p (tree var)
+{
+  var_ann_t ann = var_ann (var);
+  return ann->used;
+}
 
 /* Return true if T (assumed to be a DECL) is a global variable.
    A variable is considered global if its storage is not automatic.  */
Index: gcc/tree-nrv.c
===================================================================
--- gcc/tree-nrv.c.orig	2011-01-28 07:18:44.211743324 -0200
+++ gcc/tree-nrv.c	2011-01-28 07:21:02.189650852 -0200
@@ -263,7 +263,7 @@ tree_nrv (void)
   DECL_HAS_VALUE_EXPR_P (found) = 1;
 
   /* FOUND is no longer used.  Ensure it gets removed.  */
-  var_ann (found)->used = 0;
+  clear_is_used (found);
   return 0;
 }
 
Index: gcc/tree-ssa-live.c
===================================================================
--- gcc/tree-ssa-live.c.orig	2011-01-28 07:19:45.526714936 -0200
+++ gcc/tree-ssa-live.c	2011-01-28 07:21:02.210658970 -0200
@@ -468,7 +468,7 @@ remove_unused_scope_block_p (tree scope)
 	 Exception are the scope blocks not containing any instructions
 	 at all so user can't get into the scopes at first place.  */
       else if ((ann = var_ann (*t)) != NULL
-		&& ann->used)
+	       && is_used_p (*t))
 	unused = false;
       else if (TREE_CODE (*t) == LABEL_DECL && TREE_USED (*t))
 	/* For labels that are still used in the IL, the decision to
@@ -708,7 +708,7 @@ remove_unused_locals (void)
 
   /* Assume all locals are unused.  */
   FOR_EACH_REFERENCED_VAR (t, rvi)
-    var_ann (t)->used = false;
+    clear_is_used (t);
 
   /* Walk the CFG marking all referenced symbols.  */
   FOR_EACH_BB (bb)
@@ -769,7 +769,7 @@ remove_unused_locals (void)
       var = VEC_index (tree, cfun->local_decls, srcidx);
       if (TREE_CODE (var) != FUNCTION_DECL
 	  && (!(ann = var_ann (var))
-	      || !ann->used))
+	      || !is_used_p (var)))
 	{
 	  if (is_global_var (var))
 	    {
@@ -801,7 +801,7 @@ remove_unused_locals (void)
 	if (TREE_CODE (var) == VAR_DECL
 	    && is_global_var (var)
 	    && (ann = var_ann (var)) != NULL
-	    && ann->used)
+	    && is_used_p (var))
 	  mark_all_vars_used (&DECL_INITIAL (var), global_unused_vars);
 
       num = VEC_length (tree, cfun->local_decls);
@@ -827,8 +827,8 @@ remove_unused_locals (void)
     if (!is_global_var (t)
 	&& TREE_CODE (t) != PARM_DECL
 	&& TREE_CODE (t) != RESULT_DECL
-	&& !(ann = var_ann (t))->used
-	&& !ann->is_heapvar)
+	&& !is_used_p (t)
+	&& !var_ann (t)->is_heapvar)
       remove_referenced_var (t);
   remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));
   if (dump_file && (dump_flags & TDF_DETAILS))

[-- Attachment #7: Type: text/plain, Size: 166 bytes --]


4th: check that used flag is never accessed before initialization, and
that it's uninitialized after inlining and cleared after clearing it for
all referenced_vars


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #8: ipa-check-uninit-used.patch --]
[-- Type: text/x-diff, Size: 3292 bytes --]

Index: gcc/tree-flow-inline.h
===================================================================
--- gcc/tree-flow-inline.h.orig	2011-01-29 23:00:20.738072144 -0200
+++ gcc/tree-flow-inline.h	2011-01-29 23:00:21.006072156 -0200
@@ -569,7 +569,7 @@ static inline void
 set_is_used (tree var)
 {
   var_ann_t ann = get_var_ann (var);
-  ann->used = true;
+  ann->used3 = TRI_TRUE;
 }
 
 /* Clear VAR's used flag, so that it may be discarded during rtl
@@ -579,7 +579,7 @@ static inline void
 clear_is_used (tree var)
 {
   var_ann_t ann = var_ann (var);
-  ann->used = false;
+  ann->used3 = TRI_FALSE;
 }
 
 /* Return true if VAR is marked as used.  */
@@ -588,7 +588,8 @@ static inline bool
 is_used_p (tree var)
 {
   var_ann_t ann = var_ann (var);
-  return ann->used;
+  gcc_checking_assert (ann->used3 != TRI_UNINIT);
+  return ann->used3 == TRI_TRUE;
 }
 
 /* Return true if T (assumed to be a DECL) is a global variable.
Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h.orig	2011-01-29 22:17:07.349555724 -0200
+++ gcc/tree-flow.h	2011-01-29 23:00:21.023072157 -0200
@@ -159,13 +159,15 @@ enum need_phi_state {
 };
 
 
+enum tristate { TRI_FALSE = -1, TRI_UNINIT = 0, TRI_TRUE = 1 };
+
 struct GTY(()) var_ann_d {
   /* Used when building base variable structures in a var_map.  */
   unsigned base_var_processed : 1;
 
   /* Nonzero if this variable was used after SSA optimizations were
      applied.  We set this when translating out of SSA form.  */
-  unsigned used : 1;
+  ENUM_BITFIELD (tristate) used3 : 2;
 
   /* This field indicates whether or not the variable may need PHI nodes.
      See the enum's definition for more detailed information about the
Index: gcc/tree-ssa-live.c
===================================================================
--- gcc/tree-ssa-live.c.orig	2011-01-29 23:00:20.780072146 -0200
+++ gcc/tree-ssa-live.c	2011-01-29 23:08:00.188089207 -0200
@@ -636,8 +636,8 @@ dump_scope_block (FILE *file, int indent
       var_ann_t ann;
 
       if ((ann = var_ann (var))
-	  && ann->used)
-	used = true;
+	  && ann->used3)
+	used = ann->used3 == TRI_TRUE;
 
       fprintf (file, "%*s",indent, "");
       print_generic_decl (file, var, flags);
@@ -709,6 +709,14 @@ remove_unused_locals (void)
   /* Assume all locals are unused.  */
   FOR_EACH_REFERENCED_VAR (t, rvi)
     clear_is_used (t);
+  FOR_EACH_LOCAL_DECL (cfun, num, t)
+    {
+      var_ann_t ann = var_ann (t);
+      if (!ann || is_global_var (t))
+	continue;
+
+      gcc_checking_assert (ann->used3 == TRI_FALSE);
+    }
 
   /* Walk the CFG marking all referenced symbols.  */
   FOR_EACH_BB (bb)
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-01-29 23:00:20.493072133 -0200
+++ gcc/tree-inline.c	2011-01-29 23:00:21.091072160 -0200
@@ -5251,6 +5251,17 @@ tree_function_versioning (tree old_decl,
   gcc_assert (!id.debug_stmts);
   VEC_free (gimple, heap, init_stmts);
 
+#ifdef ENABLE_CHECKING
+  FOR_EACH_LOCAL_DECL (cfun, i, p)
+    {
+      var_ann_t ann = var_ann (p);
+      if (!ann || is_global_var (p))
+	continue;
+
+      gcc_checking_assert (ann->used3 == TRI_UNINIT);
+    }
+#endif
+
   remove_unused_locals ();
 
   pop_cfun ();

[-- Attachment #9: Type: text/plain, Size: 258 bytes --]



-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-31 12:24       ` Alexandre Oliva
@ 2011-01-31 14:01         ` Richard Guenther
  2011-02-01 22:25           ` Alexandre Oliva
  2011-02-02  6:34           ` Alexandre Oliva
  0 siblings, 2 replies; 46+ messages in thread
From: Richard Guenther @ 2011-01-31 14:01 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Mon, Jan 31, 2011 at 11:58 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jan 21, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> we seem to be looking at default-initialized (not recomputed)
>> information, a possibility that IIUC you and honza were concered about
>> when reviewing the original patch, but that went way over the top of
>> my head :-(
>
> I ended up introducing some infrastructure to check that we don't use
> uninitialized “used” in var_ann, and that helped me catch a few
> problems.
>
> First, it reported that var decls created corresponding to result decls
> of inlined functions were added twice to the LOCAL_DECLs list.  The
> first patch below fixes that.
>
> Second, it showed that we completely failed to compute used flags after
> inlining.  At first, I split the computation of used flags out of
> remove_unused_locals() and refrain from actually removing variables, but
> then I decided calling removed_unused_locals() after inlining might get
> us smaller functions and more accurate computations for additional
> inlining.
>
> But that was not enough to ensure all accesses were properly
> initialized.  Temporaries created after the referenced_vars gimple pass
> are not recorded in referenced_vars, so the resetting of the used flag
> in remove_unused_vars doesn't apply to them.  I considered going over
> LOCAL_DECLS, but instead I decided to arrange for variables to be added
> to referenced_vars as they are created.  I hope that's not just ok, but
> also desirable.
>
> With these fixes, in the second patch, I could get a full bootstrap and
> library build on x86_64-linux-gnu, fixing both PR 47106 bug and the
> regression it caused.
>
>
> The third patch introduces an abstract interface to set and clear the
> used flag, so that introducing the checks could be done with a simpler
> patch, and the fourth patch introduces the rejection of uses before
> initialization of this flag using a 3-state variable rather than a
> boolean, and additional checks to verify that the flags are at the
> expected states before we start computing what locals are unused.  I
> don't expect these two patches to be installed, and I'm only posting
> them for the record, but if you think they'd useful (say, the abstract
> interface is desirable, or the additional checks should be enabled given
> some compile-time --enable-checking flag), let me know and I'll prepare
> and submit the patches for inclusion.
>
>
> I'm regstrapping them all together on x86_64-linux-gnu and
> i686-linux-gnu, at -O1 and -O3 in addition to the regular -O2 bootstrap
> (with -fcompare-debug at stage3, just because), and then I'll regstrap
> only the first two.  This will take some time, because of the hardware
> failures I experienced, but I wanted to post the patches early so that,
> if they require changes or aren't acceptable, I can save some
> no-longer-copious machine time and devote it to the other bugs that are
> on my plate.  Thanks for your understanding.
>
> Ok to install the first 2 if they pass regstrap?  Opinions on the other 2?
>
>
> 1st: fix duplicate LOCAL_DECLs for inlined result decls

Ok (I think the assert is superfluous, a lot of things would already be
broken if that didn't hold ...).  A further patch might simply pass
a struct function to add_local_decl instead.

> 2nd: fix bug and regression: remove unused vars after inlining and add
> newly-created variables to referenced_vars

Err, you add remove-unused-vars in function versioning.  Which means
if that is useful we copied a function with unused locals - that's the place
this should (eventually) be fixed, not this one.

The 2nd part of the patch is ok (the gimple-low.c change).

>
>
> 3rd: introduce abstract interface to clear and test the used flag

+/* Clear VAR's used flag, so that it may be discarded during rtl
+   expansion.  */
+
+static inline void
+clear_is_used (tree var)

I think this should just say "Clear VAR's used flag.".  If it is just used
during RTL expansion that pass should simply compute it and
keep a local array/bitmap, not using var_ann (which really should go away).

That said, the patch is ok.

>
>
> 4th: check that used flag is never accessed before initialization, and
> that it's uninitialized after inlining and cleared after clearing it for
> all referenced_vars

This is not ok (at this stage anyway).  I think we should compute
"used" where we need it (like remove-unused-vars does).  If expand
needs it we should compute it there (I think there are no other users).

Richard.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-31 14:01         ` Richard Guenther
@ 2011-02-01 22:25           ` Alexandre Oliva
  2011-02-01 22:40             ` Richard Guenther
  2011-02-02  6:34           ` Alexandre Oliva
  1 sibling, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-01 22:25 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

On Jan 31, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

>> Second, it showed that we completely failed to compute used flags after
>> inlining.  At first, I split the computation of used flags out of
>> remove_unused_locals() and refrain from actually removing variables, but
>> then I decided calling removed_unused_locals() after inlining might get
>> us smaller functions and more accurate computations for additional
>> inlining.

> Err, you add remove-unused-vars in function versioning.  Which means
> if that is useful we copied a function with unused locals - that's the place
> this should (eventually) be fixed, not this one.

No, it just means we need part of the computation fo
remove_unused_vars().  Maybe my assumption above, that it would make a
difference in terms of function size and accuracy, is wrong, but we
still need to compute what is used and what isn't.

Simply setting newly-copied variables as used doesn't do it: if the
original function has unused user variables preserved for debug info, we
want them preserved as unused, rather than having them set to used as in
the previous version of the patch did.

I suppose we could simply copy the state of the used flag when
duplicating a variable.  I haven't tried that, but it's probabl worth a
shot, to avoid having to compute used/unused variables after versioning.

>> 4th: check that used flag is never accessed before initialization, and
>> that it's uninitialized after inlining and cleared after clearing it for
>> all referenced_vars

> This is not ok (at this stage anyway)

It wasn't meant to be installed anyway, it was just for the record.
Like the 3rd patch.  Now, if you find this useful for the next stage,
you think it should be enabled given some (new?) checking option?

> I think we should compute "used" where we need it (like
> remove-unused-vars does).  If expand needs it we should compute it
> there (I think there are no other users).

One of the problems is that we only compute used for referenced vars,
but I found we sometimes used it for LOCAL_DECLs or block-scope decls
that were not in referenced vars, so their flags weren't cleared or
recomputed.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-01 22:25           ` Alexandre Oliva
@ 2011-02-01 22:40             ` Richard Guenther
  2011-02-02  0:42               ` Alexandre Oliva
  0 siblings, 1 reply; 46+ messages in thread
From: Richard Guenther @ 2011-02-01 22:40 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Tue, Feb 1, 2011 at 11:25 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Jan 31, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>>> Second, it showed that we completely failed to compute used flags after
>>> inlining.  At first, I split the computation of used flags out of
>>> remove_unused_locals() and refrain from actually removing variables, but
>>> then I decided calling removed_unused_locals() after inlining might get
>>> us smaller functions and more accurate computations for additional
>>> inlining.
>
>> Err, you add remove-unused-vars in function versioning.  Which means
>> if that is useful we copied a function with unused locals - that's the place
>> this should (eventually) be fixed, not this one.
>
> No, it just means we need part of the computation fo
> remove_unused_vars().  Maybe my assumption above, that it would make a
> difference in terms of function size and accuracy, is wrong, but we
> still need to compute what is used and what isn't.
>
> Simply setting newly-copied variables as used doesn't do it: if the
> original function has unused user variables preserved for debug info, we
> want them preserved as unused, rather than having them set to used as in
> the previous version of the patch did.
>
> I suppose we could simply copy the state of the used flag when
> duplicating a variable.  I haven't tried that, but it's probabl worth a
> shot, to avoid having to compute used/unused variables after versioning.
>
>>> 4th: check that used flag is never accessed before initialization, and
>>> that it's uninitialized after inlining and cleared after clearing it for
>>> all referenced_vars
>
>> This is not ok (at this stage anyway)
>
> It wasn't meant to be installed anyway, it was just for the record.
> Like the 3rd patch.  Now, if you find this useful for the next stage,
> you think it should be enabled given some (new?) checking option?
>
>> I think we should compute "used" where we need it (like
>> remove-unused-vars does).  If expand needs it we should compute it
>> there (I think there are no other users).
>
> One of the problems is that we only compute used for referenced vars,
> but I found we sometimes used it for LOCAL_DECLs or block-scope decls
> that were not in referenced vars, so their flags weren't cleared or
> recomputed.

Did you spot any other users than remove-unused-locals and expand?
What I said above is that expand should compute a used flag for
the variables it cares about (not necessarily restricted to referenced-vars),
and _not_ use var_ann ()->used for this (but for example a bitmap of UIDs).
var_ann () should go away.  eventually.

Richard.

> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-01 22:40             ` Richard Guenther
@ 2011-02-02  0:42               ` Alexandre Oliva
  2011-02-02 11:30                 ` Richard Guenther
  0 siblings, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-02  0:42 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

On Feb  1, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> Did you spot any other users than remove-unused-locals and expand?

Yeah, the just-introduced use that uses the used flag to estimate stack
sizes for inlining.

Do you suggest we should compute the global usedness of variables of
each function every time we're about to decide on inlining?  This sounds
very expensive to me.  Perhaps we could compute usedness for all
variables before going through global inlining/versioning decisions, but
then we might have to update it locally, for the inlined-into function
after each inlining, and for the new function version after it's
created, no?

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-01-31 14:01         ` Richard Guenther
  2011-02-01 22:25           ` Alexandre Oliva
@ 2011-02-02  6:34           ` Alexandre Oliva
  1 sibling, 0 replies; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-02  6:34 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 882 bytes --]

On Jan 31, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

>> 1st: fix duplicate LOCAL_DECLs for inlined result decls

> Ok (I think the assert is superfluous, a lot of things would already be
> broken if that didn't hold ...)

Ok, assert removed, then installed a separate patch when I realized that
the removal of the assert made the caller variable unused.

> +/* Clear VAR's used flag, so that it may be discarded during rtl
> +   expansion.  */
> +
> +static inline void
> +clear_is_used (tree var)

> I think this should just say "Clear VAR's used flag.".

Adjusted.  I also adjusted the last remaining direct use of the used
field from a dumping function.

Here are the patches I installed so far.  I got my build machines back,
and I've nearly finished the testing of the other patches in my queue,
so I'll try copying var_ann->used when duplicating variables.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: no-duplicate-inlined-return-local-decl.patch --]
[-- Type: text/x-diff, Size: 1213 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* tree-inline.c (declare_return_variable): Add result decl to
	local decls only once.
	* gimple-low.c (record_vars_into): Mark newly-created variables
	as referenced.

Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-02-02 03:41:34.689416199 -0200
+++ gcc/tree-inline.c	2011-02-02 03:43:02.887665322 -0200
@@ -2864,7 +2864,6 @@ declare_return_variable (copy_body_data 
     }
 
   DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
-  add_local_decl (DECL_STRUCT_FUNCTION (caller), var);
 
   /* Do not have the rest of GCC warn about this variable as it should
      not be visible to the user.  */
Index: gcc/gimple-low.c
===================================================================
--- gcc/gimple-low.c.orig	2011-02-02 02:29:49.452253490 -0200
+++ gcc/gimple-low.c	2011-02-02 03:43:02.917664388 -0200
@@ -907,6 +907,8 @@ record_vars_into (tree vars, tree fn)
 
       /* Record the variable.  */
       add_local_decl (cfun, var);
+      if (gimple_referenced_vars (cfun))
+	add_referenced_var (var);
     }
 
   if (fn != current_function_decl)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: no-duplicate-inlined-return-local-decl-more.patch --]
[-- Type: text/x-diff, Size: 639 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* tree-inline.c (declare_return_variable): Remove unused caller
	variable.

Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-02-02 04:21:17.917531792 -0200
+++ gcc/tree-inline.c	2011-02-02 04:22:01.370137012 -0200
@@ -2749,7 +2749,6 @@ declare_return_variable (copy_body_data 
 			 basic_block entry_bb)
 {
   tree callee = id->src_fn;
-  tree caller = id->dst_fn;
   tree result = DECL_RESULT (callee);
   tree callee_type = TREE_TYPE (result);
   tree caller_type;

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: ipa-var-used-abstract-interface.patch --]
[-- Type: text/x-diff, Size: 4553 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* tree-flow-inline.h (clear_is_used, is_used_p): New.
	* cfgexpand.c (account_used_vars_for_block): Use them.
	* tree-nrv.c (tree_nrv): Likewise.
	* tree-ssa-live.c (remove_unused_scope_block_p): Likewise.
	(dump_scope_block): Likewise.
	(remove_unused_locals): Likewise.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-01-31 09:04:28.000000000 -0200
+++ gcc/cfgexpand.c	2011-02-02 03:45:49.306492981 -0200
@@ -1325,7 +1325,7 @@ account_used_vars_for_block (tree block,
 
   /* Expand all variables at this level.  */
   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (var_ann (t) && var_ann (t)->used)
+    if (var_ann (t) && is_used_p (t))
       size += expand_one_var (t, toplevel, false);
 
   /* Expand all variables at containing levels.  */
Index: gcc/tree-flow-inline.h
===================================================================
--- gcc/tree-flow-inline.h.orig	2011-01-31 09:04:28.000000000 -0200
+++ gcc/tree-flow-inline.h	2011-02-02 03:59:32.083138770 -0200
@@ -569,9 +569,26 @@ static inline void
 set_is_used (tree var)
 {
   var_ann_t ann = get_var_ann (var);
-  ann->used = 1;
+  ann->used = true;
 }
 
+/* Clear VAR's used flag.  */
+
+static inline void
+clear_is_used (tree var)
+{
+  var_ann_t ann = var_ann (var);
+  ann->used = false;
+}
+
+/* Return true if VAR is marked as used.  */
+
+static inline bool
+is_used_p (tree var)
+{
+  var_ann_t ann = var_ann (var);
+  return ann->used;
+}
 
 /* Return true if T (assumed to be a DECL) is a global variable.
    A variable is considered global if its storage is not automatic.  */
Index: gcc/tree-nrv.c
===================================================================
--- gcc/tree-nrv.c.orig	2011-01-31 09:04:28.000000000 -0200
+++ gcc/tree-nrv.c	2011-02-02 03:45:49.314492733 -0200
@@ -263,7 +263,7 @@ tree_nrv (void)
   DECL_HAS_VALUE_EXPR_P (found) = 1;
 
   /* FOUND is no longer used.  Ensure it gets removed.  */
-  var_ann (found)->used = 0;
+  clear_is_used (found);
   return 0;
 }
 
Index: gcc/tree-ssa-live.c
===================================================================
--- gcc/tree-ssa-live.c.orig	2011-01-31 09:04:28.000000000 -0200
+++ gcc/tree-ssa-live.c	2011-02-02 04:10:16.132912744 -0200
@@ -468,7 +468,7 @@ remove_unused_scope_block_p (tree scope)
 	 Exception are the scope blocks not containing any instructions
 	 at all so user can't get into the scopes at first place.  */
       else if ((ann = var_ann (*t)) != NULL
-		&& ann->used)
+	       && is_used_p (*t))
 	unused = false;
       else if (TREE_CODE (*t) == LABEL_DECL && TREE_USED (*t))
 	/* For labels that are still used in the IL, the decision to
@@ -633,13 +633,11 @@ dump_scope_block (FILE *file, int indent
   for (var = BLOCK_VARS (scope); var; var = DECL_CHAIN (var))
     {
       bool used = false;
-      var_ann_t ann;
 
-      if ((ann = var_ann (var))
-	  && ann->used)
-	used = true;
+      if (var_ann (var))
+	used = is_used_p (var);
 
-      fprintf (file, "%*s",indent, "");
+      fprintf (file, "%*s", indent, "");
       print_generic_decl (file, var, flags);
       fprintf (file, "%s\n", used ? "" : " (unused)");
     }
@@ -708,7 +706,7 @@ remove_unused_locals (void)
 
   /* Assume all locals are unused.  */
   FOR_EACH_REFERENCED_VAR (t, rvi)
-    var_ann (t)->used = false;
+    clear_is_used (t);
 
   /* Walk the CFG marking all referenced symbols.  */
   FOR_EACH_BB (bb)
@@ -769,7 +767,7 @@ remove_unused_locals (void)
       var = VEC_index (tree, cfun->local_decls, srcidx);
       if (TREE_CODE (var) != FUNCTION_DECL
 	  && (!(ann = var_ann (var))
-	      || !ann->used))
+	      || !is_used_p (var)))
 	{
 	  if (is_global_var (var))
 	    {
@@ -801,7 +799,7 @@ remove_unused_locals (void)
 	if (TREE_CODE (var) == VAR_DECL
 	    && is_global_var (var)
 	    && (ann = var_ann (var)) != NULL
-	    && ann->used)
+	    && is_used_p (var))
 	  mark_all_vars_used (&DECL_INITIAL (var), global_unused_vars);
 
       num = VEC_length (tree, cfun->local_decls);
@@ -827,8 +825,8 @@ remove_unused_locals (void)
     if (!is_global_var (t)
 	&& TREE_CODE (t) != PARM_DECL
 	&& TREE_CODE (t) != RESULT_DECL
-	&& !(ann = var_ann (t))->used
-	&& !ann->is_heapvar)
+	&& !is_used_p (t)
+	&& !var_ann (t)->is_heapvar)
       remove_referenced_var (t);
   remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));
   if (dump_file && (dump_flags & TDF_DETAILS))

[-- Attachment #5: Type: text/plain, Size: 258 bytes --]



-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-02  0:42               ` Alexandre Oliva
@ 2011-02-02 11:30                 ` Richard Guenther
  2011-02-02 19:35                   ` Alexandre Oliva
  0 siblings, 1 reply; 46+ messages in thread
From: Richard Guenther @ 2011-02-02 11:30 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Wed, Feb 2, 2011 at 1:42 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Feb  1, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>> Did you spot any other users than remove-unused-locals and expand?
>
> Yeah, the just-introduced use that uses the used flag to estimate stack
> sizes for inlining.
>
> Do you suggest we should compute the global usedness of variables of
> each function every time we're about to decide on inlining?  This sounds
> very expensive to me.  Perhaps we could compute usedness for all
> variables before going through global inlining/versioning decisions, but
> then we might have to update it locally, for the inlined-into function
> after each inlining, and for the new function version after it's
> created, no?

No.  It's an estimate only and thus can even use a simpler implementation
compared to that in cfgexpand.  Also for inline parameter computation
we already look at all statements, so the cost of determining used
variables looks cheap if it is done at the same time.

Richard.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-02 11:30                 ` Richard Guenther
@ 2011-02-02 19:35                   ` Alexandre Oliva
  2011-02-03  5:51                     ` Alexandre Oliva
  0 siblings, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-02 19:35 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

On Feb  2, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> No.  It's an estimate only and thus can even use a simpler implementation
> compared to that in cfgexpand.

Ok, but it can't be so simple that it gives different results depending
on whether or not unused variables were retained to debug information,
and this is what's hitting us ATM.

Perhaps relying on the used flag and going over all scope blocks wasn't
such a good idea, after all; I suppose we could get something more
reliable using referenced_vars only.  Although I'm finding it difficult
to figure out whether presence in referenced_vars should ever be
different from having the used flag marked, it appears that presence in
referenced_vars is maintained more accurately, even during inlining.

My earlier patch, that calls remove_unused_locals after versioning, and
an earlier version, that split out the part of remove_unused_locals that
only computed used and called just that, both work, even when checking
for uninitialized uses of used, but my attempts to copy the used flag
from the inlined function are proving to be too much of a headache.  I
think I learned to not try to fit a round block in a square hole when I
was very, very young ;-)

I'm going to trying to fix the problem of stack size estimation based on
referenced_vars only.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-02 19:35                   ` Alexandre Oliva
@ 2011-02-03  5:51                     ` Alexandre Oliva
  2011-02-03 10:19                       ` Richard Guenther
  0 siblings, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-03  5:51 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 828 bytes --]

On Feb  2, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:

> Perhaps relying on the used flag and going over all scope blocks wasn't
> such a good idea, after all; I suppose we could get something more
> reliable using referenced_vars only.  Although I'm finding it difficult
> to figure out whether presence in referenced_vars should ever be
> different from having the used flag marked, it appears that presence in
> referenced_vars is maintained more accurately, even during inlining.

Indeed, early in compilation we don't have referenced_vars at all, but
we can make do without them, going through the scope blocks.

Something very close to this patch (I'd accidentally left
init_vars_expansion() out) regstrapped without regressions on
x86_64-linux-gnu and i686-pc-linux-gnu.  I'm now retesting this.  Ok if
it passes?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ipa-inline-stack-growth-referenced-vars-pr47106.patch --]
[-- Type: text/x-diff, Size: 6295 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* cfgexpand.c (account_used_vars_for_block): Disregard used flag.
	(estimated_stack_frame_size): Prefer referenced vars over scope
	block vars.
	* tree-flow.h (referenced_var_lookup_in): Declare.
	* tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
	that were referenced in the original function.
	(copy_decl_for_dup_finish): Likewise.
	* tree-dfa.c (referenced_var_lookup_in): Split out of...
	(referenced_var_lookup): ... this.

for  gcc/testsuite/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* g++.dg/debug/pr47106.C: New.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-02 19:58:10.013374804 -0200
+++ gcc/cfgexpand.c	2011-02-02 20:21:51.148296315 -0200
@@ -1325,8 +1325,7 @@ account_used_vars_for_block (tree block,
 
   /* Expand all variables at this level.  */
   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (var_ann (t) && is_used_p (t))
-      size += expand_one_var (t, toplevel, false);
+    size += expand_one_var (t, toplevel, false);
 
   /* Expand all variables at containing levels.  */
   for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
@@ -1381,20 +1380,38 @@ estimated_stack_frame_size (tree decl)
   size_t i;
   tree var, outer_block = DECL_INITIAL (current_function_decl);
   unsigned ix;
+  referenced_var_iterator rvi;
   tree old_cur_fun_decl = current_function_decl;
+
   current_function_decl = decl;
   push_cfun (DECL_STRUCT_FUNCTION (decl));
 
-  init_vars_expansion ();
-
-  FOR_EACH_LOCAL_DECL (cfun, ix, var)
+  /* We want to count only referenced vars, but if asked to estimate
+     stack size before we compute them, guess based on local decls and
+     scopes.  We have to make sure we compute the same values for
+     debug and non-debug compilations, in spite of the removal of
+     unused variables from lexical blocks, but this removal never
+     occurs before referenced vars are computed.  Even when we perform
+     inlining and versioning, we register referenced vars, and can
+     thus use them for further stack size estimation.  */
+  if (gimple_in_ssa_p (cfun))
+    {
+      gcc_checking_assert (gimple_referenced_vars (cfun));
+      FOR_EACH_REFERENCED_VAR (var, rvi)
+	size += expand_one_var (var, true, false);
+    }
+  else
     {
-      /* TREE_USED marks local variables that do not appear in lexical
-	 blocks.  We don't want to expand those that do twice.  */
-      if (TREE_USED (var))
-        size += expand_one_var (var, true, false);
+      FOR_EACH_LOCAL_DECL (cfun, ix, var)
+	{
+	  /* TREE_USED marks local variables that do not appear in
+	     lexical blocks.  We don't want to expand those that do
+	     twice.  */
+	  if (TREE_USED (var))
+	    size += expand_one_var (var, true, false);
+	}
+      size += account_used_vars_for_block (outer_block, true);
     }
-  size += account_used_vars_for_block (outer_block, true);
 
   if (stack_vars_num > 0)
     {
Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h.orig	2011-02-02 17:31:40.430302181 -0200
+++ gcc/tree-flow.h	2011-02-02 20:05:17.169414197 -0200
@@ -319,6 +319,7 @@ typedef struct
        !end_referenced_vars_p (&(ITER)); \
        (VAR) = next_referenced_var (&(ITER)))
 
+extern tree referenced_var_lookup_in (htab_t, unsigned int);
 extern tree referenced_var_lookup (unsigned int);
 extern bool referenced_var_check_and_insert (tree);
 #define num_referenced_vars htab_elements (gimple_referenced_vars (cfun))
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-02-02 17:31:41.279276897 -0200
+++ gcc/tree-inline.c	2011-02-03 00:08:30.100962123 -0200
@@ -317,7 +317,12 @@ remap_decl (tree decl, copy_body_data *i
 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
 	{
 	  get_var_ann (t);
-	  add_referenced_var (t);
+	  if (TREE_CODE (decl) != VAR_DECL
+	      || !gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+	      || referenced_var_lookup_in (gimple_referenced_vars
+					   (DECL_STRUCT_FUNCTION (id->src_fn)),
+					   DECL_UID (decl)))
+	    add_referenced_var (t);
 	}
       return t;
     }
@@ -4753,6 +4758,14 @@ copy_decl_for_dup_finish (copy_body_data
        new function.  */
     DECL_CONTEXT (copy) = id->dst_fn;
 
+  if (TREE_CODE (decl) == VAR_DECL
+      && gimple_referenced_vars (cfun)
+      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+      && referenced_var_lookup_in (gimple_referenced_vars
+				   (DECL_STRUCT_FUNCTION (id->src_fn)),
+				   DECL_UID (decl)))
+    add_referenced_var (copy);
+
   return copy;
 }
 
Index: gcc/tree-dfa.c
===================================================================
--- gcc/tree-dfa.c.orig	2011-02-02 16:21:12.934940723 -0200
+++ gcc/tree-dfa.c	2011-02-02 20:05:17.249411771 -0200
@@ -490,10 +490,18 @@ find_referenced_vars_in (gimple stmt)
 tree
 referenced_var_lookup (unsigned int uid)
 {
+  return referenced_var_lookup_in (gimple_referenced_vars (cfun), uid);
+}
+
+/* Lookup UID in REFVARS and return the associated variable.  */
+
+tree
+referenced_var_lookup_in (htab_t refvars, unsigned int uid)
+{
   tree h;
   struct tree_decl_minimal in;
   in.uid = uid;
-  h = (tree) htab_find_with_hash (gimple_referenced_vars (cfun), &in, uid);
+  h = (tree) htab_find_with_hash (refvars, &in, uid);
   return h;
 }
 
Index: gcc/testsuite/g++.dg/debug/pr47106.C
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/testsuite/g++.dg/debug/pr47106.C	2011-02-02 20:18:16.826787838 -0200
@@ -0,0 +1,37 @@
+// { dg-do compile }
+// { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+
+void end (int, int) __attribute__ ((__noreturn__));
+
+struct S
+{
+  int i;
+  S *s;
+};
+
+inline bool f (S *s)
+{
+  if (!s->s)
+    end (0, 0);
+  return s->s == s;
+}
+
+inline bool
+baz (S s1, S)
+{
+  while (f (&s1));
+}
+
+inline bool
+bar (S s1, S s2, S)
+{
+  baz (s1, s2);
+}
+
+S getS ();
+
+bool
+foo ()
+{
+  bar (getS (), getS (), getS ());
+}

[-- Attachment #3: Type: text/plain, Size: 257 bytes --]


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-03  5:51                     ` Alexandre Oliva
@ 2011-02-03 10:19                       ` Richard Guenther
  2011-02-03 13:25                         ` Jan Hubicka
  2011-02-04  7:49                         ` Alexandre Oliva
  0 siblings, 2 replies; 46+ messages in thread
From: Richard Guenther @ 2011-02-03 10:19 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Thu, Feb 3, 2011 at 6:50 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Feb  2, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> Perhaps relying on the used flag and going over all scope blocks wasn't
>> such a good idea, after all; I suppose we could get something more
>> reliable using referenced_vars only.  Although I'm finding it difficult
>> to figure out whether presence in referenced_vars should ever be
>> different from having the used flag marked, it appears that presence in
>> referenced_vars is maintained more accurately, even during inlining.
>
> Indeed, early in compilation we don't have referenced_vars at all, but
> we can make do without them, going through the scope blocks.

I think we can simply move pass_inline_parameters to after
pass_referenced_vars.  It doesn't make much sense at its current
position.  Honza?

referenced_var_lookup_in should rather take a struct function argument
than a hashtable (in fact, given the few existing callers I'd just change
the referenced_var_lookup signature).

Can you do a quick comparison between the old and the new
stack frame estimations on some random files from gcc?

Thanks,
Richard.

> Something very close to this patch (I'd accidentally left
> init_vars_expansion() out) regstrapped without regressions on
> x86_64-linux-gnu and i686-pc-linux-gnu.  I'm now retesting this.  Ok if
> it passes?
>
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-03 10:19                       ` Richard Guenther
@ 2011-02-03 13:25                         ` Jan Hubicka
  2011-02-04  7:49                         ` Alexandre Oliva
  1 sibling, 0 replies; 46+ messages in thread
From: Jan Hubicka @ 2011-02-03 13:25 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Alexandre Oliva, gcc-patches, Jan Hubicka

> On Thu, Feb 3, 2011 at 6:50 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
> > On Feb  2, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:
> >
> >> Perhaps relying on the used flag and going over all scope blocks wasn't
> >> such a good idea, after all; I suppose we could get something more
> >> reliable using referenced_vars only.  Although I'm finding it difficult
> >> to figure out whether presence in referenced_vars should ever be
> >> different from having the used flag marked, it appears that presence in
> >> referenced_vars is maintained more accurately, even during inlining.
> >
> > Indeed, early in compilation we don't have referenced_vars at all, but
> > we can make do without them, going through the scope blocks.
> 
> I think we can simply move pass_inline_parameters to after
> pass_referenced_vars.  It doesn't make much sense at its current
> position.  Honza?

Yes, it is there because we used to have early early inlining for profling.
I would suggest to move the first pass just before early inlining.  

I hope early inliner won't be confused by seeing functions with parameters
not computed yet, but those are not inlinable anyway since they are not yet
in SSA form.

Honza
> 
> referenced_var_lookup_in should rather take a struct function argument
> than a hashtable (in fact, given the few existing callers I'd just change
> the referenced_var_lookup signature).
> 
> Can you do a quick comparison between the old and the new
> stack frame estimations on some random files from gcc?
> 
> Thanks,
> Richard.
> 
> > Something very close to this patch (I'd accidentally left
> > init_vars_expansion() out) regstrapped without regressions on
> > x86_64-linux-gnu and i686-pc-linux-gnu.  I'm now retesting this.  Ok if
> > it passes?
> >
> >
> >
> > --
> > Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> > You must be the change you wish to see in the world. -- Gandhi
> > Be Free! -- http://FSFLA.org/   FSF Latin America board member
> > Free Software Evangelist      Red Hat Brazil Compiler Engineer
> >
> >

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-03 10:19                       ` Richard Guenther
  2011-02-03 13:25                         ` Jan Hubicka
@ 2011-02-04  7:49                         ` Alexandre Oliva
  2011-02-07 12:32                           ` Alexandre Oliva
  1 sibling, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-04  7:49 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 2462 bytes --]

On Feb  3, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> On Thu, Feb 3, 2011 at 6:50 AM, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Feb  2, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:
>> 
>>> Perhaps relying on the used flag and going over all scope blocks wasn't
>>> such a good idea, after all; I suppose we could get something more
>>> reliable using referenced_vars only.  Although I'm finding it difficult
>>> to figure out whether presence in referenced_vars should ever be
>>> different from having the used flag marked, it appears that presence in
>>> referenced_vars is maintained more accurately, even during inlining.
>> 
>> Indeed, early in compilation we don't have referenced_vars at all, but
>> we can make do without them, going through the scope blocks.

> I think we can simply move pass_inline_parameters to after
> pass_referenced_vars.

That isn't enough.  We often attempt to estimate the stack size of a
function before we as much as put it in SSA form (let alone compute
referenced_vras), while processing other functions.

> referenced_var_lookup_in should rather take a struct function argument
> than a hashtable (in fact, given the few existing callers I'd just change
> the referenced_var_lookup signature).

*nod*

> Can you do a quick comparison between the old and the new
> stack frame estimations on some random files from gcc?

I guess...  Once we have another working patch.  My previous patch did
away with the need for rearranging passes by using the current
computation when referenced_vars are not available, but if we switch to
referenced_vars only, we have to figure out how to rearrange the passes
so that we make reasonable estimates.

Here's where I stand ATM.  The first patch is supposed to implement the
changes you suggested, plus some attempts to fix things up or otherwise
detect problems.  It finds too many.  The second patch restores some of
the removed functionality to error out if the stack size computations
differ.  None of these are meant for inclusion, I post them for the
record in case any of you guys would like to pick it up while I get some
sleep.  Suggestions on how to proceed are welcome.  I'm thinking of
rearranging the passes so we compute referenced_vars for all functions
before advancing to other passes for any other functions, but I have no
idea of how to do that.  I'll dig it up if nobody beats me to it.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ipa-inline-stack-growth-referenced-vars-pr47106.patch --]
[-- Type: text/x-diff, Size: 10734 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* cfgexpand.c (account_used_vars_for_block): Disregard used flag.
	(estimated_stack_frame_size): Prefer referenced vars over scope
	block vars.
	* tree-flow.h (referenced_var_lookup_in): Declare.
	* tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
	that were referenced in the original function.
	(copy_decl_for_dup_finish): Likewise.
	* tree-dfa.c (referenced_var_lookup_in): Split out of...
	(referenced_var_lookup): ... this.

for  gcc/testsuite/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* g++.dg/debug/pr47106.C: New.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-04 05:33:51.822915638 -0200
+++ gcc/cfgexpand.c	2011-02-04 05:34:59.067593689 -0200
@@ -520,7 +520,7 @@ update_alias_info_with_stack_vars (void)
 	     for -O0 where we are preserving even unreferenced variables.  */
 	  gcc_assert (DECL_P (decl)
 		      && (!optimize
-			  || referenced_var_lookup (DECL_UID (decl))));
+			  || referenced_var_lookup (cfun, DECL_UID (decl))));
 	  bitmap_set_bit (part, uid);
 	  *((bitmap *) pointer_map_insert (decls_to_partitions,
 					   (void *)(size_t) uid)) = part;
@@ -1311,30 +1311,6 @@ create_stack_guard (void)
   crtl->stack_protect_guard = guard;
 }
 
-/* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-   expanding variables.  Those variables that can be put into registers
-   are allocated pseudos; those that can't are put on the stack.
-
-   TOPLEVEL is true if this is the outermost BLOCK.  */
-
-static HOST_WIDE_INT
-account_used_vars_for_block (tree block, bool toplevel)
-{
-  tree t;
-  HOST_WIDE_INT size = 0;
-
-  /* Expand all variables at this level.  */
-  for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (var_ann (t) && is_used_p (t))
-      size += expand_one_var (t, toplevel, false);
-
-  /* Expand all variables at containing levels.  */
-  for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-    size += account_used_vars_for_block (t, false);
-
-  return size;
-}
-
 /* Prepare for expanding variables.  */
 static void
 init_vars_expansion (void)
@@ -1379,22 +1355,19 @@ estimated_stack_frame_size (tree decl)
 {
   HOST_WIDE_INT size = 0;
   size_t i;
-  tree var, outer_block = DECL_INITIAL (current_function_decl);
-  unsigned ix;
+  tree var;
   tree old_cur_fun_decl = current_function_decl;
+  referenced_var_iterator rvi;
+
+  if (!gimple_referenced_vars (cfun))
+    return -1;
+
   current_function_decl = decl;
   push_cfun (DECL_STRUCT_FUNCTION (decl));
 
-  init_vars_expansion ();
-
-  FOR_EACH_LOCAL_DECL (cfun, ix, var)
-    {
-      /* TREE_USED marks local variables that do not appear in lexical
-	 blocks.  We don't want to expand those that do twice.  */
-      if (TREE_USED (var))
-        size += expand_one_var (var, true, false);
-    }
-  size += account_used_vars_for_block (outer_block, true);
+  gcc_checking_assert (gimple_referenced_vars (cfun));
+  FOR_EACH_REFERENCED_VAR (var, rvi)
+    size += expand_one_var (var, true, false);
 
   if (stack_vars_num > 0)
     {
Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h.orig	2011-02-04 05:33:52.413868892 -0200
+++ gcc/tree-flow.h	2011-02-04 05:34:59.082592500 -0200
@@ -319,7 +319,7 @@ typedef struct
        !end_referenced_vars_p (&(ITER)); \
        (VAR) = next_referenced_var (&(ITER)))
 
-extern tree referenced_var_lookup (unsigned int);
+extern tree referenced_var_lookup (struct function *, unsigned int);
 extern bool referenced_var_check_and_insert (tree);
 #define num_referenced_vars htab_elements (gimple_referenced_vars (cfun))
 
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-02-04 05:33:52.276879729 -0200
+++ gcc/tree-inline.c	2011-02-04 05:34:59.121589412 -0200
@@ -317,7 +317,11 @@ remap_decl (tree decl, copy_body_data *i
 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
 	{
 	  get_var_ann (t);
-	  add_referenced_var (t);
+	  if (TREE_CODE (decl) != VAR_DECL
+	      || !gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+					DECL_UID (decl)))
+	    add_referenced_var (t);
 	}
       return t;
     }
@@ -4753,6 +4757,13 @@ copy_decl_for_dup_finish (copy_body_data
        new function.  */
     DECL_CONTEXT (copy) = id->dst_fn;
 
+  if (TREE_CODE (decl) == VAR_DECL
+      && gimple_referenced_vars (cfun)
+      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+      && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+				DECL_UID (decl)))
+    add_referenced_var (copy);
+
   return copy;
 }
 
Index: gcc/tree-dfa.c
===================================================================
--- gcc/tree-dfa.c.orig	2011-02-04 05:33:52.126891590 -0200
+++ gcc/tree-dfa.c	2011-02-04 05:34:59.139587983 -0200
@@ -488,12 +488,12 @@ find_referenced_vars_in (gimple stmt)
    variable.  */
 
 tree
-referenced_var_lookup (unsigned int uid)
+referenced_var_lookup (struct function *fn, unsigned int uid)
 {
   tree h;
   struct tree_decl_minimal in;
   in.uid = uid;
-  h = (tree) htab_find_with_hash (gimple_referenced_vars (cfun), &in, uid);
+  h = (tree) htab_find_with_hash (gimple_referenced_vars (fn), &in, uid);
   return h;
 }
 
Index: gcc/testsuite/g++.dg/debug/pr47106.C
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/testsuite/g++.dg/debug/pr47106.C	2011-02-04 05:34:59.190583952 -0200
@@ -0,0 +1,37 @@
+// { dg-do compile }
+// { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+
+void end (int, int) __attribute__ ((__noreturn__));
+
+struct S
+{
+  int i;
+  S *s;
+};
+
+inline bool f (S *s)
+{
+  if (!s->s)
+    end (0, 0);
+  return s->s == s;
+}
+
+inline bool
+baz (S s1, S)
+{
+  while (f (&s1));
+}
+
+inline bool
+bar (S s1, S s2, S)
+{
+  baz (s1, s2);
+}
+
+S getS ();
+
+bool
+foo ()
+{
+  bar (getS (), getS (), getS ());
+}
Index: gcc/gimple-pretty-print.c
===================================================================
--- gcc/gimple-pretty-print.c.orig	2011-02-04 05:33:51.900909541 -0200
+++ gcc/gimple-pretty-print.c	2011-02-04 05:34:59.218581733 -0200
@@ -542,7 +542,7 @@ pp_points_to_solution (pretty_printer *b
       pp_string (buffer, "{ ");
       EXECUTE_IF_SET_IN_BITMAP (pt->vars, 0, i, bi)
 	{
-	  tree var = referenced_var_lookup (i);
+	  tree var = referenced_var_lookup (cfun, i);
 	  if (var)
 	    {
 	      dump_generic_node (buffer, var, 0, dump_flags, false);
Index: gcc/passes.c
===================================================================
--- gcc/passes.c.orig	2011-02-04 05:33:52.192886360 -0200
+++ gcc/passes.c	2011-02-04 05:34:59.244579675 -0200
@@ -729,7 +729,6 @@ init_optimization_passes (void)
   NEXT_PASS (pass_build_cfg);
   NEXT_PASS (pass_warn_function_return);
   NEXT_PASS (pass_build_cgraph_edges);
-  NEXT_PASS (pass_inline_parameters);
   *p = NULL;
 
   /* Interprocedural optimization passes.  */
@@ -753,6 +752,7 @@ init_optimization_passes (void)
 	 inline functions to be inlined even at -O0.  This does not
 	 happen during the first early inline pass.  */
       NEXT_PASS (pass_rebuild_cgraph_edges);
+      NEXT_PASS (pass_inline_parameters);
       NEXT_PASS (pass_early_inline);
       NEXT_PASS (pass_all_early_optimizations);
 	{
Index: gcc/tree-flow-inline.h
===================================================================
--- gcc/tree-flow-inline.h.orig	2011-02-04 05:33:51.976903462 -0200
+++ gcc/tree-flow-inline.h	2011-02-04 05:34:59.256578726 -0200
@@ -103,7 +103,7 @@ next_htab_element (htab_iterator *hti)
 static inline tree
 referenced_var (unsigned int uid)
 {
-  tree var = referenced_var_lookup (uid);
+  tree var = referenced_var_lookup (cfun, uid);
   gcc_assert (var || uid == 0);
   return var;
 }
Index: gcc/tree-into-ssa.c
===================================================================
--- gcc/tree-into-ssa.c.orig	2011-02-04 05:33:52.052897449 -0200
+++ gcc/tree-into-ssa.c	2011-02-04 05:34:59.280576826 -0200
@@ -1469,7 +1469,7 @@ dump_decl_set (FILE *file, bitmap set)
 
       EXECUTE_IF_SET_IN_BITMAP (set, 0, i, bi)
 	{
-	  tree var = referenced_var_lookup (i);
+	  tree var = referenced_var_lookup (cfun, i);
 	  if (var)
 	    print_generic_expr (file, var, 0);
 	  else
Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c.orig	2011-02-04 05:33:52.345874268 -0200
+++ gcc/tree-ssa.c	2011-02-04 05:34:59.298575401 -0200
@@ -1902,7 +1902,7 @@ maybe_optimize_var (tree var, bitmap add
 
   /* If the variable is not in the list of referenced vars then we
      do not need to touch it nor can we rename it.  */
-  if (!referenced_var_lookup (DECL_UID (var)))
+  if (!referenced_var_lookup (cfun, DECL_UID (var)))
     return false;
 
   if (TREE_ADDRESSABLE (var)
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-04 05:35:01.984362779 -0200
+++ gcc/ipa-inline.c	2011-02-04 05:35:07.164952668 -0200
@@ -275,11 +275,14 @@ cgraph_clone_inlined_nodes (struct cgrap
     e->callee->global.inlined_to = e->caller->global.inlined_to;
   else
     e->callee->global.inlined_to = e->caller;
+  gcc_checking_assert (inline_summary (e->caller)->estimated_self_stack_size != -1);
   e->callee->global.stack_frame_offset
     = e->caller->global.stack_frame_offset
       + inline_summary (e->caller)->estimated_self_stack_size;
+  gcc_checking_assert (inline_summary (e->callee)->estimated_self_stack_size != -1)
   peak = e->callee->global.stack_frame_offset
       + inline_summary (e->callee)->estimated_self_stack_size;
+  gcc_checking_assert (e->callee->global.inlined_to->global.estimated_stack_size != -1);
   if (e->callee->global.inlined_to->global.estimated_stack_size < peak)
     e->callee->global.inlined_to->global.estimated_stack_size = peak;
   cgraph_propagate_frequency (e->callee);
@@ -419,9 +422,11 @@ cgraph_check_inline_limits (struct cgrap
     }
 
   stack_size_limit = inline_summary (to)->estimated_self_stack_size;
+  gcc_checking_assert (stack_size_limit != -1);
 
   stack_size_limit += stack_size_limit * PARAM_VALUE (PARAM_STACK_FRAME_GROWTH) / 100;
 
+  gcc_checking_assert (what->global.estimated_stack_size != -1);
   inlined_stack = (to->global.stack_frame_offset
 		   + inline_summary (to)->estimated_self_stack_size
 		   + what->global.estimated_stack_size);

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: ipa-inline-stack-growth-referenced-vars-pr47106-compare.patch --]
[-- Type: text/x-diff, Size: 1766 bytes --]

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-04 05:26:40.137008650 -0200
+++ gcc/cfgexpand.c	2011-02-04 05:27:40.951213575 -0200
@@ -1311,6 +1311,29 @@ create_stack_guard (void)
   crtl->stack_protect_guard = guard;
 }
 
+/* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
+   expanding variables.  Those variables that can be put into registers
+   are allocated pseudos; those that can't are put on the stack.
+
+   TOPLEVEL is true if this is the outermost BLOCK.  */
+
+static HOST_WIDE_INT
+account_used_vars_for_block (tree block, bool toplevel)
+{
+  tree t;
+  HOST_WIDE_INT size = 0;
+
+  /* Expand all variables at this level.  */
+  for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
+    size += expand_one_var (t, toplevel, false);
+
+  /* Expand all variables at containing levels.  */
+  for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
+    size += account_used_vars_for_block (t, false);
+
+  return size;
+}
+
 /* Prepare for expanding variables.  */
 static void
 init_vars_expansion (void)
@@ -1354,7 +1377,10 @@ HOST_WIDE_INT
 estimated_stack_frame_size (tree decl)
 {
   HOST_WIDE_INT size = 0;
+  HOST_WIDE_INT osize = 0;
+  unsigned ix;
   size_t i;
+  tree outer_block = DECL_INITIAL (current_function_decl);
   tree var;
   tree old_cur_fun_decl = current_function_decl;
   referenced_var_iterator rvi;
@@ -1378,6 +1404,11 @@ estimated_stack_frame_size (tree decl)
       size += account_stack_vars ();
       fini_vars_expansion ();
     }
+
+  if (size != osize)
+    error ("est stack size changed from %li to %li",
+	   (long)osize, (long)size);
+
   pop_cfun ();
   current_function_decl = old_cur_fun_decl;
   return size;

[-- Attachment #4: Type: text/plain, Size: 258 bytes --]



-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-04  7:49                         ` Alexandre Oliva
@ 2011-02-07 12:32                           ` Alexandre Oliva
  2011-02-07 13:39                             ` Richard Guenther
                                               ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-07 12:32 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 816 bytes --]

On Feb  4, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:

> That isn't enough.  We often attempt to estimate the stack size of a
> function before we as much as put it in SSA form (let alone compute
> referenced_vras), while processing other functions.

> if we switch to referenced_vars only, we have to figure out how to
> rearrange the passes so that we make reasonable estimates.

Like this.  This has completed bootstrap and is almost done building
target libs on amd64-linux-gnu.  Ok to install if it passes regression
testing?

I've used a patch to report differences in the computation of stack
size, and in nearly all cases the size goes down, but there are
exceptions in which it goes up by 4 or 8 bytes.  I'll run a full
bootstrap with that patch once I'm done with testing and summarize the
results.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ipa-inline-stack-growth-referenced-vars-pr47106.patch --]
[-- Type: text/x-diff, Size: 18785 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* tree-flow.h (referenced_var_lookup): Add fn parameter.
	Adjust all callers.
	* tree-dfa.c (referenced_var_lookup): Use fn instead of cfun.
	* tree-flow-inline.h: Adjust.
	* gimple-pretty-print.c: Adjust.
	* tree-into-ssa.c: Adjust.
	* tree-ssa.c: Adjust.
	* cfgexpand.c: Adjust.
	(account_used_vars_for_block): Remove.
	(estimated_stack_frame_size): Use referenced vars.
	* tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
	that were referenced in the original function.
	(copy_decl_for_dup_finish): Likewise.
	* tree-optimize.c (pass_early_lowering_passes): New.
	(pass_early_lowered_passes): New.
	* tree-pass.h (pass_early_lowering_passes): Declare.
	(pass_early_lowered_passes): Declare.
	* passes.c (init_optimization_passes): Introduce
	pass_early_lowering_passes and pass_early_lowered_passes.
	Move first pass_inline_parameters into the latter.
	(execute_ipa_pass_list): Adjust so PLUGIN_EARLY_GIMPLE_PASSES get
	called at the same spots.
	* ipa-inline.c (cgraph_early_inlining): Fixup cfg.
	(pass_inline_parameters): Require PROP_referenced_vars.
	* cgraphunit.c (cgraph_process_new_functions): Don't run
	compute_inline_parameters explicitly.
	* predict.c (gate_estimate_probability): Skip for already-guessed
	versioned functions.

for  gcc/testsuite/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* g++.dg/debug/pr47106.C: New.

Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h.orig	2011-02-07 09:23:32.698049422 -0200
+++ gcc/tree-flow.h	2011-02-07 09:25:20.342781531 -0200
@@ -319,7 +319,7 @@ typedef struct
        !end_referenced_vars_p (&(ITER)); \
        (VAR) = next_referenced_var (&(ITER)))
 
-extern tree referenced_var_lookup (unsigned int);
+extern tree referenced_var_lookup (struct function *, unsigned int);
 extern bool referenced_var_check_and_insert (tree);
 #define num_referenced_vars htab_elements (gimple_referenced_vars (cfun))
 
Index: gcc/tree-dfa.c
===================================================================
--- gcc/tree-dfa.c.orig	2011-02-07 09:23:32.112067191 -0200
+++ gcc/tree-dfa.c	2011-02-07 09:25:20.391780043 -0200
@@ -488,12 +488,12 @@ find_referenced_vars_in (gimple stmt)
    variable.  */
 
 tree
-referenced_var_lookup (unsigned int uid)
+referenced_var_lookup (struct function *fn, unsigned int uid)
 {
   tree h;
   struct tree_decl_minimal in;
   in.uid = uid;
-  h = (tree) htab_find_with_hash (gimple_referenced_vars (cfun), &in, uid);
+  h = (tree) htab_find_with_hash (gimple_referenced_vars (fn), &in, uid);
   return h;
 }
 
Index: gcc/tree-flow-inline.h
===================================================================
--- gcc/tree-flow-inline.h.orig	2011-02-07 09:23:31.268092772 -0200
+++ gcc/tree-flow-inline.h	2011-02-07 09:25:20.476777464 -0200
@@ -103,7 +103,7 @@ next_htab_element (htab_iterator *hti)
 static inline tree
 referenced_var (unsigned int uid)
 {
-  tree var = referenced_var_lookup (uid);
+  tree var = referenced_var_lookup (cfun, uid);
   gcc_assert (var || uid == 0);
   return var;
 }
Index: gcc/gimple-pretty-print.c
===================================================================
--- gcc/gimple-pretty-print.c.orig	2011-02-07 09:23:30.988101348 -0200
+++ gcc/gimple-pretty-print.c	2011-02-07 09:25:20.438778618 -0200
@@ -542,7 +542,7 @@ pp_points_to_solution (pretty_printer *b
       pp_string (buffer, "{ ");
       EXECUTE_IF_SET_IN_BITMAP (pt->vars, 0, i, bi)
 	{
-	  tree var = referenced_var_lookup (i);
+	  tree var = referenced_var_lookup (cfun, i);
 	  if (var)
 	    {
 	      dump_generic_node (buffer, var, 0, dump_flags, false);
Index: gcc/tree-into-ssa.c
===================================================================
--- gcc/tree-into-ssa.c.orig	2011-02-07 09:23:31.697080038 -0200
+++ gcc/tree-into-ssa.c	2011-02-07 09:25:20.500776736 -0200
@@ -1469,7 +1469,7 @@ dump_decl_set (FILE *file, bitmap set)
 
       EXECUTE_IF_SET_IN_BITMAP (set, 0, i, bi)
 	{
-	  tree var = referenced_var_lookup (i);
+	  tree var = referenced_var_lookup (cfun, i);
 	  if (var)
 	    print_generic_expr (file, var, 0);
 	  else
Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c.orig	2011-02-07 09:23:32.553053907 -0200
+++ gcc/tree-ssa.c	2011-02-07 09:25:20.526775945 -0200
@@ -1902,7 +1902,7 @@ maybe_optimize_var (tree var, bitmap add
 
   /* If the variable is not in the list of referenced vars then we
      do not need to touch it nor can we rename it.  */
-  if (!referenced_var_lookup (DECL_UID (var)))
+  if (!referenced_var_lookup (cfun, DECL_UID (var)))
     return false;
 
   if (TREE_ADDRESSABLE (var)
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-07 09:23:30.841105782 -0200
+++ gcc/cfgexpand.c	2011-02-07 09:25:20.326782016 -0200
@@ -520,7 +520,7 @@ update_alias_info_with_stack_vars (void)
 	     for -O0 where we are preserving even unreferenced variables.  */
 	  gcc_assert (DECL_P (decl)
 		      && (!optimize
-			  || referenced_var_lookup (DECL_UID (decl))));
+			  || referenced_var_lookup (cfun, DECL_UID (decl))));
 	  bitmap_set_bit (part, uid);
 	  *((bitmap *) pointer_map_insert (decls_to_partitions,
 					   (void *)(size_t) uid)) = part;
@@ -1311,30 +1311,6 @@ create_stack_guard (void)
   crtl->stack_protect_guard = guard;
 }
 
-/* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-   expanding variables.  Those variables that can be put into registers
-   are allocated pseudos; those that can't are put on the stack.
-
-   TOPLEVEL is true if this is the outermost BLOCK.  */
-
-static HOST_WIDE_INT
-account_used_vars_for_block (tree block, bool toplevel)
-{
-  tree t;
-  HOST_WIDE_INT size = 0;
-
-  /* Expand all variables at this level.  */
-  for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (var_ann (t) && is_used_p (t))
-      size += expand_one_var (t, toplevel, false);
-
-  /* Expand all variables at containing levels.  */
-  for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-    size += account_used_vars_for_block (t, false);
-
-  return size;
-}
-
 /* Prepare for expanding variables.  */
 static void
 init_vars_expansion (void)
@@ -1379,22 +1355,18 @@ estimated_stack_frame_size (tree decl)
 {
   HOST_WIDE_INT size = 0;
   size_t i;
-  tree var, outer_block = DECL_INITIAL (current_function_decl);
-  unsigned ix;
+  tree var;
   tree old_cur_fun_decl = current_function_decl;
+  referenced_var_iterator rvi;
+
+  gcc_checking_assert (gimple_referenced_vars (cfun));
+
   current_function_decl = decl;
   push_cfun (DECL_STRUCT_FUNCTION (decl));
 
-  init_vars_expansion ();
-
-  FOR_EACH_LOCAL_DECL (cfun, ix, var)
-    {
-      /* TREE_USED marks local variables that do not appear in lexical
-	 blocks.  We don't want to expand those that do twice.  */
-      if (TREE_USED (var))
-        size += expand_one_var (var, true, false);
-    }
-  size += account_used_vars_for_block (outer_block, true);
+  gcc_checking_assert (gimple_referenced_vars (cfun));
+  FOR_EACH_REFERENCED_VAR (var, rvi)
+    size += expand_one_var (var, true, false);
 
   if (stack_vars_num > 0)
     {
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-02-07 09:23:32.407058212 -0200
+++ gcc/tree-inline.c	2011-02-07 09:25:20.376780499 -0200
@@ -317,7 +317,11 @@ remap_decl (tree decl, copy_body_data *i
 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
 	{
 	  get_var_ann (t);
-	  add_referenced_var (t);
+	  if (TREE_CODE (decl) != VAR_DECL
+	      || !gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+					DECL_UID (decl)))
+	    add_referenced_var (t);
 	}
       return t;
     }
@@ -4753,6 +4757,13 @@ copy_decl_for_dup_finish (copy_body_data
        new function.  */
     DECL_CONTEXT (copy) = id->dst_fn;
 
+  if (TREE_CODE (decl) == VAR_DECL
+      && gimple_referenced_vars (cfun)
+      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+      && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+				DECL_UID (decl)))
+    add_referenced_var (copy);
+
   return copy;
 }
 
Index: gcc/tree-optimize.c
===================================================================
--- gcc/tree-optimize.c.orig	2011-02-07 09:23:33.152035632 -0200
+++ gcc/tree-optimize.c	2011-02-07 09:25:20.538775583 -0200
@@ -101,6 +101,44 @@ execute_all_early_local_passes (void)
   return 0;
 }
 
+struct simple_ipa_opt_pass pass_early_lowering_passes =
+{
+ {
+  SIMPLE_IPA_PASS,
+  "*early_lowering",			/* name */
+  gate_all_early_local_passes,		/* gate */
+  NULL,					/* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  TV_EARLY_LOCAL,			/* tv_id */
+  0,					/* properties_required */
+  PROP_referenced_vars,			/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  0	 				/* todo_flags_finish */
+ }
+};
+
+struct simple_ipa_opt_pass pass_early_lowered_passes =
+{
+ {
+  SIMPLE_IPA_PASS,
+  "*early_lowered",			/* name */
+  gate_all_early_local_passes,		/* gate */
+  NULL,					/* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  TV_EARLY_LOCAL,			/* tv_id */
+  PROP_referenced_vars,			/* properties_required */
+  0,					/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  0	 				/* todo_flags_finish */
+ }
+};
+
 struct simple_ipa_opt_pass pass_early_local_passes =
 {
  {
Index: gcc/tree-pass.h
===================================================================
--- gcc/tree-pass.h.orig	2011-02-07 09:23:31.407088633 -0200
+++ gcc/tree-pass.h	2011-02-07 09:25:20.550775218 -0200
@@ -452,6 +452,8 @@ extern struct simple_ipa_opt_pass pass_i
 extern struct simple_ipa_opt_pass pass_ipa_function_and_variable_visibility;
 extern struct simple_ipa_opt_pass pass_ipa_tree_profile;
 
+extern struct simple_ipa_opt_pass pass_early_lowering_passes;
+extern struct simple_ipa_opt_pass pass_early_lowered_passes;
 extern struct simple_ipa_opt_pass pass_early_local_passes;
 
 extern struct ipa_opt_pass_d pass_ipa_whole_program_visibility;
Index: gcc/passes.c
===================================================================
--- gcc/passes.c.orig	2011-02-07 09:23:32.266062537 -0200
+++ gcc/passes.c	2011-02-07 09:25:20.462777888 -0200
@@ -729,7 +729,6 @@ init_optimization_passes (void)
   NEXT_PASS (pass_build_cfg);
   NEXT_PASS (pass_warn_function_return);
   NEXT_PASS (pass_build_cgraph_edges);
-  NEXT_PASS (pass_inline_parameters);
   *p = NULL;
 
   /* Interprocedural optimization passes.  */
@@ -739,51 +738,59 @@ init_optimization_passes (void)
   NEXT_PASS (pass_early_local_passes);
     {
       struct opt_pass **p = &pass_early_local_passes.pass.sub;
-      NEXT_PASS (pass_fixup_cfg);
-      NEXT_PASS (pass_init_datastructures);
-      NEXT_PASS (pass_expand_omp);
-
-      NEXT_PASS (pass_referenced_vars);
-      NEXT_PASS (pass_build_ssa);
-      NEXT_PASS (pass_lower_vector);
-      NEXT_PASS (pass_early_warn_uninitialized);
-      /* Note that it is not strictly necessary to schedule an early
-	 inline pass here.  However, some test cases (e.g.,
-	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
-	 inline functions to be inlined even at -O0.  This does not
-	 happen during the first early inline pass.  */
-      NEXT_PASS (pass_rebuild_cgraph_edges);
-      NEXT_PASS (pass_early_inline);
-      NEXT_PASS (pass_all_early_optimizations);
-	{
-	  struct opt_pass **p = &pass_all_early_optimizations.pass.sub;
-	  NEXT_PASS (pass_remove_cgraph_callee_edges);
-	  NEXT_PASS (pass_rename_ssa_copies);
-	  NEXT_PASS (pass_ccp);
-	  NEXT_PASS (pass_forwprop);
-	  /* pass_build_ealias is a dummy pass that ensures that we
-	     execute TODO_rebuild_alias at this point.  Re-building
-	     alias information also rewrites no longer addressed
-	     locals into SSA form if possible.  */
-	  NEXT_PASS (pass_build_ealias);
-	  NEXT_PASS (pass_sra_early);
-	  NEXT_PASS (pass_copy_prop);
-	  NEXT_PASS (pass_merge_phi);
-	  NEXT_PASS (pass_cd_dce);
-	  NEXT_PASS (pass_early_ipa_sra);
-	  NEXT_PASS (pass_tail_recursion);
-	  NEXT_PASS (pass_convert_switch);
-          NEXT_PASS (pass_cleanup_eh);
-          NEXT_PASS (pass_profile);
-          NEXT_PASS (pass_local_pure_const);
-	  /* Split functions creates parts that are not run through
-	     early optimizations again.  It is thus good idea to do this
-	     late.  */
-          NEXT_PASS (pass_split_functions);
+      NEXT_PASS (pass_early_lowering_passes);
+        {
+	  struct opt_pass **p = &pass_early_lowering_passes.pass.sub;
+	  NEXT_PASS (pass_fixup_cfg);
+	  NEXT_PASS (pass_init_datastructures);
+	  NEXT_PASS (pass_expand_omp);
+
+	  /* We have to run this pass for all functions before we go into
+	     early local passes, particularly pass_inline_parameters, that
+	     relies on referenced_vars being already computed.  */
+	  NEXT_PASS (pass_referenced_vars);
+
+	  NEXT_PASS (pass_build_ssa);
+	  NEXT_PASS (pass_lower_vector);
+	  NEXT_PASS (pass_early_warn_uninitialized);
+	  NEXT_PASS (pass_rebuild_cgraph_edges);
+	}
+      NEXT_PASS (pass_early_lowered_passes);
+        {
+	  struct opt_pass **p = &pass_early_lowered_passes.pass.sub;
+	  NEXT_PASS (pass_inline_parameters);
+	  NEXT_PASS (pass_early_inline);
+	  NEXT_PASS (pass_all_early_optimizations);
+	  {
+	    struct opt_pass **p = &pass_all_early_optimizations.pass.sub;
+	    NEXT_PASS (pass_remove_cgraph_callee_edges);
+	    NEXT_PASS (pass_rename_ssa_copies);
+	    NEXT_PASS (pass_ccp);
+	    NEXT_PASS (pass_forwprop);
+	    /* pass_build_ealias is a dummy pass that ensures that we
+	       execute TODO_rebuild_alias at this point.  Re-building
+	       alias information also rewrites no longer addressed
+	       locals into SSA form if possible.  */
+	    NEXT_PASS (pass_build_ealias);
+	    NEXT_PASS (pass_sra_early);
+	    NEXT_PASS (pass_copy_prop);
+	    NEXT_PASS (pass_merge_phi);
+	    NEXT_PASS (pass_cd_dce);
+	    NEXT_PASS (pass_early_ipa_sra);
+	    NEXT_PASS (pass_tail_recursion);
+	    NEXT_PASS (pass_convert_switch);
+	    NEXT_PASS (pass_cleanup_eh);
+	    NEXT_PASS (pass_profile);
+	    NEXT_PASS (pass_local_pure_const);
+	    /* Split functions creates parts that are not run through
+	       early optimizations again.  It is thus good idea to do this
+	       late.  */
+	    NEXT_PASS (pass_split_functions);
+	  }
+	  NEXT_PASS (pass_release_ssa_names);
+	  NEXT_PASS (pass_rebuild_cgraph_edges);
+	  NEXT_PASS (pass_inline_parameters);
 	}
-      NEXT_PASS (pass_release_ssa_names);
-      NEXT_PASS (pass_rebuild_cgraph_edges);
-      NEXT_PASS (pass_inline_parameters);
     }
   NEXT_PASS (pass_ipa_tree_profile);
     {
@@ -1927,18 +1934,20 @@ execute_ipa_pass_list (struct opt_pass *
       gcc_assert (pass->type == SIMPLE_IPA_PASS || pass->type == IPA_PASS);
       if (execute_one_pass (pass) && pass->sub)
 	{
+	  if (pass == &pass_early_local_passes.pass)
+	    invoke_plugin_callbacks (PLUGIN_EARLY_GIMPLE_PASSES_START, NULL);
+
 	  if (pass->sub->type == GIMPLE_PASS)
-	    {
-	      invoke_plugin_callbacks (PLUGIN_EARLY_GIMPLE_PASSES_START, NULL);
-	      do_per_function_toporder ((void (*)(void *))execute_pass_list,
-					pass->sub);
-	      invoke_plugin_callbacks (PLUGIN_EARLY_GIMPLE_PASSES_END, NULL);
-	    }
+	    do_per_function_toporder ((void (*)(void *))execute_pass_list,
+				      pass->sub);
 	  else if (pass->sub->type == SIMPLE_IPA_PASS
 		   || pass->sub->type == IPA_PASS)
 	    execute_ipa_pass_list (pass->sub);
 	  else
 	    gcc_unreachable ();
+
+	  if (pass == &pass_early_local_passes.pass)
+	    invoke_plugin_callbacks (PLUGIN_EARLY_GIMPLE_PASSES_END, NULL);
 	}
       gcc_assert (!current_function_decl);
       cgraph_process_new_functions ();
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-07 09:23:31.554084216 -0200
+++ gcc/ipa-inline.c	2011-02-07 09:25:20.570774611 -0200
@@ -1794,7 +1794,7 @@ cgraph_early_inlining (void)
 
   cfun->always_inline_functions_inlined = true;
 
-  return todo;
+  return todo | execute_fixup_cfg ();
 }
 
 struct gimple_opt_pass pass_early_inline =
@@ -2044,7 +2044,7 @@ struct gimple_opt_pass pass_inline_param
   NULL,					/* next */
   0,					/* static_pass_number */
   TV_INLINE_HEURISTICS,			/* tv_id */
-  0,	                                /* properties_required */
+  PROP_referenced_vars,	                /* properties_required */
   0,					/* properties_provided */
   0,					/* properties_destroyed */
   0,					/* todo_flags_start */
Index: gcc/cgraphunit.c
===================================================================
--- gcc/cgraphunit.c.orig	2011-02-07 09:23:31.962071817 -0200
+++ gcc/cgraphunit.c	2011-02-07 09:25:20.590774004 -0200
@@ -246,7 +246,6 @@ cgraph_process_new_functions (void)
 	    cgraph_analyze_function (node);
 	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
 	  current_function_decl = fndecl;
-	  compute_inline_parameters (node);
 	  if ((cgraph_state == CGRAPH_STATE_IPA_SSA
 	      && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
 	      /* When not optimizing, be sure we run early local passes anyway
Index: gcc/predict.c
===================================================================
--- gcc/predict.c.orig	2011-02-07 09:23:31.828075834 -0200
+++ gcc/predict.c	2011-02-07 09:53:09.096282921 -0200
@@ -2266,6 +2266,11 @@ compute_function_frequency (void)
 static bool
 gate_estimate_probability (void)
 {
+  /* Functions cloned for versioning get profile information from
+     the original functions.  */
+  if (profile_status != PROFILE_ABSENT)
+    return false;
+
   return flag_guess_branch_prob;
 }
 
Index: gcc/testsuite/g++.dg/debug/pr47106.C
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/testsuite/g++.dg/debug/pr47106.C	2011-02-07 09:25:20.420779164 -0200
@@ -0,0 +1,37 @@
+// { dg-do compile }
+// { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+
+void end (int, int) __attribute__ ((__noreturn__));
+
+struct S
+{
+  int i;
+  S *s;
+};
+
+inline bool f (S *s)
+{
+  if (!s->s)
+    end (0, 0);
+  return s->s == s;
+}
+
+inline bool
+baz (S s1, S)
+{
+  while (f (&s1));
+}
+
+inline bool
+bar (S s1, S s2, S)
+{
+  baz (s1, s2);
+}
+
+S getS ();
+
+bool
+foo ()
+{
+  bar (getS (), getS (), getS ());
+}

[-- Attachment #3: Type: text/plain, Size: 257 bytes --]


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 12:32                           ` Alexandre Oliva
@ 2011-02-07 13:39                             ` Richard Guenther
  2011-02-07 16:01                               ` Jan Hubicka
                                                 ` (2 more replies)
  2011-02-07 16:03                             ` Jan Hubicka
  2011-02-07 16:09                             ` Jan Hubicka
  2 siblings, 3 replies; 46+ messages in thread
From: Richard Guenther @ 2011-02-07 13:39 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Mon, Feb 7, 2011 at 1:31 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Feb  4, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> That isn't enough.  We often attempt to estimate the stack size of a
>> function before we as much as put it in SSA form (let alone compute
>> referenced_vras), while processing other functions.
>
>> if we switch to referenced_vars only, we have to figure out how to
>> rearrange the passes so that we make reasonable estimates.
>
> Like this.  This has completed bootstrap and is almost done building
> target libs on amd64-linux-gnu.  Ok to install if it passes regression
> testing?
>
> I've used a patch to report differences in the computation of stack
> size, and in nearly all cases the size goes down, but there are
> exceptions in which it goes up by 4 or 8 bytes.  I'll run a full
> bootstrap with that patch once I'm done with testing and summarize the
> results.

The referenced_var_lookup interface change is ok.

   tree old_cur_fun_decl = current_function_decl;
+  referenced_var_iterator rvi;
+
+  gcc_checking_assert (gimple_referenced_vars (cfun));
+
   current_function_decl = decl;
   push_cfun (DECL_STRUCT_FUNCTION (decl));

-  init_vars_expansion ();
-
-  FOR_EACH_LOCAL_DECL (cfun, ix, var)
-    {
-      /* TREE_USED marks local variables that do not appear in lexical
-        blocks.  We don't want to expand those that do twice.  */
-      if (TREE_USED (var))
-        size += expand_one_var (var, true, false);
-    }
-  size += account_used_vars_for_block (outer_block, true);
+  gcc_checking_assert (gimple_referenced_vars (cfun));
+  FOR_EACH_REFERENCED_VAR (var, rvi)
+    size += expand_one_var (var, true, false);

looks like only the assert after pushing cfun is required?  Can we get
away w/o cfun pushing here now?


Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig      2011-02-07 09:23:32.407058212 -0200
+++ gcc/tree-inline.c   2011-02-07 09:25:20.376780499 -0200
@@ -317,7 +317,11 @@ remap_decl (tree decl, copy_body_data *i
              || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
        {
          get_var_ann (t);
-         add_referenced_var (t);
+         if (TREE_CODE (decl) != VAR_DECL
+             || !gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+             || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+                                       DECL_UID (decl)))
+           add_referenced_var (t);

get_var_ann is redundant with add_referenced_var.  It probably isn't
needed for debug vars, is it?  id-> contains struct function pointers,
use them, not
DECL_STRUCT_FUNCTION.  Do we really run into the
!gimple_referenced_vars case here?  I think we can't (cfun is in SSA form
and we do not inline non-SSA fns into SSA fns).

@@ -4753,6 +4757,13 @@ copy_decl_for_dup_finish (copy_body_data
        new function.  */
     DECL_CONTEXT (copy) = id->dst_fn;

+  if (TREE_CODE (decl) == VAR_DECL
+      && gimple_referenced_vars (cfun)
+      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+      && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+                               DECL_UID (decl)))
+    add_referenced_var (copy);

Why only for VAR_DECL decls and not to-var-decl translated PARM_DECLs?
Please use the same (single) check as the other piece, gimple_in_ssa_p (cfun).

I wanted to ask you about

> That isn't enough.  We often attempt to estimate the stack size of a
> function before we as much as put it in SSA form (let alone compute
> referenced_vras), while processing other functions.

anyway, as I don't see why we do that (nor where).  And I'd rather
fix that than doing the pass re-org you did (which are not optimal
for cache locality reasons).  Honza correctly noted that we never
inline non-SSA fns into SSA fns, but maybe we'd query estimates
in wrong order (and we should fix that, instead of your reorg I think).

Thus,

Index: gcc/passes.c
===================================================================
--- gcc/passes.c        (revision 169878)
+++ gcc/passes.c        (working copy)
@@ -729,7 +729,6 @@ init_optimization_passes (void)
   NEXT_PASS (pass_build_cfg);
   NEXT_PASS (pass_warn_function_return);
   NEXT_PASS (pass_build_cgraph_edges);
-  NEXT_PASS (pass_inline_parameters);
   *p = NULL;

   /* Interprocedural optimization passes.  */
@@ -753,6 +752,7 @@ init_optimization_passes (void)
         inline functions to be inlined even at -O0.  This does not
         happen during the first early inline pass.  */
       NEXT_PASS (pass_rebuild_cgraph_edges);
+      NEXT_PASS (pass_inline_parameters);
       NEXT_PASS (pass_early_inline);
       NEXT_PASS (pass_all_early_optimizations);
        {

should really just work.  If it doesn't then there is a bug (somewhere).

The cgraphunit.c change looks ok in this regard, can you explain
the predict.c change?  They do look independent, splitting up
commits at this stage would be nice.

Thanks,
Richard.


>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 13:39                             ` Richard Guenther
@ 2011-02-07 16:01                               ` Jan Hubicka
  2011-02-07 22:38                                 ` Alexandre Oliva
  2011-02-07 22:31                               ` Alexandre Oliva
  2011-02-12 13:45                               ` Alexandre Oliva
  2 siblings, 1 reply; 46+ messages in thread
From: Jan Hubicka @ 2011-02-07 16:01 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Alexandre Oliva, gcc-patches, Jan Hubicka

> The cgraphunit.c change looks ok in this regard, can you explain
> the predict.c change?  They do look independent, splitting up

predict.c change looks wrong to me.  pass_profile is done as part of early
passes.  Only case it can be run on clone of already processed function is
the case where early optimizations would produce an clone and that clone
would be early optimized again.

We clone as part of ipa-sra and partial inlining, but that one expects clone to
not be re-processed and do thinks like calling compute_inline_parameters.

We can re-process them if the result seems benefical, but I can not think
of cases where this maes big difference (changes of partial inlining are
rather simple, with ipa-sra there is some chance that the new function body
would optimize better, but not too grand either)

I will need to go through the thread.  What was reason for splitting lowering
passes to two?  This way we will process the whole program twice that is bit
slower since it leads to poorer locality.

Honza

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 12:32                           ` Alexandre Oliva
  2011-02-07 13:39                             ` Richard Guenther
@ 2011-02-07 16:03                             ` Jan Hubicka
  2011-02-07 22:35                               ` Alexandre Oliva
  2011-02-07 16:09                             ` Jan Hubicka
  2 siblings, 1 reply; 46+ messages in thread
From: Jan Hubicka @ 2011-02-07 16:03 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, Jan Hubicka

> On Feb  4, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:
> 
> > That isn't enough.  We often attempt to estimate the stack size of a
> > function before we as much as put it in SSA form (let alone compute
> > referenced_vras), while processing other functions.

This should happen only because compute_inline_parameters is scheduled too early.
Moving it just before early_inline pass should do the trick.

Honza

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 12:32                           ` Alexandre Oliva
  2011-02-07 13:39                             ` Richard Guenther
  2011-02-07 16:03                             ` Jan Hubicka
@ 2011-02-07 16:09                             ` Jan Hubicka
  2011-02-07 22:34                               ` Alexandre Oliva
  2 siblings, 1 reply; 46+ messages in thread
From: Jan Hubicka @ 2011-02-07 16:09 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, Jan Hubicka

> Index: gcc/ipa-inline.c
> ===================================================================
> --- gcc/ipa-inline.c.orig	2011-02-07 09:23:31.554084216 -0200
> +++ gcc/ipa-inline.c	2011-02-07 09:25:20.570774611 -0200
> @@ -1794,7 +1794,7 @@ cgraph_early_inlining (void)
>  
>    cfun->always_inline_functions_inlined = true;
>  
> -  return todo;
> +  return todo | execute_fixup_cfg ();

Fixup_cfg exists to update function body after properties of other functions
it calls was changed (and originally also some EH edges was removed after
inlining, but this was fixed some time ago).
If you go for splitting early otimizatoin queue, you need to schedule fixup_cfg
as first of early lowered passes, or any invocation of verify_cfg would fail
until early inliner is run.
> Index: gcc/cgraphunit.c
> ===================================================================
> --- gcc/cgraphunit.c.orig	2011-02-07 09:23:31.962071817 -0200
> +++ gcc/cgraphunit.c	2011-02-07 09:25:20.590774004 -0200
> @@ -246,7 +246,6 @@ cgraph_process_new_functions (void)
>  	    cgraph_analyze_function (node);
>  	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
>  	  current_function_decl = fndecl;
> -	  compute_inline_parameters (node);

Hmm, this is because both ipa-sra and ipa-split take care to compute inliner parameters
themselves?
If you go for this, you probably should arrange also new function inserted by -fprofile-generate
to be processed by compute_inline_parameters.  We never inline those, but we should not get
uninitialized fields there since they are used for cost metrics of ipa-cp, too.

Honza

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 13:39                             ` Richard Guenther
  2011-02-07 16:01                               ` Jan Hubicka
@ 2011-02-07 22:31                               ` Alexandre Oliva
  2011-02-08 11:10                                 ` Richard Guenther
  2011-02-12 13:45                               ` Alexandre Oliva
  2 siblings, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-07 22:31 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

On Feb  7, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> The referenced_var_lookup interface change is ok.

Ok, I'll split it out and check it in.

>    tree old_cur_fun_decl = current_function_decl;
> +  referenced_var_iterator rvi;
> +
> +  gcc_checking_assert (gimple_referenced_vars (cfun));
> +
>    current_function_decl = decl;
>    push_cfun (DECL_STRUCT_FUNCTION (decl));

> looks like only the assert after pushing cfun is required?

Doh!  Thanks.

> Can we get away w/o cfun pushing here now?

Maybe.  I'll have a look.

> Index: gcc/tree-inline.c
> ===================================================================
> --- gcc/tree-inline.c.orig      2011-02-07 09:23:32.407058212 -0200
> +++ gcc/tree-inline.c   2011-02-07 09:25:20.376780499 -0200
> @@ -317,7 +317,11 @@ remap_decl (tree decl, copy_body_data *i
>               || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
>         {
>           get_var_ann (t);
> -         add_referenced_var (t);
> +         if (TREE_CODE (decl) != VAR_DECL
> +             || !gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
> +             || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
> +                                       DECL_UID (decl)))
> +           add_referenced_var (t);

> get_var_ann is redundant with add_referenced_var.

Yeah, will drop and unminimize the patch.

> Do we really run into the
> !gimple_referenced_vars case here?

Some earlier version of the patch did.  Maybe not any more.

> @@ -4753,6 +4757,13 @@ copy_decl_for_dup_finish (copy_body_data
>         new function.  */
>      DECL_CONTEXT (copy) = id->dst_fn;

> +  if (TREE_CODE (decl) == VAR_DECL
> +      && gimple_referenced_vars (cfun)
> +      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
> +      && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
> +                               DECL_UID (decl)))
> +    add_referenced_var (copy);

> Why only for VAR_DECL decls and not to-var-decl translated PARM_DECLs?

Yeah, PARM_DECLs and RESULT_DECLs get unconditional add_referenced_var
elsewhere.

> I wanted to ask you about

>> That isn't enough.  We often attempt to estimate the stack size of a
>> function before we as much as put it in SSA form (let alone compute
>> referenced_vras), while processing other functions.

> anyway, as I don't see why we do that (nor where).

What I noticed was that the trivial patch that moved
pass_inline_parameters down and asserted gimple_referenced_vars(cfun)
failed the assertion.  I didn't investigate much, I just figured it made
sense for one function to look at stack size estimates of its
potentially-inlined callees to estimate its own, and since the call
graph may contain cycles, we can't ensure we'll always process callee
before caller.  If my “figuring” is right, I don't see how to fix this,
but I'll go back to the drawing board and investigate further.

> The cgraphunit.c change looks ok in this regard, can you explain
> the predict.c change?

We split a function after profile-guessing, and initialize_cfun copied
the PROFILE_GUESSED status from the original function to the split.
Then, when we got to running the various passes over the newly-created
version, we failed the assertion that profile_status != PROFILE_GUESSED
in gimple_predict_edge.  I didn't try to find out why this didn't happen
before the pass rearrangement.  I figured it made sense that the pass
rearrangement enabled some more inlining/versioning, exposing this kind
of situation just like it did expose the need for cleanups after early
inlining to avoid getting SSA verify failures because formerly-throwing
call stmts ended up known not to throw.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 16:09                             ` Jan Hubicka
@ 2011-02-07 22:34                               ` Alexandre Oliva
  0 siblings, 0 replies; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-07 22:34 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, gcc-patches

On Feb  7, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:

> If you go for splitting early otimizatoin queue, you need to schedule
> fixup_cfg as first of early lowered passes, or any invocation of
> verify_cfg would fail until early inliner is run.

That might work as well, but I probably won't investigate further, since
the approach of splitting the passes is undesirable and probably too
destabilizing at this point.

>> Index: gcc/cgraphunit.c
>> ===================================================================
>> --- gcc/cgraphunit.c.orig	2011-02-07 09:23:31.962071817 -0200
>> +++ gcc/cgraphunit.c	2011-02-07 09:25:20.590774004 -0200
>> @@ -246,7 +246,6 @@ cgraph_process_new_functions (void)
>> cgraph_analyze_function (node);
>> push_cfun (DECL_STRUCT_FUNCTION (fndecl));
>> current_function_decl = fndecl;
>> -	  compute_inline_parameters (node);

> Hmm, this is because both ipa-sra and ipa-split take care to compute
> inliner parameters themselves?

No, it's just because the early passes called below now include the
pass that computes inline parameters.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 16:03                             ` Jan Hubicka
@ 2011-02-07 22:35                               ` Alexandre Oliva
  2011-02-07 23:58                                 ` Jan Hubicka
  0 siblings, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-07 22:35 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, gcc-patches

On Feb  7, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:

>> On Feb  4, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:
>> 
>> > That isn't enough.  We often attempt to estimate the stack size of a
>> > function before we as much as put it in SSA form (let alone compute
>> > referenced_vras), while processing other functions.

> This should happen only because compute_inline_parameters is scheduled
> too early.  Moving it just before early_inline pass should do the
> trick.

I tried that.  It didn't work, we still tried to estimate the stack size
of functions that hadn't been put in SSA form and had referenced_vars
computed.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 16:01                               ` Jan Hubicka
@ 2011-02-07 22:38                                 ` Alexandre Oliva
  0 siblings, 0 replies; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-07 22:38 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, gcc-patches

On Feb  7, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:

>> The cgraphunit.c change looks ok in this regard, can you explain
>> the predict.c change?  They do look independent, splitting up

> predict.c change looks wrong to me.  pass_profile is done as part of early
> passes.  Only case it can be run on clone of already processed function is
> the case where early optimizations would produce an clone and that clone
> would be early optimized again.

Maybe it was the split that enabled the second part of the early passes
to run for nearly-generated functions.  So I won't investigate, because
we're going to (try to) go back to a single early pass.

> What was reason for splitting lowering
> passes to two?

It looked like we had to compute referenced_vars for all functions
before computing inline parameters for any, because we could attempt to
estimate the stack size of functions that hadn't gone through
referenced_vars yet.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 22:35                               ` Alexandre Oliva
@ 2011-02-07 23:58                                 ` Jan Hubicka
  0 siblings, 0 replies; 46+ messages in thread
From: Jan Hubicka @ 2011-02-07 23:58 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Jan Hubicka, Richard Guenther, gcc-patches

> On Feb  7, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:
> 
> >> On Feb  4, 2011, Alexandre Oliva <aoliva@redhat.com> wrote:
> >> 
> >> > That isn't enough.  We often attempt to estimate the stack size of a
> >> > function before we as much as put it in SSA form (let alone compute
> >> > referenced_vras), while processing other functions.
> 
> > This should happen only because compute_inline_parameters is scheduled
> > too early.  Moving it just before early_inline pass should do the
> > trick.
> 
> I tried that.  It didn't work, we still tried to estimate the stack size
> of functions that hadn't been put in SSA form and had referenced_vars
> computed.

From where we did so?
Honza
> 
> -- 
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 22:31                               ` Alexandre Oliva
@ 2011-02-08 11:10                                 ` Richard Guenther
  2011-02-08 11:30                                   ` Jan Hubicka
  0 siblings, 1 reply; 46+ messages in thread
From: Richard Guenther @ 2011-02-08 11:10 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Mon, Feb 7, 2011 at 11:30 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Feb  7, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>> The referenced_var_lookup interface change is ok.
>
> Ok, I'll split it out and check it in.
>
>>    tree old_cur_fun_decl = current_function_decl;
>> +  referenced_var_iterator rvi;
>> +
>> +  gcc_checking_assert (gimple_referenced_vars (cfun));
>> +
>>    current_function_decl = decl;
>>    push_cfun (DECL_STRUCT_FUNCTION (decl));
>
>> looks like only the assert after pushing cfun is required?
>
> Doh!  Thanks.
>
>> Can we get away w/o cfun pushing here now?
>
> Maybe.  I'll have a look.
>
>> Index: gcc/tree-inline.c
>> ===================================================================
>> --- gcc/tree-inline.c.orig      2011-02-07 09:23:32.407058212 -0200
>> +++ gcc/tree-inline.c   2011-02-07 09:25:20.376780499 -0200
>> @@ -317,7 +317,11 @@ remap_decl (tree decl, copy_body_data *i
>>               || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
>>         {
>>           get_var_ann (t);
>> -         add_referenced_var (t);
>> +         if (TREE_CODE (decl) != VAR_DECL
>> +             || !gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
>> +             || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
>> +                                       DECL_UID (decl)))
>> +           add_referenced_var (t);
>
>> get_var_ann is redundant with add_referenced_var.
>
> Yeah, will drop and unminimize the patch.
>
>> Do we really run into the
>> !gimple_referenced_vars case here?
>
> Some earlier version of the patch did.  Maybe not any more.
>
>> @@ -4753,6 +4757,13 @@ copy_decl_for_dup_finish (copy_body_data
>>         new function.  */
>>      DECL_CONTEXT (copy) = id->dst_fn;
>
>> +  if (TREE_CODE (decl) == VAR_DECL
>> +      && gimple_referenced_vars (cfun)
>> +      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
>> +      && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
>> +                               DECL_UID (decl)))
>> +    add_referenced_var (copy);
>
>> Why only for VAR_DECL decls and not to-var-decl translated PARM_DECLs?
>
> Yeah, PARM_DECLs and RESULT_DECLs get unconditional add_referenced_var
> elsewhere.
>
>> I wanted to ask you about
>
>>> That isn't enough.  We often attempt to estimate the stack size of a
>>> function before we as much as put it in SSA form (let alone compute
>>> referenced_vras), while processing other functions.
>
>> anyway, as I don't see why we do that (nor where).
>
> What I noticed was that the trivial patch that moved
> pass_inline_parameters down and asserted gimple_referenced_vars(cfun)
> failed the assertion.  I didn't investigate much, I just figured it made
> sense for one function to look at stack size estimates of its
> potentially-inlined callees to estimate its own, and since the call
> graph may contain cycles, we can't ensure we'll always process callee
> before caller.  If my “figuring” is right, I don't see how to fix this,
> but I'll go back to the drawing board and investigate further.

Yes, we can have cycles and no referenced vars in the callee.  But
we won't inline that callee (as it isn't in SSA form), so we can just
as well skip looking at its estimated stack frame size.

>> The cgraphunit.c change looks ok in this regard, can you explain
>> the predict.c change?
>
> We split a function after profile-guessing, and initialize_cfun copied
> the PROFILE_GUESSED status from the original function to the split.
> Then, when we got to running the various passes over the newly-created
> version, we failed the assertion that profile_status != PROFILE_GUESSED
> in gimple_predict_edge.  I didn't try to find out why this didn't happen
> before the pass rearrangement.  I figured it made sense that the pass
> rearrangement enabled some more inlining/versioning, exposing this kind
> of situation just like it did expose the need for cleanups after early
> inlining to avoid getting SSA verify failures because formerly-throwing
> call stmts ended up known not to throw.

I think you are now feeding more functions into re-optimization
(that's another reason I want to avoid the pass reorg for stage4).

Richard.

>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-08 11:10                                 ` Richard Guenther
@ 2011-02-08 11:30                                   ` Jan Hubicka
  0 siblings, 0 replies; 46+ messages in thread
From: Jan Hubicka @ 2011-02-08 11:30 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Alexandre Oliva, gcc-patches, Jan Hubicka

> Yes, we can have cycles and no referenced vars in the callee.  But
> we won't inline that callee (as it isn't in SSA form), so we can just
> as well skip looking at its estimated stack frame size.

Checking in_ssa_p tests in ipa-inline, I think we check them early enough
(before we check only cgraph properties - whether edge is inlined or inlining
is recursive etc.).

Functions checking inline parameters (check_inline_limits etc.) are done only
after in_ssa_p check is done, so we should be safe here.  (with exception of
disregard inline limits, you can move that check later but any value of that
flag is OK since we won't do anything with the call)

Honza
> 
> >> The cgraphunit.c change looks ok in this regard, can you explain
> >> the predict.c change?
> >
> > We split a function after profile-guessing, and initialize_cfun copied
> > the PROFILE_GUESSED status from the original function to the split.
> > Then, when we got to running the various passes over the newly-created
> > version, we failed the assertion that profile_status != PROFILE_GUESSED
> > in gimple_predict_edge.  I didn't try to find out why this didn't happen
> > before the pass rearrangement.  I figured it made sense that the pass
> > rearrangement enabled some more inlining/versioning, exposing this kind
> > of situation just like it did expose the need for cleanups after early
> > inlining to avoid getting SSA verify failures because formerly-throwing
> > call stmts ended up known not to throw.
> 
> I think you are now feeding more functions into re-optimization
> (that's another reason I want to avoid the pass reorg for stage4).
> 
> Richard.
> 
> >
> > --
> > Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> > You must be the change you wish to see in the world. -- Gandhi
> > Be Free! -- http://FSFLA.org/   FSF Latin America board member
> > Free Software Evangelist      Red Hat Brazil Compiler Engineer
> >

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-07 13:39                             ` Richard Guenther
  2011-02-07 16:01                               ` Jan Hubicka
  2011-02-07 22:31                               ` Alexandre Oliva
@ 2011-02-12 13:45                               ` Alexandre Oliva
  2011-02-14 12:41                                 ` Richard Guenther
                                                   ` (2 more replies)
  2 siblings, 3 replies; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-12 13:45 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 2448 bytes --]

On Feb  7, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> The referenced_var_lookup interface change is ok.

After much hacking and testing, I have a patchset that works (no
regressions) and that I'm happy with.

I've broken it up into preparation patches with interface changes, the
aactual change, one independent bug fix, some additional improvements
that I regard as slightly risky, but that have survived multiple
bootstrap cycles (-O1 and -O3 in addition to -O2, on i686 and x86_64),
and some additional patches I used for testing and for verifying that
the estimated stack sizes are unchanged from the current method, when
debug info is not enabled (when it is, variables retained in lexical
blocks for debug information only used to inflate the estimate, but now
they no longer do)

On Feb  7, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:

>> we still tried to estimate the stack size of functions that hadn't
>> been put in SSA form and had referenced_vars computed.

>> From where we did so?

SRA, indeed.

On Feb  8, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> Yes, we can have cycles and no referenced vars in the callee.  But
> we won't inline that callee (as it isn't in SSA form), so we can just
> as well skip looking at its estimated stack frame size.

Excellent!

On Feb  8, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:

> Functions checking inline parameters (check_inline_limits etc.) are done only
> after in_ssa_p check is done, so we should be safe here.

Yup.

> (with exception of disregard inline limits, you can move that check
> later but any value of that flag is OK since we won't do anything with
> the call)

I hadn't realized what this note was about before now.  I hit the
problem of processing functions in the wrong order, breaking pr42632.c,
which is why I introduced a new pass, compute_inlinable, at the point
the initial compute_inline_params was.  I tried not calling
compute_inlinable from within compute_inline_parameters, but I found out
inlinable changed, or had to be computed separately, or something like
that.  What I didn't try was to replace the explicit call of
compute_inline_parameters in cgraph_process_new_functions with
compute_inlinable.  From the failures I observed, it now occurs to me
that this might be what I was missing.  Anyhow, this patch works, but if
you'd rather I make this other change, I will.


Anyhow, here is the patch set.

Interface changes:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: referenced-var-lookup-fn-arg.patch --]
[-- Type: text/x-diff, Size: 4315 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* tree-flow.h (referenced_var_lookup): Add fn parameter.
	Adjust all callers.
	* tree-dfa.c (referenced_var_lookup): Use fn instead of cfun.
	* tree-flow-inline.h: Adjust.
	* gimple-pretty-print.c: Adjust.
	* tree-into-ssa.c: Adjust.
	* tree-ssa.c: Adjust.
	* cfgexpand.c: Adjust.

Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h.orig	2011-02-10 04:20:23.466643733 -0200
+++ gcc/tree-flow.h	2011-02-10 04:23:07.004150142 -0200
@@ -319,7 +319,7 @@ typedef struct
        !end_referenced_vars_p (&(ITER)); \
        (VAR) = next_referenced_var (&(ITER)))
 
-extern tree referenced_var_lookup (unsigned int);
+extern tree referenced_var_lookup (struct function *, unsigned int);
 extern bool referenced_var_check_and_insert (tree);
 #define num_referenced_vars htab_elements (gimple_referenced_vars (cfun))
 
Index: gcc/tree-dfa.c
===================================================================
--- gcc/tree-dfa.c.orig	2011-02-10 04:20:23.160650315 -0200
+++ gcc/tree-dfa.c	2011-02-10 04:23:06.417162741 -0200
@@ -488,12 +488,12 @@ find_referenced_vars_in (gimple stmt)
    variable.  */
 
 tree
-referenced_var_lookup (unsigned int uid)
+referenced_var_lookup (struct function *fn, unsigned int uid)
 {
   tree h;
   struct tree_decl_minimal in;
   in.uid = uid;
-  h = (tree) htab_find_with_hash (gimple_referenced_vars (cfun), &in, uid);
+  h = (tree) htab_find_with_hash (gimple_referenced_vars (fn), &in, uid);
   return h;
 }
 
Index: gcc/tree-flow-inline.h
===================================================================
--- gcc/tree-flow-inline.h.orig	2011-02-10 04:20:22.943654959 -0200
+++ gcc/tree-flow-inline.h	2011-02-10 04:23:05.289186749 -0200
@@ -103,7 +103,7 @@ next_htab_element (htab_iterator *hti)
 static inline tree
 referenced_var (unsigned int uid)
 {
-  tree var = referenced_var_lookup (uid);
+  tree var = referenced_var_lookup (cfun, uid);
   gcc_assert (var || uid == 0);
   return var;
 }
Index: gcc/gimple-pretty-print.c
===================================================================
--- gcc/gimple-pretty-print.c.orig	2011-02-10 04:20:22.823657516 -0200
+++ gcc/gimple-pretty-print.c	2011-02-10 04:20:27.672553932 -0200
@@ -542,7 +542,7 @@ pp_points_to_solution (pretty_printer *b
       pp_string (buffer, "{ ");
       EXECUTE_IF_SET_IN_BITMAP (pt->vars, 0, i, bi)
 	{
-	  tree var = referenced_var_lookup (i);
+	  tree var = referenced_var_lookup (cfun, i);
 	  if (var)
 	    {
 	      dump_generic_node (buffer, var, 0, dump_flags, false);
Index: gcc/tree-into-ssa.c
===================================================================
--- gcc/tree-into-ssa.c.orig	2011-02-10 04:20:23.046652755 -0200
+++ gcc/tree-into-ssa.c	2011-02-10 04:23:06.064170211 -0200
@@ -1469,7 +1469,7 @@ dump_decl_set (FILE *file, bitmap set)
 
       EXECUTE_IF_SET_IN_BITMAP (set, 0, i, bi)
 	{
-	  tree var = referenced_var_lookup (i);
+	  tree var = referenced_var_lookup (cfun, i);
 	  if (var)
 	    print_generic_expr (file, var, 0);
 	  else
Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c.orig	2011-02-10 04:20:23.302647274 -0200
+++ gcc/tree-ssa.c	2011-02-10 04:23:06.817154171 -0200
@@ -1930,7 +1930,7 @@ maybe_optimize_var (tree var, bitmap add
 
   /* If the variable is not in the list of referenced vars then we
      do not need to touch it nor can we rename it.  */
-  if (!referenced_var_lookup (DECL_UID (var)))
+  if (!referenced_var_lookup (cfun, DECL_UID (var)))
     return false;
 
   if (TREE_ADDRESSABLE (var)
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-10 04:20:22.688660397 -0200
+++ gcc/cfgexpand.c	2011-02-10 04:23:00.395291314 -0200
@@ -520,7 +520,7 @@ update_alias_info_with_stack_vars (void)
 	     for -O0 where we are preserving even unreferenced variables.  */
 	  gcc_assert (DECL_P (decl)
 		      && (!optimize
-			  || referenced_var_lookup (DECL_UID (decl))));
+			  || referenced_var_lookup (cfun, DECL_UID (decl))));
 	  bitmap_set_bit (part, uid);
 	  *((bitmap *) pointer_map_insert (decls_to_partitions,
 					   (void *)(size_t) uid)) = part;

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: for-each-referenced-var-fn-arg.patch --]
[-- Type: text/x-diff, Size: 8257 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* tree-flow.h (FOR_EACH_REFERENCED_VAR): Add FN argument.
	Adjust all users.  Pass FN to...
	* tree-flow-inline.h (first_referenced_var): ... this.  Add
	fn argument.
	* ipa-struct-reorg.c: Adjust.
	* tree-dfa.c: Adjust.
	* tree-into-ssa.c: Adjust.
	* tree-sra.c: Adjust.
	* tree-ssa-alias.c: Adjust.
	* tree-ssa-live.c: Adjust.
	* tree-ssa.c: Adjust.
	* tree-ssanames.c: Adjust.
	* tree-tailcall.c: Adjust.

Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h.orig	2011-02-10 04:20:27.617555109 -0200
+++ gcc/tree-flow.h	2011-02-10 04:20:29.087522610 -0200
@@ -314,9 +314,9 @@ typedef struct
    to the hashtable while using this macro.  Doing so may cause it to behave
    erratically.  */
 
-#define FOR_EACH_REFERENCED_VAR(VAR, ITER) \
-  for ((VAR) = first_referenced_var (&(ITER)); \
-       !end_referenced_vars_p (&(ITER)); \
+#define FOR_EACH_REFERENCED_VAR(FN, VAR, ITER)		\
+  for ((VAR) = first_referenced_var ((FN), &(ITER));	\
+       !end_referenced_vars_p (&(ITER));		\
        (VAR) = next_referenced_var (&(ITER)))
 
 extern tree referenced_var_lookup (struct function *, unsigned int);
Index: gcc/tree-flow-inline.h
===================================================================
--- gcc/tree-flow-inline.h.orig	2011-02-10 04:20:27.650554402 -0200
+++ gcc/tree-flow-inline.h	2011-02-10 04:22:33.508865705 -0200
@@ -112,10 +112,10 @@ referenced_var (unsigned int uid)
    referenced_vars hashtable, and return that variable.  */
 
 static inline tree
-first_referenced_var (referenced_var_iterator *iter)
+first_referenced_var (struct function *fn, referenced_var_iterator *iter)
 {
   return (tree) first_htab_element (&iter->hti,
-				    gimple_referenced_vars (cfun));
+				    gimple_referenced_vars (fn));
 }
 
 /* Return true if we have hit the end of the referenced variables ITER is
Index: gcc/ipa-struct-reorg.c
===================================================================
--- gcc/ipa-struct-reorg.c.orig	2011-02-10 04:20:21.457686677 -0200
+++ gcc/ipa-struct-reorg.c	2011-02-10 04:20:29.142522529 -0200
@@ -2712,7 +2712,7 @@ create_new_local_vars (void)
   new_local_vars = htab_create (num_referenced_vars,
 				new_var_hash, new_var_eq, NULL);
 
-  FOR_EACH_REFERENCED_VAR (var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     {
       if (!is_global_var (var))
 	create_new_var (var, new_local_vars);
Index: gcc/tree-dfa.c
===================================================================
--- gcc/tree-dfa.c.orig	2011-02-10 04:20:27.635554720 -0200
+++ gcc/tree-dfa.c	2011-02-10 04:20:29.157522207 -0200
@@ -218,7 +218,7 @@ dump_referenced_vars (FILE *file)
   fprintf (file, "\nReferenced variables in %s: %u\n\n",
 	   get_name (current_function_decl), (unsigned) num_referenced_vars);
 
-  FOR_EACH_REFERENCED_VAR (var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     {
       fprintf (file, "Variable: ");
       dump_variable (file, var);
@@ -400,7 +400,7 @@ collect_dfa_stats (struct dfa_stats_d *d
   memset ((void *)dfa_stats_p, 0, sizeof (struct dfa_stats_d));
 
   /* Count all the variable annotations.  */
-  FOR_EACH_REFERENCED_VAR (var, vi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, vi)
     if (var_ann (var))
       dfa_stats_p->num_var_anns++;
 
Index: gcc/tree-into-ssa.c
===================================================================
--- gcc/tree-into-ssa.c.orig	2011-02-10 04:20:27.726552777 -0200
+++ gcc/tree-into-ssa.c	2011-02-10 04:20:29.185521611 -0200
@@ -1156,7 +1156,7 @@ insert_phi_nodes (bitmap_head *dfs)
      differences but no UID ordering differences.  */
 
   vars = BITMAP_ALLOC (NULL);
-  FOR_EACH_REFERENCED_VAR (var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     {
       struct def_blocks_d *def_map;
 
@@ -1573,7 +1573,7 @@ dump_currdefs (FILE *file)
   tree var;
 
   fprintf (file, "\n\nCurrent reaching definitions\n\n");
-  FOR_EACH_REFERENCED_VAR (var, i)
+  FOR_EACH_REFERENCED_VAR (cfun, var, i)
     if (SYMS_TO_RENAME (cfun) == NULL
 	|| bitmap_bit_p (SYMS_TO_RENAME (cfun), DECL_UID (var)))
       {
@@ -2313,7 +2313,7 @@ init_ssa_renamer (void)
   def_blocks = htab_create (num_referenced_vars, def_blocks_hash,
                             def_blocks_eq, def_blocks_free);
 
-  FOR_EACH_REFERENCED_VAR(var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     set_current_def (var, NULL_TREE);
 }
 
Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c.orig	2011-02-10 04:20:21.083694687 -0200
+++ gcc/tree-sra.c	2011-02-10 04:20:29.221520842 -0200
@@ -1540,7 +1540,7 @@ find_var_candidates (void)
   referenced_var_iterator rvi;
   bool ret = false;
 
-  FOR_EACH_REFERENCED_VAR (var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     {
       if (TREE_CODE (var) != VAR_DECL && TREE_CODE (var) != PARM_DECL)
         continue;
Index: gcc/tree-ssa-alias.c
===================================================================
--- gcc/tree-ssa-alias.c.orig	2011-02-10 04:20:20.624704633 -0200
+++ gcc/tree-ssa-alias.c	2011-02-10 04:20:29.243520372 -0200
@@ -364,7 +364,7 @@ dump_alias_info (FILE *file)
 
   fprintf (file, "Aliased symbols\n\n");
 
-  FOR_EACH_REFERENCED_VAR (var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     {
       if (may_be_aliased (var))
 	dump_variable (file, var);
Index: gcc/tree-ssa-live.c
===================================================================
--- gcc/tree-ssa-live.c.orig	2011-02-10 04:20:20.850699713 -0200
+++ gcc/tree-ssa-live.c	2011-02-10 04:20:29.261519988 -0200
@@ -705,7 +705,7 @@ remove_unused_locals (void)
   mark_scope_block_unused (DECL_INITIAL (current_function_decl));
 
   /* Assume all locals are unused.  */
-  FOR_EACH_REFERENCED_VAR (t, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, t, rvi)
     clear_is_used (t);
 
   /* Walk the CFG marking all referenced symbols.  */
@@ -821,7 +821,7 @@ remove_unused_locals (void)
     }
 
   /* Remove unused variables from REFERENCED_VARs.  */
-  FOR_EACH_REFERENCED_VAR (t, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, t, rvi)
     if (!is_global_var (t)
 	&& TREE_CODE (t) != PARM_DECL
 	&& TREE_CODE (t) != RESULT_DECL
Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c.orig	2011-02-10 04:20:27.751552245 -0200
+++ gcc/tree-ssa.c	2011-02-10 04:20:29.279519603 -0200
@@ -1157,7 +1157,7 @@ delete_tree_ssa (void)
   tree var;
 
   /* Remove annotations from every referenced local variable.  */
-  FOR_EACH_REFERENCED_VAR (var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     {
       if (is_global_var (var))
 	continue;
Index: gcc/tree-ssanames.c
===================================================================
--- gcc/tree-ssanames.c.orig	2011-02-10 04:20:20.484707352 -0200
+++ gcc/tree-ssanames.c	2011-02-10 04:20:29.292519326 -0200
@@ -340,7 +340,7 @@ release_dead_ssa_names (void)
 
   /* Current defs point to various dead SSA names that in turn point to
      eventually dead variables so a bunch of memory is held live.  */
-  FOR_EACH_REFERENCED_VAR (t, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, t, rvi)
     set_current_def (t, NULL);
   /* Now release the freelist.  */
   for (t = FREE_SSANAMES (cfun); t; t = next)
Index: gcc/tree-tailcall.c
===================================================================
--- gcc/tree-tailcall.c.orig	2011-02-10 04:20:20.349710447 -0200
+++ gcc/tree-tailcall.c	2011-02-10 04:20:29.306519026 -0200
@@ -481,7 +481,7 @@ find_tail_calls (basic_block bb, struct 
 
   /* Make sure the tail invocation of this function does not refer
      to local variables.  */
-  FOR_EACH_REFERENCED_VAR (var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     {
       if (TREE_CODE (var) != PARM_DECL
 	  && auto_var_in_fn_p (var, cfun->decl)
@@ -889,7 +889,7 @@ add_virtual_phis (void)
      this, we cannot do much better than to rebuild the ssa form for
      possibly affected virtual ssa names from scratch.  */
 
-  FOR_EACH_REFERENCED_VAR (var, rvi)
+  FOR_EACH_REFERENCED_VAR (cfun, var, rvi)
     {
       if (!is_gimple_reg (var) && gimple_default_def (cfun, var) != NULL_TREE)
 	mark_sym_for_renaming (var);

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: estimated-stack-frame-size-cgraph-node-arg.patch --]
[-- Type: text/x-diff, Size: 3391 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* tree-inline.h (estimated_stack_frame_size): Take cgraph node
	rather than decl.
	* cfgexpand.c (estimated_stack_frame_size): Likewise.
	* ipa-inline.c (compute_inline_parameters): Adjust.

Index: gcc/tree-inline.h
===================================================================
--- gcc/tree-inline.h.orig	2011-02-10 04:20:18.394752112 -0200
+++ gcc/tree-inline.h	2011-02-10 04:20:34.606405625 -0200
@@ -188,6 +188,6 @@ extern tree remap_decl (tree decl, copy_
 extern tree remap_type (tree type, copy_body_data *id);
 extern gimple_seq copy_gimple_seq_and_replace_locals (gimple_seq seq);
 
-extern HOST_WIDE_INT estimated_stack_frame_size (tree);
+extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
 
 #endif /* GCC_TREE_INLINE_H */
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-10 04:20:27.785551518 -0200
+++ gcc/cfgexpand.c	2011-02-10 04:22:53.918429701 -0200
@@ -1366,24 +1366,25 @@ fini_vars_expansion (void)
   stack_vars_alloc = stack_vars_num = 0;
 }
 
-/* Make a fair guess for the size of the stack frame of the decl
-   passed.  This doesn't have to be exact, the result is only used
-   in the inline heuristics.  So we don't want to run the full stack
-   var packing algorithm (which is quadratic in the number of stack
-   vars).  Instead, we calculate the total size of all stack vars.
-   This turns out to be a pretty fair estimate -- packing of stack
-   vars doesn't happen very often.  */
+/* Make a fair guess for the size of the stack frame of the function
+   in NODE.  This doesn't have to be exact, the result is only used in
+   the inline heuristics.  So we don't want to run the full stack var
+   packing algorithm (which is quadratic in the number of stack vars).
+   Instead, we calculate the total size of all stack vars.  This turns
+   out to be a pretty fair estimate -- packing of stack vars doesn't
+   happen very often.  */
 
 HOST_WIDE_INT
-estimated_stack_frame_size (tree decl)
+estimated_stack_frame_size (struct cgraph_node *node)
 {
   HOST_WIDE_INT size = 0;
   size_t i;
   tree var, outer_block = DECL_INITIAL (current_function_decl);
   unsigned ix;
   tree old_cur_fun_decl = current_function_decl;
-  current_function_decl = decl;
-  push_cfun (DECL_STRUCT_FUNCTION (decl));
+
+  current_function_decl = node->decl;
+  push_cfun (DECL_STRUCT_FUNCTION (node->decl));
 
   init_vars_expansion ();
 
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-10 04:20:18.579748152 -0200
+++ gcc/ipa-inline.c	2011-02-10 04:22:57.573351625 -0200
@@ -1982,9 +1982,8 @@ compute_inline_parameters (struct cgraph
 
   gcc_assert (!node->global.inlined_to);
 
-  /* Estimate the stack size for the function.  But not at -O0
-     because estimated_stack_frame_size is a quadratic problem.  */
-  self_stack_size = optimize ? estimated_stack_frame_size (node->decl) : 0;
+  /* Estimate the stack size for the function if we're optimizing.  */
+  self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
   inline_summary (node)->estimated_self_stack_size = self_stack_size;
   node->global.estimated_stack_size = self_stack_size;
   node->global.stack_frame_offset = 0;

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: compute-inline-params-ret-void.patch --]
[-- Type: text/x-diff, Size: 1454 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* cgraph.h (compute_inline_parameters): Return void.
	* ipa-inline.c (compute_inline_parameters): Adjust.

Index: gcc/cgraph.h
===================================================================
--- gcc/cgraph.h.orig	2011-02-10 04:20:17.504771074 -0200
+++ gcc/cgraph.h	2011-02-10 04:21:08.306685881 -0200
@@ -769,7 +769,7 @@ varpool_next_static_initializer (struct 
 
 /* In ipa-inline.c  */
 void cgraph_clone_inlined_nodes (struct cgraph_edge *, bool, bool);
-unsigned int compute_inline_parameters (struct cgraph_node *);
+void compute_inline_parameters (struct cgraph_node *);
 
 
 /* Create a new static variable of type TYPE.  */
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-10 04:20:34.655404759 -0200
+++ gcc/ipa-inline.c	2011-02-10 04:22:54.129425181 -0200
@@ -1975,7 +1975,7 @@ estimate_function_body_sizes (struct cgr
 }
 
 /* Compute parameters of functions used by inliner.  */
-unsigned int
+void
 compute_inline_parameters (struct cgraph_node *node)
 {
   HOST_WIDE_INT self_stack_size;
@@ -2013,7 +2013,6 @@ compute_inline_parameters (struct cgraph
   /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
   node->global.time = inline_summary (node)->self_time;
   node->global.size = inline_summary (node)->self_size;
-  return 0;
 }
 
 

[-- Attachment #6: Type: text/plain, Size: 83 bytes --]


The patch that fixes the problem, using referenced_vars to estimate
stack sizes.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: ipa-inline-stack-growth-referenced-vars-pr47106.patch --]
[-- Type: text/x-diff, Size: 13630 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* cfgexpand.c (account_used_vars_for_block): Remove.
	(estimated_stack_frame_size): Use referenced vars.
	* tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
	that were referenced in the original function.  Test src_fn
	rather than cfun.  Drop redundant get_var_ann.
	(setup_one_parameter): Drop redundant get_var_ann.
	(declare_return_variable): Likewise.
	(copy_decl_for_dup_finish): Mark VAR_DECLs referenced in src_fn.
	(copy_arguments_for_versioning): Drop redundant get_var_ann.
	* tree-pass.h (pass_inlinable): Declare.
	* passes.c (init_optimization_passes): Move first
	pass_inline_parameters after pass_referenced_vars.  Add
	pass_inlinable where it was before.
	* ipa-inline.c (compute_inlinable): New, split out of...
	(compute_inline_parameters): ... this.  Quit if referenced_vars
	are not available.
	(compute_inlinable_for_current, pass_inlinable): New.
	(pass_inline_parameters): Require PROP_referenced_vars.
	* cgraphunit.c (cgraph_process_new_functions): Don't run
	compute_inline_parameters explicitly.

for  gcc/testsuite/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* g++.dg/debug/pr47106.C: New.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-11 13:42:37.430636412 -0200
+++ gcc/cfgexpand.c	2011-02-11 13:42:40.067581306 -0200
@@ -1311,30 +1311,6 @@ create_stack_guard (void)
   crtl->stack_protect_guard = guard;
 }
 
-/* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-   expanding variables.  Those variables that can be put into registers
-   are allocated pseudos; those that can't are put on the stack.
-
-   TOPLEVEL is true if this is the outermost BLOCK.  */
-
-static HOST_WIDE_INT
-account_used_vars_for_block (tree block, bool toplevel)
-{
-  tree t;
-  HOST_WIDE_INT size = 0;
-
-  /* Expand all variables at this level.  */
-  for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (var_ann (t) && is_used_p (t))
-      size += expand_one_var (t, toplevel, false);
-
-  /* Expand all variables at containing levels.  */
-  for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-    size += account_used_vars_for_block (t, false);
-
-  return size;
-}
-
 /* Prepare for expanding variables.  */
 static void
 init_vars_expansion (void)
@@ -1379,23 +1355,17 @@ estimated_stack_frame_size (struct cgrap
 {
   HOST_WIDE_INT size = 0;
   size_t i;
-  tree var, outer_block = DECL_INITIAL (current_function_decl);
-  unsigned ix;
+  tree var;
   tree old_cur_fun_decl = current_function_decl;
+  referenced_var_iterator rvi;
+  struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
 
   current_function_decl = node->decl;
-  push_cfun (DECL_STRUCT_FUNCTION (node->decl));
+  push_cfun (fn);
 
-  init_vars_expansion ();
-
-  FOR_EACH_LOCAL_DECL (cfun, ix, var)
-    {
-      /* TREE_USED marks local variables that do not appear in lexical
-	 blocks.  We don't want to expand those that do twice.  */
-      if (TREE_USED (var))
-        size += expand_one_var (var, true, false);
-    }
-  size += account_used_vars_for_block (outer_block, true);
+  gcc_checking_assert (gimple_referenced_vars (fn));
+  FOR_EACH_REFERENCED_VAR (fn, var, rvi)
+    size += expand_one_var (var, true, false);
 
   if (stack_vars_num > 0)
     {
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-02-11 13:42:38.454614859 -0200
+++ gcc/tree-inline.c	2011-02-11 13:42:40.122579711 -0200
@@ -312,13 +312,17 @@ remap_decl (tree decl, copy_body_data *i
 	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
 	}
 
-      if (cfun && gimple_in_ssa_p (cfun)
-	  && (TREE_CODE (t) == VAR_DECL
-	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
-	{
-	  get_var_ann (t);
-	  add_referenced_var (t);
-	}
+      if ((TREE_CODE (t) == VAR_DECL
+	   || TREE_CODE (t) == RESULT_DECL
+	   || TREE_CODE (t) == PARM_DECL)
+	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
+	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+	  /* We don't want to mark as referenced VAR_DECLs that were
+	     not marked as such in the src function.  */
+	  && (TREE_CODE (decl) != VAR_DECL
+	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+					DECL_UID (decl))))
+	add_referenced_var (t);
       return t;
     }
 
@@ -2547,10 +2551,7 @@ setup_one_parameter (copy_body_data *id,
 
   /* We're actually using the newly-created var.  */
   if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
-    {
-      get_var_ann (var);
-      add_referenced_var (var);
-    }
+    add_referenced_var (var);
 
   /* Declare this new variable.  */
   DECL_CHAIN (var) = *vars;
@@ -2857,10 +2858,7 @@ declare_return_variable (copy_body_data 
 
   var = copy_result_decl_to_var (result, id);
   if (gimple_in_ssa_p (cfun))
-    {
-      get_var_ann (var);
-      add_referenced_var (var);
-    }
+    add_referenced_var (var);
 
   DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
 
@@ -2896,10 +2894,7 @@ declare_return_variable (copy_body_data 
     {
       tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
       if (gimple_in_ssa_p (id->src_cfun))
-	{
-	  get_var_ann (temp);
-	  add_referenced_var (temp);
-	}
+	add_referenced_var (temp);
       insert_decl_map (id, result, temp);
       /* When RESULT_DECL is in SSA form, we need to use it's default_def
 	 SSA_NAME.  */
@@ -4753,6 +4748,14 @@ copy_decl_for_dup_finish (copy_body_data
        new function.  */
     DECL_CONTEXT (copy) = id->dst_fn;
 
+  if (TREE_CODE (decl) == VAR_DECL
+      /* C++ clones functions during parsing, before
+	 referenced_vars.  */
+      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+      && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+				DECL_UID (decl)))
+    add_referenced_var (copy);
+
   return copy;
 }
 
@@ -4864,7 +4867,6 @@ copy_arguments_for_versioning (tree orig
 	   as temporary variable later in function, the uses will be
 	   replaced by local variable.  */
 	tree var = copy_decl_to_var (arg, id);
-	get_var_ann (var);
 	add_referenced_var (var);
 	insert_decl_map (id, arg, var);
         /* Declare this new variable.  */
Index: gcc/passes.c
===================================================================
--- gcc/passes.c.orig	2011-02-11 13:42:38.255618853 -0200
+++ gcc/passes.c	2011-02-11 13:42:40.146579235 -0200
@@ -729,7 +729,7 @@ init_optimization_passes (void)
   NEXT_PASS (pass_build_cfg);
   NEXT_PASS (pass_warn_function_return);
   NEXT_PASS (pass_build_cgraph_edges);
-  NEXT_PASS (pass_inline_parameters);
+  NEXT_PASS (pass_inlinable);
   *p = NULL;
 
   /* Interprocedural optimization passes.  */
@@ -747,12 +747,8 @@ init_optimization_passes (void)
       NEXT_PASS (pass_build_ssa);
       NEXT_PASS (pass_lower_vector);
       NEXT_PASS (pass_early_warn_uninitialized);
-      /* Note that it is not strictly necessary to schedule an early
-	 inline pass here.  However, some test cases (e.g.,
-	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
-	 inline functions to be inlined even at -O0.  This does not
-	 happen during the first early inline pass.  */
       NEXT_PASS (pass_rebuild_cgraph_edges);
+      NEXT_PASS (pass_inline_parameters);
       NEXT_PASS (pass_early_inline);
       NEXT_PASS (pass_all_early_optimizations);
 	{
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-11 13:42:37.859627265 -0200
+++ gcc/ipa-inline.c	2011-02-11 13:42:55.655254908 -0200
@@ -1974,21 +1974,10 @@ estimate_function_body_sizes (struct cgr
   inline_summary (node)->size_inlining_benefit = size_inlining_benefit;
 }
 
-/* Compute parameters of functions used by inliner.  */
-void
-compute_inline_parameters (struct cgraph_node *node)
+/* Determine whether node is inlinable.  */
+static void
+compute_inlinable (struct cgraph_node *node)
 {
-  HOST_WIDE_INT self_stack_size;
-
-  gcc_assert (!node->global.inlined_to);
-
-  /* Estimate the stack size for the function if we're optimizing.  */
-  self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
-  inline_summary (node)->estimated_self_stack_size = self_stack_size;
-  node->global.estimated_stack_size = self_stack_size;
-  node->global.stack_frame_offset = 0;
-
-  /* Can this function be inlined at all?  */
   node->local.inlinable = tree_inlinable_function_p (node->decl);
 
   /* Inlinable functions always can change signature.  */
@@ -1998,7 +1987,7 @@ compute_inline_parameters (struct cgraph
     {
       struct cgraph_edge *e;
 
-      /* Functions calling builtlin_apply can not change signature.  */
+      /* Functions calling builtin_apply can not change signature.  */
       for (e = node->callees; e; e = e->next_callee)
 	if (DECL_BUILT_IN (e->callee->decl)
 	    && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
@@ -2006,9 +1995,33 @@ compute_inline_parameters (struct cgraph
 	  break;
       node->local.can_change_signature = !e;
     }
+
   if (node->local.inlinable && !node->local.disregard_inline_limits)
     node->local.disregard_inline_limits
       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
+}
+
+/* Compute parameters of functions used by inliner.  */
+void
+compute_inline_parameters (struct cgraph_node *node)
+{
+  HOST_WIDE_INT self_stack_size;
+
+  gcc_assert (!node->global.inlined_to);
+
+  compute_inlinable (node);
+
+  /* A function won't be considered for inlining before we put it in
+     SSA form and compute referenced_vars.  */
+  if (!gimple_referenced_vars (DECL_STRUCT_FUNCTION (node->decl)))
+    return;
+
+  /* Estimate the stack size for the function if we're optimizing.  */
+  self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
+  inline_summary (node)->estimated_self_stack_size = self_stack_size;
+  node->global.estimated_stack_size = self_stack_size;
+  node->global.stack_frame_offset = 0;
+
   estimate_function_body_sizes (node);
   /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
   node->global.time = inline_summary (node)->self_time;
@@ -2016,6 +2029,14 @@ compute_inline_parameters (struct cgraph
 }
 
 
+/* Determine whether current_function_decl is inlinable.  */
+static unsigned int
+compute_inlinable_for_current (void)
+{
+  compute_inlinable (cgraph_node (current_function_decl));
+  return 0;
+}
+
 /* Compute parameters of functions used by inliner using
    current_function_decl.  */
 static unsigned int
@@ -2025,6 +2046,25 @@ compute_inline_parameters_for_current (v
   return 0;
 }
 
+struct gimple_opt_pass pass_inlinable =
+{
+ {
+  GIMPLE_PASS,
+  "inlinable",				/* name */
+  NULL,					/* gate */
+  compute_inlinable_for_current,	/* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  TV_INLINE_HEURISTICS,			/* tv_id */
+  0,					/* properties_required */
+  0,					/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  0					/* todo_flags_finish */
+ }
+};
+
 struct gimple_opt_pass pass_inline_parameters =
 {
  {
@@ -2036,7 +2076,7 @@ struct gimple_opt_pass pass_inline_param
   NULL,					/* next */
   0,					/* static_pass_number */
   TV_INLINE_HEURISTICS,			/* tv_id */
-  0,	                                /* properties_required */
+  PROP_referenced_vars,	                /* properties_required */
   0,					/* properties_provided */
   0,					/* properties_destroyed */
   0,					/* todo_flags_start */
Index: gcc/cgraphunit.c
===================================================================
--- gcc/cgraphunit.c.orig	2011-02-11 13:42:38.057623032 -0200
+++ gcc/cgraphunit.c	2011-02-11 13:42:40.211577872 -0200
@@ -246,7 +246,6 @@ cgraph_process_new_functions (void)
 	    cgraph_analyze_function (node);
 	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
 	  current_function_decl = fndecl;
-	  compute_inline_parameters (node);
 	  if ((cgraph_state == CGRAPH_STATE_IPA_SSA
 	      && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
 	      /* When not optimizing, be sure we run early local passes anyway
Index: gcc/testsuite/g++.dg/debug/pr47106.C
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/testsuite/g++.dg/debug/pr47106.C	2011-02-11 13:42:40.236577347 -0200
@@ -0,0 +1,37 @@
+// { dg-do compile }
+// { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+
+void end (int, int) __attribute__ ((__noreturn__));
+
+struct S
+{
+  int i;
+  S *s;
+};
+
+inline bool f (S *s)
+{
+  if (!s->s)
+    end (0, 0);
+  return s->s == s;
+}
+
+inline bool
+baz (S s1, S)
+{
+  while (f (&s1));
+}
+
+inline bool
+bar (S s1, S s2, S)
+{
+  baz (s1, s2);
+}
+
+S getS ();
+
+bool
+foo ()
+{
+  bar (getS (), getS (), getS ());
+}
Index: gcc/tree-pass.h
===================================================================
--- gcc/tree-pass.h.orig	2011-02-11 13:42:37.641631793 -0200
+++ gcc/tree-pass.h	2011-02-11 13:42:40.249577075 -0200
@@ -572,6 +572,7 @@ extern struct rtl_opt_pass pass_final;
 extern struct rtl_opt_pass pass_rtl_seqabstr;
 extern struct gimple_opt_pass pass_release_ssa_names;
 extern struct gimple_opt_pass pass_early_inline;
+extern struct gimple_opt_pass pass_inlinable;
 extern struct gimple_opt_pass pass_inline_parameters;
 extern struct gimple_opt_pass pass_all_early_optimizations;
 extern struct gimple_opt_pass pass_update_address_taken;

[-- Attachment #8: Type: text/plain, Size: 188 bytes --]


This patch fixes a latent bug: when estimating function sizes/times, we
use functions that access current_function_decl, without always having
set it up to the function being estimated.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #9: estimate-function-body-sizes-set-cfundecl.patch --]
[-- Type: text/x-diff, Size: 1608 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* ipa-inline.c (estimate_function_body_sizes): Set
	current_function_decl while visiting stmts.

Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-10 04:21:20.234431070 -0200
+++ gcc/ipa-inline.c	2011-02-10 04:22:42.115681759 -0200
@@ -1901,12 +1901,20 @@ estimate_function_body_sizes (struct cgr
   tree arg;
   int freq;
   tree funtype = TREE_TYPE (node->decl);
+  tree saved_current_function_decl = current_function_decl;
 
   if (dump_file)
     fprintf (dump_file, "Analyzing function body size: %s\n",
 	     cgraph_node_name (node));
 
   gcc_assert (my_function && my_function->cfg);
+  /* ??? estimate_num_insns may indirectly call
+     decl_address_invariant_p which tests current_function_decl.  We
+     should probably somehow arrange for decl_address_ip_invariant_p
+     to be called instead.  Leaving current_function_decl NULL would
+     probably have the same effect, but setting current_function_decl
+     we get a more accurate estimate.  */
+  current_function_decl = node->decl;
   FOR_EACH_BB_FN (bb, my_function)
     {
       freq = compute_call_stmt_bb_frequency (node->decl, bb);
@@ -1937,6 +1945,7 @@ estimate_function_body_sizes (struct cgr
 	  gcc_assert (size >= 0);
 	}
     }
+  current_function_decl = saved_current_function_decl;
   time = (time + CGRAPH_FREQ_BASE / 2) / CGRAPH_FREQ_BASE;
   time_inlining_benefit = ((time_inlining_benefit + CGRAPH_FREQ_BASE)
   			   / (CGRAPH_FREQ_BASE * 2));

[-- Attachment #10: Type: text/plain, Size: 85 bytes --]


These two patches remove unnecessary setting up of cfun and
current_function_decl.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #11: dont-push-cfun-in-estimate-stack-frame-size.patch --]
[-- Type: text/x-diff, Size: 1063 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* cfgexpand.c (estimated_stack_frame_size): Don't change cfun
	or decl.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-10 04:21:20.149432887 -0200
+++ gcc/cfgexpand.c	2011-02-10 04:22:38.177765967 -0200
@@ -1356,13 +1356,9 @@ estimated_stack_frame_size (struct cgrap
   HOST_WIDE_INT size = 0;
   size_t i;
   tree var;
-  tree old_cur_fun_decl = current_function_decl;
   referenced_var_iterator rvi;
   struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
 
-  current_function_decl = node->decl;
-  push_cfun (fn);
-
   gcc_checking_assert (gimple_referenced_vars (fn));
   FOR_EACH_REFERENCED_VAR (fn, var, rvi)
     size += expand_one_var (var, true, false);
@@ -1376,8 +1372,6 @@ estimated_stack_frame_size (struct cgrap
       size += account_stack_vars ();
       fini_vars_expansion ();
     }
-  pop_cfun ();
-  current_function_decl = old_cur_fun_decl;
   return size;
 }
 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #12: dont-push-cfun-in-analyze-function.patch --]
[-- Type: text/x-diff, Size: 960 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* ipa-inline.c (analyze_function): Don't change cfun or decl.

Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-11 13:43:19.016764097 -0200
+++ gcc/ipa-inline.c	2011-02-11 13:43:19.317757985 -0200
@@ -2110,17 +2110,11 @@ inline_indirect_intraprocedural_analysis
 static void
 analyze_function (struct cgraph_node *node)
 {
-  push_cfun (DECL_STRUCT_FUNCTION (node->decl));
-  current_function_decl = node->decl;
-
   compute_inline_parameters (node);
   /* FIXME: We should remove the optimize check after we ensure we never run
      IPA passes when not optimizng.  */
   if (flag_indirect_inlining && optimize)
     inline_indirect_intraprocedural_analysis (node);
-
-  current_function_decl = NULL;
-  pop_cfun ();
 }
 
 /* Called when new function is inserted to callgraph late.  */

[-- Attachment #13: Type: text/plain, Size: 732 bytes --]



Are the above ok to install?  All regstrapped on x86_64-linux-gnu and
i686-pc-linux-gnu, and also bootstrapped with -O1 and -O3, besides the
tests described below.


This FTR patch makes sure the previous two patches are safe, at least
inasmuchas they ensure we don't unconditionally dereferenced cfun or
current_function_decl within these functions.  It wasn't enough,
however, to detect the problem in function size/time estimation.  That
required comparing the output of the compiler with and without the
patches that dropped cfun overriding.  There was only one difference in
stage3-like builds, in ada/trans.o, and that was fixed with the patch
that set current_function_decl while estimating function body sizes and
times.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #14: dont-push-cfun-check.patch --]
[-- Type: text/x-diff, Size: 3041 bytes --]

FTR and testing only, not to be installed.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-11 13:43:19.174760996 -0200
+++ gcc/cfgexpand.c	2011-02-11 13:43:28.071574474 -0200
@@ -1358,6 +1358,10 @@ estimated_stack_frame_size (struct cgrap
   tree var;
   referenced_var_iterator rvi;
   struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
+  tree save = current_function_decl;
+
+  push_cfun (NULL);
+  current_function_decl = NULL;
 
   gcc_checking_assert (gimple_referenced_vars (fn));
   FOR_EACH_REFERENCED_VAR (fn, var, rvi)
@@ -1372,6 +1376,10 @@ estimated_stack_frame_size (struct cgrap
       size += account_stack_vars ();
       fini_vars_expansion ();
     }
+
+  current_function_decl = save;
+  pop_cfun ();
+
   return size;
 }
 
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-11 13:43:19.317757985 -0200
+++ gcc/ipa-inline.c	2011-02-11 13:43:28.101573846 -0200
@@ -2015,6 +2015,7 @@ void
 compute_inline_parameters (struct cgraph_node *node)
 {
   HOST_WIDE_INT self_stack_size;
+  tree save = current_function_decl;
 
   gcc_assert (!node->global.inlined_to);
 
@@ -2025,6 +2026,9 @@ compute_inline_parameters (struct cgraph
   if (!gimple_referenced_vars (DECL_STRUCT_FUNCTION (node->decl)))
     return;
 
+  push_cfun (NULL);
+  current_function_decl = NULL;
+
   /* Estimate the stack size for the function if we're optimizing.  */
   self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
   inline_summary (node)->estimated_self_stack_size = self_stack_size;
@@ -2035,6 +2039,9 @@ compute_inline_parameters (struct cgraph
   /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
   node->global.time = inline_summary (node)->self_time;
   node->global.size = inline_summary (node)->self_size;
+
+  current_function_decl = save;
+  pop_cfun ();
 }
 
 
@@ -2110,11 +2117,17 @@ inline_indirect_intraprocedural_analysis
 static void
 analyze_function (struct cgraph_node *node)
 {
+  push_cfun (NULL);
+  current_function_decl = NULL;
+
   compute_inline_parameters (node);
   /* FIXME: We should remove the optimize check after we ensure we never run
      IPA passes when not optimizng.  */
   if (flag_indirect_inlining && optimize)
     inline_indirect_intraprocedural_analysis (node);
+
+  current_function_decl = NULL;
+  pop_cfun ();
 }
 
 /* Called when new function is inserted to callgraph late.  */
Index: gcc/tree-flow-inline.h
===================================================================
--- gcc/tree-flow-inline.h.orig	2011-02-11 13:42:17.880046107 -0200
+++ gcc/tree-flow-inline.h	2011-02-11 13:43:28.118573489 -0200
@@ -114,6 +114,7 @@ referenced_var (unsigned int uid)
 static inline tree
 first_referenced_var (struct function *fn, referenced_var_iterator *iter)
 {
+  gcc_checking_assert (!cfun || fn == cfun);
   return (tree) first_htab_element (&iter->hti,
 				    gimple_referenced_vars (fn));
 }

[-- Attachment #15: Type: text/plain, Size: 309 bytes --]


Finally, this FTR patch was used to compare the stack size estimations
of the current (trunk) approach and the one introduced with this
patchset.  No messages printed in stage2 host for a bootstrap-debug
build (no -g), indicating no differences; plenty printed during stage3
(with -g), always lower counts.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #16: ipa-inline-stack-growth-referenced-vars-pr47106.patch --]
[-- Type: text/x-diff, Size: 13630 bytes --]

for  gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* cfgexpand.c (account_used_vars_for_block): Remove.
	(estimated_stack_frame_size): Use referenced vars.
	* tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
	that were referenced in the original function.  Test src_fn
	rather than cfun.  Drop redundant get_var_ann.
	(setup_one_parameter): Drop redundant get_var_ann.
	(declare_return_variable): Likewise.
	(copy_decl_for_dup_finish): Mark VAR_DECLs referenced in src_fn.
	(copy_arguments_for_versioning): Drop redundant get_var_ann.
	* tree-pass.h (pass_inlinable): Declare.
	* passes.c (init_optimization_passes): Move first
	pass_inline_parameters after pass_referenced_vars.  Add
	pass_inlinable where it was before.
	* ipa-inline.c (compute_inlinable): New, split out of...
	(compute_inline_parameters): ... this.  Quit if referenced_vars
	are not available.
	(compute_inlinable_for_current, pass_inlinable): New.
	(pass_inline_parameters): Require PROP_referenced_vars.
	* cgraphunit.c (cgraph_process_new_functions): Don't run
	compute_inline_parameters explicitly.

for  gcc/testsuite/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	PR debug/47106
	PR debug/47402
	* g++.dg/debug/pr47106.C: New.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-11 13:42:37.430636412 -0200
+++ gcc/cfgexpand.c	2011-02-11 13:42:40.067581306 -0200
@@ -1311,30 +1311,6 @@ create_stack_guard (void)
   crtl->stack_protect_guard = guard;
 }
 
-/* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-   expanding variables.  Those variables that can be put into registers
-   are allocated pseudos; those that can't are put on the stack.
-
-   TOPLEVEL is true if this is the outermost BLOCK.  */
-
-static HOST_WIDE_INT
-account_used_vars_for_block (tree block, bool toplevel)
-{
-  tree t;
-  HOST_WIDE_INT size = 0;
-
-  /* Expand all variables at this level.  */
-  for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-    if (var_ann (t) && is_used_p (t))
-      size += expand_one_var (t, toplevel, false);
-
-  /* Expand all variables at containing levels.  */
-  for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-    size += account_used_vars_for_block (t, false);
-
-  return size;
-}
-
 /* Prepare for expanding variables.  */
 static void
 init_vars_expansion (void)
@@ -1379,23 +1355,17 @@ estimated_stack_frame_size (struct cgrap
 {
   HOST_WIDE_INT size = 0;
   size_t i;
-  tree var, outer_block = DECL_INITIAL (current_function_decl);
-  unsigned ix;
+  tree var;
   tree old_cur_fun_decl = current_function_decl;
+  referenced_var_iterator rvi;
+  struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
 
   current_function_decl = node->decl;
-  push_cfun (DECL_STRUCT_FUNCTION (node->decl));
+  push_cfun (fn);
 
-  init_vars_expansion ();
-
-  FOR_EACH_LOCAL_DECL (cfun, ix, var)
-    {
-      /* TREE_USED marks local variables that do not appear in lexical
-	 blocks.  We don't want to expand those that do twice.  */
-      if (TREE_USED (var))
-        size += expand_one_var (var, true, false);
-    }
-  size += account_used_vars_for_block (outer_block, true);
+  gcc_checking_assert (gimple_referenced_vars (fn));
+  FOR_EACH_REFERENCED_VAR (fn, var, rvi)
+    size += expand_one_var (var, true, false);
 
   if (stack_vars_num > 0)
     {
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c.orig	2011-02-11 13:42:38.454614859 -0200
+++ gcc/tree-inline.c	2011-02-11 13:42:40.122579711 -0200
@@ -312,13 +312,17 @@ remap_decl (tree decl, copy_body_data *i
 	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
 	}
 
-      if (cfun && gimple_in_ssa_p (cfun)
-	  && (TREE_CODE (t) == VAR_DECL
-	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
-	{
-	  get_var_ann (t);
-	  add_referenced_var (t);
-	}
+      if ((TREE_CODE (t) == VAR_DECL
+	   || TREE_CODE (t) == RESULT_DECL
+	   || TREE_CODE (t) == PARM_DECL)
+	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
+	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+	  /* We don't want to mark as referenced VAR_DECLs that were
+	     not marked as such in the src function.  */
+	  && (TREE_CODE (decl) != VAR_DECL
+	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+					DECL_UID (decl))))
+	add_referenced_var (t);
       return t;
     }
 
@@ -2547,10 +2551,7 @@ setup_one_parameter (copy_body_data *id,
 
   /* We're actually using the newly-created var.  */
   if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
-    {
-      get_var_ann (var);
-      add_referenced_var (var);
-    }
+    add_referenced_var (var);
 
   /* Declare this new variable.  */
   DECL_CHAIN (var) = *vars;
@@ -2857,10 +2858,7 @@ declare_return_variable (copy_body_data 
 
   var = copy_result_decl_to_var (result, id);
   if (gimple_in_ssa_p (cfun))
-    {
-      get_var_ann (var);
-      add_referenced_var (var);
-    }
+    add_referenced_var (var);
 
   DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
 
@@ -2896,10 +2894,7 @@ declare_return_variable (copy_body_data 
     {
       tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
       if (gimple_in_ssa_p (id->src_cfun))
-	{
-	  get_var_ann (temp);
-	  add_referenced_var (temp);
-	}
+	add_referenced_var (temp);
       insert_decl_map (id, result, temp);
       /* When RESULT_DECL is in SSA form, we need to use it's default_def
 	 SSA_NAME.  */
@@ -4753,6 +4748,14 @@ copy_decl_for_dup_finish (copy_body_data
        new function.  */
     DECL_CONTEXT (copy) = id->dst_fn;
 
+  if (TREE_CODE (decl) == VAR_DECL
+      /* C++ clones functions during parsing, before
+	 referenced_vars.  */
+      && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+      && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+				DECL_UID (decl)))
+    add_referenced_var (copy);
+
   return copy;
 }
 
@@ -4864,7 +4867,6 @@ copy_arguments_for_versioning (tree orig
 	   as temporary variable later in function, the uses will be
 	   replaced by local variable.  */
 	tree var = copy_decl_to_var (arg, id);
-	get_var_ann (var);
 	add_referenced_var (var);
 	insert_decl_map (id, arg, var);
         /* Declare this new variable.  */
Index: gcc/passes.c
===================================================================
--- gcc/passes.c.orig	2011-02-11 13:42:38.255618853 -0200
+++ gcc/passes.c	2011-02-11 13:42:40.146579235 -0200
@@ -729,7 +729,7 @@ init_optimization_passes (void)
   NEXT_PASS (pass_build_cfg);
   NEXT_PASS (pass_warn_function_return);
   NEXT_PASS (pass_build_cgraph_edges);
-  NEXT_PASS (pass_inline_parameters);
+  NEXT_PASS (pass_inlinable);
   *p = NULL;
 
   /* Interprocedural optimization passes.  */
@@ -747,12 +747,8 @@ init_optimization_passes (void)
       NEXT_PASS (pass_build_ssa);
       NEXT_PASS (pass_lower_vector);
       NEXT_PASS (pass_early_warn_uninitialized);
-      /* Note that it is not strictly necessary to schedule an early
-	 inline pass here.  However, some test cases (e.g.,
-	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
-	 inline functions to be inlined even at -O0.  This does not
-	 happen during the first early inline pass.  */
       NEXT_PASS (pass_rebuild_cgraph_edges);
+      NEXT_PASS (pass_inline_parameters);
       NEXT_PASS (pass_early_inline);
       NEXT_PASS (pass_all_early_optimizations);
 	{
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c.orig	2011-02-11 13:42:37.859627265 -0200
+++ gcc/ipa-inline.c	2011-02-11 13:42:55.655254908 -0200
@@ -1974,21 +1974,10 @@ estimate_function_body_sizes (struct cgr
   inline_summary (node)->size_inlining_benefit = size_inlining_benefit;
 }
 
-/* Compute parameters of functions used by inliner.  */
-void
-compute_inline_parameters (struct cgraph_node *node)
+/* Determine whether node is inlinable.  */
+static void
+compute_inlinable (struct cgraph_node *node)
 {
-  HOST_WIDE_INT self_stack_size;
-
-  gcc_assert (!node->global.inlined_to);
-
-  /* Estimate the stack size for the function if we're optimizing.  */
-  self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
-  inline_summary (node)->estimated_self_stack_size = self_stack_size;
-  node->global.estimated_stack_size = self_stack_size;
-  node->global.stack_frame_offset = 0;
-
-  /* Can this function be inlined at all?  */
   node->local.inlinable = tree_inlinable_function_p (node->decl);
 
   /* Inlinable functions always can change signature.  */
@@ -1998,7 +1987,7 @@ compute_inline_parameters (struct cgraph
     {
       struct cgraph_edge *e;
 
-      /* Functions calling builtlin_apply can not change signature.  */
+      /* Functions calling builtin_apply can not change signature.  */
       for (e = node->callees; e; e = e->next_callee)
 	if (DECL_BUILT_IN (e->callee->decl)
 	    && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
@@ -2006,9 +1995,33 @@ compute_inline_parameters (struct cgraph
 	  break;
       node->local.can_change_signature = !e;
     }
+
   if (node->local.inlinable && !node->local.disregard_inline_limits)
     node->local.disregard_inline_limits
       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
+}
+
+/* Compute parameters of functions used by inliner.  */
+void
+compute_inline_parameters (struct cgraph_node *node)
+{
+  HOST_WIDE_INT self_stack_size;
+
+  gcc_assert (!node->global.inlined_to);
+
+  compute_inlinable (node);
+
+  /* A function won't be considered for inlining before we put it in
+     SSA form and compute referenced_vars.  */
+  if (!gimple_referenced_vars (DECL_STRUCT_FUNCTION (node->decl)))
+    return;
+
+  /* Estimate the stack size for the function if we're optimizing.  */
+  self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
+  inline_summary (node)->estimated_self_stack_size = self_stack_size;
+  node->global.estimated_stack_size = self_stack_size;
+  node->global.stack_frame_offset = 0;
+
   estimate_function_body_sizes (node);
   /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
   node->global.time = inline_summary (node)->self_time;
@@ -2016,6 +2029,14 @@ compute_inline_parameters (struct cgraph
 }
 
 
+/* Determine whether current_function_decl is inlinable.  */
+static unsigned int
+compute_inlinable_for_current (void)
+{
+  compute_inlinable (cgraph_node (current_function_decl));
+  return 0;
+}
+
 /* Compute parameters of functions used by inliner using
    current_function_decl.  */
 static unsigned int
@@ -2025,6 +2046,25 @@ compute_inline_parameters_for_current (v
   return 0;
 }
 
+struct gimple_opt_pass pass_inlinable =
+{
+ {
+  GIMPLE_PASS,
+  "inlinable",				/* name */
+  NULL,					/* gate */
+  compute_inlinable_for_current,	/* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  TV_INLINE_HEURISTICS,			/* tv_id */
+  0,					/* properties_required */
+  0,					/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  0					/* todo_flags_finish */
+ }
+};
+
 struct gimple_opt_pass pass_inline_parameters =
 {
  {
@@ -2036,7 +2076,7 @@ struct gimple_opt_pass pass_inline_param
   NULL,					/* next */
   0,					/* static_pass_number */
   TV_INLINE_HEURISTICS,			/* tv_id */
-  0,	                                /* properties_required */
+  PROP_referenced_vars,	                /* properties_required */
   0,					/* properties_provided */
   0,					/* properties_destroyed */
   0,					/* todo_flags_start */
Index: gcc/cgraphunit.c
===================================================================
--- gcc/cgraphunit.c.orig	2011-02-11 13:42:38.057623032 -0200
+++ gcc/cgraphunit.c	2011-02-11 13:42:40.211577872 -0200
@@ -246,7 +246,6 @@ cgraph_process_new_functions (void)
 	    cgraph_analyze_function (node);
 	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
 	  current_function_decl = fndecl;
-	  compute_inline_parameters (node);
 	  if ((cgraph_state == CGRAPH_STATE_IPA_SSA
 	      && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
 	      /* When not optimizing, be sure we run early local passes anyway
Index: gcc/testsuite/g++.dg/debug/pr47106.C
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc/testsuite/g++.dg/debug/pr47106.C	2011-02-11 13:42:40.236577347 -0200
@@ -0,0 +1,37 @@
+// { dg-do compile }
+// { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+
+void end (int, int) __attribute__ ((__noreturn__));
+
+struct S
+{
+  int i;
+  S *s;
+};
+
+inline bool f (S *s)
+{
+  if (!s->s)
+    end (0, 0);
+  return s->s == s;
+}
+
+inline bool
+baz (S s1, S)
+{
+  while (f (&s1));
+}
+
+inline bool
+bar (S s1, S s2, S)
+{
+  baz (s1, s2);
+}
+
+S getS ();
+
+bool
+foo ()
+{
+  bar (getS (), getS (), getS ());
+}
Index: gcc/tree-pass.h
===================================================================
--- gcc/tree-pass.h.orig	2011-02-11 13:42:37.641631793 -0200
+++ gcc/tree-pass.h	2011-02-11 13:42:40.249577075 -0200
@@ -572,6 +572,7 @@ extern struct rtl_opt_pass pass_final;
 extern struct rtl_opt_pass pass_rtl_seqabstr;
 extern struct gimple_opt_pass pass_release_ssa_names;
 extern struct gimple_opt_pass pass_early_inline;
+extern struct gimple_opt_pass pass_inlinable;
 extern struct gimple_opt_pass pass_inline_parameters;
 extern struct gimple_opt_pass pass_all_early_optimizations;
 extern struct gimple_opt_pass pass_update_address_taken;

[-- Attachment #17: Type: text/plain, Size: 258 bytes --]



-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-12 13:45                               ` Alexandre Oliva
@ 2011-02-14 12:41                                 ` Richard Guenther
  2011-02-14 20:25                                   ` Alexandre Oliva
  2011-02-17  1:53                                   ` Jan Hubicka
  2011-02-17  9:33                                 ` Jan Hubicka
  2011-02-17 16:41                                 ` Jan Hubicka
  2 siblings, 2 replies; 46+ messages in thread
From: Richard Guenther @ 2011-02-14 12:41 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Sat, Feb 12, 2011 at 2:43 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Feb  7, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>> The referenced_var_lookup interface change is ok.
>
> After much hacking and testing, I have a patchset that works (no
> regressions) and that I'm happy with.
>
> I've broken it up into preparation patches with interface changes, the
> aactual change, one independent bug fix, some additional improvements
> that I regard as slightly risky, but that have survived multiple
> bootstrap cycles (-O1 and -O3 in addition to -O2, on i686 and x86_64),
> and some additional patches I used for testing and for verifying that
> the estimated stack sizes are unchanged from the current method, when
> debug info is not enabled (when it is, variables retained in lexical
> blocks for debug information only used to inflate the estimate, but now
> they no longer do)
>
> On Feb  7, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:
>
>>> we still tried to estimate the stack size of functions that hadn't
>>> been put in SSA form and had referenced_vars computed.
>
>>> From where we did so?
>
> SRA, indeed.
>
> On Feb  8, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>> Yes, we can have cycles and no referenced vars in the callee.  But
>> we won't inline that callee (as it isn't in SSA form), so we can just
>> as well skip looking at its estimated stack frame size.
>
> Excellent!
>
> On Feb  8, 2011, Jan Hubicka <hubicka@ucw.cz> wrote:
>
>> Functions checking inline parameters (check_inline_limits etc.) are done only
>> after in_ssa_p check is done, so we should be safe here.
>
> Yup.
>
>> (with exception of disregard inline limits, you can move that check
>> later but any value of that flag is OK since we won't do anything with
>> the call)
>
> I hadn't realized what this note was about before now.  I hit the
> problem of processing functions in the wrong order, breaking pr42632.c,
> which is why I introduced a new pass, compute_inlinable, at the point
> the initial compute_inline_params was.  I tried not calling
> compute_inlinable from within compute_inline_parameters, but I found out
> inlinable changed, or had to be computed separately, or something like
> that.  What I didn't try was to replace the explicit call of
> compute_inline_parameters in cgraph_process_new_functions with
> compute_inlinable.  From the failures I observed, it now occurs to me
> that this might be what I was missing.  Anyhow, this patch works, but if
> you'd rather I make this other change, I will.
>
>
> Anyhow, here is the patch set.
>
> Interface changes:
>
>
>
> The patch that fixes the problem, using referenced_vars to estimate
> stack sizes.
>
>
>
> This patch fixes a latent bug: when estimating function sizes/times, we
> use functions that access current_function_decl, without always having
> set it up to the function being estimated.
>
>
>
> These two patches remove unnecessary setting up of cfun and
> current_function_decl.
>
>
>
>
> Are the above ok to install?  All regstrapped on x86_64-linux-gnu and
> i686-pc-linux-gnu, and also bootstrapped with -O1 and -O3, besides the
> tests described below.
>
>
> This FTR patch makes sure the previous two patches are safe, at least
> inasmuchas they ensure we don't unconditionally dereferenced cfun or
> current_function_decl within these functions.  It wasn't enough,
> however, to detect the problem in function size/time estimation.  That
> required comparing the output of the compiler with and without the
> patches that dropped cfun overriding.  There was only one difference in
> stage3-like builds, in ada/trans.o, and that was fixed with the patch
> that set current_function_decl while estimating function body sizes and
> times.
>
>
>
> Finally, this FTR patch was used to compare the stack size estimations
> of the current (trunk) approach and the one introduced with this
> patchset.  No messages printed in stage2 host for a bootstrap-debug
> build (no -g), indicating no differences; plenty printed during stage3
> (with -g), always lower counts.

Thanks for splitting the patch up.

referenced-var-lookup-fn-arg.patch, for-each-referenced-var-fn-arg.patch,
estimated-stack-frame-size-cgraph-node-arg.patch and
compute-inline-params-ret-void.patch are obvious (and ok), you can apply
them independently now.

I'm skipping ipa-inline-stack-growth-referenced-vars-pr47106.patch for detailed
review here.

--

estimate-function-body-sizes-set-cfundecl.patch contains

+  /* ??? estimate_num_insns may indirectly call
+     decl_address_invariant_p which tests current_function_decl.  We
+     should probably somehow arrange for decl_address_ip_invariant_p
+     to be called instead.  Leaving current_function_decl NULL would
+     probably have the same effect, but setting current_function_decl
+     we get a more accurate estimate.  */
+  current_function_decl = node->decl;

which does look like a hack.  Looking at decl_address_invariant_p it
_seems_ that the test
          || DECL_CONTEXT (op) == current_function_decl
          || decl_function_context (op) == current_function_decl)
could simply be
   DECL_CONTEXT (op) && TREE_CODE (DECL_CONTEX (op)) == FUNCTION_DECL

that said, I'm not sure about the necessity of this patch at this stage,
I'd like to defer it (or rather a better solution) to 4.7 if possible.
(you might want to open a bug so we don't forget)

Does dont-push-cfun-in-estimate-stack-frame-size.patch depend on this
patch?  I guess dont-push-cfun-in-estimate-stack-frame-size.patch depends on
ipa-inline-stack-growth-referenced-vars-pr47106.patch at least?
Similar dont-push-cfun-in-analyze-function.patch?

ipa-inline-stack-growth-referenced-vars-pr47106.patch was attached twice,
so I'm not sure about inter-dependencies.  If the do-not-set-cfun patches
pass testing ontop of the obvious cleanup patches they are ok as well.

--

Now onto ipa-inline-stack-growth-referenced-vars-pr47106.patch

it looks ok apart from the new pass_inlinable stuff, that at least sounds
odd and I'd want Honza to have a look here.  The testcase you say
the patch breaks otherwise redefines a _static_ inline function, so
I'd say it's invalid anyway.  I also wonder why computing
->inlineable should have any effect on that testcase ...

Well, if this remaining pass was all that would need to be applied I'd
have a quick look now ... ;)

So, can we get the approved pre-requesties applied?

Thanks for working on this odd part of GCC!
Richard.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-14 12:41                                 ` Richard Guenther
@ 2011-02-14 20:25                                   ` Alexandre Oliva
  2011-02-15 11:04                                     ` Richard Guenther
  2011-02-17  1:53                                   ` Jan Hubicka
  1 sibling, 1 reply; 46+ messages in thread
From: Alexandre Oliva @ 2011-02-14 20:25 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 2510 bytes --]

On Feb 14, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:

> referenced-var-lookup-fn-arg.patch, for-each-referenced-var-fn-arg.patch,
> estimated-stack-frame-size-cgraph-node-arg.patch and
> compute-inline-params-ret-void.patch are obvious (and ok), you can apply
> them independently now.

Thanks

> which does look like a hack.  Looking at decl_address_invariant_p it
> _seems_ that the test
>           || DECL_CONTEXT (op) == current_function_decl
>           || decl_function_context (op) == current_function_decl)
> could simply be
>    DECL_CONTEXT (op) && TREE_CODE (DECL_CONTEX (op)) == FUNCTION_DECL

Err...  How would this tell whether the decl is local to the current
function?

> that said, I'm not sure about the necessity of this patch at this stage,
> I'd like to defer it (or rather a better solution) to 4.7 if possible.
> (you might want to open a bug so we don't forget)

*nod*

> Does dont-push-cfun-in-estimate-stack-frame-size.patch depend on this
> patch?

Nope.

> I guess dont-push-cfun-in-estimate-stack-frame-size.patch depends on
> ipa-inline-stack-growth-referenced-vars-pr47106.patch at least?

Yup.

> Similar dont-push-cfun-in-analyze-function.patch?

Yep, but the analyze_function change also depends on the
current_function_decl setting within the body size estimation.

> ipa-inline-stack-growth-referenced-vars-pr47106.patch was attached twice,

Oops, sorry, I meant to attach the -compare patch the second time.

> so I'm not sure about inter-dependencies.  If the do-not-set-cfun patches
> pass testing ontop of the obvious cleanup patches they are ok as well.

Ok, I'll retest and install.

> Now onto ipa-inline-stack-growth-referenced-vars-pr47106.patch

> it looks ok apart from the new pass_inlinable stuff, that at least sounds
> odd and I'd want Honza to have a look here.  The testcase you say
> the patch breaks otherwise redefines a _static_ inline function

Err, no, it's 3 separate functions with different names (foo, __foo and
____foo)

> I also wonder why computing -> inlineable should have any effect on
> that testcase ...

Computing inlinable early enables edges to not be marked as
non-inlinable early.  Computing disregard_* early enables cycles to be
broken at the right spots when deciding the order in which we're going
to process functions.  See the comment right before the two occurrences
of disregard in ipa.c:cgraph_postorder().

> So, can we get the approved pre-requesties applied?

Will do shortly.

Thanks for the reviews.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ipa-inline-stack-growth-referenced-vars-pr47106-compare.patch --]
[-- Type: text/x-diff, Size: 3150 bytes --]

FTR, testing and reporting only, not to be installed.

Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c.orig	2011-02-10 09:27:16.004373533 -0200
+++ gcc/cfgexpand.c	2011-02-10 09:29:49.418374663 -0200
@@ -1311,18 +1311,41 @@ create_stack_guard (void)
   crtl->stack_protect_guard = guard;
 }
 
+/* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
+   expanding variables.  Those variables that can be put into registers
+   are allocated pseudos; those that can't are put on the stack.
+
+   TOPLEVEL is true if this is the outermost BLOCK.  */
+
+static HOST_WIDE_INT
+account_used_vars_for_block (tree block, bool toplevel)
+{
+  tree t;
+  HOST_WIDE_INT size = 0;
+
+  /* Expand all variables at this level.  */
+  for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
+    size += expand_one_var (t, toplevel, false);
+
+  /* Expand all variables at containing levels.  */
+  for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
+    size += account_used_vars_for_block (t, false);
+
+  return size;
+}
+
 /* Prepare for expanding variables.  */
 static void
-init_vars_expansion (void)
+init_vars_expansion (tree decl)
 {
   tree t;
   unsigned ix;
   /* Set TREE_USED on all variables in the local_decls.  */
-  FOR_EACH_LOCAL_DECL (cfun, ix, t)
+  FOR_EACH_LOCAL_DECL (DECL_STRUCT_FUNCTION (decl), ix, t)
     TREE_USED (t) = 1;
 
   /* Clear TREE_USED on all variables associated with a block scope.  */
-  clear_tree_used (DECL_INITIAL (current_function_decl));
+  clear_tree_used (DECL_INITIAL (decl));
 
   /* Initialize local stack smashing state.  */
   has_protected_decls = false;
@@ -1354,6 +1377,8 @@ HOST_WIDE_INT
 estimated_stack_frame_size (struct cgraph_node *node)
 {
   HOST_WIDE_INT size = 0;
+  HOST_WIDE_INT delta;
+  unsigned ix;
   size_t i;
   tree var;
   referenced_var_iterator rvi;
@@ -1377,6 +1402,34 @@ estimated_stack_frame_size (struct cgrap
       fini_vars_expansion ();
     }
 
+  delta = size;
+
+  init_vars_expansion (node->decl);
+
+  FOR_EACH_LOCAL_DECL (fn, ix, var)
+    {
+      /* TREE_USED marks local variables that do not appear in lexical
+	 blocks.  We don't want to expand those that do twice.  */
+      if (TREE_USED (var))
+        delta -= expand_one_var (var, true, false);
+    }
+  delta -= account_used_vars_for_block (DECL_INITIAL (node->decl), true);
+
+  if (stack_vars_num > 0)
+    {
+      /* Fake sorting the stack vars for account_stack_vars ().  */
+      stack_vars_sorted = XNEWVEC (size_t, stack_vars_num);
+      for (i = 0; i < stack_vars_num; ++i)
+	stack_vars_sorted[i] = i;
+      delta -= account_stack_vars ();
+      fini_vars_expansion ();
+    }
+
+  if (delta)
+    inform (input_location,
+	    "guestimated stack size for %q+D changed by %li to %li",
+	    node->decl, (long)delta, (long)size);
+
   current_function_decl = save;
   pop_cfun ();
 
@@ -1400,7 +1453,7 @@ expand_used_vars (void)
     frame_phase = off ? align - off : 0;
   }
 
-  init_vars_expansion ();
+  init_vars_expansion (current_function_decl);
 
   for (i = 0; i < SA.map->num_partitions; i++)
     {

[-- Attachment #3: Type: text/plain, Size: 257 bytes --]


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-14 20:25                                   ` Alexandre Oliva
@ 2011-02-15 11:04                                     ` Richard Guenther
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Guenther @ 2011-02-15 11:04 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, Jan Hubicka

On Mon, Feb 14, 2011 at 9:00 PM, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Feb 14, 2011, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>> referenced-var-lookup-fn-arg.patch, for-each-referenced-var-fn-arg.patch,
>> estimated-stack-frame-size-cgraph-node-arg.patch and
>> compute-inline-params-ret-void.patch are obvious (and ok), you can apply
>> them independently now.
>
> Thanks
>
>> which does look like a hack.  Looking at decl_address_invariant_p it
>> _seems_ that the test
>>           || DECL_CONTEXT (op) == current_function_decl
>>           || decl_function_context (op) == current_function_decl)
>> could simply be
>>    DECL_CONTEXT (op) && TREE_CODE (DECL_CONTEX (op)) == FUNCTION_DECL
>
> Err...  How would this tell whether the decl is local to the current
> function?
>
>> that said, I'm not sure about the necessity of this patch at this stage,
>> I'd like to defer it (or rather a better solution) to 4.7 if possible.
>> (you might want to open a bug so we don't forget)
>
> *nod*
>
>> Does dont-push-cfun-in-estimate-stack-frame-size.patch depend on this
>> patch?
>
> Nope.
>
>> I guess dont-push-cfun-in-estimate-stack-frame-size.patch depends on
>> ipa-inline-stack-growth-referenced-vars-pr47106.patch at least?
>
> Yup.
>
>> Similar dont-push-cfun-in-analyze-function.patch?
>
> Yep, but the analyze_function change also depends on the
> current_function_decl setting within the body size estimation.
>
>> ipa-inline-stack-growth-referenced-vars-pr47106.patch was attached twice,
>
> Oops, sorry, I meant to attach the -compare patch the second time.
>
>> so I'm not sure about inter-dependencies.  If the do-not-set-cfun patches
>> pass testing ontop of the obvious cleanup patches they are ok as well.
>
> Ok, I'll retest and install.
>
>> Now onto ipa-inline-stack-growth-referenced-vars-pr47106.patch
>
>> it looks ok apart from the new pass_inlinable stuff, that at least sounds
>> odd and I'd want Honza to have a look here.  The testcase you say
>> the patch breaks otherwise redefines a _static_ inline function
>
> Err, no, it's 3 separate functions with different names (foo, __foo and
> ____foo)
>
>> I also wonder why computing -> inlineable should have any effect on
>> that testcase ...
>
> Computing inlinable early enables edges to not be marked as
> non-inlinable early.  Computing disregard_* early enables cycles to be
> broken at the right spots when deciding the order in which we're going
> to process functions.  See the comment right before the two occurrences
> of disregard in ipa.c:cgraph_postorder().

Ah ok, that indeed makes sense.  Computing node->local separately
from inline-limits is a nice cleanup indeed.

Let's wait for Honzas comment and if he's happy with that part the
patch is ok.

Thanks,
Richard.

>> So, can we get the approved pre-requesties applied?
>
> Will do shortly.
>
> Thanks for the reviews.
>
>
>
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist      Red Hat Brazil Compiler Engineer
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-14 12:41                                 ` Richard Guenther
  2011-02-14 20:25                                   ` Alexandre Oliva
@ 2011-02-17  1:53                                   ` Jan Hubicka
  2011-02-17  1:59                                     ` Jan Hubicka
  1 sibling, 1 reply; 46+ messages in thread
From: Jan Hubicka @ 2011-02-17  1:53 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Alexandre Oliva, gcc-patches, Jan Hubicka

> Now onto ipa-inline-stack-growth-referenced-vars-pr47106.patch
> 
> it looks ok apart from the new pass_inlinable stuff, that at least sounds
> odd and I'd want Honza to have a look here.  The testcase you say

It is because early inliner depends on toporder to properly place always_inline functions, right?
I guess it is sane solution.  We might try to avoid recomputing inlinable in inline_parameters
done just before early_inline pass as we have nothing that should change inlinablity.
But in longer run we might want to add some cleanups before early inlining there, so I guess it is
easier to not worry here.

> the patch breaks otherwise redefines a _static_ inline function, so
> I'd say it's invalid anyway.  I also wonder why computing
> ->inlineable should have any effect on that testcase ...

I don't see the testcase redefining anything, it seems to me that toporder is now getting wrong.
I will try to look into this tomorrow.

If this is solved, the patch is OK.
Honza

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-17  1:53                                   ` Jan Hubicka
@ 2011-02-17  1:59                                     ` Jan Hubicka
  0 siblings, 0 replies; 46+ messages in thread
From: Jan Hubicka @ 2011-02-17  1:59 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, Alexandre Oliva, gcc-patches

> > Now onto ipa-inline-stack-growth-referenced-vars-pr47106.patch
> > 
> > it looks ok apart from the new pass_inlinable stuff, that at least sounds
> > odd and I'd want Honza to have a look here.  The testcase you say
> 
> It is because early inliner depends on toporder to properly place always_inline functions, right?
> I guess it is sane solution.  We might try to avoid recomputing inlinable in inline_parameters
> done just before early_inline pass as we have nothing that should change inlinablity.
> But in longer run we might want to add some cleanups before early inlining there, so I guess it is
> easier to not worry here.
Also because we only care abou disregard_inline_limits that is computed based on the DECL flag,
perhaps we can move 
node->local.disregard_inline_limits
      = DECL_DISREGARD_INLINE_LIMITS (node->decl);
into cgraph_analyze_function and do not introduce the new pass. 
We would end up setting the flag on uninlinable always inlines, but those are bogus anyway.

Honza

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-12 13:45                               ` Alexandre Oliva
  2011-02-14 12:41                                 ` Richard Guenther
@ 2011-02-17  9:33                                 ` Jan Hubicka
  2011-02-18  3:58                                   ` Jack Howarth
  2011-02-17 16:41                                 ` Jan Hubicka
  2 siblings, 1 reply; 46+ messages in thread
From: Jan Hubicka @ 2011-02-17  9:33 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, Jan Hubicka

Hi,
this is variant of patch I am testing now.
Quick testing on the problematic PR shows no problems and we get around w/o
the extra pass.

Honza

Index: cgraphunit.c
===================================================================
*** cgraphunit.c	(revision 170240)
--- cgraphunit.c	(working copy)
*************** cgraph_analyze_function (struct cgraph_n
*** 783,788 ****
--- 783,793 ----
  
    assign_assembler_name_if_neeeded (node->decl);
  
+   /* disregard_inline_limits affects topological order of the early optimization,
+      so we need to compute it ahead of rest of inline parameters.  */
+   node->local.disregard_inline_limits
+     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
+ 
    /* Make sure to gimplify bodies only once.  During analyzing a
       function we lower it, which will require gimplified nested
       functions, so we can end up here with an already gimplified
Index: testsuite/g++.dg/debug/pr47106.C
===================================================================
*** testsuite/g++.dg/debug/pr47106.C	(revision 0)
--- testsuite/g++.dg/debug/pr47106.C	(revision 0)
***************
*** 0 ****
--- 1,37 ----
+ // { dg-do compile }
+ // { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+ 
+ void end (int, int) __attribute__ ((__noreturn__));
+ 
+ struct S
+ {
+   int i;
+   S *s;
+ };
+ 
+ inline bool f (S *s)
+ {
+   if (!s->s)
+     end (0, 0);
+   return s->s == s;
+ }
+ 
+ inline bool
+ baz (S s1, S)
+ {
+   while (f (&s1));
+ }
+ 
+ inline bool
+ bar (S s1, S s2, S)
+ {
+   baz (s1, s2);
+ }
+ 
+ S getS ();
+ 
+ bool
+ foo ()
+ {
+   bar (getS (), getS (), getS ());
+ }
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 170240)
--- ipa-inline.c	(working copy)
*************** compute_inline_parameters (struct cgraph
*** 2006,2014 ****
  	  break;
        node->local.can_change_signature = !e;
      }
-   if (node->local.inlinable && !node->local.disregard_inline_limits)
-     node->local.disregard_inline_limits
-       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
    estimate_function_body_sizes (node);
    /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
    node->global.time = inline_summary (node)->self_time;
--- 2006,2011 ----
Index: cfgexpand.c
===================================================================
*** cfgexpand.c	(revision 170240)
--- cfgexpand.c	(working copy)
*************** create_stack_guard (void)
*** 1311,1340 ****
    crtl->stack_protect_guard = guard;
  }
  
- /* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-    expanding variables.  Those variables that can be put into registers
-    are allocated pseudos; those that can't are put on the stack.
- 
-    TOPLEVEL is true if this is the outermost BLOCK.  */
- 
- static HOST_WIDE_INT
- account_used_vars_for_block (tree block, bool toplevel)
- {
-   tree t;
-   HOST_WIDE_INT size = 0;
- 
-   /* Expand all variables at this level.  */
-   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-     if (var_ann (t) && is_used_p (t))
-       size += expand_one_var (t, toplevel, false);
- 
-   /* Expand all variables at containing levels.  */
-   for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-     size += account_used_vars_for_block (t, false);
- 
-   return size;
- }
- 
  /* Prepare for expanding variables.  */
  static void
  init_vars_expansion (void)
--- 1311,1316 ----
*************** estimated_stack_frame_size (struct cgrap
*** 1379,1401 ****
  {
    HOST_WIDE_INT size = 0;
    size_t i;
!   tree var, outer_block = DECL_INITIAL (current_function_decl);
!   unsigned ix;
    tree old_cur_fun_decl = current_function_decl;
  
    current_function_decl = node->decl;
!   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  
!   init_vars_expansion ();
! 
!   FOR_EACH_LOCAL_DECL (cfun, ix, var)
!     {
!       /* TREE_USED marks local variables that do not appear in lexical
! 	 blocks.  We don't want to expand those that do twice.  */
!       if (TREE_USED (var))
!         size += expand_one_var (var, true, false);
!     }
!   size += account_used_vars_for_block (outer_block, true);
  
    if (stack_vars_num > 0)
      {
--- 1355,1371 ----
  {
    HOST_WIDE_INT size = 0;
    size_t i;
!   tree var;
    tree old_cur_fun_decl = current_function_decl;
+   referenced_var_iterator rvi;
+   struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
  
    current_function_decl = node->decl;
!   push_cfun (fn);
  
!   gcc_checking_assert (gimple_referenced_vars (fn));
!   FOR_EACH_REFERENCED_VAR (fn, var, rvi)
!     size += expand_one_var (var, true, false);
  
    if (stack_vars_num > 0)
      {
Index: tree-inline.c
===================================================================
*** tree-inline.c	(revision 170240)
--- tree-inline.c	(working copy)
*************** remap_decl (tree decl, copy_body_data *i
*** 312,324 ****
  	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
  	}
  
!       if (cfun && gimple_in_ssa_p (cfun)
! 	  && (TREE_CODE (t) == VAR_DECL
! 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
! 	{
! 	  get_var_ann (t);
! 	  add_referenced_var (t);
! 	}
        return t;
      }
  
--- 312,328 ----
  	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
  	}
  
!       if ((TREE_CODE (t) == VAR_DECL
! 	   || TREE_CODE (t) == RESULT_DECL
! 	   || TREE_CODE (t) == PARM_DECL)
! 	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
! 	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
! 	  /* We don't want to mark as referenced VAR_DECLs that were
! 	     not marked as such in the src function.  */
! 	  && (TREE_CODE (decl) != VAR_DECL
! 	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
! 					DECL_UID (decl))))
! 	add_referenced_var (t);
        return t;
      }
  
*************** setup_one_parameter (copy_body_data *id,
*** 2547,2556 ****
  
    /* We're actually using the newly-created var.  */
    if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
!     {
!       get_var_ann (var);
!       add_referenced_var (var);
!     }
  
    /* Declare this new variable.  */
    DECL_CHAIN (var) = *vars;
--- 2551,2557 ----
  
    /* We're actually using the newly-created var.  */
    if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
!     add_referenced_var (var);
  
    /* Declare this new variable.  */
    DECL_CHAIN (var) = *vars;
*************** declare_return_variable (copy_body_data 
*** 2857,2866 ****
  
    var = copy_result_decl_to_var (result, id);
    if (gimple_in_ssa_p (cfun))
!     {
!       get_var_ann (var);
!       add_referenced_var (var);
!     }
  
    DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
  
--- 2858,2864 ----
  
    var = copy_result_decl_to_var (result, id);
    if (gimple_in_ssa_p (cfun))
!     add_referenced_var (var);
  
    DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
  
*************** declare_return_variable (copy_body_data 
*** 2896,2905 ****
      {
        tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
        if (gimple_in_ssa_p (id->src_cfun))
! 	{
! 	  get_var_ann (temp);
! 	  add_referenced_var (temp);
! 	}
        insert_decl_map (id, result, temp);
        /* When RESULT_DECL is in SSA form, we need to use it's default_def
  	 SSA_NAME.  */
--- 2894,2900 ----
      {
        tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
        if (gimple_in_ssa_p (id->src_cfun))
! 	add_referenced_var (temp);
        insert_decl_map (id, result, temp);
        /* When RESULT_DECL is in SSA form, we need to use it's default_def
  	 SSA_NAME.  */
*************** copy_decl_for_dup_finish (copy_body_data
*** 4753,4758 ****
--- 4748,4761 ----
         new function.  */
      DECL_CONTEXT (copy) = id->dst_fn;
  
+   if (TREE_CODE (decl) == VAR_DECL
+       /* C++ clones functions during parsing, before
+ 	 referenced_vars.  */
+       && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+       && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+ 				DECL_UID (decl)))
+     add_referenced_var (copy);
+ 
    return copy;
  }
  
*************** copy_arguments_for_versioning (tree orig
*** 4864,4870 ****
  	   as temporary variable later in function, the uses will be
  	   replaced by local variable.  */
  	tree var = copy_decl_to_var (arg, id);
- 	get_var_ann (var);
  	add_referenced_var (var);
  	insert_decl_map (id, arg, var);
          /* Declare this new variable.  */
--- 4867,4872 ----
Index: passes.c
===================================================================
*** passes.c	(revision 170240)
--- passes.c	(working copy)
*************** init_optimization_passes (void)
*** 729,735 ****
    NEXT_PASS (pass_build_cfg);
    NEXT_PASS (pass_warn_function_return);
    NEXT_PASS (pass_build_cgraph_edges);
-   NEXT_PASS (pass_inline_parameters);
    *p = NULL;
  
    /* Interprocedural optimization passes.  */
--- 729,734 ----
*************** init_optimization_passes (void)
*** 747,758 ****
        NEXT_PASS (pass_build_ssa);
        NEXT_PASS (pass_lower_vector);
        NEXT_PASS (pass_early_warn_uninitialized);
-       /* Note that it is not strictly necessary to schedule an early
- 	 inline pass here.  However, some test cases (e.g.,
- 	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
- 	 inline functions to be inlined even at -O0.  This does not
- 	 happen during the first early inline pass.  */
        NEXT_PASS (pass_rebuild_cgraph_edges);
        NEXT_PASS (pass_early_inline);
        NEXT_PASS (pass_all_early_optimizations);
  	{
--- 746,753 ----
        NEXT_PASS (pass_build_ssa);
        NEXT_PASS (pass_lower_vector);
        NEXT_PASS (pass_early_warn_uninitialized);
        NEXT_PASS (pass_rebuild_cgraph_edges);
+       NEXT_PASS (pass_inline_parameters);
        NEXT_PASS (pass_early_inline);
        NEXT_PASS (pass_all_early_optimizations);
  	{

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-12 13:45                               ` Alexandre Oliva
  2011-02-14 12:41                                 ` Richard Guenther
  2011-02-17  9:33                                 ` Jan Hubicka
@ 2011-02-17 16:41                                 ` Jan Hubicka
  2011-02-17 18:59                                   ` H.J. Lu
  2 siblings, 1 reply; 46+ messages in thread
From: Jan Hubicka @ 2011-02-17 16:41 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Richard Guenther, gcc-patches, Jan Hubicka

Hi,
this is variant of patch I comitted after re-testing.
The change relative to previous version I sent is new tree-sra hunk avoiding
call of compute_inline_parameters on functions that are not in SSA yet and
cgraph_process_new_functions calling it in functions that arrive in SSA form
and thus are not early optimized.

Honza

Index: ChangeLog
===================================================================
*** ChangeLog	(revision 170248)
--- ChangeLog	(working copy)
***************
*** 1,3 ****
--- 1,29 ----
+ 2011-02-17  Alexandre Oliva  <aoliva@redhat.com>
+ 	    Jan Hubicka  <jh@suse.cz>
+ 
+ 	PR debug/47106
+ 	PR debug/47402
+ 	* cfgexpand.c (account_used_vars_for_block): Remove.
+ 	(estimated_stack_frame_size): Use referenced vars.
+ 	* tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
+ 	that were referenced in the original function.  Test src_fn
+ 	rather than cfun.  Drop redundant get_var_ann.
+ 	(setup_one_parameter): Drop redundant get_var_ann.
+ 	(declare_return_variable): Likewise.
+ 	(copy_decl_for_dup_finish): Mark VAR_DECLs referenced in src_fn.
+ 	(copy_arguments_for_versioning): Drop redundant get_var_ann.
+ 	* ipa-inline.c (compute_inline_parameters): Do not compute
+ 	disregard_inline_limits here.
+ 	are not available.
+ 	(compute_inlinable_for_current, pass_inlinable): New.
+ 	(pass_inline_parameters): Require PROP_referenced_vars.
+ 	* cgraphunit.c (cgraph_process_new_functions): Don't run
+ 	compute_inline_parameters explicitly unless function is in
+ 	SSA form.
+ 	(cgraph_analyze_function): Set .disregard_inline_limits.
+ 	* tree-sra.c (convert_callers): Compute inliner parameters
+ 	only for functions already in SSA form.
+ 
  2011-02-17  Joseph Myers  <joseph@codesourcery.com>
  
  	* config/sparc/sparc.h (CPP_ENDIAN_SPEC): Don't handle
Index: cgraphunit.c
===================================================================
*** cgraphunit.c	(revision 170240)
--- cgraphunit.c	(working copy)
*************** cgraph_process_new_functions (void)
*** 246,258 ****
  	    cgraph_analyze_function (node);
  	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
  	  current_function_decl = fndecl;
- 	  compute_inline_parameters (node);
  	  if ((cgraph_state == CGRAPH_STATE_IPA_SSA
  	      && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
  	      /* When not optimizing, be sure we run early local passes anyway
  		 to expand OMP.  */
  	      || !optimize)
  	    execute_pass_list (pass_early_local_passes.pass.sub);
  	  free_dominance_info (CDI_POST_DOMINATORS);
  	  free_dominance_info (CDI_DOMINATORS);
  	  pop_cfun ();
--- 246,259 ----
  	    cgraph_analyze_function (node);
  	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
  	  current_function_decl = fndecl;
  	  if ((cgraph_state == CGRAPH_STATE_IPA_SSA
  	      && !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
  	      /* When not optimizing, be sure we run early local passes anyway
  		 to expand OMP.  */
  	      || !optimize)
  	    execute_pass_list (pass_early_local_passes.pass.sub);
+ 	  else
+ 	    compute_inline_parameters (node);
  	  free_dominance_info (CDI_POST_DOMINATORS);
  	  free_dominance_info (CDI_DOMINATORS);
  	  pop_cfun ();
*************** cgraph_analyze_function (struct cgraph_n
*** 783,788 ****
--- 784,794 ----
  
    assign_assembler_name_if_neeeded (node->decl);
  
+   /* disregard_inline_limits affects topological order of the early optimization,
+      so we need to compute it ahead of rest of inline parameters.  */
+   node->local.disregard_inline_limits
+     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
+ 
    /* Make sure to gimplify bodies only once.  During analyzing a
       function we lower it, which will require gimplified nested
       functions, so we can end up here with an already gimplified
Index: testsuite/ChangeLog
===================================================================
*** testsuite/ChangeLog	(revision 170248)
--- testsuite/ChangeLog	(working copy)
***************
*** 1,3 ****
--- 1,10 ----
+ 2011-02-17  Alexandre Oliva  <aoliva@redhat.com>
+ 	    Jan Hubicka  <jh@suse.cz>
+ 
+ 	PR debug/47106
+ 	PR debug/47402
+ 	* g++.dg/debug/pr47106.C: New.
+ 
  2011-02-17  Uros Bizjak  <ubizjak@gmail.com>
  
  	PR target/43653
Index: testsuite/g++.dg/debug/pr47106.C
===================================================================
*** testsuite/g++.dg/debug/pr47106.C	(revision 0)
--- testsuite/g++.dg/debug/pr47106.C	(revision 0)
***************
*** 0 ****
--- 1,37 ----
+ // { dg-do compile }
+ // { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
+ 
+ void end (int, int) __attribute__ ((__noreturn__));
+ 
+ struct S
+ {
+   int i;
+   S *s;
+ };
+ 
+ inline bool f (S *s)
+ {
+   if (!s->s)
+     end (0, 0);
+   return s->s == s;
+ }
+ 
+ inline bool
+ baz (S s1, S)
+ {
+   while (f (&s1));
+ }
+ 
+ inline bool
+ bar (S s1, S s2, S)
+ {
+   baz (s1, s2);
+ }
+ 
+ S getS ();
+ 
+ bool
+ foo ()
+ {
+   bar (getS (), getS (), getS ());
+ }
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 170240)
--- ipa-inline.c	(working copy)
*************** compute_inline_parameters (struct cgraph
*** 2006,2014 ****
  	  break;
        node->local.can_change_signature = !e;
      }
-   if (node->local.inlinable && !node->local.disregard_inline_limits)
-     node->local.disregard_inline_limits
-       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
    estimate_function_body_sizes (node);
    /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
    node->global.time = inline_summary (node)->self_time;
--- 2006,2011 ----
Index: cfgexpand.c
===================================================================
*** cfgexpand.c	(revision 170240)
--- cfgexpand.c	(working copy)
*************** create_stack_guard (void)
*** 1311,1340 ****
    crtl->stack_protect_guard = guard;
  }
  
- /* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
-    expanding variables.  Those variables that can be put into registers
-    are allocated pseudos; those that can't are put on the stack.
- 
-    TOPLEVEL is true if this is the outermost BLOCK.  */
- 
- static HOST_WIDE_INT
- account_used_vars_for_block (tree block, bool toplevel)
- {
-   tree t;
-   HOST_WIDE_INT size = 0;
- 
-   /* Expand all variables at this level.  */
-   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
-     if (var_ann (t) && is_used_p (t))
-       size += expand_one_var (t, toplevel, false);
- 
-   /* Expand all variables at containing levels.  */
-   for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
-     size += account_used_vars_for_block (t, false);
- 
-   return size;
- }
- 
  /* Prepare for expanding variables.  */
  static void
  init_vars_expansion (void)
--- 1311,1316 ----
*************** estimated_stack_frame_size (struct cgrap
*** 1379,1401 ****
  {
    HOST_WIDE_INT size = 0;
    size_t i;
!   tree var, outer_block = DECL_INITIAL (current_function_decl);
!   unsigned ix;
    tree old_cur_fun_decl = current_function_decl;
  
    current_function_decl = node->decl;
!   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  
!   init_vars_expansion ();
! 
!   FOR_EACH_LOCAL_DECL (cfun, ix, var)
!     {
!       /* TREE_USED marks local variables that do not appear in lexical
! 	 blocks.  We don't want to expand those that do twice.  */
!       if (TREE_USED (var))
!         size += expand_one_var (var, true, false);
!     }
!   size += account_used_vars_for_block (outer_block, true);
  
    if (stack_vars_num > 0)
      {
--- 1355,1371 ----
  {
    HOST_WIDE_INT size = 0;
    size_t i;
!   tree var;
    tree old_cur_fun_decl = current_function_decl;
+   referenced_var_iterator rvi;
+   struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
  
    current_function_decl = node->decl;
!   push_cfun (fn);
  
!   gcc_checking_assert (gimple_referenced_vars (fn));
!   FOR_EACH_REFERENCED_VAR (fn, var, rvi)
!     size += expand_one_var (var, true, false);
  
    if (stack_vars_num > 0)
      {
Index: tree-sra.c
===================================================================
*** tree-sra.c	(revision 170240)
--- tree-sra.c	(working copy)
*************** convert_callers (struct cgraph_node *nod
*** 4362,4368 ****
      }
  
    for (cs = node->callers; cs; cs = cs->next_caller)
!     if (bitmap_set_bit (recomputed_callers, cs->caller->uid))
        compute_inline_parameters (cs->caller);
    BITMAP_FREE (recomputed_callers);
  
--- 4362,4369 ----
      }
  
    for (cs = node->callers; cs; cs = cs->next_caller)
!     if (bitmap_set_bit (recomputed_callers, cs->caller->uid)
! 	&& gimple_in_ssa_p (DECL_STRUCT_FUNCTION (cs->caller->decl)))
        compute_inline_parameters (cs->caller);
    BITMAP_FREE (recomputed_callers);
  
Index: tree-inline.c
===================================================================
*** tree-inline.c	(revision 170240)
--- tree-inline.c	(working copy)
*************** remap_decl (tree decl, copy_body_data *i
*** 312,324 ****
  	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
  	}
  
!       if (cfun && gimple_in_ssa_p (cfun)
! 	  && (TREE_CODE (t) == VAR_DECL
! 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
! 	{
! 	  get_var_ann (t);
! 	  add_referenced_var (t);
! 	}
        return t;
      }
  
--- 312,328 ----
  	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
  	}
  
!       if ((TREE_CODE (t) == VAR_DECL
! 	   || TREE_CODE (t) == RESULT_DECL
! 	   || TREE_CODE (t) == PARM_DECL)
! 	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
! 	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
! 	  /* We don't want to mark as referenced VAR_DECLs that were
! 	     not marked as such in the src function.  */
! 	  && (TREE_CODE (decl) != VAR_DECL
! 	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
! 					DECL_UID (decl))))
! 	add_referenced_var (t);
        return t;
      }
  
*************** setup_one_parameter (copy_body_data *id,
*** 2547,2556 ****
  
    /* We're actually using the newly-created var.  */
    if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
!     {
!       get_var_ann (var);
!       add_referenced_var (var);
!     }
  
    /* Declare this new variable.  */
    DECL_CHAIN (var) = *vars;
--- 2551,2557 ----
  
    /* We're actually using the newly-created var.  */
    if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
!     add_referenced_var (var);
  
    /* Declare this new variable.  */
    DECL_CHAIN (var) = *vars;
*************** declare_return_variable (copy_body_data 
*** 2857,2866 ****
  
    var = copy_result_decl_to_var (result, id);
    if (gimple_in_ssa_p (cfun))
!     {
!       get_var_ann (var);
!       add_referenced_var (var);
!     }
  
    DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
  
--- 2858,2864 ----
  
    var = copy_result_decl_to_var (result, id);
    if (gimple_in_ssa_p (cfun))
!     add_referenced_var (var);
  
    DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
  
*************** declare_return_variable (copy_body_data 
*** 2896,2905 ****
      {
        tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
        if (gimple_in_ssa_p (id->src_cfun))
! 	{
! 	  get_var_ann (temp);
! 	  add_referenced_var (temp);
! 	}
        insert_decl_map (id, result, temp);
        /* When RESULT_DECL is in SSA form, we need to use it's default_def
  	 SSA_NAME.  */
--- 2894,2900 ----
      {
        tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
        if (gimple_in_ssa_p (id->src_cfun))
! 	add_referenced_var (temp);
        insert_decl_map (id, result, temp);
        /* When RESULT_DECL is in SSA form, we need to use it's default_def
  	 SSA_NAME.  */
*************** copy_decl_for_dup_finish (copy_body_data
*** 4753,4758 ****
--- 4748,4761 ----
         new function.  */
      DECL_CONTEXT (copy) = id->dst_fn;
  
+   if (TREE_CODE (decl) == VAR_DECL
+       /* C++ clones functions during parsing, before
+ 	 referenced_vars.  */
+       && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
+       && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
+ 				DECL_UID (decl)))
+     add_referenced_var (copy);
+ 
    return copy;
  }
  
*************** copy_arguments_for_versioning (tree orig
*** 4864,4870 ****
  	   as temporary variable later in function, the uses will be
  	   replaced by local variable.  */
  	tree var = copy_decl_to_var (arg, id);
- 	get_var_ann (var);
  	add_referenced_var (var);
  	insert_decl_map (id, arg, var);
          /* Declare this new variable.  */
--- 4867,4872 ----
Index: passes.c
===================================================================
*** passes.c	(revision 170240)
--- passes.c	(working copy)
*************** init_optimization_passes (void)
*** 729,735 ****
    NEXT_PASS (pass_build_cfg);
    NEXT_PASS (pass_warn_function_return);
    NEXT_PASS (pass_build_cgraph_edges);
-   NEXT_PASS (pass_inline_parameters);
    *p = NULL;
  
    /* Interprocedural optimization passes.  */
--- 729,734 ----
*************** init_optimization_passes (void)
*** 747,758 ****
        NEXT_PASS (pass_build_ssa);
        NEXT_PASS (pass_lower_vector);
        NEXT_PASS (pass_early_warn_uninitialized);
-       /* Note that it is not strictly necessary to schedule an early
- 	 inline pass here.  However, some test cases (e.g.,
- 	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
- 	 inline functions to be inlined even at -O0.  This does not
- 	 happen during the first early inline pass.  */
        NEXT_PASS (pass_rebuild_cgraph_edges);
        NEXT_PASS (pass_early_inline);
        NEXT_PASS (pass_all_early_optimizations);
  	{
--- 746,753 ----
        NEXT_PASS (pass_build_ssa);
        NEXT_PASS (pass_lower_vector);
        NEXT_PASS (pass_early_warn_uninitialized);
        NEXT_PASS (pass_rebuild_cgraph_edges);
+       NEXT_PASS (pass_inline_parameters);
        NEXT_PASS (pass_early_inline);
        NEXT_PASS (pass_all_early_optimizations);
  	{

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-17 16:41                                 ` Jan Hubicka
@ 2011-02-17 18:59                                   ` H.J. Lu
  2011-02-18 23:41                                     ` Jan Hubicka
  0 siblings, 1 reply; 46+ messages in thread
From: H.J. Lu @ 2011-02-17 18:59 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Alexandre Oliva, Richard Guenther, gcc-patches

On Thu, Feb 17, 2011 at 8:21 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this is variant of patch I comitted after re-testing.
> The change relative to previous version I sent is new tree-sra hunk avoiding
> call of compute_inline_parameters on functions that are not in SSA yet and
> cgraph_process_new_functions calling it in functions that arrive in SSA form
> and thus are not early optimized.
>
> Honza
>
> Index: ChangeLog
> ===================================================================
> *** ChangeLog   (revision 170248)
> --- ChangeLog   (working copy)
> ***************
> *** 1,3 ****
> --- 1,29 ----
> + 2011-02-17  Alexandre Oliva  <aoliva@redhat.com>
> +           Jan Hubicka  <jh@suse.cz>
> +
> +       PR debug/47106
> +       PR debug/47402
> +       * cfgexpand.c (account_used_vars_for_block): Remove.
> +       (estimated_stack_frame_size): Use referenced vars.
> +       * tree-inline.c (remap_decl): Only mark VAR_DECLs as referenced
> +       that were referenced in the original function.  Test src_fn
> +       rather than cfun.  Drop redundant get_var_ann.
> +       (setup_one_parameter): Drop redundant get_var_ann.
> +       (declare_return_variable): Likewise.
> +       (copy_decl_for_dup_finish): Mark VAR_DECLs referenced in src_fn.
> +       (copy_arguments_for_versioning): Drop redundant get_var_ann.
> +       * ipa-inline.c (compute_inline_parameters): Do not compute
> +       disregard_inline_limits here.
> +       are not available.
> +       (compute_inlinable_for_current, pass_inlinable): New.
> +       (pass_inline_parameters): Require PROP_referenced_vars.
> +       * cgraphunit.c (cgraph_process_new_functions): Don't run
> +       compute_inline_parameters explicitly unless function is in
> +       SSA form.
> +       (cgraph_analyze_function): Set .disregard_inline_limits.
> +       * tree-sra.c (convert_callers): Compute inliner parameters
> +       only for functions already in SSA form.
> +

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47788


-- 
H.J.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-17  9:33                                 ` Jan Hubicka
@ 2011-02-18  3:58                                   ` Jack Howarth
  2011-02-18  9:48                                     ` Jakub Jelinek
  0 siblings, 1 reply; 46+ messages in thread
From: Jack Howarth @ 2011-02-18  3:58 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Alexandre Oliva, Richard Guenther, gcc-patches

On Thu, Feb 17, 2011 at 09:35:56AM +0100, Jan Hubicka wrote:
> Hi,
> this is variant of patch I am testing now.
> Quick testing on the problematic PR shows no problems and we get around w/o
> the extra pass.
> 
> Honza

Honza,
    It seems that this patch may have introduced the regressions....

FAIL: gcc.dg/lto/20081115 c_lto_20081115_0.o-c_lto_20081115_2.o execute -O0 -flto -flto-partition=1to1
FAIL: gcc.dg/lto/20081118 c_lto_20081118_0.o-c_lto_20081118_2.o execute -O0 -flto -flto-partition=1to1
FAIL: gcc.dg/lto/20081201-1 c_lto_20081201-1_0.o-c_lto_20081201-1_2.o execute -O0 -flto -flto-partition=1to1

looking at http://gcc.gnu.org/ml/gcc-testresults/2011-02/msg01897.html compared to
http://gcc.gnu.org/ml/gcc-testresults/2011-02/msg01911.html. I also see this on x86_64-apple-darwin10.
             Jack

> 
> Index: cgraphunit.c
> ===================================================================
> *** cgraphunit.c	(revision 170240)
> --- cgraphunit.c	(working copy)
> *************** cgraph_analyze_function (struct cgraph_n
> *** 783,788 ****
> --- 783,793 ----
>   
>     assign_assembler_name_if_neeeded (node->decl);
>   
> +   /* disregard_inline_limits affects topological order of the early optimization,
> +      so we need to compute it ahead of rest of inline parameters.  */
> +   node->local.disregard_inline_limits
> +     = DECL_DISREGARD_INLINE_LIMITS (node->decl);
> + 
>     /* Make sure to gimplify bodies only once.  During analyzing a
>        function we lower it, which will require gimplified nested
>        functions, so we can end up here with an already gimplified
> Index: testsuite/g++.dg/debug/pr47106.C
> ===================================================================
> *** testsuite/g++.dg/debug/pr47106.C	(revision 0)
> --- testsuite/g++.dg/debug/pr47106.C	(revision 0)
> ***************
> *** 0 ****
> --- 1,37 ----
> + // { dg-do compile }
> + // { dg-options "-O -fpartial-inlining -flto -fconserve-stack -fcompare-debug" }
> + 
> + void end (int, int) __attribute__ ((__noreturn__));
> + 
> + struct S
> + {
> +   int i;
> +   S *s;
> + };
> + 
> + inline bool f (S *s)
> + {
> +   if (!s->s)
> +     end (0, 0);
> +   return s->s == s;
> + }
> + 
> + inline bool
> + baz (S s1, S)
> + {
> +   while (f (&s1));
> + }
> + 
> + inline bool
> + bar (S s1, S s2, S)
> + {
> +   baz (s1, s2);
> + }
> + 
> + S getS ();
> + 
> + bool
> + foo ()
> + {
> +   bar (getS (), getS (), getS ());
> + }
> Index: ipa-inline.c
> ===================================================================
> *** ipa-inline.c	(revision 170240)
> --- ipa-inline.c	(working copy)
> *************** compute_inline_parameters (struct cgraph
> *** 2006,2014 ****
>   	  break;
>         node->local.can_change_signature = !e;
>       }
> -   if (node->local.inlinable && !node->local.disregard_inline_limits)
> -     node->local.disregard_inline_limits
> -       = DECL_DISREGARD_INLINE_LIMITS (node->decl);
>     estimate_function_body_sizes (node);
>     /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
>     node->global.time = inline_summary (node)->self_time;
> --- 2006,2011 ----
> Index: cfgexpand.c
> ===================================================================
> *** cfgexpand.c	(revision 170240)
> --- cfgexpand.c	(working copy)
> *************** create_stack_guard (void)
> *** 1311,1340 ****
>     crtl->stack_protect_guard = guard;
>   }
>   
> - /* A subroutine of expand_used_vars.  Walk down through the BLOCK tree
> -    expanding variables.  Those variables that can be put into registers
> -    are allocated pseudos; those that can't are put on the stack.
> - 
> -    TOPLEVEL is true if this is the outermost BLOCK.  */
> - 
> - static HOST_WIDE_INT
> - account_used_vars_for_block (tree block, bool toplevel)
> - {
> -   tree t;
> -   HOST_WIDE_INT size = 0;
> - 
> -   /* Expand all variables at this level.  */
> -   for (t = BLOCK_VARS (block); t ; t = DECL_CHAIN (t))
> -     if (var_ann (t) && is_used_p (t))
> -       size += expand_one_var (t, toplevel, false);
> - 
> -   /* Expand all variables at containing levels.  */
> -   for (t = BLOCK_SUBBLOCKS (block); t ; t = BLOCK_CHAIN (t))
> -     size += account_used_vars_for_block (t, false);
> - 
> -   return size;
> - }
> - 
>   /* Prepare for expanding variables.  */
>   static void
>   init_vars_expansion (void)
> --- 1311,1316 ----
> *************** estimated_stack_frame_size (struct cgrap
> *** 1379,1401 ****
>   {
>     HOST_WIDE_INT size = 0;
>     size_t i;
> !   tree var, outer_block = DECL_INITIAL (current_function_decl);
> !   unsigned ix;
>     tree old_cur_fun_decl = current_function_decl;
>   
>     current_function_decl = node->decl;
> !   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
>   
> !   init_vars_expansion ();
> ! 
> !   FOR_EACH_LOCAL_DECL (cfun, ix, var)
> !     {
> !       /* TREE_USED marks local variables that do not appear in lexical
> ! 	 blocks.  We don't want to expand those that do twice.  */
> !       if (TREE_USED (var))
> !         size += expand_one_var (var, true, false);
> !     }
> !   size += account_used_vars_for_block (outer_block, true);
>   
>     if (stack_vars_num > 0)
>       {
> --- 1355,1371 ----
>   {
>     HOST_WIDE_INT size = 0;
>     size_t i;
> !   tree var;
>     tree old_cur_fun_decl = current_function_decl;
> +   referenced_var_iterator rvi;
> +   struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
>   
>     current_function_decl = node->decl;
> !   push_cfun (fn);
>   
> !   gcc_checking_assert (gimple_referenced_vars (fn));
> !   FOR_EACH_REFERENCED_VAR (fn, var, rvi)
> !     size += expand_one_var (var, true, false);
>   
>     if (stack_vars_num > 0)
>       {
> Index: tree-inline.c
> ===================================================================
> *** tree-inline.c	(revision 170240)
> --- tree-inline.c	(working copy)
> *************** remap_decl (tree decl, copy_body_data *i
> *** 312,324 ****
>   	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
>   	}
>   
> !       if (cfun && gimple_in_ssa_p (cfun)
> ! 	  && (TREE_CODE (t) == VAR_DECL
> ! 	      || TREE_CODE (t) == RESULT_DECL || TREE_CODE (t) == PARM_DECL))
> ! 	{
> ! 	  get_var_ann (t);
> ! 	  add_referenced_var (t);
> ! 	}
>         return t;
>       }
>   
> --- 312,328 ----
>   	    walk_tree (&DECL_QUALIFIER (t), copy_tree_body_r, id, NULL);
>   	}
>   
> !       if ((TREE_CODE (t) == VAR_DECL
> ! 	   || TREE_CODE (t) == RESULT_DECL
> ! 	   || TREE_CODE (t) == PARM_DECL)
> ! 	  && id->src_fn && DECL_STRUCT_FUNCTION (id->src_fn)
> ! 	  && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
> ! 	  /* We don't want to mark as referenced VAR_DECLs that were
> ! 	     not marked as such in the src function.  */
> ! 	  && (TREE_CODE (decl) != VAR_DECL
> ! 	      || referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
> ! 					DECL_UID (decl))))
> ! 	add_referenced_var (t);
>         return t;
>       }
>   
> *************** setup_one_parameter (copy_body_data *id,
> *** 2547,2556 ****
>   
>     /* We're actually using the newly-created var.  */
>     if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
> !     {
> !       get_var_ann (var);
> !       add_referenced_var (var);
> !     }
>   
>     /* Declare this new variable.  */
>     DECL_CHAIN (var) = *vars;
> --- 2551,2557 ----
>   
>     /* We're actually using the newly-created var.  */
>     if (gimple_in_ssa_p (cfun) && TREE_CODE (var) == VAR_DECL)
> !     add_referenced_var (var);
>   
>     /* Declare this new variable.  */
>     DECL_CHAIN (var) = *vars;
> *************** declare_return_variable (copy_body_data 
> *** 2857,2866 ****
>   
>     var = copy_result_decl_to_var (result, id);
>     if (gimple_in_ssa_p (cfun))
> !     {
> !       get_var_ann (var);
> !       add_referenced_var (var);
> !     }
>   
>     DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
>   
> --- 2858,2864 ----
>   
>     var = copy_result_decl_to_var (result, id);
>     if (gimple_in_ssa_p (cfun))
> !     add_referenced_var (var);
>   
>     DECL_SEEN_IN_BIND_EXPR_P (var) = 1;
>   
> *************** declare_return_variable (copy_body_data 
> *** 2896,2905 ****
>       {
>         tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
>         if (gimple_in_ssa_p (id->src_cfun))
> ! 	{
> ! 	  get_var_ann (temp);
> ! 	  add_referenced_var (temp);
> ! 	}
>         insert_decl_map (id, result, temp);
>         /* When RESULT_DECL is in SSA form, we need to use it's default_def
>   	 SSA_NAME.  */
> --- 2894,2900 ----
>       {
>         tree temp = create_tmp_var (TREE_TYPE (result), "retvalptr");
>         if (gimple_in_ssa_p (id->src_cfun))
> ! 	add_referenced_var (temp);
>         insert_decl_map (id, result, temp);
>         /* When RESULT_DECL is in SSA form, we need to use it's default_def
>   	 SSA_NAME.  */
> *************** copy_decl_for_dup_finish (copy_body_data
> *** 4753,4758 ****
> --- 4748,4761 ----
>          new function.  */
>       DECL_CONTEXT (copy) = id->dst_fn;
>   
> +   if (TREE_CODE (decl) == VAR_DECL
> +       /* C++ clones functions during parsing, before
> + 	 referenced_vars.  */
> +       && gimple_referenced_vars (DECL_STRUCT_FUNCTION (id->src_fn))
> +       && referenced_var_lookup (DECL_STRUCT_FUNCTION (id->src_fn),
> + 				DECL_UID (decl)))
> +     add_referenced_var (copy);
> + 
>     return copy;
>   }
>   
> *************** copy_arguments_for_versioning (tree orig
> *** 4864,4870 ****
>   	   as temporary variable later in function, the uses will be
>   	   replaced by local variable.  */
>   	tree var = copy_decl_to_var (arg, id);
> - 	get_var_ann (var);
>   	add_referenced_var (var);
>   	insert_decl_map (id, arg, var);
>           /* Declare this new variable.  */
> --- 4867,4872 ----
> Index: passes.c
> ===================================================================
> *** passes.c	(revision 170240)
> --- passes.c	(working copy)
> *************** init_optimization_passes (void)
> *** 729,735 ****
>     NEXT_PASS (pass_build_cfg);
>     NEXT_PASS (pass_warn_function_return);
>     NEXT_PASS (pass_build_cgraph_edges);
> -   NEXT_PASS (pass_inline_parameters);
>     *p = NULL;
>   
>     /* Interprocedural optimization passes.  */
> --- 729,734 ----
> *************** init_optimization_passes (void)
> *** 747,758 ****
>         NEXT_PASS (pass_build_ssa);
>         NEXT_PASS (pass_lower_vector);
>         NEXT_PASS (pass_early_warn_uninitialized);
> -       /* Note that it is not strictly necessary to schedule an early
> - 	 inline pass here.  However, some test cases (e.g.,
> - 	 g++.dg/other/p334435.C g++.dg/other/i386-1.C) expect extern
> - 	 inline functions to be inlined even at -O0.  This does not
> - 	 happen during the first early inline pass.  */
>         NEXT_PASS (pass_rebuild_cgraph_edges);
>         NEXT_PASS (pass_early_inline);
>         NEXT_PASS (pass_all_early_optimizations);
>   	{
> --- 746,753 ----
>         NEXT_PASS (pass_build_ssa);
>         NEXT_PASS (pass_lower_vector);
>         NEXT_PASS (pass_early_warn_uninitialized);
>         NEXT_PASS (pass_rebuild_cgraph_edges);
> +       NEXT_PASS (pass_inline_parameters);
>         NEXT_PASS (pass_early_inline);
>         NEXT_PASS (pass_all_early_optimizations);
>   	{

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-18  3:58                                   ` Jack Howarth
@ 2011-02-18  9:48                                     ` Jakub Jelinek
  0 siblings, 0 replies; 46+ messages in thread
From: Jakub Jelinek @ 2011-02-18  9:48 UTC (permalink / raw)
  To: Jack Howarth; +Cc: Jan Hubicka, gcc-patches

On Thu, Feb 17, 2011 at 10:27:23PM -0500, Jack Howarth wrote:
> On Thu, Feb 17, 2011 at 09:35:56AM +0100, Jan Hubicka wrote:
>     It seems that this patch may have introduced the regressions....
> 
> FAIL: gcc.dg/lto/20081115 c_lto_20081115_0.o-c_lto_20081115_2.o execute -O0 -flto -flto-partition=1to1
> FAIL: gcc.dg/lto/20081118 c_lto_20081118_0.o-c_lto_20081118_2.o execute -O0 -flto -flto-partition=1to1
> FAIL: gcc.dg/lto/20081201-1 c_lto_20081201-1_0.o-c_lto_20081201-1_2.o execute -O0 -flto -flto-partition=1to1
> 
> looking at http://gcc.gnu.org/ml/gcc-testresults/2011-02/msg01897.html compared to
> http://gcc.gnu.org/ml/gcc-testresults/2011-02/msg01911.html. I also see this on x86_64-apple-darwin10.

That's http://gcc.gnu.org/PR47788, already reported.

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PR debug/47106] account used vars only once
  2011-02-17 18:59                                   ` H.J. Lu
@ 2011-02-18 23:41                                     ` Jan Hubicka
  0 siblings, 0 replies; 46+ messages in thread
From: Jan Hubicka @ 2011-02-18 23:41 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jan Hubicka, Alexandre Oliva, Richard Guenther, gcc-patches

> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47788

The problem is that we now inline extern inlines at -O0, while previously we
didn't.  For some reason only LTO tests check this.

Since GCC 3.4 or so C frontend (but not C++) declare extern inlines
always_inline by setting DECl_DISREGARD_INLINE_LIMITS.  However the functions
are not inlined at -O0 because tree_inlinable_function_p returns false there.
It checks for always_inline attribute but ignores disregard_inline_limits.

tree_inlinable_function_p is not really good place to handle this, but we will
clean this up for 4.7.  This patch restores the previous behaviour

I am testing it x86_64-linux (this time with LTO) and will commit it once
testing passes.

Honza

Index: ipa-inline.c
===================================================================
--- ipa-inline.c	(revision 170249)
+++ ipa-inline.c	(working copy)
@@ -1990,6 +1990,8 @@ compute_inline_parameters (struct cgraph
 
   /* Can this function be inlined at all?  */
   node->local.inlinable = tree_inlinable_function_p (node->decl);
+  if (!node->local.inlinable)
+    node->local.disregard_inline_limits = 0;
 
   /* Inlinable functions always can change signature.  */
   if (node->local.inlinable)

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2011-02-18 23:09 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-06 11:20 [PR debug/47106] account used vars only once Alexandre Oliva
2011-01-10 15:24 ` Richard Guenther
2011-01-10 15:50   ` Jan Hubicka
2011-01-10 16:31     ` Richard Guenther
2011-01-10 16:58       ` Jan Hubicka
2011-01-20 22:01   ` Alexandre Oliva
2011-01-21 10:51     ` Richard Guenther
2011-01-21 19:01       ` H.J. Lu
2011-01-21 22:13         ` Alexandre Oliva
2011-01-21 23:27     ` Alexandre Oliva
2011-01-31 12:24       ` Alexandre Oliva
2011-01-31 14:01         ` Richard Guenther
2011-02-01 22:25           ` Alexandre Oliva
2011-02-01 22:40             ` Richard Guenther
2011-02-02  0:42               ` Alexandre Oliva
2011-02-02 11:30                 ` Richard Guenther
2011-02-02 19:35                   ` Alexandre Oliva
2011-02-03  5:51                     ` Alexandre Oliva
2011-02-03 10:19                       ` Richard Guenther
2011-02-03 13:25                         ` Jan Hubicka
2011-02-04  7:49                         ` Alexandre Oliva
2011-02-07 12:32                           ` Alexandre Oliva
2011-02-07 13:39                             ` Richard Guenther
2011-02-07 16:01                               ` Jan Hubicka
2011-02-07 22:38                                 ` Alexandre Oliva
2011-02-07 22:31                               ` Alexandre Oliva
2011-02-08 11:10                                 ` Richard Guenther
2011-02-08 11:30                                   ` Jan Hubicka
2011-02-12 13:45                               ` Alexandre Oliva
2011-02-14 12:41                                 ` Richard Guenther
2011-02-14 20:25                                   ` Alexandre Oliva
2011-02-15 11:04                                     ` Richard Guenther
2011-02-17  1:53                                   ` Jan Hubicka
2011-02-17  1:59                                     ` Jan Hubicka
2011-02-17  9:33                                 ` Jan Hubicka
2011-02-18  3:58                                   ` Jack Howarth
2011-02-18  9:48                                     ` Jakub Jelinek
2011-02-17 16:41                                 ` Jan Hubicka
2011-02-17 18:59                                   ` H.J. Lu
2011-02-18 23:41                                     ` Jan Hubicka
2011-02-07 16:03                             ` Jan Hubicka
2011-02-07 22:35                               ` Alexandre Oliva
2011-02-07 23:58                                 ` Jan Hubicka
2011-02-07 16:09                             ` Jan Hubicka
2011-02-07 22:34                               ` Alexandre Oliva
2011-02-02  6:34           ` Alexandre Oliva

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).