From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 85138 invoked by alias); 9 Mar 2015 15:36:38 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 85078 invoked by uid 48); 9 Mar 2015 15:36:32 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/44563] GCC uses a lot of RAM when compiling a large numbers of functions Date: Mon, 09 Mar 2015 15:36:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 4.3.4 X-Bugzilla-Keywords: compile-time-hog, memory-hog X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-03/txt/msg00966.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D44563 --- Comment #18 from Richard Biener --- (In reply to Richard Biener from comment #17) > (In reply to Richard Biener from comment #16) > > callgrind shows the cgraph_edge_hasher quite high in the profile (via > > redirect_all_calls). I suppose as the large main is a single BB walking > > all stmts over-and-over is quite bad. Also hash_pointer isn't inlined!? > > Ah - it's external in libiberty hashtab.c ... - it should transition to > > using/inheriting from pointer_hash. > >=20 > > cgraph_edge * > > cgraph_node::get_edge (gimple call_stmt) > > { > > cgraph_edge *e, *e2; > > int n =3D 0; > >=20 > > if (call_site_hash) > > return call_site_hash->find_with_hash (call_stmt, > > htab_hash_pointer (call_stmt= )); > >=20 >=20 > Btw, for 10000 calls (smaller testcase) we get 100 000 000 calls to > cgraph_edge::redirect_call_stmt_to_callee () (that's from 40000 > redirect_all_calls calls which is from 10000 optimize_inline_calls calls). >=20 > Ah - we do this also for the ENTRY/EXIT block! >=20 > Index: gcc/tree-inline.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- gcc/tree-inline.c (revision 221278) > +++ gcc/tree-inline.c (working copy) > @@ -2802,11 +2802,13 @@ copy_cfg_body (copy_body_data * id, gcov > if (need_debug_cleanup > && bb->index !=3D ENTRY_BLOCK > && bb->index !=3D EXIT_BLOCK) > - maybe_move_debug_stmts_to_successors (id, (basic_block) bb->aux= ); > - /* Update call edge destinations. This can not be done before lo= op > - info is updated, because we may split basic blocks. */ > - if (id->transform_call_graph_edges =3D=3D CB_CGE_DUPLICATE) > - redirect_all_calls (id, (basic_block)bb->aux); > + { > + maybe_move_debug_stmts_to_successors (id, (basic_block) bb->a= ux); > + /* Update call edge destinations. This can not be done before > loop > + info is updated, because we may split basic blocks. */ > + if (id->transform_call_graph_edges =3D=3D CB_CGE_DUPLICATE) > + redirect_all_calls (id, (basic_block)bb->aux); > + } > ((basic_block)bb->aux)->aux =3D NULL; > bb->aux =3D NULL; > } >=20 > makes sense? Fails to bootstrap :/ But would improve the testcase to only have the inline heuristic issue. /space/rguenther/src/svn/trunk/libstdc++-v3/libsupc++/pbase_type_info.cc: In member function =E2=80=98virtual bool __cxxabiv1::__pbase_type_info::__do_c= atch(const std::type_info*, void**, unsigned int) const=E2=80=99: /space/rguenther/src/svn/trunk/libstdc++-v3/libsupc++/pbase_type_info.cc:32= :6: error: reference to dead statement bool __pbase_type_info:: ^ # .MEM =3D VDEF <.MEM> _30 =3D OBJ_TYPE_REF(_28;(const struct __pbase_type_info)this_3(D)->6) (this_3(D), thr_type_5(D), thr_obj_9(D), outer_29); _ZNK10__cxxabiv117__pbase_type_info10__do_catchEPKSt9type_infoPPvj/74 (virt= ual bool __cxxabiv1::__pbase_type_info::__do_catch(const std::type_info*, void*= *, unsigned int) const) @0x2aaaac8d3ab8 Type: function definition analyzed Visibility: externally_visible public visibility_specified virtual Address is taken. References: _ZNK10__cxxabiv117__pbase_type_info15__pointer_catchEPKS0_PPv= j/34 (addr) (speculative) Referring: _ZTVN10__cxxabiv117__pbase_type_infoE/77 (addr) Availability: overwritable First run: 0 Function flags: body Called by:=20 Calls: strcmp/85 (0.39 per call) __cxa_bad_typeid/83 (can throw external) strcmp/85 (0.61 per call)=20 Indirect call(0.11 per call) (can throw external)=20 Polymorphic indirect call of type const struct __pbase_type_info token:6(speculative) (0.03 per call) (can throw external) of param:0 Outer type (dynamic):struct __pbase_type_info (or a derived type) offse= t 0 /space/rguenther/src/svn/trunk/libstdc++-v3/libsupc++/pbase_type_info.cc:32= :6: internal compiler error: verify_cgraph_node failed 0xa8eebc cgraph_node::verify_node() /space/rguenther/src/svn/trunk/gcc/cgraph.c:3115 0xa8473f symtab_node::verify() /space/rguenther/src/svn/trunk/gcc/symtab.c:1103 0x1025861 optimize_inline_calls(tree_node*) /space/rguenther/src/svn/trunk/gcc/tree-inline.c:4938 > > The estimate_calls_size_and_time portion is quite smaller. > >=20 > > cleanup-cfgs main portion is split_bb_on_noreturn_calls. >>From gcc-bugs-return-479823-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Mon Mar 09 15:55:22 2015 Return-Path: Delivered-To: listarch-gcc-bugs@gcc.gnu.org Received: (qmail 130953 invoked by alias); 9 Mar 2015 15:55:22 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Delivered-To: mailing list gcc-bugs@gcc.gnu.org Received: (qmail 130855 invoked by uid 48); 9 Mar 2015 15:55:16 -0000 From: "derodat at adacore dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug debug/53927] wrong value for DW_AT_static_link Date: Mon, 09 Mar 2015 15:55:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: debug X-Bugzilla-Version: 4.6.3 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: derodat at adacore dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-03/txt/msg00967.txt.bz2 Content-length: 3119 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53927 --- Comment #21 from Pierre-Marie de Rodat --- (In reply to Eric Botcazou from comment #18) > I think this is worth investigating though because it's conceptually > much simpler than adding yet another indirection. And we should > concentrate on -O0 (and -Og), we don't really care about what happens > with aggressive optimization. Understood and agreed. Nevertheless... > I guess the question is: can we arrange to have a constant offset > between the frame base and the FRAME object, "constant" meaning valid > for every function but possibly target-dependent? I started to hack into cfgexpand.c and dwarf2out.c, but I realized this is not possible in the general case. Consider the following example: #include int nestee (void) { int a __attribute__((aligned(64))) = 1234; void nested (int b) { a = b; } nested (2345); return a; } int call_nestee (int n) { int *v = alloca (sizeof (int) * n); v[0] = nestee (); return v[0]; } int main (void) { call_nestee (1); call_nestee (8); return 0; } With a GCC 5.0 built from fairly recent sources, I get the following CFA information: 00000090 000000000000002c 00000064 FDE cie=00000030 pc=00000000004004ac..00000000004004eb DW_CFA_advance_loc: 5 to 00000000004004b1 DW_CFA_def_cfa: r10 (r10) ofs 0 DW_CFA_advance_loc: 9 to 00000000004004ba DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 5 to 00000000004004bf DW_CFA_def_cfa_expression (DW_OP_breg6 (rbp): -8; DW_OP_deref) DW_CFA_advance_loc: 38 to 00000000004004e5 And now here is what I get under GDB: $ gdb -n -q -ex 'b nestee' ./dyn_frame Reading symbols from ./dyn_frame...done. Breakpoint 1 at 0x4004c3: file dyn_frame.c, line 6. (gdb) r [...] Breakpoint 1, nestee () at dyn_frame.c:6 6 int a __attribute__((aligned(64))) = 1234; (gdb) p $pc $1 = (void (*)()) 0x4004c3 (gdb) x/1xg $rbp - 8 0x7fffffffdf28: 0x00007fffffffdf60 (gdb) p/x (char *) 0x00007fffffffdf60 - (char *) &a $2 = 0xa0 ... so for this frame, the CFA and the FRAME object are 0xa0 bytes from each other. Now let's resume to see the next "nestee" frame: (gdb) c Continuing. Breakpoint 1, nestee () at dyn_frame.c:6 6 int a __attribute__((aligned(64))) = 1234; (gdb) p $pc $3 = (void (*)()) 0x4004c3 (gdb) x/1xg $rbp - 8 0x7fffffffdf28: 0x00007fffffffdf50 (gdb) p/x (char *) 0x00007fffffffdf50 - (char *) &a $4 = 0x90 The offset between the CFA and e FRAME object is now 0x90 bytes. So because of alignment constraints, I think we cannot assume we can have a constant offset (even function-dependent). This offset is dynamic and the only way to compute it is to use the frame's context: here, nestee's saved registers, which we don't have access to in DWARF when computing the static link attribute.