From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20857 invoked by alias); 22 Sep 2003 00:17:13 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 20844 invoked from network); 22 Sep 2003 00:17:12 -0000 Received: from unknown (HELO mail.alinoe.com) (62.195.88.134) by sources.redhat.com with SMTP; 22 Sep 2003 00:17:12 -0000 Received: (qmail 24628 invoked by uid 500); 22 Sep 2003 00:17:10 -0000 Date: Mon, 22 Sep 2003 02:31:00 -0000 From: Carlo Wood To: gcc@gcc.gnu.org Subject: RFC: debug line info & inlining; patch proposal and request for comments. Message-ID: <20030922001710.GA24248@alinoe.com> Mail-Followup-To: Carlo Wood , gcc@gcc.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-SW-Source: 2003-09/txt/msg00944.txt.bz2 Problem description ------------------- The debug source-file/line-number information (location) in relation with inlined functions is not perfect. The main problem in fixing this, hence this RFC, is that I am not sure what test case would cover all possibilities. The test case that I have in mind is given below. If something is missing then I really need to know that or else the patch might fix one thing but break another! Consider the following test case (a C++ program): ~>grep -n ^ troep.cc 1:extern int a; 2:extern int h(); 3: 4:inline int 5:i(int v, int w) 6:{ 7: a += v + w; 8:} 9: 10:inline void 11:g(int x, 12: int y = h(), 13: int z = i(12345, h())) 14:{ 15: int q[4] = { 2, 3, 5, 7 }; 16: a += x + y + z + (int)&q; 17:} 18: 19:void f() 20:{ 21: g(h()); 22:} This generates (current CVS + patch of PR 12319) the following assembly code for function f(), compiled as: # g++-cvs-3.4 -c troep.cc -g -finline -save-temps -v // My comments here: _Z1fv: .LFB4: .file 1 "troep.c" .loc 1 20 0 // Prologue of f() at line 20 pushl %ebp .LCFI0: movl %esp, %ebp .LCFI1: subl $56, %esp .LCFI2: .LBB2: .LBB3: .loc 1 14 0 // Line 14 for inlined g() call _Z1hv // Call that is actually done at line 21 movl %eax, -12(%ebp) // Assignment to x. call _Z1hv // Call that is actually done at line 12 movl %eax, -16(%ebp) // Assignment to y. .LBB4: .loc 1 6 0 // Line 6 for inlined i() movl $12345, -24(%ebp) // Assignment to v. call _Z1hv // Call that is actually done at line 13 movl %eax, -28(%ebp) // Assignment to w. .LBB5: .loc 1 7 0 // Actually inlined code of i(), at line 7. movl -28(%ebp), %eax // > addl -24(%ebp), %eax // > a += v + w addl %eax, a // > .LBE5: .LBE4: .loc 1 6 0 // Back to line 6 - prologue of i() actually at line 8. movl -32(%ebp), %eax // Undefined return value of i() movl %eax, -20(%ebp) // Assignment to z, actually at line 13 .LBB6: .LBB7: .loc 1 15 0 // Wow, a correctc linenumber for a change. movl $2, -56(%ebp) // int q[4] = { 2, 3, 4, 5 }; movl $3, -52(%ebp) movl $5, -48(%ebp) movl $7, -44(%ebp) .loc 1 16 0 // Line 16: a += x + y + z + (int)&q; movl -16(%ebp), %eax // y addl -12(%ebp), %eax // x movl %eax, %edx // x + y addl -20(%ebp), %edx // + z leal -56(%ebp), %eax // (int)&q leal (%edx,%eax), %eax // a += addl %eax, a .LBE7: .LBE6: .LBE3: .LBE2: .loc 1 22 0 // Prologue of f() at line 22. leave ret As you can see, all line numbers are wrong that refer to initialization of inlined function parameters. I think I can hardly screw this up MORE then already is the case, but maybe I am overlooking a construction that WOULD be screwed up changing this. For the given test case, the most important thing to get right imho are the call's (to h()). These calls leave their return address on the stack and are therefore a direct target for debuggers to be resolved to a source location; this needs to be fixed. The actual assignment to the function parameters is less important I think: in order to get that right you'd need to "swap" between line numbers too often, only increasing the the ammount of debug info rather unnecessarily imho. All of this (the part that I intend to fix the debug line number information for) is emitted by a single call to expand_expr() in expr.c, for an expression of type 'expr_with_file_location' that recursively contains all the initialization (calls) of the function parameters of an inlined function. The problem arises from this recursion: the expr_with_file_location is only a container of everything of the inlined function, including the initialization and any calls that are done as part of the parameter initialization. But, the implementation of this expr_"with_file_location" is that it emits that location at the start. As a result all of the initialization, including any calls needed, are assigned the line number of the prologue of the inlined functions definition - totally incorrect. Proposal for new behaviour -------------------------- My proposal for a new behaviour is that the above assembly code would contain the following line numbers: _Z1fv: .LFB4: .file 1 "troep.c" .loc 1 20 0 // Prologue of f() at line 20 pushl %ebp .LCFI0: movl %esp, %ebp .LCFI1: subl $56, %esp .LCFI2: .LBB2: .LBB3: .loc 1 21 0 // Line 21 call _Z1hv // Call that is actually done at line 21 movl %eax, -12(%ebp) // Assignment to x. .loc 1 12 0 // Line 12 call _Z1hv // Call that is actually done at line 12 movl %eax, -16(%ebp) // Assignment to y. .LBB4: .loc 1 6 0 // Line 6 for inlined i() movl $12345, -24(%ebp) // Assignment to v. .loc 1 13 // Line 13 call _Z1hv // Call that is actually done at line 13 movl %eax, -28(%ebp) // Assignment to w. .LBB5: .loc 1 7 0 // Actually inlined code of i(), at line 7. movl -28(%ebp), %eax // > addl -24(%ebp), %eax // > a += v + w addl %eax, a // > .LBE5: .LBE4: .loc 1 8 0 // Prologue of i() actually at line 8. movl -32(%ebp), %eax // Undefined return value of i() movl %eax, -20(%ebp) // Assignment to z, actually at line 13 .LBB6: .LBB7: .loc 1 15 0 // Line 15 movl $2, -56(%ebp) // int q[4] = { 2, 3, 4, 5 }; movl $3, -52(%ebp) movl $5, -48(%ebp) movl $7, -44(%ebp) .loc 1 16 0 // Line 16: a += x + y + z + (int)&q; movl -16(%ebp), %eax // y addl -12(%ebp), %eax // x movl %eax, %edx // x + y addl -20(%ebp), %edx // + z leal -56(%ebp), %eax // (int)&q leal (%edx,%eax), %eax // a += addl %eax, a .LBE7: .LBE6: .LBE3: .LBE2: .loc 1 22 0 // Prologue of f() at line 22. leave ret In other words: all of the call's correct, and the assignments to function parameters rather fuzzy. Comments please? -- Carlo Wood