public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
* [Bug default/25199] New: Drop variable declarations if definition is present
@ 2019-01-01  0:00 vries at gcc dot gnu.org
  0 siblings, 0 replies; only message in thread
From: vries at gcc dot gnu.org @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz

https://sourceware.org/bugzilla/show_bug.cgi?id=25199

            Bug ID: 25199
           Summary: Drop variable declarations if definition is present
           Product: dwz
           Version: unspecified
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: default
          Assignee: nobody at sourceware dot org
          Reporter: vries at gcc dot gnu.org
                CC: dwz at sourceware dot org
  Target Milestone: ---

I. 

Consider the following test-case:
...
$ cat test.h
extern int var;
extern void foo (void);
$ cat test.c
#include "test.h"

int
main (void)
{
  var = 2;
  foo ();
  return 0;
}
$ cat test2.c
#include "test.h"

int var;

void
foo (void)
{
  var = 3;
}
...

When compiled with gcc and debug info:
...
$ gcc test.c test2.c -g
...

We'll end up with two declarations, at 0xf4 and 0x152, as well as a definition
at 0x163:
...
 <0><d2>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <d8>   DW_AT_name        : (indirect string, offset: 0x1dc): test.c
 <1><f4>: Abbrev Number: 2 (DW_TAG_variable)
    <f5>   DW_AT_name        : var
    <f9>   DW_AT_decl_file   : 2
    <fa>   DW_AT_decl_line   : 1
    <fb>   DW_AT_type        : <0xff>
    <ff>   DW_AT_external    : 1
    <ff>   DW_AT_declaration : 1
 <0><12f>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <135>   DW_AT_name        : (indirect string, offset: 0x23e): test2.c
 <1><151>: Abbrev Number: 2 (DW_TAG_variable)
    <152>   DW_AT_name        : var
    <156>   DW_AT_decl_file   : 2
    <157>   DW_AT_decl_line   : 1
    <158>   DW_AT_type        : <0x15c>
    <15c>   DW_AT_external    : 1
    <15c>   DW_AT_declaration : 1
 <1><163>: Abbrev Number: 4 (DW_TAG_variable)
    <164>   DW_AT_specification: <0x151>
    <168>   DW_AT_decl_file   : 1
    <169>   DW_AT_decl_line   : 3
    <16a>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
(DW_OP_addr: 60102c)
...

II.

Using dwz (with forcing transformation using --devel-ignore-size):
...
$ dwz a.out -o 1 --devel-ignore-size
...
we get the declarations duplicate-eliminated and moved into a partial unit
0x51, which itself it imported by a partial unit 0x88, which is imported into
both compilation units, allowing declaration 0x5a to be used by definition
0x1ae:
...
 <0><51>: Abbrev Number: 2 (DW_TAG_partial_unit)
 <1><5a>: Abbrev Number: 1 (DW_TAG_variable)
    <5b>   DW_AT_name        : var
    <5f>   DW_AT_decl_file   : 2
    <60>   DW_AT_decl_line   : 1
    <61>   DW_AT_type        : <0x14>
    <65>   DW_AT_external    : 1
    <65>   DW_AT_declaration : 1
 <0><88>: Abbrev Number: 32 (DW_TAG_partial_unit)
 <1><8e>: Abbrev Number: 5 (DW_TAG_imported_unit)
    <8f>   DW_AT_import      : <0x51>   [Abbrev Number: 2]
 <0><14c>: Abbrev Number: 17 (DW_TAG_compile_unit)
    <152>   DW_AT_name        : (indirect string, offset: 0x1dc): test.c
 <1><167>: Abbrev Number: 5 (DW_TAG_imported_unit)
    <168>   DW_AT_import      : <0x88>  [Abbrev Number: 32]
 <0><18e>: Abbrev Number: 17 (DW_TAG_compile_unit)
    <194>   DW_AT_name        : (indirect string, offset: 0x23e): test2.c
 <1><1a9>: Abbrev Number: 5 (DW_TAG_imported_unit)
    <1aa>   DW_AT_import      : <0x88>  [Abbrev Number: 32]
 <1><1ae>: Abbrev Number: 29 (DW_TAG_variable)
    <1af>   DW_AT_specification: <0x5a>
    <1b3>   DW_AT_decl_file   : 1
    <1b4>   DW_AT_decl_line   : 3
    <1b5>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
(DW_OP_addr: 60102c)
...

III.

When compiled with clang and debug info:
...
$ clang test.c test2.c -g -o b.out
...

we'll just get one definition, and no declarations:
...
 <0><11d>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <124>   DW_AT_name        : (indirect string, offset: 0x21c): test2.c
 <1><13c>: Abbrev Number: 2 (DW_TAG_variable)
    <13d>   DW_AT_name        : (indirect string, offset: 0x224): var
    <141>   DW_AT_type        : <0x151>
    <145>   DW_AT_external    : 1
    <145>   DW_AT_decl_file   : 1
    <146>   DW_AT_decl_line   : 3
    <147>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
(DW_OP_addr: 60102c)
...

IV.

The difference is gdb-visible.  With clang we only get:
...
$ gdb b.out -batch -ex "l var"
1       #include "test.h"
2
3       int var;
4
5       void
6       foo (void)
7       {
8         var = 3;
9       }
...

whereas with gdb, we get depending on the context either just the declaration,
or the declaration and the definition:
...
$ gdb a.out -batch -ex "l var" 
1       extern int var;
2       extern void foo (void);
$ gdb a.out -batch -ex "b foo" -ex r -ex "l var" 
Breakpoint 1 at 0x4004b5: file test2.c, line 8.

Breakpoint 1, foo () at test2.c:8
8         var = 3;
file: "test.h", line number: 1, symbol: "var"
1       extern int var;
2       extern void foo (void);
file: "test2.c", line number: 3, symbol: "var"
1       #include "test.h"
2
3       int var;
4
5       void
6       foo (void)
7       {
8         var = 3;
9       }
...

While the gcc output is a more accurate representation of the sources, the
question is, how useful is it to show:
- the declaration instead of the definition.  It might be actually more useful
  to show the definition, since that might also show the initializer.
- the declaration in addition to the definition.

V.

So, why is the declaration emitted by gcc, and not by clang? Is there a
positive side-effect from emitting the declaration?

There is indeed. Say we compile without debug info for test2.c:
...
$ gcc test.c -g -c
$ gcc test2.c -c
$ gcc test.o test2.o -g -o c.out
...

There will be no debug info describing the definition of the var, but gdb
combines the declaration of the var to get the type, and the minimal symbol
info to get the location, and manages to print the value of the variable:
...
$ gdb c.out -batch -ex "p var"
$1 = 0
...

Doing the same with clang gives us:
...
$ clang test.c -g -c
$ clang test2.c -c
$ clang test.o test2.o -g -o d.out
$ gdb d.out -batch -ex "p var"
'var' has unknown type; cast it to its declared type
...

VI.

So, there's a reason why gcc emits declarations in the debug info of objects,
but if the corresponding definition is present in the debug info of the
executable, it seems unnecessary to hold on to the declarations.

So, a simple way to optimize this is to just drop the unneeded declaration, and
drop the loc info in the remaining declaration:
...
 <0><d2>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <d8>   DW_AT_name        : (indirect string, offset: 0x1dc): test.c
 <0><12f>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <135>   DW_AT_name        : (indirect string, offset: 0x23e): test2.c
 <1><151>: Abbrev Number: 2 (DW_TAG_variable)
    <152>   DW_AT_name        : var
    <158>   DW_AT_type        : <0x15c>
    <15c>   DW_AT_external    : 1
    <15c>   DW_AT_declaration : 1
 <1><163>: Abbrev Number: 4 (DW_TAG_variable)
    <164>   DW_AT_specification: <0x151>
    <168>   DW_AT_decl_file   : 1
    <169>   DW_AT_decl_line   : 3
    <16a>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
(DW_OP_addr: 60102c)
...
which does not require creation of a partial unit and import statements as in
II.

A further step could be to merge the definition and the remaining declaration
into just a definition:
...
 <0><d2>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <d8>   DW_AT_name        : (indirect string, offset: 0x1dc): test.c
 <0><12f>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <135>   DW_AT_name        : (indirect string, offset: 0x23e): test2.c
 <1><163>: Abbrev Number: 4 (DW_TAG_variable)
    <152>   DW_AT_name        : var
    <158>   DW_AT_type        : <0x15c>
    <15c>   DW_AT_external    : 1
    <168>   DW_AT_decl_file   : 1
    <169>   DW_AT_decl_line   : 3
    <16a>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
(DW_OP_addr: 60102c)
...

VII.

As a cornercase, we can mention a test-case like this, where a variable is
declared with different types in different CUs:
...
$ cat t.h
extern KIND var;
$ cat t.c
#define KIND int
#include "t.h"
extern void foo (void);
int
main (void)
{
  foo ();
  return 0;
}
$ cat t2.c
#define KIND float
#include "t.h"

KIND var;

void
foo (void)
{
  var = 1.1;
}
...

When compiling with clang, we'll ever only see an int var in gdb.

But when compiling with gcc, we'll either see an int var or a float var in gdb.

The proposed optimization might be conservative and only be effective if the
declaration and definition have the same type.

Or it might be aggressive, and just drop the declarations regardless of the
type, reasoning that this type of code is a violation of aliasing rules (though
that would probably have to be limited to C99+ languages).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-11-16 15:09 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01  0:00 [Bug default/25199] New: Drop variable declarations if definition is present vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).