public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug symtab/25755] New: Means to not keep decls in symtab
@ 2020-04-01  8:13 vries at gcc dot gnu.org
  2020-04-01 13:29 ` [Bug symtab/25755] " vries at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: vries at gcc dot gnu.org @ 2020-04-01  8:13 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=25755

            Bug ID: 25755
           Summary: Means to not keep decls in symtab
           Product: gdb
           Version: HEAD
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: symtab
          Assignee: unassigned at sourceware dot org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

I.

Consider the following test-case using source files test.c:
...
extern int aaa;

int
main (void)
{
  return 0;
}
...
and test2.c:
...
int aaa = 33;
...

If we compile with debug info, we can print the value of aaa using the proper
type with both gdb and lldb:
...
$ gcc test.c test2.c -g
$ gdb -batch a.out -ex "print aaa"
$1 = 33
$ lldb -batch a.out -o "print aaa"
(lldb) target create "a.out"
Current executable set to 'a.out' (x86_64).
(lldb) print aaa
(int) $0 = 33
...


II.

If we compile without debug info, we can print the value with gdb provided we
cast to the proper type:
...
$ gcc test.c test2.c
$ gdb -batch a.out -ex "print aaa"
'aaa' has unknown type; cast it to its declared type
$ gdb -batch a.out -ex "print (int)aaa"
$1 = 33
...
and with lldb we get a typeless value:
...
$ lldb -batch a.out -o "print aaa"
(lldb) target create "a.out"
Current executable set to 'a.out' (x86_64).
(lldb) print aaa
(void *) $0 = 0x0000000000000021
...
and can also cast it to a type (which seems to require long int rather than
int):
...
$ lldb -batch a.out -o "print (int)aaa"
(lldb) target create "a.out"
Current executable set to 'a.out' (x86_64).
(lldb) print (int)aaa
error: warning: got name from symbols: aaa
error: cast from pointer to smaller type 'int' loses information
$ lldb -batch a.out -o "print (long int)aaa"
(lldb) target create "a.out"
Current executable set to 'a.out' (x86_64).
(lldb) print (long int)aaa
(long) $0 = 33
...


III.

Now consider compiling with debug info only for test.c:
...
$ gcc -c test.c -g; gcc -c test2.c; gcc test.o test2.o -g
...

Gdb managed to print with type:
...
$ gdb -batch a.out -ex "print aaa"
$1 = 33
...
while lldb falls back on the typeless print:
...
(lldb) target create "a.out"
Current executable set to 'a.out' (x86_64).
(lldb) print aaa
(void *) $0 = 0x0000000000000021
...

This is a feature of gdb, (which apparently at least this version of lldb
doesn't have) where gdb keeps track of variable declarations:
...
Blockvector:

block #000, object at 0x555570721f70, 1 syms/buckets in 0x400497..0x4004a2
 int aaa; unresolved
 int main(void); block object 0x555570721e60, 0x400497..0x4004a2 section .text
  block #001, object at 0x555570721ec0 under 0x555570721f70, 1 syms/buckets in
0x400497..0x4004a2
   typedef int int; 
    block #002, object at 0x555570721e60 under 0x555570721ec0, 0 syms/buckets
in 0x400497..0x4004a2, function main
...
and combines those with addresses found in minimal symbol info:
...
$ nm a.out | grep aaa
0000000000601028 D aaa
...


IV.

This is a nice feature, but comes with a few issues.

There's a recently fixed issue where a decl using an incomplete type shadows
the def using the complete type (fixed in commit 93e55f0a03 "[gdb/symtab]
Prefer var def over decl"). [ And there's an open issue to fix this better:
PR25260 - "Handle decl before def more robustly"  (
https://sourceware.org/bugzilla/show_bug.cgi?id=25260 ). ]

And there's the open issue PR 24985 - "Cannot print value of global variable
because decl in one CU shadows def in other"  (
https://sourceware.org/bugzilla/show_bug.cgi?id=24985 ).

Furthermore, it costs memory to keep track of the decls, while that is not
always useful.


V.

Consider a simpler test-case, test3.c:
...
extern int aaa;

int aaa;

int
main (void)
{
  return 0;
}
...
compiled with debug info, with an older gcc:
...
$ gcc-4.8 -g test3.c
...

There's just one DIE describing the variable:
...
 <1><118>: Abbrev Number: 4 (DW_TAG_variable)
    <119>   DW_AT_name        : aaa
    <11d>   DW_AT_decl_file   : 1
    <11e>   DW_AT_decl_line   : 3
    <11f>   DW_AT_type        : <0x111>
    <123>   DW_AT_external    : 1
    <123>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
(DW_OP_addr: 60102c)
...

But with a more recent gcc (7.5.0), we have a def and a decl:
...
 <1><f4>: Abbrev Number: 2 (DW_TAG_variable)
    <f5>   DW_AT_name        : aaa
    <f9>   DW_AT_decl_file   : 1
    <fa>   DW_AT_decl_line   : 1
    <fb>   DW_AT_type        : <0xff>
    <ff>   DW_AT_external    : 1
    <ff>   DW_AT_declaration : 1
 <1><106>: Abbrev Number: 4 (DW_TAG_variable)
    <107>   DW_AT_specification: <0xf4>
    <10b>   DW_AT_decl_line   : 3
    <10c>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
(DW_OP_addr: 60102c)
...

This more accurately describes the source, but gdb makes a symbol for both the
def and the decl:
...
Blockvector:

block #000, object at 0x560017e71f40, 1 syms/buckets in 0x400497..0x4004a2
 int aaa; unresolved
 int aaa; static at 0x60102c section .bss
 int main(void); block object 0x560017e71e30, 0x400497..0x4004a2 section .text
  block #001, object at 0x560017e71e90 under 0x560017e71f40, 1 syms/buckets in
0x400497..0x4004a2
   typedef int int; 
    block #002, object at 0x560017e71e30 under 0x560017e71e90, 0 syms/buckets
in 0x400497..0x4004a2, function main
...
which is not useful at all.


VI.

This situation is further aggravated by -flto, which for a test-case test4.c:
...
int aaa;

int
main (void)
{
  return 0;
}
... 
compiled like this:
...
$ gcc-8 -O0 test4.c -g -flto -flto-partition=none -ffat-lto-objects
...
generates a def and a decl:
...
 <0><d2>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <d8>   DW_AT_name        : <artificial>
 <1><110>: Abbrev Number: 4 (DW_TAG_variable)
    <111>   DW_AT_abstract_origin: <0x13d>
    <115>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
(DW_OP_addr: 60102c)
 <0><12b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <131>   DW_AT_name        : test4.c
 <1><13d>: Abbrev Number: 2 (DW_TAG_variable)
    <13e>   DW_AT_name        : aaa
    <142>   DW_AT_decl_file   : 1
    <143>   DW_AT_decl_line   : 1
    <144>   DW_AT_decl_column : 5
    <145>   DW_AT_type        : <0x149>
    <149>   DW_AT_external    : 1
...
even though there's no seperate decl in the file, and gdb again keeps two
entries in the symbol tables:
...
Symtab for file test4.c

Blockvector:

block #000, object at 0x555e7afafb70, 1 syms/buckets in 0x0..0x0
 int aaa; unresolved
  block #001, object at 0x555e7afafac0 under 0x555e7afafb70, 1 syms/buckets in
0x0..0x0
   typedef int int; 


Symtab for file <artificial>

Blockvector:

block #000, object at 0x555e7afaf7d0, 1 syms/buckets in 0x400492..0x40049e
 int main(void); block object 0x555e7afaf6c0, 0x400492..0x40049e section .text
 int aaa; static at 0x60102c section .bss
  block #001, object at 0x555e7afaf770 under 0x555e7afaf7d0, 0 syms/buckets in
0x400492..0x40049e
    block #002, object at 0x555e7afaf6c0 under 0x555e7afaf770, 0 syms/buckets
in 0x400492..0x40049e, function main
...


VII.

Concluding, with the feature having some known issues, and costing memory, and
that cost problem possibly getting worse with recent gcc and lto executables,
it would be good to have a means to switch off the feature (by not keeping the
decls in the symbol table), say:
...
(gdb) maint set symbol-store-decls off
...

This would alllow us to:
- more easily identify problems related to the feature
- work around such problems
- assess memory impact of feature
- more fairly compare memory usage with lldb versions that do not
  support this feature.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug symtab/25755] Means to not keep decls in symtab
  2020-04-01  8:13 [Bug symtab/25755] New: Means to not keep decls in symtab vries at gcc dot gnu.org
@ 2020-04-01 13:29 ` vries at gcc dot gnu.org
  2020-04-02 12:01 ` vries at gcc dot gnu.org
  2020-04-02 23:14 ` vries at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vries at gcc dot gnu.org @ 2020-04-01 13:29 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=25755

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #0)

> V.
> 
> Consider a simpler test-case, test3.c:
> ...
> extern int aaa;
> 
> int aaa;
> 
> int
> main (void)
> {
>   return 0;
> }
> ...
> compiled with debug info, with an older gcc:
> ...
> $ gcc-4.8 -g test3.c
> ...
> 
> There's just one DIE describing the variable:
> ...
>  <1><118>: Abbrev Number: 4 (DW_TAG_variable)
>     <119>   DW_AT_name        : aaa
>     <11d>   DW_AT_decl_file   : 1
>     <11e>   DW_AT_decl_line   : 3
>     <11f>   DW_AT_type        : <0x111>
>     <123>   DW_AT_external    : 1
>     <123>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
> (DW_OP_addr: 60102c)
> ...
> 
> But with a more recent gcc (7.5.0), we have a def and a decl:
> ...
>  <1><f4>: Abbrev Number: 2 (DW_TAG_variable)
>     <f5>   DW_AT_name        : aaa
>     <f9>   DW_AT_decl_file   : 1
>     <fa>   DW_AT_decl_line   : 1
>     <fb>   DW_AT_type        : <0xff>
>     <ff>   DW_AT_external    : 1
>     <ff>   DW_AT_declaration : 1
>  <1><106>: Abbrev Number: 4 (DW_TAG_variable)
>     <107>   DW_AT_specification: <0xf4>
>     <10b>   DW_AT_decl_line   : 3
>     <10c>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
> (DW_OP_addr: 60102c)
> ...
> 
> This more accurately describes the source, but gdb makes a symbol for both
> the def and the decl:
> ...
> Blockvector:
> 
> block #000, object at 0x560017e71f40, 1 syms/buckets in 0x400497..0x4004a2
>  int aaa; unresolved
>  int aaa; static at 0x60102c section .bss
>  int main(void); block object 0x560017e71e30, 0x400497..0x4004a2 section
> .text
>   block #001, object at 0x560017e71e90 under 0x560017e71f40, 1 syms/buckets
> in 0x400497..0x4004a2
>    typedef int int; 
>     block #002, object at 0x560017e71e30 under 0x560017e71e90, 0
> syms/buckets in 0x400497..0x4004a2, function main
> ...
> which is not useful at all.
> 
> 
> VI.
> 
> This situation is further aggravated by -flto, which for a test-case test4.c:
> ...
> int aaa;
> 
> int
> main (void)
> {
>   return 0;
> }
> ... 
> compiled like this:
> ...
> $ gcc-8 -O0 test4.c -g -flto -flto-partition=none -ffat-lto-objects
> ...
> generates a def and a decl:
> ...
>  <0><d2>: Abbrev Number: 1 (DW_TAG_compile_unit)
>     <d8>   DW_AT_name        : <artificial>
>  <1><110>: Abbrev Number: 4 (DW_TAG_variable)
>     <111>   DW_AT_abstract_origin: <0x13d>
>     <115>   DW_AT_location    : 9 byte block: 3 2c 10 60 0 0 0 0 0     
> (DW_OP_addr: 60102c)
>  <0><12b>: Abbrev Number: 1 (DW_TAG_compile_unit)
>     <131>   DW_AT_name        : test4.c
>  <1><13d>: Abbrev Number: 2 (DW_TAG_variable)
>     <13e>   DW_AT_name        : aaa
>     <142>   DW_AT_decl_file   : 1
>     <143>   DW_AT_decl_line   : 1
>     <144>   DW_AT_decl_column : 5
>     <145>   DW_AT_type        : <0x149>
>     <149>   DW_AT_external    : 1
> ...
> even though there's no seperate decl in the file, and gdb again keeps two
> entries in the symbol tables:
> ...
> Symtab for file test4.c
> 
> Blockvector:
> 
> block #000, object at 0x555e7afafb70, 1 syms/buckets in 0x0..0x0
>  int aaa; unresolved
>   block #001, object at 0x555e7afafac0 under 0x555e7afafb70, 1 syms/buckets
> in 0x0..0x0
>    typedef int int; 
> 
> 
> Symtab for file <artificial>
> 
> Blockvector:
> 
> block #000, object at 0x555e7afaf7d0, 1 syms/buckets in 0x400492..0x40049e
>  int main(void); block object 0x555e7afaf6c0, 0x400492..0x40049e section
> .text
>  int aaa; static at 0x60102c section .bss
>   block #001, object at 0x555e7afaf770 under 0x555e7afaf7d0, 0 syms/buckets
> in 0x400492..0x40049e
>     block #002, object at 0x555e7afaf6c0 under 0x555e7afaf770, 0
> syms/buckets in 0x400492..0x40049e, function main
> ...
> 

I've filed a PR to ignore these useless symbols: PR25759 - "Remove useless
decls from symtab".

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug symtab/25755] Means to not keep decls in symtab
  2020-04-01  8:13 [Bug symtab/25755] New: Means to not keep decls in symtab vries at gcc dot gnu.org
  2020-04-01 13:29 ` [Bug symtab/25755] " vries at gcc dot gnu.org
@ 2020-04-02 12:01 ` vries at gcc dot gnu.org
  2020-04-02 23:14 ` vries at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vries at gcc dot gnu.org @ 2020-04-02 12:01 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=25755

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
Yet another issue related to this feature: PR25764 - "LOC_UNRESOLVED symbol
missing from partial symtab".

I'm starting to wonder if switching this feature off by default would be a
large inconvenience for gdb users.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug symtab/25755] Means to not keep decls in symtab
  2020-04-01  8:13 [Bug symtab/25755] New: Means to not keep decls in symtab vries at gcc dot gnu.org
  2020-04-01 13:29 ` [Bug symtab/25755] " vries at gcc dot gnu.org
  2020-04-02 12:01 ` vries at gcc dot gnu.org
@ 2020-04-02 23:14 ` vries at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vries at gcc dot gnu.org @ 2020-04-02 23:14 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=25755

--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> ---
I tried out this patch and ran the testsuite:
...
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index f94c66b4f1..3d13e00554 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -20267,6 +20267,8 @@ new_symbol (struct die_info *die, struct type *type,
struct dwarf2_cu *cu,
                       ? cu->get_builder ()->get_global_symbols ()
                       : cu->list_in_scope);

+                 suppress_add = 1;
+
                  SYMBOL_ACLASS_INDEX (sym) = LOC_UNRESOLVED;
                }
              else if (!die_is_declaration (die, cu))
...

These were the new fails:
...
FAIL: gdb.base/symbol-alias.exp: p g_var_s_alias
FAIL: gdb.dwarf2/dw2-bad-unresolved.exp: ptype var
FAIL: gdb.dwarf2/dw2-bad-unresolved.exp: print var
FAIL: gdb.dwarf2/dw2-cu-size.exp: ptype noloc
FAIL: gdb.dwarf2/dw2-linkage-name-trust.exp: p c.membername
FAIL: gdb.dwarf2/dw2-linkage-name-trust.exp: p c.membername ()
FAIL: gdb.dwarf2/dw2-noloc.exp: no-run: print file_extern_locno_resolvable
FAIL: gdb.dwarf2/dw2-noloc.exp: no-run: ptype file_extern_locno_resolvable
FAIL: gdb.dwarf2/dw2-noloc.exp: in-main: print file_extern_locno_resolvable
FAIL: gdb.dwarf2/dw2-noloc.exp: in-main: ptype file_extern_locno_resolvable
FAIL: gdb.dwarf2/dw2-noloc.exp: print main_extern_locno_resolvable
FAIL: gdb.dwarf2/dw2-noloc.exp: ptype main_extern_locno_resolvable
FAIL: gdb.dwarf2/dw2-unresolved.exp: print/d var
FAIL: gdb.dwarf2/opaque-type-lookup.exp: p variable_a
FAIL: gdb.dwarf2/opaque-type-lookup.exp: p variable_b
...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-04-02 23:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01  8:13 [Bug symtab/25755] New: Means to not keep decls in symtab vries at gcc dot gnu.org
2020-04-01 13:29 ` [Bug symtab/25755] " vries at gcc dot gnu.org
2020-04-02 12:01 ` vries at gcc dot gnu.org
2020-04-02 23:14 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).