public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Garbage collection bugs
@ 2019-01-09  9:46 Joern Wolfgang Rennecke
  2019-01-09 10:46 ` Arseny Solokha
  2019-01-09 11:48 ` Richard Biener
  0 siblings, 2 replies; 5+ messages in thread
From: Joern Wolfgang Rennecke @ 2019-01-09  9:46 UTC (permalink / raw)
  To: gcc@gcc.gnu.org >> GCC Development

[-- Attachment #1: Type: text/plain, Size: 8538 bytes --]

We've been running builds/regression tests for GCC 8.2 configured with 
--enable-checking=all, and have observed some failures related to 
garbage collection.

First problem:

The g++.dg/pr85039-2.C tests (I've looked in detail at -std=c++98, but 
-std=c++11 and -std=c++14 appear to follow the same pattern) see gcc 
garbage-collecting a live vector.  A subsequent access to the vector 
with vec_quick_push causes a segmentation fault, as m_vecpfx.m_num is 
0xa5a5a5a5 . The vec data is also being freed / poisoned. The vector in 
question is an auto-variable of cp_parser_parenthesized_expression_list, 
which is declared as: vec<tree, va_gc> *expression_list;

According to doc/gty/texi: "you should reference all your data from 
static or external @code{GTY}-ed variables, and it is advised to call 
@code{ggc_collect} with a shallow call stack."

In this case, cgraph_node::finalize_function calls the garage collector,
as we are finishing a member function of a struct. gdb shows a backtrace 
of 34 frames, which is not really much as far as C++ parsing goes. The 
caller of finalize_function is expand_or_defer_fn, which uses the 
expression "function_depth > 1" to compute the no_collect paramter to 
finalize_function.
cp_parser_parenthesized_expression_list is in frame 21 of the backtrace 
at this point.
So, if we consider this shallow, cp_parser_parenthesized_expression_list 
either has to refrain from using a vector with garbage-collected 
allocation, or it has to make the pointer reachable from a GC root - at 
least if function_depth <= 1.
Is the attached patch the right approach?

When looking at regression test results for gcc version 9.0.0 20181028 
(experimental), the excess errors test for g++.dg/pr85039-2.C seems to 
pass, yet I can see no definite reason in the source code why that is 
so.  I tried running the test by hand in order to check if maybe the 
patch for PR c++/84455 plays a role,
but running the test by hand, it crashes again, and gdb shows the 
telltale a5 pattern in a pointer register.
#0  vec<tree_node*, va_gc, vl_embed>::quick_push (obj=<optimized out>,
     this=0x7ffff05ece60)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:974
#1  vec_safe_push<tree_node*, va_gc> (obj=<optimized out>,
     v=@0x7fffffffd038: 0x7ffff05ece60)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:766
#2  cp_parser_parenthesized_expression_list (
     parser=parser@entry=0x7ffff7ff83f0,
     is_attribute_list=is_attribute_list@entry=0, cast_p=cast_p@entry=false,
     allow_expansion_p=allow_expansion_p@entry=true,
     non_constant_p=non_constant_p@entry=0x7fffffffd103,
     close_paren_loc=close_paren_loc@entry=0x0, wrap_locations_p=false)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:7803
#3  0x00000000006e910d in cp_parser_initializer (
     parser=parser@entry=0x7ffff7ff83f0,
     is_direct_init=is_direct_init@entry=0x7fffffffd102,
     non_constant_p=non_constant_p@entry=0x7fffffffd103,
     subexpression_p=subexpression_p@entry=false)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:22009
#4  0x000000000070954e in cp_parser_init_declarator (
     parser=parser@entry=0x7ffff7ff83f0,
     decl_specifiers=decl_specifiers@entry=0x7fffffffd1c0,
     checks=checks@entry=0x0,
function_definition_allowed_p=function_definition_allowed_p@entry=true,
     member_p=member_p@entry=false, declares_class_or_enum=<optimized out>,
     function_definition_p=0x7fffffffd250, maybe_range_for_decl=0x0,
     init_loc=0x7fffffffd1ac, auto_result=0x7fffffffd2e0)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:19827
#5  0x0000000000711c5d in cp_parser_simple_declaration 
(parser=0x7ffff7ff83f0,
     function_definition_allowed_p=<optimized out>, 
maybe_range_for_decl=0x0)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:13179
#6  0x0000000000717bb5 in cp_parser_declaration (parser=0x7ffff7ff83f0)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:12876
#7  0x000000000071837d in cp_parser_translation_unit (parser=0x7ffff7ff83f0)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:4631
#8  c_parse_file ()
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:39108
#9  0x0000000000868db1 in c_common_parse_file ()
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/c-family/c-opts.c:1150
#10 0x0000000000e0aaaf in compile_file ()
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:455
#11 0x000000000059248a in do_compile ()
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2172
#12 toplev::main (this=this@entry=0x7fffffffd54e, argc=<optimized out>,

     argc@entry=100, argv=<optimized out>, argv@entry=0x7fffffffd648)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2307
#13 0x0000000000594b5b in main (argc=100, argv=0x7fffffffd648)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/main.c:39
(gdb) info reg rdx
rdx            0xa5a5a5a5       2779096485

    0x00000000006e84eb 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+283>:     je     0x6e8608 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+568>
    0x00000000006e84f1 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+289>:     lea    0x1(%rdx),%esi
    0x00000000006e84f4 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+292>:     mov    %esi,0x4(%rax)
=> 0x00000000006e84f7 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+295>:     mov %rcx,0x8(%rax,%rdx,8)
    0x00000000006e84fc 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+300>:     cmp %rcx,0x16cd67d(%rip)        # 
0x1db5b80 <global_trees>


Second problem:

We see: FAIL: gcc.dg/plugin/start_unit-test-1.c 
-fplugin=./start_unit_plugin.so (test for excess errors)
Excess errors:
fake_var not a VAR_DECL

The message comes from plugin; the source code is in 
testsuite/gcc.dg/plugin/start_unit_plugin.c . The plugin holds a tree 
value in a static variable, yet fails to make it referenced by any 
garbage collection root tables.
When configuring with --enable-checking=all, a garbage colection comes 
along and
poisons the fake_var allocated by the plugin.
gty.texi advises:
   Plugins can add additional root tables.  Run the @code{gengtype}
   utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir}
   @var{file-list} @var{plugin*.c}} with your plugin files
   @var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file.
   The GCC build tree is needed to be present in that mode.
However, can we use the build directory in the test suite, considering that
it's supposed to be usable also for installed compilers?
I suppose, for gcc core types like tree we should be able to prepare a
header file beforehand.


Third problem:

Both in 8.2 and 9.0.0 configureed with --enable-checking-all, we see:
FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg0 = { a }"
FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg1 = { a }"

Looking closer at the problem in 8.2,  see that large parts of the dump 
file show signs of garbage-collector poisoned memory.
Looking at "Points-to sets", the information is reached from a static 
variable in
tree-ssa-structalias.c that lacks a GTY marker:
static vec<varinfo_t> varmap;

The "name" field in struct variable_info sometimes contains a string 
literal, and sometimes a ggc-allocated string.
I'm not sure if ggc can handle this, or if we need to change the code so 
that we make these either all ggc-allocated strings, or add a 
discriminator so we can tell which kind of string there is so that we 
can tell the garbage collector to ignore string literals.
At any rate, We can't make the "vec<varinfo_t>" as-is into a GC root.
Should that be changed into a va_gc vector pointer, and all the code that
uses it adjusted accordingly, or would it be better to use a new variable
as GC root and fake the references with a GTY((user)) for the new variable
that employs a marking function that marks the elements inside varmap?


[-- Attachment #2: expression-list-preserve-patch.txt --]
[-- Type: text/plain, Size: 1496 bytes --]

2018-12-27  Joern Rennecke  <joern.rennecke@riscy-ip.com>

	* cp/parser.c: (expression_list_ref): New GTY variable.
	(cp_parser_parenthesized_expression_list): Make it reference
	expression_list.

Index: gcc/cp/parser.c
===================================================================
--- gcc/cp/parser.c	(revision 5644)
+++ gcc/cp/parser.c	(revision 5645)
@@ -7649,6 +7649,14 @@ cp_parser_postfix_dot_deref_expression (
   return postfix_expression;
 }
 
+/* When processing a member function, cgraph_node::finalize_function can
+   call ggc_collect (like if called by expand_or_defer_fn from a struct
+   member function with (function_depth <= 1)).
+   Prevent expression_list in cp_parser_parenthesized_expression_list to
+   be garbage collected then by referencing it from a GC root.  */
+typedef vec<tree, va_gc> *tree_gc_vec;
+static GTY(()) vec<tree_gc_vec, va_gc> *expression_list_ref;
+
 /* Parse a parenthesized expression-list.
 
    expression-list:
@@ -7705,6 +7713,7 @@ cp_parser_parenthesized_expression_list
     return NULL;
 
   expression_list = make_tree_vector ();
+  vec_safe_push (expression_list_ref, expression_list);
 
   /* Within a parenthesized expression, a `>' token is always
      the greater-than operator.  */
@@ -7797,6 +7806,8 @@ cp_parser_parenthesized_expression_list
 	cp_lexer_consume_token (parser->lexer);
       }
 
+  expression_list_ref->pop ();
+
   if (close_paren_loc)
     *close_paren_loc = cp_lexer_peek_token (parser->lexer)->location;
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Garbage collection bugs
  2019-01-09  9:46 Garbage collection bugs Joern Wolfgang Rennecke
@ 2019-01-09 10:46 ` Arseny Solokha
  2019-01-09 11:48 ` Richard Biener
  1 sibling, 0 replies; 5+ messages in thread
From: Arseny Solokha @ 2019-01-09 10:46 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc@gcc.gnu.org >> GCC Development

> First problem:
> 
> The g++.dg/pr85039-2.C tests (I've looked in detail at -std=c++98, but
> -std=c++11 and -std=c++14 appear to follow the same pattern) see gcc
> garbage-collecting a live vector.  A subsequent access to the vector with
> vec_quick_push causes a segmentation fault, as m_vecpfx.m_num is 0xa5a5a5a5 .
> The vec data is also being freed / poisoned. The vector in question is an
> auto-variable of cp_parser_parenthesized_expression_list, which is declared as:
> vec<tree, va_gc> *expression_list;

It looks like PR88180 to me.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Garbage collection bugs
  2019-01-09  9:46 Garbage collection bugs Joern Wolfgang Rennecke
  2019-01-09 10:46 ` Arseny Solokha
@ 2019-01-09 11:48 ` Richard Biener
  2019-01-09 11:54   ` Richard Biener
  1 sibling, 1 reply; 5+ messages in thread
From: Richard Biener @ 2019-01-09 11:48 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke; +Cc: gcc@gcc.gnu.org >> GCC Development

On Wed, Jan 9, 2019 at 10:46 AM Joern Wolfgang Rennecke
<joern.rennecke@riscy-ip.com> wrote:
>
> We've been running builds/regression tests for GCC 8.2 configured with
> --enable-checking=all, and have observed some failures related to
> garbage collection.
>
> First problem:
>
> The g++.dg/pr85039-2.C tests (I've looked in detail at -std=c++98, but
> -std=c++11 and -std=c++14 appear to follow the same pattern) see gcc
> garbage-collecting a live vector.  A subsequent access to the vector
> with vec_quick_push causes a segmentation fault, as m_vecpfx.m_num is
> 0xa5a5a5a5 . The vec data is also being freed / poisoned. The vector in
> question is an auto-variable of cp_parser_parenthesized_expression_list,
> which is declared as: vec<tree, va_gc> *expression_list;
>
> According to doc/gty/texi: "you should reference all your data from
> static or external @code{GTY}-ed variables, and it is advised to call
> @code{ggc_collect} with a shallow call stack."
>
> In this case, cgraph_node::finalize_function calls the garage collector,
> as we are finishing a member function of a struct. gdb shows a backtrace
> of 34 frames, which is not really much as far as C++ parsing goes. The
> caller of finalize_function is expand_or_defer_fn, which uses the
> expression "function_depth > 1" to compute the no_collect paramter to
> finalize_function.
> cp_parser_parenthesized_expression_list is in frame 21 of the backtrace
> at this point.
> So, if we consider this shallow, cp_parser_parenthesized_expression_list
> either has to refrain from using a vector with garbage-collected
> allocation, or it has to make the pointer reachable from a GC root - at
> least if function_depth <= 1.
> Is the attached patch the right approach?
>
> When looking at regression test results for gcc version 9.0.0 20181028
> (experimental), the excess errors test for g++.dg/pr85039-2.C seems to
> pass, yet I can see no definite reason in the source code why that is
> so.  I tried running the test by hand in order to check if maybe the
> patch for PR c++/84455 plays a role,
> but running the test by hand, it crashes again, and gdb shows the
> telltale a5 pattern in a pointer register.
> #0  vec<tree_node*, va_gc, vl_embed>::quick_push (obj=<optimized out>,
>      this=0x7ffff05ece60)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:974
> #1  vec_safe_push<tree_node*, va_gc> (obj=<optimized out>,
>      v=@0x7fffffffd038: 0x7ffff05ece60)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:766
> #2  cp_parser_parenthesized_expression_list (
>      parser=parser@entry=0x7ffff7ff83f0,
>      is_attribute_list=is_attribute_list@entry=0, cast_p=cast_p@entry=false,
>      allow_expansion_p=allow_expansion_p@entry=true,
>      non_constant_p=non_constant_p@entry=0x7fffffffd103,
>      close_paren_loc=close_paren_loc@entry=0x0, wrap_locations_p=false)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:7803
> #3  0x00000000006e910d in cp_parser_initializer (
>      parser=parser@entry=0x7ffff7ff83f0,
>      is_direct_init=is_direct_init@entry=0x7fffffffd102,
>      non_constant_p=non_constant_p@entry=0x7fffffffd103,
>      subexpression_p=subexpression_p@entry=false)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:22009
> #4  0x000000000070954e in cp_parser_init_declarator (
>      parser=parser@entry=0x7ffff7ff83f0,
>      decl_specifiers=decl_specifiers@entry=0x7fffffffd1c0,
>      checks=checks@entry=0x0,
> function_definition_allowed_p=function_definition_allowed_p@entry=true,
>      member_p=member_p@entry=false, declares_class_or_enum=<optimized out>,
>      function_definition_p=0x7fffffffd250, maybe_range_for_decl=0x0,
>      init_loc=0x7fffffffd1ac, auto_result=0x7fffffffd2e0)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:19827
> #5  0x0000000000711c5d in cp_parser_simple_declaration
> (parser=0x7ffff7ff83f0,
>      function_definition_allowed_p=<optimized out>,
> maybe_range_for_decl=0x0)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:13179
> #6  0x0000000000717bb5 in cp_parser_declaration (parser=0x7ffff7ff83f0)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:12876
> #7  0x000000000071837d in cp_parser_translation_unit (parser=0x7ffff7ff83f0)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:4631
> #8  c_parse_file ()
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:39108
> #9  0x0000000000868db1 in c_common_parse_file ()
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/c-family/c-opts.c:1150
> #10 0x0000000000e0aaaf in compile_file ()
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:455
> #11 0x000000000059248a in do_compile ()
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2172
> #12 toplev::main (this=this@entry=0x7fffffffd54e, argc=<optimized out>,
>
>      argc@entry=100, argv=<optimized out>, argv@entry=0x7fffffffd648)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2307
> #13 0x0000000000594b5b in main (argc=100, argv=0x7fffffffd648)
>      at
> /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/main.c:39
> (gdb) info reg rdx
> rdx            0xa5a5a5a5       2779096485
>
>     0x00000000006e84eb
> <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> bool*, location_t*, bool)+283>:     je     0x6e8608
> <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> bool*, location_t*, bool)+568>
>     0x00000000006e84f1
> <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> bool*, location_t*, bool)+289>:     lea    0x1(%rdx),%esi
>     0x00000000006e84f4
> <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> bool*, location_t*, bool)+292>:     mov    %esi,0x4(%rax)
> => 0x00000000006e84f7
> <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> bool*, location_t*, bool)+295>:     mov %rcx,0x8(%rax,%rdx,8)
>     0x00000000006e84fc
> <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> bool*, location_t*, bool)+300>:     cmp %rcx,0x16cd67d(%rip)        #
> 0x1db5b80 <global_trees>
>
>
> Second problem:
>
> We see: FAIL: gcc.dg/plugin/start_unit-test-1.c
> -fplugin=./start_unit_plugin.so (test for excess errors)
> Excess errors:
> fake_var not a VAR_DECL
>
> The message comes from plugin; the source code is in
> testsuite/gcc.dg/plugin/start_unit_plugin.c . The plugin holds a tree
> value in a static variable, yet fails to make it referenced by any
> garbage collection root tables.
> When configuring with --enable-checking=all, a garbage colection comes
> along and
> poisons the fake_var allocated by the plugin.
> gty.texi advises:
>    Plugins can add additional root tables.  Run the @code{gengtype}
>    utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir}
>    @var{file-list} @var{plugin*.c}} with your plugin files
>    @var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file.
>    The GCC build tree is needed to be present in that mode.
> However, can we use the build directory in the test suite, considering that
> it's supposed to be usable also for installed compilers?
> I suppose, for gcc core types like tree we should be able to prepare a
> header file beforehand.
>
>
> Third problem:
>
> Both in 8.2 and 9.0.0 configureed with --enable-checking-all, we see:
> FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg0 = { a }"
> FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg1 = { a }"
>
> Looking closer at the problem in 8.2,  see that large parts of the dump
> file show signs of garbage-collector poisoned memory.
> Looking at "Points-to sets", the information is reached from a static
> variable in
> tree-ssa-structalias.c that lacks a GTY marker:
> static vec<varinfo_t> varmap;

Hmm, it shouldn't be live across pass invocations.  Where does it collect
when ipa_pta_execute is in the call stack?

> The "name" field in struct variable_info sometimes contains a string
> literal, and sometimes a ggc-allocated string.
> I'm not sure if ggc can handle this, or if we need to change the code so
> that we make these either all ggc-allocated strings, or add a
> discriminator so we can tell which kind of string there is so that we
> can tell the garbage collector to ignore string literals.
> At any rate, We can't make the "vec<varinfo_t>" as-is into a GC root.
> Should that be changed into a va_gc vector pointer, and all the code that
> uses it adjusted accordingly, or would it be better to use a new variable
> as GC root and fake the references with a GTY((user)) for the new variable
> that employs a marking function that marks the elements inside varmap?
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Garbage collection bugs
  2019-01-09 11:48 ` Richard Biener
@ 2019-01-09 11:54   ` Richard Biener
  2019-01-09 11:59     ` Jan Hubicka
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2019-01-09 11:54 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, Jan Hubicka
  Cc: gcc@gcc.gnu.org >> GCC Development

On Wed, Jan 9, 2019 at 12:48 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Jan 9, 2019 at 10:46 AM Joern Wolfgang Rennecke
> <joern.rennecke@riscy-ip.com> wrote:
> >
> > We've been running builds/regression tests for GCC 8.2 configured with
> > --enable-checking=all, and have observed some failures related to
> > garbage collection.
> >
> > First problem:
> >
> > The g++.dg/pr85039-2.C tests (I've looked in detail at -std=c++98, but
> > -std=c++11 and -std=c++14 appear to follow the same pattern) see gcc
> > garbage-collecting a live vector.  A subsequent access to the vector
> > with vec_quick_push causes a segmentation fault, as m_vecpfx.m_num is
> > 0xa5a5a5a5 . The vec data is also being freed / poisoned. The vector in
> > question is an auto-variable of cp_parser_parenthesized_expression_list,
> > which is declared as: vec<tree, va_gc> *expression_list;
> >
> > According to doc/gty/texi: "you should reference all your data from
> > static or external @code{GTY}-ed variables, and it is advised to call
> > @code{ggc_collect} with a shallow call stack."
> >
> > In this case, cgraph_node::finalize_function calls the garage collector,
> > as we are finishing a member function of a struct. gdb shows a backtrace
> > of 34 frames, which is not really much as far as C++ parsing goes. The
> > caller of finalize_function is expand_or_defer_fn, which uses the
> > expression "function_depth > 1" to compute the no_collect paramter to
> > finalize_function.
> > cp_parser_parenthesized_expression_list is in frame 21 of the backtrace
> > at this point.
> > So, if we consider this shallow, cp_parser_parenthesized_expression_list
> > either has to refrain from using a vector with garbage-collected
> > allocation, or it has to make the pointer reachable from a GC root - at
> > least if function_depth <= 1.
> > Is the attached patch the right approach?
> >
> > When looking at regression test results for gcc version 9.0.0 20181028
> > (experimental), the excess errors test for g++.dg/pr85039-2.C seems to
> > pass, yet I can see no definite reason in the source code why that is
> > so.  I tried running the test by hand in order to check if maybe the
> > patch for PR c++/84455 plays a role,
> > but running the test by hand, it crashes again, and gdb shows the
> > telltale a5 pattern in a pointer register.
> > #0  vec<tree_node*, va_gc, vl_embed>::quick_push (obj=<optimized out>,
> >      this=0x7ffff05ece60)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:974
> > #1  vec_safe_push<tree_node*, va_gc> (obj=<optimized out>,
> >      v=@0x7fffffffd038: 0x7ffff05ece60)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:766
> > #2  cp_parser_parenthesized_expression_list (
> >      parser=parser@entry=0x7ffff7ff83f0,
> >      is_attribute_list=is_attribute_list@entry=0, cast_p=cast_p@entry=false,
> >      allow_expansion_p=allow_expansion_p@entry=true,
> >      non_constant_p=non_constant_p@entry=0x7fffffffd103,
> >      close_paren_loc=close_paren_loc@entry=0x0, wrap_locations_p=false)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:7803
> > #3  0x00000000006e910d in cp_parser_initializer (
> >      parser=parser@entry=0x7ffff7ff83f0,
> >      is_direct_init=is_direct_init@entry=0x7fffffffd102,
> >      non_constant_p=non_constant_p@entry=0x7fffffffd103,
> >      subexpression_p=subexpression_p@entry=false)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:22009
> > #4  0x000000000070954e in cp_parser_init_declarator (
> >      parser=parser@entry=0x7ffff7ff83f0,
> >      decl_specifiers=decl_specifiers@entry=0x7fffffffd1c0,
> >      checks=checks@entry=0x0,
> > function_definition_allowed_p=function_definition_allowed_p@entry=true,
> >      member_p=member_p@entry=false, declares_class_or_enum=<optimized out>,
> >      function_definition_p=0x7fffffffd250, maybe_range_for_decl=0x0,
> >      init_loc=0x7fffffffd1ac, auto_result=0x7fffffffd2e0)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:19827
> > #5  0x0000000000711c5d in cp_parser_simple_declaration
> > (parser=0x7ffff7ff83f0,
> >      function_definition_allowed_p=<optimized out>,
> > maybe_range_for_decl=0x0)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:13179
> > #6  0x0000000000717bb5 in cp_parser_declaration (parser=0x7ffff7ff83f0)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:12876
> > #7  0x000000000071837d in cp_parser_translation_unit (parser=0x7ffff7ff83f0)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:4631
> > #8  c_parse_file ()
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:39108
> > #9  0x0000000000868db1 in c_common_parse_file ()
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/c-family/c-opts.c:1150
> > #10 0x0000000000e0aaaf in compile_file ()
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:455
> > #11 0x000000000059248a in do_compile ()
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2172
> > #12 toplev::main (this=this@entry=0x7fffffffd54e, argc=<optimized out>,
> >
> >      argc@entry=100, argv=<optimized out>, argv@entry=0x7fffffffd648)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2307
> > #13 0x0000000000594b5b in main (argc=100, argv=0x7fffffffd648)
> >      at
> > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/main.c:39
> > (gdb) info reg rdx
> > rdx            0xa5a5a5a5       2779096485
> >
> >     0x00000000006e84eb
> > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > bool*, location_t*, bool)+283>:     je     0x6e8608
> > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > bool*, location_t*, bool)+568>
> >     0x00000000006e84f1
> > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > bool*, location_t*, bool)+289>:     lea    0x1(%rdx),%esi
> >     0x00000000006e84f4
> > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > bool*, location_t*, bool)+292>:     mov    %esi,0x4(%rax)
> > => 0x00000000006e84f7
> > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > bool*, location_t*, bool)+295>:     mov %rcx,0x8(%rax,%rdx,8)
> >     0x00000000006e84fc
> > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > bool*, location_t*, bool)+300>:     cmp %rcx,0x16cd67d(%rip)        #
> > 0x1db5b80 <global_trees>
> >
> >
> > Second problem:
> >
> > We see: FAIL: gcc.dg/plugin/start_unit-test-1.c
> > -fplugin=./start_unit_plugin.so (test for excess errors)
> > Excess errors:
> > fake_var not a VAR_DECL
> >
> > The message comes from plugin; the source code is in
> > testsuite/gcc.dg/plugin/start_unit_plugin.c . The plugin holds a tree
> > value in a static variable, yet fails to make it referenced by any
> > garbage collection root tables.
> > When configuring with --enable-checking=all, a garbage colection comes
> > along and
> > poisons the fake_var allocated by the plugin.
> > gty.texi advises:
> >    Plugins can add additional root tables.  Run the @code{gengtype}
> >    utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir}
> >    @var{file-list} @var{plugin*.c}} with your plugin files
> >    @var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file.
> >    The GCC build tree is needed to be present in that mode.
> > However, can we use the build directory in the test suite, considering that
> > it's supposed to be usable also for installed compilers?
> > I suppose, for gcc core types like tree we should be able to prepare a
> > header file beforehand.
> >
> >
> > Third problem:
> >
> > Both in 8.2 and 9.0.0 configureed with --enable-checking-all, we see:
> > FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg0 = { a }"
> > FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg1 = { a }"
> >
> > Looking closer at the problem in 8.2,  see that large parts of the dump
> > file show signs of garbage-collector poisoned memory.
> > Looking at "Points-to sets", the information is reached from a static
> > variable in
> > tree-ssa-structalias.c that lacks a GTY marker:
> > static vec<varinfo_t> varmap;
>
> Hmm, it shouldn't be live across pass invocations.  Where does it collect
> when ipa_pta_execute is in the call stack?

Ah, ipa_pta_execute calls ->get_body () which does

  if (ipa_transforms_to_apply.exists ())
    {
...
      push_cfun (DECL_STRUCT_FUNCTION (decl));
      execute_all_ipa_transforms ();

which ends up in execute_one_ipa_transform_pass which does

  /* Signal this is a suitable GC collection point.  */
  if (!(todo_after & TODO_do_not_ggc_collect))
    ggc_collect ();

that's of course bogus.  Honza - this get_body () behavior is broken, one
"fix" would be to never collect in execute_one_ipa_transform_pass or
somehow pass down a flag not to.

Can you take care of this?

Richard.

> > The "name" field in struct variable_info sometimes contains a string
> > literal, and sometimes a ggc-allocated string.
> > I'm not sure if ggc can handle this, or if we need to change the code so
> > that we make these either all ggc-allocated strings, or add a
> > discriminator so we can tell which kind of string there is so that we
> > can tell the garbage collector to ignore string literals.
> > At any rate, We can't make the "vec<varinfo_t>" as-is into a GC root.
> > Should that be changed into a va_gc vector pointer, and all the code that
> > uses it adjusted accordingly, or would it be better to use a new variable
> > as GC root and fake the references with a GTY((user)) for the new variable
> > that employs a marking function that marks the elements inside varmap?
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Garbage collection bugs
  2019-01-09 11:54   ` Richard Biener
@ 2019-01-09 11:59     ` Jan Hubicka
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Hubicka @ 2019-01-09 11:59 UTC (permalink / raw)
  To: Richard Biener
  Cc: Joern Wolfgang Rennecke, gcc@gcc.gnu.org >> GCC Development

> On Wed, Jan 9, 2019 at 12:48 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Jan 9, 2019 at 10:46 AM Joern Wolfgang Rennecke
> > <joern.rennecke@riscy-ip.com> wrote:
> > >
> > > We've been running builds/regression tests for GCC 8.2 configured with
> > > --enable-checking=all, and have observed some failures related to
> > > garbage collection.
> > >
> > > First problem:
> > >
> > > The g++.dg/pr85039-2.C tests (I've looked in detail at -std=c++98, but
> > > -std=c++11 and -std=c++14 appear to follow the same pattern) see gcc
> > > garbage-collecting a live vector.  A subsequent access to the vector
> > > with vec_quick_push causes a segmentation fault, as m_vecpfx.m_num is
> > > 0xa5a5a5a5 . The vec data is also being freed / poisoned. The vector in
> > > question is an auto-variable of cp_parser_parenthesized_expression_list,
> > > which is declared as: vec<tree, va_gc> *expression_list;
> > >
> > > According to doc/gty/texi: "you should reference all your data from
> > > static or external @code{GTY}-ed variables, and it is advised to call
> > > @code{ggc_collect} with a shallow call stack."
> > >
> > > In this case, cgraph_node::finalize_function calls the garage collector,
> > > as we are finishing a member function of a struct. gdb shows a backtrace
> > > of 34 frames, which is not really much as far as C++ parsing goes. The
> > > caller of finalize_function is expand_or_defer_fn, which uses the
> > > expression "function_depth > 1" to compute the no_collect paramter to
> > > finalize_function.
> > > cp_parser_parenthesized_expression_list is in frame 21 of the backtrace
> > > at this point.
> > > So, if we consider this shallow, cp_parser_parenthesized_expression_list
> > > either has to refrain from using a vector with garbage-collected
> > > allocation, or it has to make the pointer reachable from a GC root - at
> > > least if function_depth <= 1.
> > > Is the attached patch the right approach?
> > >
> > > When looking at regression test results for gcc version 9.0.0 20181028
> > > (experimental), the excess errors test for g++.dg/pr85039-2.C seems to
> > > pass, yet I can see no definite reason in the source code why that is
> > > so.  I tried running the test by hand in order to check if maybe the
> > > patch for PR c++/84455 plays a role,
> > > but running the test by hand, it crashes again, and gdb shows the
> > > telltale a5 pattern in a pointer register.
> > > #0  vec<tree_node*, va_gc, vl_embed>::quick_push (obj=<optimized out>,
> > >      this=0x7ffff05ece60)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:974
> > > #1  vec_safe_push<tree_node*, va_gc> (obj=<optimized out>,
> > >      v=@0x7fffffffd038: 0x7ffff05ece60)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:766
> > > #2  cp_parser_parenthesized_expression_list (
> > >      parser=parser@entry=0x7ffff7ff83f0,
> > >      is_attribute_list=is_attribute_list@entry=0, cast_p=cast_p@entry=false,
> > >      allow_expansion_p=allow_expansion_p@entry=true,
> > >      non_constant_p=non_constant_p@entry=0x7fffffffd103,
> > >      close_paren_loc=close_paren_loc@entry=0x0, wrap_locations_p=false)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:7803
> > > #3  0x00000000006e910d in cp_parser_initializer (
> > >      parser=parser@entry=0x7ffff7ff83f0,
> > >      is_direct_init=is_direct_init@entry=0x7fffffffd102,
> > >      non_constant_p=non_constant_p@entry=0x7fffffffd103,
> > >      subexpression_p=subexpression_p@entry=false)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:22009
> > > #4  0x000000000070954e in cp_parser_init_declarator (
> > >      parser=parser@entry=0x7ffff7ff83f0,
> > >      decl_specifiers=decl_specifiers@entry=0x7fffffffd1c0,
> > >      checks=checks@entry=0x0,
> > > function_definition_allowed_p=function_definition_allowed_p@entry=true,
> > >      member_p=member_p@entry=false, declares_class_or_enum=<optimized out>,
> > >      function_definition_p=0x7fffffffd250, maybe_range_for_decl=0x0,
> > >      init_loc=0x7fffffffd1ac, auto_result=0x7fffffffd2e0)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:19827
> > > #5  0x0000000000711c5d in cp_parser_simple_declaration
> > > (parser=0x7ffff7ff83f0,
> > >      function_definition_allowed_p=<optimized out>,
> > > maybe_range_for_decl=0x0)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:13179
> > > #6  0x0000000000717bb5 in cp_parser_declaration (parser=0x7ffff7ff83f0)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:12876
> > > #7  0x000000000071837d in cp_parser_translation_unit (parser=0x7ffff7ff83f0)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:4631
> > > #8  c_parse_file ()
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:39108
> > > #9  0x0000000000868db1 in c_common_parse_file ()
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/c-family/c-opts.c:1150
> > > #10 0x0000000000e0aaaf in compile_file ()
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:455
> > > #11 0x000000000059248a in do_compile ()
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2172
> > > #12 toplev::main (this=this@entry=0x7fffffffd54e, argc=<optimized out>,
> > >
> > >      argc@entry=100, argv=<optimized out>, argv@entry=0x7fffffffd648)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2307
> > > #13 0x0000000000594b5b in main (argc=100, argv=0x7fffffffd648)
> > >      at
> > > /data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/main.c:39
> > > (gdb) info reg rdx
> > > rdx            0xa5a5a5a5       2779096485
> > >
> > >     0x00000000006e84eb
> > > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > > bool*, location_t*, bool)+283>:     je     0x6e8608
> > > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > > bool*, location_t*, bool)+568>
> > >     0x00000000006e84f1
> > > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > > bool*, location_t*, bool)+289>:     lea    0x1(%rdx),%esi
> > >     0x00000000006e84f4
> > > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > > bool*, location_t*, bool)+292>:     mov    %esi,0x4(%rax)
> > > => 0x00000000006e84f7
> > > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > > bool*, location_t*, bool)+295>:     mov %rcx,0x8(%rax,%rdx,8)
> > >     0x00000000006e84fc
> > > <cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool,
> > > bool*, location_t*, bool)+300>:     cmp %rcx,0x16cd67d(%rip)        #
> > > 0x1db5b80 <global_trees>
> > >
> > >
> > > Second problem:
> > >
> > > We see: FAIL: gcc.dg/plugin/start_unit-test-1.c
> > > -fplugin=./start_unit_plugin.so (test for excess errors)
> > > Excess errors:
> > > fake_var not a VAR_DECL
> > >
> > > The message comes from plugin; the source code is in
> > > testsuite/gcc.dg/plugin/start_unit_plugin.c . The plugin holds a tree
> > > value in a static variable, yet fails to make it referenced by any
> > > garbage collection root tables.
> > > When configuring with --enable-checking=all, a garbage colection comes
> > > along and
> > > poisons the fake_var allocated by the plugin.
> > > gty.texi advises:
> > >    Plugins can add additional root tables.  Run the @code{gengtype}
> > >    utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir}
> > >    @var{file-list} @var{plugin*.c}} with your plugin files
> > >    @var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file.
> > >    The GCC build tree is needed to be present in that mode.
> > > However, can we use the build directory in the test suite, considering that
> > > it's supposed to be usable also for installed compilers?
> > > I suppose, for gcc core types like tree we should be able to prepare a
> > > header file beforehand.
> > >
> > >
> > > Third problem:
> > >
> > > Both in 8.2 and 9.0.0 configureed with --enable-checking-all, we see:
> > > FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg0 = { a }"
> > > FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg1 = { a }"
> > >
> > > Looking closer at the problem in 8.2,  see that large parts of the dump
> > > file show signs of garbage-collector poisoned memory.
> > > Looking at "Points-to sets", the information is reached from a static
> > > variable in
> > > tree-ssa-structalias.c that lacks a GTY marker:
> > > static vec<varinfo_t> varmap;
> >
> > Hmm, it shouldn't be live across pass invocations.  Where does it collect
> > when ipa_pta_execute is in the call stack?
> 
> Ah, ipa_pta_execute calls ->get_body () which does
> 
>   if (ipa_transforms_to_apply.exists ())
>     {
> ...
>       push_cfun (DECL_STRUCT_FUNCTION (decl));
>       execute_all_ipa_transforms ();
> 
> which ends up in execute_one_ipa_transform_pass which does
> 
>   /* Signal this is a suitable GC collection point.  */
>   if (!(todo_after & TODO_do_not_ggc_collect))
>     ggc_collect ();
> 
> that's of course bogus.  Honza - this get_body () behavior is broken, one
> "fix" would be to never collect in execute_one_ipa_transform_pass or
> somehow pass down a flag not to.
> 
> Can you take care of this?
Yes, i will take a look. I think we want to be able to get bodies lazily
at random points in compilation, so we do not want to collect after
transforms. I will prepare patch.

Honza
> 
> Richard.
> 
> > > The "name" field in struct variable_info sometimes contains a string
> > > literal, and sometimes a ggc-allocated string.
> > > I'm not sure if ggc can handle this, or if we need to change the code so
> > > that we make these either all ggc-allocated strings, or add a
> > > discriminator so we can tell which kind of string there is so that we
> > > can tell the garbage collector to ignore string literals.
> > > At any rate, We can't make the "vec<varinfo_t>" as-is into a GC root.
> > > Should that be changed into a va_gc vector pointer, and all the code that
> > > uses it adjusted accordingly, or would it be better to use a new variable
> > > as GC root and fake the references with a GTY((user)) for the new variable
> > > that employs a marking function that marks the elements inside varmap?
> > >

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-09 11:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09  9:46 Garbage collection bugs Joern Wolfgang Rennecke
2019-01-09 10:46 ` Arseny Solokha
2019-01-09 11:48 ` Richard Biener
2019-01-09 11:54   ` Richard Biener
2019-01-09 11:59     ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).