public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Garbage collection bugs
@ 2019-01-09  9:46 Joern Wolfgang Rennecke
  2019-01-09 10:46 ` Arseny Solokha
  2019-01-09 11:48 ` Richard Biener
  0 siblings, 2 replies; 5+ messages in thread
From: Joern Wolfgang Rennecke @ 2019-01-09  9:46 UTC (permalink / raw)
  To: gcc@gcc.gnu.org >> GCC Development

[-- Attachment #1: Type: text/plain, Size: 8538 bytes --]

We've been running builds/regression tests for GCC 8.2 configured with 
--enable-checking=all, and have observed some failures related to 
garbage collection.

First problem:

The g++.dg/pr85039-2.C tests (I've looked in detail at -std=c++98, but 
-std=c++11 and -std=c++14 appear to follow the same pattern) see gcc 
garbage-collecting a live vector.  A subsequent access to the vector 
with vec_quick_push causes a segmentation fault, as m_vecpfx.m_num is 
0xa5a5a5a5 . The vec data is also being freed / poisoned. The vector in 
question is an auto-variable of cp_parser_parenthesized_expression_list, 
which is declared as: vec<tree, va_gc> *expression_list;

According to doc/gty/texi: "you should reference all your data from 
static or external @code{GTY}-ed variables, and it is advised to call 
@code{ggc_collect} with a shallow call stack."

In this case, cgraph_node::finalize_function calls the garage collector,
as we are finishing a member function of a struct. gdb shows a backtrace 
of 34 frames, which is not really much as far as C++ parsing goes. The 
caller of finalize_function is expand_or_defer_fn, which uses the 
expression "function_depth > 1" to compute the no_collect paramter to 
finalize_function.
cp_parser_parenthesized_expression_list is in frame 21 of the backtrace 
at this point.
So, if we consider this shallow, cp_parser_parenthesized_expression_list 
either has to refrain from using a vector with garbage-collected 
allocation, or it has to make the pointer reachable from a GC root - at 
least if function_depth <= 1.
Is the attached patch the right approach?

When looking at regression test results for gcc version 9.0.0 20181028 
(experimental), the excess errors test for g++.dg/pr85039-2.C seems to 
pass, yet I can see no definite reason in the source code why that is 
so.  I tried running the test by hand in order to check if maybe the 
patch for PR c++/84455 plays a role,
but running the test by hand, it crashes again, and gdb shows the 
telltale a5 pattern in a pointer register.
#0  vec<tree_node*, va_gc, vl_embed>::quick_push (obj=<optimized out>,
     this=0x7ffff05ece60)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:974
#1  vec_safe_push<tree_node*, va_gc> (obj=<optimized out>,
     v=@0x7fffffffd038: 0x7ffff05ece60)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/vec.h:766
#2  cp_parser_parenthesized_expression_list (
     parser=parser@entry=0x7ffff7ff83f0,
     is_attribute_list=is_attribute_list@entry=0, cast_p=cast_p@entry=false,
     allow_expansion_p=allow_expansion_p@entry=true,
     non_constant_p=non_constant_p@entry=0x7fffffffd103,
     close_paren_loc=close_paren_loc@entry=0x0, wrap_locations_p=false)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:7803
#3  0x00000000006e910d in cp_parser_initializer (
     parser=parser@entry=0x7ffff7ff83f0,
     is_direct_init=is_direct_init@entry=0x7fffffffd102,
     non_constant_p=non_constant_p@entry=0x7fffffffd103,
     subexpression_p=subexpression_p@entry=false)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:22009
#4  0x000000000070954e in cp_parser_init_declarator (
     parser=parser@entry=0x7ffff7ff83f0,
     decl_specifiers=decl_specifiers@entry=0x7fffffffd1c0,
     checks=checks@entry=0x0,
function_definition_allowed_p=function_definition_allowed_p@entry=true,
     member_p=member_p@entry=false, declares_class_or_enum=<optimized out>,
     function_definition_p=0x7fffffffd250, maybe_range_for_decl=0x0,
     init_loc=0x7fffffffd1ac, auto_result=0x7fffffffd2e0)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:19827
#5  0x0000000000711c5d in cp_parser_simple_declaration 
(parser=0x7ffff7ff83f0,
     function_definition_allowed_p=<optimized out>, 
maybe_range_for_decl=0x0)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:13179
#6  0x0000000000717bb5 in cp_parser_declaration (parser=0x7ffff7ff83f0)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:12876
#7  0x000000000071837d in cp_parser_translation_unit (parser=0x7ffff7ff83f0)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:4631
#8  c_parse_file ()
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/cp/parser.c:39108
#9  0x0000000000868db1 in c_common_parse_file ()
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/c-family/c-opts.c:1150
#10 0x0000000000e0aaaf in compile_file ()
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:455
#11 0x000000000059248a in do_compile ()
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2172
#12 toplev::main (this=this@entry=0x7fffffffd54e, argc=<optimized out>,

     argc@entry=100, argv=<optimized out>, argv@entry=0x7fffffffd648)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/toplev.c:2307
#13 0x0000000000594b5b in main (argc=100, argv=0x7fffffffd648)
     at 
/data/hudson/jobs/gcc-9.0.0-linux/workspace/gcc/build/../gcc/gcc/main.c:39
(gdb) info reg rdx
rdx            0xa5a5a5a5       2779096485

    0x00000000006e84eb 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+283>:     je     0x6e8608 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+568>
    0x00000000006e84f1 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+289>:     lea    0x1(%rdx),%esi
    0x00000000006e84f4 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+292>:     mov    %esi,0x4(%rax)
=> 0x00000000006e84f7 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+295>:     mov %rcx,0x8(%rax,%rdx,8)
    0x00000000006e84fc 
<cp_parser_parenthesized_expression_list(cp_parser*, int, bool, bool, 
bool*, location_t*, bool)+300>:     cmp %rcx,0x16cd67d(%rip)        # 
0x1db5b80 <global_trees>


Second problem:

We see: FAIL: gcc.dg/plugin/start_unit-test-1.c 
-fplugin=./start_unit_plugin.so (test for excess errors)
Excess errors:
fake_var not a VAR_DECL

The message comes from plugin; the source code is in 
testsuite/gcc.dg/plugin/start_unit_plugin.c . The plugin holds a tree 
value in a static variable, yet fails to make it referenced by any 
garbage collection root tables.
When configuring with --enable-checking=all, a garbage colection comes 
along and
poisons the fake_var allocated by the plugin.
gty.texi advises:
   Plugins can add additional root tables.  Run the @code{gengtype}
   utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir}
   @var{file-list} @var{plugin*.c}} with your plugin files
   @var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file.
   The GCC build tree is needed to be present in that mode.
However, can we use the build directory in the test suite, considering that
it's supposed to be usable also for installed compilers?
I suppose, for gcc core types like tree we should be able to prepare a
header file beforehand.


Third problem:

Both in 8.2 and 9.0.0 configureed with --enable-checking-all, we see:
FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg0 = { a }"
FAIL: gcc.dg/ipa/ipa-pta-1.c scan-ipa-dump pta2 "bar.arg1 = { a }"

Looking closer at the problem in 8.2,  see that large parts of the dump 
file show signs of garbage-collector poisoned memory.
Looking at "Points-to sets", the information is reached from a static 
variable in
tree-ssa-structalias.c that lacks a GTY marker:
static vec<varinfo_t> varmap;

The "name" field in struct variable_info sometimes contains a string 
literal, and sometimes a ggc-allocated string.
I'm not sure if ggc can handle this, or if we need to change the code so 
that we make these either all ggc-allocated strings, or add a 
discriminator so we can tell which kind of string there is so that we 
can tell the garbage collector to ignore string literals.
At any rate, We can't make the "vec<varinfo_t>" as-is into a GC root.
Should that be changed into a va_gc vector pointer, and all the code that
uses it adjusted accordingly, or would it be better to use a new variable
as GC root and fake the references with a GTY((user)) for the new variable
that employs a marking function that marks the elements inside varmap?


[-- Attachment #2: expression-list-preserve-patch.txt --]
[-- Type: text/plain, Size: 1496 bytes --]

2018-12-27  Joern Rennecke  <joern.rennecke@riscy-ip.com>

	* cp/parser.c: (expression_list_ref): New GTY variable.
	(cp_parser_parenthesized_expression_list): Make it reference
	expression_list.

Index: gcc/cp/parser.c
===================================================================
--- gcc/cp/parser.c	(revision 5644)
+++ gcc/cp/parser.c	(revision 5645)
@@ -7649,6 +7649,14 @@ cp_parser_postfix_dot_deref_expression (
   return postfix_expression;
 }
 
+/* When processing a member function, cgraph_node::finalize_function can
+   call ggc_collect (like if called by expand_or_defer_fn from a struct
+   member function with (function_depth <= 1)).
+   Prevent expression_list in cp_parser_parenthesized_expression_list to
+   be garbage collected then by referencing it from a GC root.  */
+typedef vec<tree, va_gc> *tree_gc_vec;
+static GTY(()) vec<tree_gc_vec, va_gc> *expression_list_ref;
+
 /* Parse a parenthesized expression-list.
 
    expression-list:
@@ -7705,6 +7713,7 @@ cp_parser_parenthesized_expression_list
     return NULL;
 
   expression_list = make_tree_vector ();
+  vec_safe_push (expression_list_ref, expression_list);
 
   /* Within a parenthesized expression, a `>' token is always
      the greater-than operator.  */
@@ -7797,6 +7806,8 @@ cp_parser_parenthesized_expression_list
 	cp_lexer_consume_token (parser->lexer);
       }
 
+  expression_list_ref->pop ();
+
   if (close_paren_loc)
     *close_paren_loc = cp_lexer_peek_token (parser->lexer)->location;
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-09 11:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09  9:46 Garbage collection bugs Joern Wolfgang Rennecke
2019-01-09 10:46 ` Arseny Solokha
2019-01-09 11:48 ` Richard Biener
2019-01-09 11:54   ` Richard Biener
2019-01-09 11:59     ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).