From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10862 invoked by alias); 13 Nov 2002 02:36:02 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 10848 invoked by uid 71); 13 Nov 2002 02:36:01 -0000 Date: Tue, 19 Nov 2002 18:25:00 -0000 Message-ID: <20021113023601.10847.qmail@sources.redhat.com> To: nobody@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: Zack Weinberg Subject: Re: c++/8511: (hopefully) reproducible cc1plus SIGSEGV. Reply-To: Zack Weinberg X-SW-Source: 2002-11/txt/msg00610.txt.bz2 List-Id: The following reply was made to PR c++/8511; it has been noted by GNATS. From: Zack Weinberg To: Wolfgang Wieser , mark@codesourcery.com Cc: Volker Reichelt , gcc-gnats@gcc.gnu.org, gcc-bugs@gcc.gnu.org Subject: Re: c++/8511: (hopefully) reproducible cc1plus SIGSEGV. Date: Tue, 12 Nov 2002 18:29:22 -0800 On Tue, Nov 12, 2002 at 10:25:10PM +0100, Wolfgang Wieser wrote: > Ah - still: Doing abort() instead of exit(1) on ICE would make it easier > debuggable. (Or am I wrong again? - Okay using a breakpoint...) Use of exit() happens to be the easiest way to prevent users from getting 100MB core dumps (which they will then try to mail to gcc-bugs) when ICEs happen. > > > (Neither type nor val are NULL.) > > > > There's not enough information here to know what went wrong. Probably > > TREE_TYPE (val) was an invalid pointer. > > > How can I tell...? (gdb) p val->common.type->common will dump out enough information to tell you if it's a valid pointer to a tree. (TREE_TYPE (val) expands to val->common.type. You can find this out by reading tree.h. Yeah, it's a pain.) > > Yeah. That means the garbage collector ate a piece of live data. > > These are a pain to debug -- even slight changes in the input will > > make the problem vanish. > > > That's _exactly_ what I am experiencing! > Even if I remove some lines far away which seemingly do > not have anything to do with the location of the SIGSEGV, the SIGSEGV > goes away (turns into ordinary ICE). ... > > We need you to give us a preprocessed source file. Using your > > installation, issue this command: > > > I'll provide you with preorocessed code. Using the file you attached, I can now reproduce the crash. It turns out not to be a GC bug, but an access-beyond-end-of-array bug. tsubst() [in cp/pt.c] is called with this TEMPLATE_PARM_INDEX expression: index 1 level 1 orig_level 1> and this 'args' structure: > elt 1 elt 1 >> tsubst is to return the element of the args structure that the template_parm_index expression refers to. Here's the catch: 'index' values are 0-based, but 'level' values are 1-based, so it winds up trying to access elt 1 of elt 0 of that tree_vec. Which, as you can see, does not exist. This should have been caught by bounds checking code in the TREE_VEC_ELT macro (since ENABLE_TREE_CHECKING is on), but, well, there is no bounds checking code there. So tsubst happily reads the word one beyond the end of the inner tree_vec, which the garbage collector has helpfully set to the 'poison' value 0xa5a5a5a5. That then gets plugged into the structure returned from tsubst. The crash happens significantly later when other code tries to dereference the poison value as a pointer. I think the right fix for tsubst() is this patch: =================================================================== Index: cp/pt.c --- cp/pt.c 9 Nov 2002 11:53:16 -0000 1.630 +++ cp/pt.c 13 Nov 2002 02:21:05 -0000 @@ -6539,7 +6539,8 @@ tsubst (t, args, complain, in_decl) tree arg = NULL_TREE; levels = TMPL_ARGS_DEPTH (args); - if (level <= levels) + if (level <= levels + && idx < NUM_TMPL_ARGS (TMPL_ARGS_LEVEL (args, level))) arg = TMPL_ARG (args, level, idx); if (arg == error_mark_node) That prevents the invalid access. Your test case then carries on to crash in c_expand_expr, which is the other bug that we already know about, and Volker found a reduced test case for. I'm cc:ing Mark for comments, he's a lot more familiar with this part of the compiler than I am. I'm a bit concerned that this does not happen when unrelated parts of the code are changed; the original data corruption could be even earlier. We also want to add bounds checking to TREE_VEC_ELT. I note that the first thing the patched compiler says about this code is val/internals.hpp: In function `void internal_vect::mult_mv(internal_vect::vector&, const internal_vect::matrix&, const internal_vect::vector&) [with int r = 4, int c = 4, int N = 3]': val/vector.hpp:50: instantiated from `vect::Vector vect::operator*(const vect::Matrix&, const vect::Vector&) [with int R = 4, int C = 4]' spline.cpp:102: instantiated from here val/internals.hpp:84: internal compiler error: in c_expand_expr, at c-common.c: 4319 If Volker's right that the code is invalid, this should be considered a more serious case of ice-on-invalid than one where an error message came up first. > [I hope it was okay to CC gcc lists when attaching spline.ii.gz.] Yes, that was fine. zw