[PATCH 0/2] Demangler crash handler

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 0/2] Demangler crash handler
@ 2014-05-09 10:07 Gary Benson
  2014-05-09 10:09 ` [PATCH 1/2] " Gary Benson
                   ` (2 more replies)
  0 siblings, 3 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-09 10:07 UTC (permalink / raw)
  To: gdb-patches

Hi all,

A number of bugs have been filed recently because of segmentation
faults in the demangler.  While such crashes are a problem for all
demangler consumers, they are particularly nasty for GDB because they
prevent the user from debugging their program at all.

This patch series arranges for GDB to catch segmentation faults in the
demangler and recover from them gracefully.  A warning is printed the
first time a fault occurs.  Example sessions with and without these
patches are included below.

None of the wrapped code uses cleanups, so each caught failure will
leak a small amount of memory.  This is undesirable but I think the
benefits here outweigh this drawback.

Ok to commit?

Thanks,
Gary

-- 
http://gbenson.net/

The current situation:

  GNU gdb (GDB) 7.7.50.20140507-cvs
  Copyright (C) 2014 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu".
  Type "show configuration" for configuration details.
  For bug reporting instructions, please see:
  <http://www.gnu.org/software/gdb/bugs/>.
  Find the GDB manual and other documentation resources online at:
  <http://www.gnu.org/software/gdb/documentation/>.
  For help, type "help".
  Type "apropos word" to search for commands related to "word"...
  (gdb) file test
  Reading symbols from test...Segmentation fault (core dumped)

With these patches:

  GNU gdb (GDB) 7.7.50.20140508-cvs
  Copyright (C) 2014 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu".
  Type "show configuration" for configuration details.
  For bug reporting instructions, please see:
  <http://www.gnu.org/software/gdb/bugs/>.
  Find the GDB manual and other documentation resources online at:
  <http://www.gnu.org/software/gdb/documentation/>.
  For help, type "help".
  Type "apropos word" to search for commands related to "word"...
  (gdb) file test
  Reading symbols from test...
  warning: internal error: demangler failed with signal 11
  Unable to demangle '_QueueNotification_QueueController__$4PPPPPPPM_A_INotice___Z'
  This is a bug, please report it to the GDB maintainers.
  done.
  (gdb) b puts
  Breakpoint 1 at 0x400488
  (gdb) r
  Starting program: /home/gary/work/archer/demangle-crashcatcher/src/test
  warning: Skipping deprecated .gdb_index section in /usr/lib/debug/lib64/ld-2.12.so.debug.
  Do "set use-deprecated-index-sections on" before the file is read
  to use the section anyway.

  Breakpoint 1, _IO_puts (str=0x4006d8 "hello") at ioputs.c:35
  35	{
  (gdb) bt
  #0  _IO_puts (str=0x4006d8 "hello") at ioputs.c:35
  #1  0x00000000004005b2 in hello () at test.cc:6
  #2  0x00000000004005bd in _QueueNotification_QueueController__$4PPPPPPPM_A_INotice___Z () at test.cc:12
  #3  0x00000000004005d3 in main (argc=1, argv=0x7fffffffe618) at test.cc:18

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/2] Demangler crash handler
  2014-05-09 10:07 [PATCH 0/2] Demangler crash handler Gary Benson
@ 2014-05-09 10:09 ` Gary Benson
  2014-05-09 10:10 ` [PATCH 2/2] " Gary Benson
  2014-05-09 11:20 ` [PATCH 0/2] " Mark Kettenis
  2 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-09 10:09 UTC (permalink / raw)
  To: gdb-patches

This patch replaces a call to cplus_demangle with a call to
gdb_demangle.  This change was included in an RFC from last
March [1] but omitted from the eventual commit.

[1] https://sourceware.org/ml/gdb-patches/2013-03/msg00235.html

2014-05-09  Gary Benson  <gbenson@redhat.com>

	* gnu-v2-abi.c (gnuv2_value_rtti_type): Use gdb_demangle.

diff --git a/gdb/gnu-v2-abi.c b/gdb/gnu-v2-abi.c
index 4a488be..3ca01c3 100644
--- a/gdb/gnu-v2-abi.c
+++ b/gdb/gnu-v2-abi.c
@@ -251,7 +251,7 @@ gnuv2_value_rtti_type (struct value *v, int *full, int *top, int *using_enc)
     return NULL;
 
   /* If we just skip the prefix, we get screwed by namespaces.  */
-  demangled_name=cplus_demangle(linkage_name,DMGL_PARAMS|DMGL_ANSI);
+  demangled_name=gdb_demangle(linkage_name,DMGL_PARAMS|DMGL_ANSI);
   p = strchr (demangled_name, ' ');
   if (p)
     *p = '\0';

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 2/2] Demangler crash handler
  2014-05-09 10:07 [PATCH 0/2] Demangler crash handler Gary Benson
  2014-05-09 10:09 ` [PATCH 1/2] " Gary Benson
@ 2014-05-09 10:10 ` Gary Benson
  2014-05-09 11:20 ` [PATCH 0/2] " Mark Kettenis
  2 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-09 10:10 UTC (permalink / raw)
  To: gdb-patches

This patch wraps calls to the demangler with a segmentation fault
handler.  A warning is printed the first time a segmentation fault
is caught.

gdb/
2014-05-09  Gary Benson  <gbenson@redhat.com>

	PR backtrace/16817
	* cp-support.c (signal.h): New include.
	(gdb_demangle_signal_handler): New function.
	(gdb_demangle): Install the above signal handler before calling
	bfd_demangle, and restore the original signal handler afterwards.

gdb/testsuite/
2014-05-09  Gary Benson  <gbenson@redhat.com>

	PR backtrace/16817
	* gdb.base/pr16817.exp: New file.

diff --git a/gdb/cp-support.c b/gdb/cp-support.c
index 91533e8..f1a0d49 100644
--- a/gdb/cp-support.c
+++ b/gdb/cp-support.c
@@ -36,6 +36,7 @@
 #include "value.h"
 #include "cp-abi.h"
 #include "language.h"
+#include <signal.h>
 
 #include "safe-ctype.h"
 
@@ -1505,12 +1506,71 @@ cp_lookup_rtti_type (const char *name, struct block *block)
   return rtti_type;
 }
 
+/* Signal handler for gdb_demangle.  */
+
+static void
+gdb_demangle_signal_handler (int signo)
+{
+  throw_error (GENERIC_ERROR, _("demangler failed with signal %d"),
+	       signo);
+}
+
 /* A wrapper for bfd_demangle.  */
 
 char *
 gdb_demangle (const char *name, int options)
 {
-  return bfd_demangle (NULL, name, options);
+  char *result = NULL;
+
+#ifdef SIGSEGV
+  volatile struct gdb_exception except;
+
+#if defined (HAVE_SIGACTION) && defined (SA_RESTART)
+  struct sigaction sa, old_sa;
+
+  sa.sa_handler = gdb_demangle_signal_handler;
+  sigemptyset (&sa.sa_mask);
+  sa.sa_flags = 0;
+  sigaction (SIGSEGV, &sa, &old_sa);
+#else
+  void (*ofunc) ();
+
+  ofunc = (void (*)()) signal (SIGSEGV, gdb_demangle_signal_handler);
+#endif
+#endif
+
+  TRY_CATCH (except, RETURN_MASK_ALL)
+    {
+      result = bfd_demangle (NULL, name, options);
+    }
+
+#ifdef SIGSEGV
+#if defined (HAVE_SIGACTION) && defined (SA_RESTART)
+  sigaction (SIGSEGV, &old_sa, NULL);
+#else
+  signal (SIGSEGV, ofunc);
+#endif
+#endif
+
+  if (except.reason < 0)
+    {
+      static int warning_printed = 0;
+
+      if (!warning_printed)
+	{
+	  warning ("internal error: %s\n"
+		   "Unable to demangle '%s'\n"
+		   "This is a bug, "
+		   "please report it to the GDB maintainers.",
+		   except.message, name);
+
+	  warning_printed = 1;
+	}
+
+      result = NULL;
+    }
+
+  return result;
 }
 
 /* Don't allow just "maintenance cplus".  */
diff --git a/gdb/testsuite/gdb.base/pr16817.exp b/gdb/testsuite/gdb.base/pr16817.exp
new file mode 100644
index 0000000..fff022c
--- /dev/null
+++ b/gdb/testsuite/gdb.base/pr16817.exp
@@ -0,0 +1,25 @@
+# Copyright 2014 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that a segmentation fault in the demangler does not cause GDB
+# to crash.  If the demangler becomes able to demangle the below
+# symbol then another symbol will need to be found for this test.
+
+set symbol "_QueueNotification_QueueController__\$4PPPPPPPM_A_INotice___Z"
+
+gdb_exit
+gdb_start
+gdb_test_no_output "set lang c++"
+gdb_test "maint demangle $symbol" ".*demangler failed.*"

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-09 10:07 [PATCH 0/2] Demangler crash handler Gary Benson
  2014-05-09 10:09 ` [PATCH 1/2] " Gary Benson
  2014-05-09 10:10 ` [PATCH 2/2] " Gary Benson
@ 2014-05-09 11:20 ` Mark Kettenis
  2014-05-09 15:33   ` Gary Benson
  2014-05-10 20:55   ` Florian Weimer
  2 siblings, 2 replies; 50+ messages in thread
From: Mark Kettenis @ 2014-05-09 11:20 UTC (permalink / raw)
  To: gbenson; +Cc: gdb-patches

> Date: Fri, 9 May 2014 11:06:57 +0100
> From: Gary Benson <gbenson@redhat.com>
> 
> Hi all,
> 
> A number of bugs have been filed recently because of segmentation
> faults in the demangler.  While such crashes are a problem for all
> demangler consumers, they are particularly nasty for GDB because they
> prevent the user from debugging their program at all.
> 
> This patch series arranges for GDB to catch segmentation faults in the
> demangler and recover from them gracefully.  A warning is printed the
> first time a fault occurs.  Example sessions with and without these
> patches are included below.
> 
> None of the wrapped code uses cleanups, so each caught failure will
> leak a small amount of memory.  This is undesirable but I think the
> benefits here outweigh this drawback.
> 
> Ok to commit?

No.  It's this skind of duct-tape that will make sure that bugs in the
demangler won't get fixed.  Apart from removing the incentive to fix
the bugs, these SIGSEGV signal handlers make actually fixing the bugs
harder as you won't have core dumps.

Besides, any signal handler that does more than just setting a flag is
probably broken.  Did you verify that you only call async-signal-safe
functions in the signal handler code path?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-09 11:20 ` [PATCH 0/2] " Mark Kettenis
@ 2014-05-09 15:33   ` Gary Benson
  2014-05-11  5:17     ` Doug Evans
  2014-05-11 20:23     ` Mark Kettenis
  2014-05-10 20:55   ` Florian Weimer
  1 sibling, 2 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-09 15:33 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

Mark Kettenis wrote:
> > A number of bugs have been filed recently because of segmentation
> > faults in the demangler.  While such crashes are a problem for all
> > demangler consumers, they are particularly nasty for GDB because
> > they prevent the user from debugging their program at all.
> > 
> > This patch series arranges for GDB to catch segmentation faults
> > in the demangler and recover from them gracefully.  A warning is
> > printed the first time a fault occurs.  Example sessions with and
> > without these patches are included below.
> > 
> > None of the wrapped code uses cleanups, so each caught failure
> > will leak a small amount of memory.  This is undesirable but I
> > think the benefits here outweigh this drawback.
> > 
> > Ok to commit?
> 
> No.  It's this skind of duct-tape that will make sure that bugs in
> the demangler won't get fixed.  Apart from removing the incentive to
> fix the bugs, these SIGSEGV signal handlers make actually fixing the
> bugs harder as you won't have core dumps.

I would normally agree with you 100% on this issue Mark, but in this
case I think a handler is justified.  If the demangler crashes because
of a symbol in the users program then the user cannot debug their
program at all.  If the demangler were simple and well understood then
that would be fine but it's not: the demangler is complex, the
specification it's following is complex, and everything's complicated
further because you can't allocate heap and you have to roll your own
data structures.  The reality is that the libiberty demangler is a
breeding ground for segfaults, and GDB needs to be able to deal with
this.

It's true that you don't get core dumps with this patch, but what you
do get in return is a printed warning that includes the symbol that
caused the crash.  That's all you need in most cases.  The five recent
demangler crashes (14963, 16593, 16752, 16817 and 16845) all required
digging by either the reporter or a GDB developer to uncover the
failing symbol.  Printing the offending symbol means this work is
already done.

If the lack of core dumps is a showstopper for you then I can
update the patch to allow disabling the handler with
"maint set handle-demangler-crashes 0" or some similar thing.

> Besides, any signal handler that does more than just setting a flag
> is probably broken.  Did you verify that you only call async-signal-
> safe functions in the signal handler code path?

I didn't think this was necessary as to my knowledge SIGSEGV is only
ever emitted synchronously.  If it is an issue then the patch could
be reworked to use (sig)longjmp as included below.

Thanks,
Gary


diff --git a/gdb/cp-support.c b/gdb/cp-support.c
index 91533e8..5e79fb4 100644
--- a/gdb/cp-support.c
+++ b/gdb/cp-support.c
@@ -36,6 +36,7 @@
 #include "value.h"
 #include "cp-abi.h"
 #include "language.h"
+#include <signal.h>
 
 #include "safe-ctype.h"
 
@@ -1505,12 +1506,89 @@ cp_lookup_rtti_type (const char *name, struct block *block)
   return rtti_type;
 }
 
+#ifdef SIGSEGV
+
+/* PortabiWrap set/long jmp so that it's more portable.  */
+
+#if defined(HAVE_SIGSETJMP)
+#define SIGJMP_BUF		sigjmp_buf
+#define SIGSETJMP(buf)		sigsetjmp((buf), 1)
+#define SIGLONGJMP(buf,val)	siglongjmp((buf), (val))
+#else
+#define SIGJMP_BUF		jmp_buf
+#define SIGSETJMP(buf)		setjmp(buf)
+#define SIGLONGJMP(buf,val)	longjmp((buf), (val))
+#endif
+
+/* Stack context and environment for demangler crash recovery.  */
+
+static SIGJMP_BUF gdb_demangle_jmp_buf;
+
+/* Signal handler for gdb_demangle.  */
+
+static void
+gdb_demangle_signal_handler (int signo)
+{
+  SIGLONGJMP (gdb_demangle_jmp_buf, signo);
+}
+
+#endif
+
 /* A wrapper for bfd_demangle.  */
 
 char *
 gdb_demangle (const char *name, int options)
 {
-  return bfd_demangle (NULL, name, options);
+  char *result = NULL;
+  int crash_signal = 0;
+
+#ifdef SIGSEGV
+#if defined (HAVE_SIGACTION) && defined (SA_RESTART)
+  struct sigaction sa, old_sa;
+
+  sa.sa_handler = gdb_demangle_signal_handler;
+  sigemptyset (&sa.sa_mask);
+  sa.sa_flags = 0;
+  sigaction (SIGSEGV, &sa, &old_sa);
+#else
+  void (*ofunc) ();
+
+  ofunc = (void (*)()) signal (SIGSEGV, gdb_demangle_signal_handler);
+#endif
+
+  crash_signal = SIGSETJMP (gdb_demangle_jmp_buf);
+#endif
+
+  if (crash_signal == 0)
+    result = bfd_demangle (NULL, name, options);
+
+#ifdef SIGSEGV
+#if defined (HAVE_SIGACTION) && defined (SA_RESTART)
+  sigaction (SIGSEGV, &old_sa, NULL);
+#else
+  signal (SIGSEGV, ofunc);
+#endif
+#endif
+
+  if (crash_signal != 0)
+    {
+      static int warning_printed = 0;
+
+      if (!warning_printed)
+	{
+	  warning ("internal error: demangler failed with signal %d\n"
+		   "Unable to demangle '%s'\n"
+		   "This is a bug, "
+		   "please report it to the GDB maintainers.",
+		   crash_signal, name);
+
+	  warning_printed = 1;
+	}
+
+      result = NULL;
+    }
+
+  return result;
 }
 
 /* Don't allow just "maintenance cplus".  */

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-09 11:20 ` [PATCH 0/2] " Mark Kettenis
  2014-05-09 15:33   ` Gary Benson
@ 2014-05-10 20:55   ` Florian Weimer
  2014-05-11  5:10     ` Doug Evans
                       ` (2 more replies)
  1 sibling, 3 replies; 50+ messages in thread
From: Florian Weimer @ 2014-05-10 20:55 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gbenson, gdb-patches

* Mark Kettenis:

> No.  It's this skind of duct-tape that will make sure that bugs in the
> demangler won't get fixed.  Apart from removing the incentive to fix
> the bugs, these SIGSEGV signal handlers make actually fixing the bugs
> harder as you won't have core dumps.

I find this approach extremely odd as well.

> Besides, any signal handler that does more than just setting a flag is
> probably broken.  Did you verify that you only call async-signal-safe
> functions in the signal handler code path?

At least it doesn't make things much worse because the program is
already in a funny state when the signal is generated.

It would be more reliable to run the demangler in a separate process.

What's so difficult about fixing the demangler?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-10 20:55   ` Florian Weimer
@ 2014-05-11  5:10     ` Doug Evans
  2014-05-13 10:22     ` Gary Benson
  2014-05-14 14:18     ` Pedro Alves
  2 siblings, 0 replies; 50+ messages in thread
From: Doug Evans @ 2014-05-11  5:10 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Mark Kettenis, gbenson, gdb-patches

On Sat, May 10, 2014 at 1:55 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Mark Kettenis:
>
>> No.  It's this skind of duct-tape that will make sure that bugs in the
>> demangler won't get fixed.  Apart from removing the incentive to fix
>> the bugs, these SIGSEGV signal handlers make actually fixing the bugs
>> harder as you won't have core dumps.
>
> I find this approach extremely odd as well.
>
>> Besides, any signal handler that does more than just setting a flag is
>> probably broken.  Did you verify that you only call async-signal-safe
>> functions in the signal handler code path?
>
> At least it doesn't make things much worse because the program is
> already in a funny state when the signal is generated.
>
> It would be more reliable to run the demangler in a separate process.

gdb can demangle a lot of symbols.
I'd be worried about the performance.

> What's so difficult about fixing the demangler?

It's not a question of not fixing the demangler.
It's a question of improving the gdb user experience.
gdb segfaults in the middle of a debugging session ARE BAD.
gdb's just a tool for solving problems, I'd rather it be robust.

Note that gdb has for a long time given the user the option of
continuing the debugging session after an internal error, instead of
just going straight to a core dump.
Demangler segv handling is no different, and if people feel the
problem is sufficient to warrant taking this step (and assuming any
technical problems are solvable), I'm ok with it.
[Whether there are intractable technical problems here I will leave to
a separate thread/subthread.  Offhand I don't know.]

As for worries about demangler bugs being less often fixed ... it's
not clear to me that would be the case.
How often would we get actionable segfault reports vs reports that
specifically say "the demangler is busted, and here's the string that
triggers the problem" ?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-09 15:33   ` Gary Benson
@ 2014-05-11  5:17     ` Doug Evans
  2014-05-13 10:20       ` Gary Benson
  2014-05-11 20:23     ` Mark Kettenis
  1 sibling, 1 reply; 50+ messages in thread
From: Doug Evans @ 2014-05-11  5:17 UTC (permalink / raw)
  To: Gary Benson; +Cc: Mark Kettenis, gdb-patches

On Fri, May 9, 2014 at 8:33 AM, Gary Benson <gbenson@redhat.com> wrote:
> [...]
> +  if (crash_signal != 0)
> +    {
> +      static int warning_printed = 0;
> +
> +      if (!warning_printed)
> +       {
> +         warning ("internal error: demangler failed with signal %d\n"
> +                  "Unable to demangle '%s'\n"
> +                  "This is a bug, "
> +                  "please report it to the GDB maintainers.",
> +                  crash_signal, name);
> +
> +         warning_printed = 1;
> +       }
> +
> +      result = NULL;
> +    }
> +
> +  return result;

Hi.

Applying "Consistency Is Good" to this patch,
I wonder if we should do something similar to what we do for internal errors.
I'm not sure I would use the same flag (grep for
internal_problem_modes and friends in utils.c), but
OTOH I wouldn't want a proliferation of options for controlling each
particular kind of "crash".

What do you think?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-09 15:33   ` Gary Benson
  2014-05-11  5:17     ` Doug Evans
@ 2014-05-11 20:23     ` Mark Kettenis
  2014-05-13 10:21       ` Gary Benson
  1 sibling, 1 reply; 50+ messages in thread
From: Mark Kettenis @ 2014-05-11 20:23 UTC (permalink / raw)
  To: gbenson; +Cc: gdb-patches

> Date: Fri, 9 May 2014 16:33:06 +0100
> From: Gary Benson <gbenson@redhat.com>
> 
> Mark Kettenis wrote:
> > > A number of bugs have been filed recently because of segmentation
> > > faults in the demangler.  While such crashes are a problem for all
> > > demangler consumers, they are particularly nasty for GDB because
> > > they prevent the user from debugging their program at all.
> > > 
> > > This patch series arranges for GDB to catch segmentation faults
> > > in the demangler and recover from them gracefully.  A warning is
> > > printed the first time a fault occurs.  Example sessions with and
> > > without these patches are included below.
> > > 
> > > None of the wrapped code uses cleanups, so each caught failure
> > > will leak a small amount of memory.  This is undesirable but I
> > > think the benefits here outweigh this drawback.
> > > 
> > > Ok to commit?
> > 
> > No.  It's this skind of duct-tape that will make sure that bugs in
> > the demangler won't get fixed.  Apart from removing the incentive to
> > fix the bugs, these SIGSEGV signal handlers make actually fixing the
> > bugs harder as you won't have core dumps.
> 
> I would normally agree with you 100% on this issue Mark, but in this
> case I think a handler is justified.  If the demangler crashes because
> of a symbol in the users program then the user cannot debug their
> program at all.  If the demangler were simple and well understood then
> that would be fine but it's not: the demangler is complex, the
> specification it's following is complex, and everything's complicated
> further because you can't allocate heap and you have to roll your own
> data structures.  The reality is that the libiberty demangler is a
> breeding ground for segfaults, and GDB needs to be able to deal with
> this.

There are entire subsystems in GDB that are a breeding ground for
segfaults.  Should we therefore wrap evrything?

It is obvious that the demangler is a breeding ground for segmentation
faults.  It uses strcpy, strcat and sprintf.  So it's probably full of
buffer overflows.  I bet that if those are fixed, the SIGSEGVs are
gone.

Note that only some of those buffer overflows will generate a SIGSEGV.
Others will corrupt random memory.  And you can't patch those up with
a signal handler.

> It's true that you don't get core dumps with this patch, but what you
> do get in return is a printed warning that includes the symbol that
> caused the crash.  That's all you need in most cases.  The five recent
> demangler crashes (14963, 16593, 16752, 16817 and 16845) all required
> digging by either the reporter or a GDB developer to uncover the
> failing symbol.  Printing the offending symbol means this work is
> already done.
> 
> If the lack of core dumps is a showstopper for you then I can
> update the patch to allow disabling the handler with
> "maint set handle-demangler-crashes 0" or some similar thing.

Not acceptable.  Unless you make it the default...

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-11  5:17     ` Doug Evans
@ 2014-05-13 10:20       ` Gary Benson
  2014-05-13 19:29         ` Tom Tromey
  2014-05-13 19:39         ` Tom Tromey
  0 siblings, 2 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-13 10:20 UTC (permalink / raw)
  To: Doug Evans; +Cc: Mark Kettenis, gdb-patches

Doug Evans wrote:
> On Fri, May 9, 2014 at 8:33 AM, Gary Benson <gbenson@redhat.com> wrote:
> > [...]
> > +  if (crash_signal != 0)
> > +    {
> > +      static int warning_printed = 0;
> > +
> > +      if (!warning_printed)
> > +       {
> > +         warning ("internal error: demangler failed with signal %d\n"
> > +                  "Unable to demangle '%s'\n"
> > +                  "This is a bug, "
> > +                  "please report it to the GDB maintainers.",
> > +                  crash_signal, name);
> > +
> > +         warning_printed = 1;
> > +       }
> > +
> > +      result = NULL;
> > +    }
> > +
> > +  return result;
> 
> Applying "Consistency Is Good" to this patch, I wonder if we should
> do something similar to what we do for internal errors.  I'm not
> sure I would use the same flag (grep for internal_problem_modes and
> friends in utils.c), but OTOH I wouldn't want a proliferation of
> options for controlling each particular kind of "crash".
> 
> What do you think?

I did not realize internal_error had the ability to continue over an
error.  Rather than doing something similar to internal_error I can
just update the patch to use it.  I actually prefer this to using
regular warning as the user has to explicity continue over the error.
They cannot miss that something is wrong and that they are now in
uncharted territory, but at the same time they have to option to
try and continue using GDB to debug their program.

I attached an updated patch that uses internal_warning below.  It
might be desirable to create a new warning (demangler_warning) that
would work like internal_warning but not offer the possibility of
saving a not-particularly-useful core dump.  That's a refinement
I'm not sure is necessary, however: this isn't everyday code.

As an aside, would it be worth printing an extra line or two from
internal_vproblem asking the user to report the bug and directing
them to the bug tracker and/or the mailing lists?

Thanks,
Gary

--
diff --git a/gdb/cp-support.c b/gdb/cp-support.c
index 91533e8..a99c827 100644
--- a/gdb/cp-support.c
+++ b/gdb/cp-support.c
@@ -36,6 +36,7 @@
 #include "value.h"
 #include "cp-abi.h"
 #include "language.h"
+#include <signal.h>
 
 #include "safe-ctype.h"
 
@@ -1505,12 +1506,88 @@ cp_lookup_rtti_type (const char *name, struct block *block)
   return rtti_type;
 }
 
+#ifdef SIGSEGV
+
+/* PortabiWrap set/long jmp so that it's more portable.  */
+
+#if defined(HAVE_SIGSETJMP)
+#define SIGJMP_BUF		sigjmp_buf
+#define SIGSETJMP(buf)		sigsetjmp((buf), 1)
+#define SIGLONGJMP(buf,val)	siglongjmp((buf), (val))
+#else
+#define SIGJMP_BUF		jmp_buf
+#define SIGSETJMP(buf)		setjmp(buf)
+#define SIGLONGJMP(buf,val)	longjmp((buf), (val))
+#endif
+
+/* Stack context and environment for demangler crash recovery.  */
+
+static SIGJMP_BUF gdb_demangle_jmp_buf;
+
+/* Signal handler for gdb_demangle.  */
+
+static void
+gdb_demangle_signal_handler (int signo)
+{
+  SIGLONGJMP (gdb_demangle_jmp_buf, signo);
+}
+
+#endif
+
 /* A wrapper for bfd_demangle.  */
 
 char *
 gdb_demangle (const char *name, int options)
 {
-  return bfd_demangle (NULL, name, options);
+  char *result = NULL;
+  int crash_signal = 0;
+
+#ifdef SIGSEGV
+#if defined (HAVE_SIGACTION) && defined (SA_RESTART)
+  struct sigaction sa, old_sa;
+
+  sa.sa_handler = gdb_demangle_signal_handler;
+  sigemptyset (&sa.sa_mask);
+  sa.sa_flags = 0;
+  sigaction (SIGSEGV, &sa, &old_sa);
+#else
+  void (*ofunc) ();
+
+  ofunc = (void (*)()) signal (SIGSEGV, gdb_demangle_signal_handler);
+#endif
+
+  crash_signal = SIGSETJMP (gdb_demangle_jmp_buf);
+#endif
+
+  if (crash_signal == 0)
+    result = bfd_demangle (NULL, name, options);
+
+#ifdef SIGSEGV
+#if defined (HAVE_SIGACTION) && defined (SA_RESTART)
+  sigaction (SIGSEGV, &old_sa, NULL);
+#else
+  signal (SIGSEGV, ofunc);
+#endif
+#endif
+
+  if (crash_signal != 0)
+    {
+      static int error_reported = 0;
+
+      if (!error_reported)
+	{
+	  internal_warning (__FILE__, __LINE__,
+			    _("unable to demangle '%s' "
+			      "(demangler failed with signal %d)"),
+			    name, crash_signal);
+
+	  error_reported = 1;
+	}
+
+      result = NULL;
+    }
+
+  return result;
 }
 
 /* Don't allow just "maintenance cplus".  */

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-11 20:23     ` Mark Kettenis
@ 2014-05-13 10:21       ` Gary Benson
  2014-05-13 16:05         ` Pedro Alves
  0 siblings, 1 reply; 50+ messages in thread
From: Gary Benson @ 2014-05-13 10:21 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: gdb-patches

Mark Kettenis wrote:
> There are entire subsystems in GDB that are a breeding ground for
> segfaults.  Should we therefore wrap evrything?

This is a straw man.  I'm not proposing we wrap all subsystems.
I'm proposing we wrap one specific subsystem that has caused
problems in the past and is likely to continue to do so into
the forseeable future.

I'd argue that the demangler warrants special consideration as
in one sense it's not a subsystem of GDB but rather a (somewhat
unloved?) side-project of GCC that we borrow.  The situation
we find ourselves in is this:

 * GDB is more likely to see demangler crashes than libstdc++

   GDB demangles all symbols in every file it loads, always.  In
   libstdc++ the demangler is only called in error situations,
   and then only to demangle the few symbols that appear in the
   backtrace.

   So: we get the bug reports, and have to triage them, even
   though the code is not "ours".   

 * Bugs have a more serious affect on GDB than on libstdc++

   Currently a demangler crash will cause GDB to segfault, usually at
   startup.  A demangler crash in libstdc++ is not such a big deal as
   the application was likely crashing anyway.

   So: bugs that are high-priority for us are low-priority for the
   "main" demangler authors, and we end up having to fix them.

 * Demangler patches often get waved through with minimal scrutiny

   The few people who really know the demangler are busy with other
   things, and the above two points mean demangler issues are low-
   priority for them.

   So: bugs are going to keep on coming our way.

This situation is why I feel the demangler merits special handling.

> It is obvious that the demangler is a breeding ground for
> segmentation faults.  It uses strcpy, strcat and sprintf.  So it's
> probably full of buffer overflows.  I bet that if those are fixed,
> the SIGSEGVs are gone.

That's not been the case for the crashes I recently fixed.  The
demangler parses the mangled symbol into a tree that it then walks to
generate the result.  All pretty standard so far, but in certain cases
sections of the tree can be (re)entered from other, non-local parts of
the tree.  Both crashes I fixed involved a stack overflow caused by a
loop in the tree caused by a reentry.

> Note that only some of those buffer overflows will generate a
> SIGSEGV.  Others will corrupt random memory.  And you can't patch
> those up with a signal handler.

Agreed.  I'm not pretending this will solve the underlying issues,
and I'm not making an excuse to not fix demangler crashes.  I was
merely trying to make a bad situation less bad by a) allowing the
possibility that the user could continue to use GDB to debug their
program, and b) helping the user make a more meaningful bug report.

I agree that there might be cases where this patch could turn a
demangler failure into some other difficult-to-debug crash that would
waste developer time.  But developer time is being wasted now: every
demangler filed requires the reporter or (more likely) one of us to
trail through the core file to identify that the issue is a demangler
bug and then pull out the symbol that caused the crash.  This patch
does that work.

In my response to Doug's mail [1] I updated the patch to use
internal_warning.  The user is interrupted (so they cannot miss the
warning) and the questions they are asked make it plain that by
continuing over the error they are now living on the edge.  Does
that minimise the possibility of us wasting time chasing memory
corruption errors enough for you?

> > It's true that you don't get core dumps with this patch, but what
> > you do get in return is a printed warning that includes the symbol
> > that caused the crash.  That's all you need in most cases.  The
> > five recent demangler crashes (14963, 16593, 16752, 16817 and
> > 16845) all required digging by either the reporter or a GDB
> > developer to uncover the failing symbol.  Printing the offending
> > symbol means this work is already done.
> > 
> > If the lack of core dumps is a showstopper for you then I can
> > update the patch to allow disabling the handler with "maint set
> > handle-demangler-crashes 0" or some similar thing.
> 
> Not acceptable.  Unless you make it the default...

Disabled by default wouldn't stop us having to triage the bugs that
are filed.

Would an enabled-by-default crash catcher using internal_warning as
above work for you?

Thanks,
Gary

-- 
[1] https://sourceware.org/ml/gdb-patches/2014-05/msg00151.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-10 20:55   ` Florian Weimer
  2014-05-11  5:10     ` Doug Evans
@ 2014-05-13 10:22     ` Gary Benson
  2014-05-13 18:22       ` Florian Weimer
  2014-05-14 14:18     ` Pedro Alves
  2 siblings, 1 reply; 50+ messages in thread
From: Gary Benson @ 2014-05-13 10:22 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Mark Kettenis, gdb-patches

Florian Weimer wrote:
> * Mark Kettenis:
> > Besides, any signal handler that does more than just setting a
> > flag is probably broken.  Did you verify that you only call
> > async-signal-safe functions in the signal handler code path?
> 
> At least it doesn't make things much worse because the program is
> already in a funny state when the signal is generated.

Yes.  I'm not aiming for perfection, I was merely looking for simple
ways to get some protection from the kind of crashes I've observed.

> It would be more reliable to run the demangler in a separate
> process.

Agreed.  There are two issues though.  First is performance:
GDB can demangle a lot of symbols--starting GDB on LibreOffice
demangles more than 369,000 symbols on my machine, and typing
"b <Tab><Tab>" demangles 2,740,000 more--so the IPC overhead
has to be seriously low.

The second issue is my time.  I could spare the time to write
a simple, low-LOC patch like this, but if the only acceptable
solution is the demangler in a separate process then that has
to be way down my priority list--especially given that that
solution would likely be rejected on performance grounds.
I'm pretty sure there are people on this list for whom any
loss of performance would be unacceptable.

> What's so difficult about fixing the demangler?

It's not so difficult really, but fixes take time to diagnose,
reproduce, fix, test, review, etc.  During that interval the user
cannot use GDB to debug their program.  This patch is about
allowing the user the possibility of getting their work done.

Cheers,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 10:21       ` Gary Benson
@ 2014-05-13 16:05         ` Pedro Alves
  2014-05-15 13:24           ` Gary Benson
  0 siblings, 1 reply; 50+ messages in thread
From: Pedro Alves @ 2014-05-13 16:05 UTC (permalink / raw)
  To: Gary Benson, Mark Kettenis; +Cc: gdb-patches

On 05/13/14 11:21, Gary Benson wrote:
> I'd argue that the demangler warrants special consideration as
> in one sense it's not a subsystem of GDB but rather a (somewhat
> unloved?) side-project of GCC that we borrow.  The situation
> we find ourselves in is this:
> 
>  * GDB is more likely to see demangler crashes than libstdc++

True.

> 
>    GDB demangles all symbols in every file it loads, always.  In
>    libstdc++ the demangler is only called in error situations,
>    and then only to demangle the few symbols that appear in the
>    backtrace.
> 
>    So: we get the bug reports, and have to triage them, even
>    though the code is not "ours".   

I really can't agree with the "not ours" sentiment.  Not if we think
in terms of the GNU toolchain.  IMO, we should foremost think of
ourselves as development toolchain toolsmiths over gdb hackers.

Thought experiment #1: I hereby declare that the demangler maintainers
are GDB hackers.  In order to get one's code in the demangler,
one has to get it past those specific (and very busy) GDB hackers.

Thought experiment #2: I'm going to import and fork the demangler
into gdb/ directly, and declare that from here on, we get to fix it
ourselves.

Either case doesn't seem to make a difference to me.  Except that
with #2, we'd end with an incompetent demangler maintainer. :-)

>  * Bugs have a more serious affect on GDB than on libstdc++

True.

> 
>    Currently a demangler crash will cause GDB to segfault, usually at
>    startup.  A demangler crash in libstdc++ is not such a big deal as
>    the application was likely crashing anyway.
> 
>    So: bugs that are high-priority for us are low-priority for the
>    "main" demangler authors, and we end up having to fix them.

True.

So...  Since this subsystem is so important, should we write our
own demangler from scratch then?  Would that result in a better
outcome?  Or, can we perhaps extend the demangler a little
to make it more robust itself?

Is there something that could be done to demangler's testsuite to
make it more effective in catching bugs that GDB would catch?
(I mean other than throwing more symbols at it.
Though a fuzzy tester that throws pseudo-random symbols at it
would be a nice project on its own.)

> 
>  * Demangler patches often get waved through with minimal scrutiny

That does sound like a problem.  Can we work with the gcc folks to
somehow prevent this from happening?  E.g., perhaps we could ask them
to CC all demangler patches to the gdb-patches@ list as well, like
supposedly done for some other shared files.

> 
>    The few people who really know the demangler are busy with other
>    things, and the above two points mean demangler issues are low-
>    priority for them.

It's not clear to me whether the issues are with demangling itself
being complex, or with the current implementation.  In any case,
this doesn't sound like a problem on their end, but on ours.
But if the demangler was "ours", who would be doing the
fixing anyway?

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 10:22     ` Gary Benson
@ 2014-05-13 18:22       ` Florian Weimer
  2014-05-13 18:42         ` Pedro Alves
  2014-05-13 19:22         ` Gary Benson
  0 siblings, 2 replies; 50+ messages in thread
From: Florian Weimer @ 2014-05-13 18:22 UTC (permalink / raw)
  To: Gary Benson; +Cc: Mark Kettenis, gdb-patches

* Gary Benson:

>> It would be more reliable to run the demangler in a separate
>> process.
>
> Agreed.  There are two issues though.  First is performance:
> GDB can demangle a lot of symbols--starting GDB on LibreOffice
> demangles more than 369,000 symbols on my machine, and typing
> "b <Tab><Tab>" demangles 2,740,000 more--so the IPC overhead
> has to be seriously low.

Is most of the demangling of the bulk kind?  These cases are easy in
the sense that a ping-pong between the two processes is avoidable.

More worrisome would be the occasional demangling as part of other
tasks, where the context switches might become very visible.

>> What's so difficult about fixing the demangler?
>
> It's not so difficult really, but fixes take time to diagnose,
> reproduce, fix, test, review, etc.

I was referring to a more permanent fix, not just patching the bug of
the day as users encounter it.

> During that interval the user cannot use GDB to debug their program.
> This patch is about allowing the user the possibility of getting
> their work done.

I suppose I could run the demangler on all symbols in Fedora and
downstream and see what breaks.  Would that help?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 18:22       ` Florian Weimer
@ 2014-05-13 18:42         ` Pedro Alves
  2014-05-13 19:16           ` Gary Benson
  2014-05-13 19:20           ` Florian Weimer
  2014-05-13 19:22         ` Gary Benson
  1 sibling, 2 replies; 50+ messages in thread
From: Pedro Alves @ 2014-05-13 18:42 UTC (permalink / raw)
  To: Florian Weimer, Gary Benson; +Cc: Mark Kettenis, gdb-patches

On 05/13/14 19:22, Florian Weimer wrote:

> I suppose I could run the demangler on all symbols in Fedora and
> downstream and see what breaks.  Would that help?

That'd be interesting.

Or you could just pass that list through c++filt,
the binutils program, which also uses the same demangler.
That might be faster, though obviously won't catch gdb-specific
bugs.

Perhaps we could also have the list of symbols accessible
somewhere, so that anyone could try it without having to
build/install the world?

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 18:42         ` Pedro Alves
@ 2014-05-13 19:16           ` Gary Benson
  2014-05-13 19:19             ` Pedro Alves
  2014-05-13 19:20           ` Florian Weimer
  1 sibling, 1 reply; 50+ messages in thread
From: Gary Benson @ 2014-05-13 19:16 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Florian Weimer, Mark Kettenis, gdb-patches

Pedro Alves wrote:
> On 05/13/14 19:22, Florian Weimer wrote:
> > I suppose I could run the demangler on all symbols in Fedora and
> > downstream and see what breaks.  Would that help?
> 
> That'd be interesting.

Yes, very much so.

> Or you could just pass that list through c++filt, the binutils
> program, which also uses the same demangler.  That might be
> faster, though obviously won't catch gdb-specific bugs.

c++filt would be enough to catch the crashes we've seen recently.
It would be best to have c++filt built with the latest libiberty
from GCC upstream, as two fixes have gone in recently.

> Perhaps we could also have the list of symbols accessible somewhere,
> so that anyone could try it without having to build/install the
> world?

Ah, yes, it'd be great to have an install-the-world symbols list
somewhere to try things with.

Cheers,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 19:16           ` Gary Benson
@ 2014-05-13 19:19             ` Pedro Alves
  2014-05-14  9:11               ` Gary Benson
  0 siblings, 1 reply; 50+ messages in thread
From: Pedro Alves @ 2014-05-13 19:19 UTC (permalink / raw)
  To: Gary Benson; +Cc: Florian Weimer, Mark Kettenis, gdb-patches

On 05/13/14 20:16, Gary Benson wrote:
> Pedro Alves wrote:

>> Or you could just pass that list through c++filt, the binutils
>> program, which also uses the same demangler.  That might be
>> faster, though obviously won't catch gdb-specific bugs.
> 
> c++filt would be enough to catch the crashes we've seen recently.
> It would be best to have c++filt built with the latest libiberty
> from GCC upstream, as two fixes have gone in recently.

The fix should have been merged to the binutils-gdb repo too, then
it's just a matter of building binutils ToT.
Has it not been yet?  We need to make sure that's done.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 18:42         ` Pedro Alves
  2014-05-13 19:16           ` Gary Benson
@ 2014-05-13 19:20           ` Florian Weimer
  2014-05-13 19:22             ` Pedro Alves
  1 sibling, 1 reply; 50+ messages in thread
From: Florian Weimer @ 2014-05-13 19:20 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Gary Benson, Mark Kettenis, gdb-patches

* Pedro Alves:

> On 05/13/14 19:22, Florian Weimer wrote:
>
>> I suppose I could run the demangler on all symbols in Fedora and
>> downstream and see what breaks.  Would that help?
>
> That'd be interesting.
>
> Or you could just pass that list through c++filt,
> the binutils program, which also uses the same demangler.
> That might be faster, though obviously won't catch gdb-specific
> bugs.

I did this with the ELF symbols in Fedora 20 and rawhide, but didn't
get any crashes.  I haven't extracted DWARF symbols yet, so I can test
that immediately.

> Perhaps we could also have the list of symbols accessible
> somewhere, so that anyone could try it without having to
> build/install the world?

It's already in a PostgreSQL database.  Send me a note if you want
access.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 19:20           ` Florian Weimer
@ 2014-05-13 19:22             ` Pedro Alves
  0 siblings, 0 replies; 50+ messages in thread
From: Pedro Alves @ 2014-05-13 19:22 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Gary Benson, Mark Kettenis, gdb-patches

On 05/13/14 20:20, Florian Weimer wrote:
> * Pedro Alves:

>> Perhaps we could also have the list of symbols accessible
>> somewhere, so that anyone could try it without having to
>> build/install the world?
> 
> It's already in a PostgreSQL database.  Send me a note if you want
> access.

Can we make that easier to get at?  Say, upload a text file
somewhere public?  Ideally we'd make it super easy to get, so
that we could point anyone who is doing demangler changes at it.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 18:22       ` Florian Weimer
  2014-05-13 18:42         ` Pedro Alves
@ 2014-05-13 19:22         ` Gary Benson
  2014-05-13 19:36           ` Tom Tromey
  1 sibling, 1 reply; 50+ messages in thread
From: Gary Benson @ 2014-05-13 19:22 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Mark Kettenis, gdb-patches

Florian Weimer wrote:
> * Gary Benson:
> > > It would be more reliable to run the demangler in a separate
> > > process.
> >
> > Agreed.  There are two issues though.  First is performance:
> > GDB can demangle a lot of symbols--starting GDB on LibreOffice
> > demangles more than 369,000 symbols on my machine, and typing
> > "b <Tab><Tab>" demangles 2,740,000 more--so the IPC overhead
> > has to be seriously low.
> 
> Is most of the demangling of the bulk kind?  These cases are easy in
> the sense that a ping-pong between the two processes is avoidable.
> 
> More worrisome would be the occasional demangling as part of other
> tasks, where the context switches might become very visible.

It probably is bulk, though I don't know for sure.  You could probably
rearrange GDB to demangle a whole object file or CU's symbols at once.

> > > What's so difficult about fixing the demangler?
> >
> > It's not so difficult really, but fixes take time to diagnose,
> > reproduce, fix, test, review, etc.
> 
> I was referring to a more permanent fix, not just patching the bug
> of the day as users encounter it.

Ah, sorry.  The main difficulty for me is that I don't fully
understand the mangling spec.  The people who really know it backwards
are the people who wrote the libiberty demangler in the first place :)
That, and time.  This demangler work is not my main job at the moment.
It just happens that I (foolishly?) fixed a bug on it last year so now
I'm Demangler Man until someone else falls into the trap.

Cheers,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 10:20       ` Gary Benson
@ 2014-05-13 19:29         ` Tom Tromey
  2014-05-14 13:07           ` Gary Benson
  2014-05-13 19:39         ` Tom Tromey
  1 sibling, 1 reply; 50+ messages in thread
From: Tom Tromey @ 2014-05-13 19:29 UTC (permalink / raw)
  To: Gary Benson; +Cc: Mark Kettenis, gdb-patches

Gary> I attached an updated patch that uses internal_warning below.  It
Gary> As an aside, would it be worth printing an extra line or two from
Gary> internal_vproblem asking the user to report the bug and directing
Gary> them to the bug tracker and/or the mailing lists?

I think that would be a good idea.

Tom

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 19:22         ` Gary Benson
@ 2014-05-13 19:36           ` Tom Tromey
  2014-05-14  9:13             ` Gary Benson
  0 siblings, 1 reply; 50+ messages in thread
From: Tom Tromey @ 2014-05-13 19:36 UTC (permalink / raw)
  To: Gary Benson; +Cc: Florian Weimer, Mark Kettenis, gdb-patches

Florian> More worrisome would be the occasional demangling as part of
Florian> other tasks, where the context switches might become very
Florian> visible.

Gary> It probably is bulk, though I don't know for sure.  You could probably
Gary> rearrange GDB to demangle a whole object file or CU's symbols at once.

It could perhaps be done but I think it would be non-trivial.
Usually in gdb whatever is calling the demangler intends to use the
result right away.

Tom

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 10:20       ` Gary Benson
  2014-05-13 19:29         ` Tom Tromey
@ 2014-05-13 19:39         ` Tom Tromey
  2014-05-14  9:15           ` Gary Benson
  1 sibling, 1 reply; 50+ messages in thread
From: Tom Tromey @ 2014-05-13 19:39 UTC (permalink / raw)
  To: Gary Benson; +Cc: gdb-patches

>>>>> "Gary" == Gary Benson <gbenson@redhat.com> writes:

Gary>  char *
Gary>  gdb_demangle (const char *name, int options)
Gary>  {
[...]
Gary> +  sigaction (SIGSEGV, &sa, &old_sa);
[...]
Gary> +  crash_signal = SIGSETJMP (gdb_demangle_jmp_buf);
[...]
Gary> +  sigaction (SIGSEGV, &old_sa, NULL);

This adds two calls to sigaction and a call to sigsetjmp to every
demangling invocation.

I wonder whether the performance cost of this is noticeable; and if so,
how large the effect is.

If it is too large we could perhaps arrange to do the sigaction calls
just once and see if that helps.

Tom

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 19:19             ` Pedro Alves
@ 2014-05-14  9:11               ` Gary Benson
  0 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-14  9:11 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Florian Weimer, Mark Kettenis, gdb-patches

Pedro Alves wrote:
> On 05/13/14 20:16, Gary Benson wrote:
> > Pedro Alves wrote:
> > > Or you could just pass that list through c++filt, the binutils
> > > program, which also uses the same demangler.  That might be
> > > faster, though obviously won't catch gdb-specific bugs.
> > 
> > c++filt would be enough to catch the crashes we've seen recently.
> > It would be best to have c++filt built with the latest libiberty
> > from GCC upstream, as two fixes have gone in recently.
> 
> The fix should have been merged to the binutils-gdb repo too, then
> it's just a matter of building binutils ToT.

Ah, I didn't realise that's where c++filt came from.

> Has it not been yet?  We need to make sure that's done.

One fix has been merged, the other hasn't been committed to GCC yet.
I only just noticed the latter, but I've pinged it so hopefully it
can go in today.  That one's not so important though, it's a failure
to demangle a symbol that's using a completely different mangling
scheme.

Cheers,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 19:36           ` Tom Tromey
@ 2014-05-14  9:13             ` Gary Benson
  0 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-14  9:13 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Florian Weimer, Mark Kettenis, gdb-patches

Tom Tromey wrote:
> Florian> More worrisome would be the occasional demangling as part
> Florian> of other tasks, where the context switches might become
> Florian> very visible.
> 
> Gary> It probably is bulk, though I don't know for sure.  You could
> Gary> probably rearrange GDB to demangle a whole object file or CU's
> Gary> symbols at once.
> 
> It could perhaps be done but I think it would be non-trivial.
> Usually in gdb whatever is calling the demangler intends to use
> the result right away.

Yeah, it wouldn't be trivial, but it's probably doable.

Cheers,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 19:39         ` Tom Tromey
@ 2014-05-14  9:15           ` Gary Benson
  0 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-14  9:15 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

Tom Tromey wrote:
> >>>>> "Gary" == Gary Benson <gbenson@redhat.com> writes:
> 
> Gary>  char *
> Gary>  gdb_demangle (const char *name, int options)
> Gary>  {
> [...]
> Gary> +  sigaction (SIGSEGV, &sa, &old_sa);
> [...]
> Gary> +  crash_signal = SIGSETJMP (gdb_demangle_jmp_buf);
> [...]
> Gary> +  sigaction (SIGSEGV, &old_sa, NULL);
> 
> This adds two calls to sigaction and a call to sigsetjmp to every
> demangling invocation.
> 
> I wonder whether the performance cost of this is noticeable; and if
> so, how large the effect is.

I tested this before I mailed the patch: it's not noticable.

FWIW I did time gdb -nx -batch \
                    /usr/lib64/libreoffice/program/soffice.bin \
		    -ex "start" -ex "complete b" > /dev/null

Cheers,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 19:29         ` Tom Tromey
@ 2014-05-14 13:07           ` Gary Benson
  0 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-14 13:07 UTC (permalink / raw)
  To: gdb-patches

Tom Tromey wrote:
> Gary> I attached an updated patch that uses internal_warning below.
> Gary> It As an aside, would it be worth printing an extra line or
> Gary> two from internal_vproblem asking the user to report the bug
> Gary> and directing them to the bug tracker and/or the mailing
> Gary> lists?
> 
> I think that would be a good idea.

https://sourceware.org/ml/gdb-patches/2014-05/msg00198.html

Cheers,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-10 20:55   ` Florian Weimer
  2014-05-11  5:10     ` Doug Evans
  2014-05-13 10:22     ` Gary Benson
@ 2014-05-14 14:18     ` Pedro Alves
  2014-05-14 16:08       ` Andrew Burgess
                         ` (2 more replies)
  2 siblings, 3 replies; 50+ messages in thread
From: Pedro Alves @ 2014-05-14 14:18 UTC (permalink / raw)
  To: Florian Weimer, Mark Kettenis; +Cc: gbenson, gdb-patches

On 05/10/2014 09:55 PM, Florian Weimer wrote:
> * Mark Kettenis:
> 
>> No.  It's this skind of duct-tape that will make sure that bugs in the
>> demangler won't get fixed.  Apart from removing the incentive to fix
>> the bugs, these SIGSEGV signal handlers make actually fixing the bugs
>> harder as you won't have core dumps.
> 
> I find this approach extremely odd as well.

I have to admit I'm not super keen on using signals for this either.
For one, not all bugs trigger segmentation faults.  Then stealing
a signal handler always has multi-threading considerations.  E.g.,
gdb Python code could well spawn a thread that happens to call
something that wants its own SIGSEGV handler...  Signal handlers
are per-process, not per-thread.

How about we instead add a new hook to the demangler interface,
that allows registering a callback that has the prototype of
gdb's internal_error?

 extern void internal_error (const char *file, int line, const char *, ...)
      ATTRIBUTE_NORETURN ATTRIBUTE_PRINTF (3, 4);

That's in the same spirit as bfd_set_error_handler, but for
internal errors.  The default implementation could just abort.

 /* This is a function pointer to the routine which should handle BFD
    error messages.  It is called when a BFD routine encounters an
    error for which it wants to print a message.  Going through a
    function pointer permits a program linked against BFD to intercept
    the messages and deal with them itself.  */

 bfd_error_handler_type _bfd_error_handler = _bfd_default_error_handler;

 bfd_error_handler_type
 bfd_set_error_handler (bfd_error_handler_type pnew)
 {
   bfd_error_handler_type pold;

   pold = _bfd_error_handler;
   _bfd_error_handler = pnew;
   return pold;
 }

GDB would install a hook that would end up in internal_error, or
warning, or whatever we decide is best.

As bonus, we get the file/line number that triggered the bug.

If libgcc/libstd++ don't want any sort of overhead, then we could make the
assertions be nops under #if defined(IN_LIBGCC2) || defined(IN_GLIBCPP_V3)
That is, the same conditions __cxa_demangle is under.

Then we'd add a demangle_assert macro to the demangler, similar to
gdb_assert, that calls that hook if the assertion fails.  And then
we could sprinkle the demangler with assertions.

I think that'd be easy to do, and I'd think it's much cleaner
and robust.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-14 14:18     ` Pedro Alves
@ 2014-05-14 16:08       ` Andrew Burgess
  2014-05-14 18:32         ` Pedro Alves
  2014-05-15 13:27       ` Gary Benson
  2014-05-20 17:05       ` Tom Tromey
  2 siblings, 1 reply; 50+ messages in thread
From: Andrew Burgess @ 2014-05-14 16:08 UTC (permalink / raw)
  To: gdb-patches

On 14/05/2014 3:01 PM, Pedro Alves wrote:
> On 05/10/2014 09:55 PM, Florian Weimer wrote:
>> * Mark Kettenis:
>>
>>> No.  It's this skind of duct-tape that will make sure that bugs in the
>>> demangler won't get fixed.  Apart from removing the incentive to fix
>>> the bugs, these SIGSEGV signal handlers make actually fixing the bugs
>>> harder as you won't have core dumps.
>>
>> I find this approach extremely odd as well.
> 
> I have to admit I'm not super keen on using signals for this either.
> For one, not all bugs trigger segmentation faults.  Then stealing
> a signal handler always has multi-threading considerations.  E.g.,
> gdb Python code could well spawn a thread that happens to call
> something that wants its own SIGSEGV handler...  Signal handlers
> are per-process, not per-thread.
> 
> How about we instead add a new hook to the demangler interface,
> that allows registering a callback that has the prototype of
> gdb's internal_error?

I thought that if the demangler couldn't demangle a symbol you
just got back NULL indicating no demangle was possible.

Given that, it's not clear to me where you'd want to use the error
handler, if you know something can't be demangled then you'd return
NULL, but if some feature wasn't implemented yet then surely you're
still better returning NULL than using the error handler, at least
that way the user of the demangler will continue using the mangled
version of the symbol.

I'm not arguing _for_ catching SEGV, I just think that an error handler
only helps with known bad states, the problem is that I think in all
known bad states the demangler should just return NULL, it's the
unknown bad states that are an issue here.

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-14 16:08       ` Andrew Burgess
@ 2014-05-14 18:32         ` Pedro Alves
  2014-05-15 13:25           ` Gary Benson
  0 siblings, 1 reply; 50+ messages in thread
From: Pedro Alves @ 2014-05-14 18:32 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On 05/14/2014 05:08 PM, Andrew Burgess wrote:
> On 14/05/2014 3:01 PM, Pedro Alves wrote:
>> On 05/10/2014 09:55 PM, Florian Weimer wrote:
>>> * Mark Kettenis:
>>>
>>>> No.  It's this skind of duct-tape that will make sure that bugs in the
>>>> demangler won't get fixed.  Apart from removing the incentive to fix
>>>> the bugs, these SIGSEGV signal handlers make actually fixing the bugs
>>>> harder as you won't have core dumps.
>>>
>>> I find this approach extremely odd as well.
>>
>> I have to admit I'm not super keen on using signals for this either.
>> For one, not all bugs trigger segmentation faults.  Then stealing
>> a signal handler always has multi-threading considerations.  E.g.,
>> gdb Python code could well spawn a thread that happens to call
>> something that wants its own SIGSEGV handler...  Signal handlers
>> are per-process, not per-thread.
>>
>> How about we instead add a new hook to the demangler interface,
>> that allows registering a callback that has the prototype of
>> gdb's internal_error?
> 
> I thought that if the demangler couldn't demangle a symbol you
> just got back NULL indicating no demangle was possible.

Well, that's fine, and I think that it's a matter that can
be changed independently of the scheme used to detect bad state
in the demangled.  For instance, we can have GDB's
demangler_internal_error callback throw a normal error,
and then catch it from within gdb_demangle, and have that return
NULL.

> 
> Given that, it's not clear to me where you'd want to use the error
> handler, if you know something can't be demangled then you'd return
> NULL, but if some feature wasn't implemented yet then surely you're
> still better returning NULL than using the error handler, at least
> that way the user of the demangler will continue using the mangled
> version of the symbol.
> 
> I'm not arguing _for_ catching SEGV, I just think that an error handler
> only helps with known bad states, the problem is that I think in all
> known bad states the demangler should just return NULL, it's the
> unknown bad states that are an issue here.

Well, the idea is about protecting against really bad state,
not unimplemented features.  Such a mechanism would be used
just like gdb's assertions.  E.g.,

#define d_assert(expr)                                                      \
  ((void) ((expr) ? 0 :                                                       \
           (d_assert_fail (#expr, __FILE__, __LINE__, FUNCTION_NAME), 0)))

and then:

 d_assert (...->index >= 0);
 d_assert (...->count >= 0);
 d_assert (len >= 0);

 d_assert (ptr != NULL)

 d_assert (!bad_recursion);

etc.  That seems much easier and natural to write then a bunch
of error-return style handling, which may require changing
function's prototypes.

Having the libgcc/libstdc++ versions abort on broken state
(but not on bad symbols!) is I think just fine.  We should
really prevent that with better testing, e.g., the
demangle-the-world testing, and/or fuzzy testing.

So I could see even the hook disappearing and the demangler
sigsetjmp/siglongjmp itself internally in the entry point
GDB uses (but not on libstdc++'s) and then returning NULL on
broken state.  That'd avoid adding a hook that effectively won't
ever go away, even if in reality it might be or become unnecessary.

I do wonder whether the demangler quality issue isn't being
blown out of proportion though.  I think further investments
in better testing/coverage would be much better and important
than all this bug swallowing...  I think the pay off of e.g.,
running through all symbols in a distro is higher, as we're
likely to catch earlier.  Yes, it's not mutually exclusive,
but in my mind, having something like that done routinely
effectively ups the quality assurance by a large margin.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-13 16:05         ` Pedro Alves
@ 2014-05-15 13:24           ` Gary Benson
  2014-05-15 14:07             ` Pedro Alves
  2014-05-16 11:06             ` Pedro Alves
  0 siblings, 2 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-15 13:24 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

Pedro Alves wrote:
> On 05/13/14 11:21, Gary Benson wrote:
> >    GDB demangles all symbols in every file it loads, always.  In
> >    libstdc++ the demangler is only called in error situations,
> >    and then only to demangle the few symbols that appear in the
> >    backtrace.
> > 
> >    So: we get the bug reports, and have to triage them, even
> >    though the code is not "ours".   
> 
> I really can't agree with the "not ours" sentiment.  Not if we think
> in terms of the GNU toolchain.  IMO, we should foremost think of
> ourselves as development toolchain toolsmiths over gdb hackers.
> 
> Thought experiment #1: I hereby declare that the demangler
> maintainers are GDB hackers.  In order to get one's code in the
> demangler, one has to get it past those specific (and very busy)
> GDB hackers.
> 
> Thought experiment #2: I'm going to import and fork the demangler
> into gdb/ directly, and declare that from here on, we get to fix it
> ourselves.
> 
> Either case doesn't seem to make a difference to me.  Except that
> with #2, we'd end with an incompetent demangler maintainer. :-)

I don't think anybody here really wants to be the demangler
maintainer.  Trouble is, I'm not sure anybody in GCC land
wants to be it either!

I wasn't really trying to disown myself from libiberty ownership.
I guess what I was trying to say is the bulk of libiberty
contributions come from GCC, but that GCC is not a heavy consumer
of libiberty.  We're eating their dogfood :)

> >  * Bugs have a more serious affect on GDB than on libstdc++
> 
> True.
> 
> >    Currently a demangler crash will cause GDB to segfault, usually
> >    at startup.  A demangler crash in libstdc++ is not such a big
> >    deal as the application was likely crashing anyway.
> > 
> >    So: bugs that are high-priority for us are low-priority for the
> >    "main" demangler authors, and we end up having to fix them.
> 
> True.
> 
> So...  Since this subsystem is so important, should we write our own
> demangler from scratch then?  Would that result in a better outcome?

If there was somebody here with a deep knowledge of C++ and a deep
knowledge of all the different mangling schemes we need to support
then maybe.  I'm not that person though.  And it's not a small
project:

  $ wc -l libiberty/*dem*
    6386 libiberty/cp-demangle.c
     174 libiberty/cp-demangle.h
     232 libiberty/cp-demint.c
    4869 libiberty/cplus-dem.c
   11661 total

If we're going to spend a lot of time on something we should spend it
improving the existing demangler rather than trying to roll our own.
The existing demangler is *good*, it just maybe hasn't had quite the
beating that it gets from a GDB run on a big app using all kinds of
the latest and greatest C++ features.

> Or, can we perhaps extend the demangler a little to make it more
> robust itself?

I'm not sure.  There's nothing obvious I can think of.  I see you
wrote another email with some ideas, I'll reply to that separately.

> Is there something that could be done to demangler's testsuite to
> make it more effective in catching bugs that GDB would catch?
> (I mean other than throwing more symbols at it.

Again, I'm not sure, unless you were to break it open and unit test
the various components.  It'd need careful refactoring to allow this
without breaking the API.

> Though a fuzzy tester that throws pseudo-random symbols at it would
> be a nice project on its own.)

I have a fuzzer for it.  <http://gbenson.net/?p=422>.  Depressingly
it gets a segfault in seconds every time.  There seem to be at least
three different issues.

> > * Demangler patches often get waved through with minimal scrutiny
> 
> That does sound like a problem.  Can we work with the gcc folks to
> somehow prevent this from happening?  E.g., perhaps we could ask
> them to CC all demangler patches to the gdb-patches@ list as well,
> like supposedly done for some other shared files.

Maybe, I'm not sure who you'd ask though.  All mail to gcc-patches
with "mangl" in the subject ends up in my inbox, so the stuff is at
least getting extra scrutiny from me :)  Unless of course the subject
is something useless like "PR 12345" (a pet hate of mine!)

> It's not clear to me whether the issues are with demangling itself
> being complex, or with the current implementation.

A bit of both I think.

Cheers,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-14 18:32         ` Pedro Alves
@ 2014-05-15 13:25           ` Gary Benson
  2014-05-15 16:01             ` Pedro Alves
  0 siblings, 1 reply; 50+ messages in thread
From: Gary Benson @ 2014-05-15 13:25 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Andrew Burgess, gdb-patches

Pedro Alves wrote:
> On 05/14/2014 05:08 PM, Andrew Burgess wrote:
> > On 14/05/2014 3:01 PM, Pedro Alves wrote:
> > > How about we instead add a new hook to the demangler interface,
> > > that allows registering a callback that has the prototype of
> > > gdb's internal_error?
> > 
> > I thought that if the demangler couldn't demangle a symbol you
> > just got back NULL indicating no demangle was possible.
> 
> Well, that's fine, and I think that it's a matter that can be
> changed independently of the scheme used to detect bad state
> in the demangled.  For instance, we can have GDB's
> demangler_internal_error callback throw a normal error, and
> then catch it from within gdb_demangle, and have that return
> NULL.

The demangler already has a system in place to handle early
termination.  It's not gdb_demangle that needs to return NULL:
it's the demangler's API to return NULL if the symbol cannot be
demangled.

It's not clear to me what benefit a second system for early
termination would add, or how you would decide which system
to use for any given error.

> ...the idea is about protecting against really bad state, not
> unimplemented features.

The thing is, everything's fine once you've *detected* the bad
state: you can handle it however you want.  Which, using the
demangler's current convention, is either "handle the state" or
"d_print_error (&dpi); return;".  The whole point of the segfault
catcher was to cope with undetected bad state.

I'm not saying we should not try and fix more places where bad
state is undetected.  My point was that no matter how much work
you put in you still can never say "ok, we got all the bugs now".
That's what the segfault catcher was for.

> We should really prevent that with better testing, e.g., the
> demangle-the-world testing, and/or fuzzy testing.

Agreed.  If nothing else comes out of this discussion but more
testing then it's all been worthwhile.

Thanks,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-14 14:18     ` Pedro Alves
  2014-05-14 16:08       ` Andrew Burgess
@ 2014-05-15 13:27       ` Gary Benson
  2014-05-20 17:05       ` Tom Tromey
  2 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-15 13:27 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Florian Weimer, Mark Kettenis, gdb-patches

Pedro Alves wrote:
> ...stealing a signal handler always has multi-threading
> considerations.  E.g., gdb Python code could well spawn a thread
> that happens to call something that wants its own SIGSEGV handler...
> Signal handlers are per-process, not per-thread.

I had not considered this.  Do you know of any specific libraries
that do it?  The only times I've seen SIGSEGV handlers in the wild
have been in VMs and other language-level stuff.

Thanks,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-15 13:24           ` Gary Benson
@ 2014-05-15 14:07             ` Pedro Alves
  2014-05-15 14:28               ` Gary Benson
  2014-05-16 11:06             ` Pedro Alves
  1 sibling, 1 reply; 50+ messages in thread
From: Pedro Alves @ 2014-05-15 14:07 UTC (permalink / raw)
  To: Gary Benson; +Cc: Mark Kettenis, gdb-patches

On 05/15/2014 02:24 PM, Gary Benson wrote:

>> Though a fuzzy tester that throws pseudo-random symbols at it would
>> be a nice project on its own.)
> 
> I have a fuzzer for it.  <http://gbenson.net/?p=422>.

Nice!

I took a peek.  I suggest wrapping it in a SIGSEGV handler so
the it prints the symbol that crashes.  :-)  That way one can
leave it running for a few hours, without needing to have core
dumps enabled.

> Depressingly it gets a segfault in seconds every time.  There seem to
> be at least three different issues.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-15 14:07             ` Pedro Alves
@ 2014-05-15 14:28               ` Gary Benson
  2014-05-15 15:25                 ` Pedro Alves
  0 siblings, 1 reply; 50+ messages in thread
From: Gary Benson @ 2014-05-15 14:28 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Mark Kettenis, gdb-patches

Pedro Alves wrote:
> On 05/15/2014 02:24 PM, Gary Benson wrote:
> > > Though a fuzzy tester that throws pseudo-random symbols at it
> > > would be a nice project on its own.)
> > 
> > I have a fuzzer for it.  <http://gbenson.net/?p=422>.
> 
> Nice!
> 
> I took a peek.  I suggest wrapping it in a SIGSEGV handler so
> the it prints the symbol that crashes.  :-)  That way one can
> leave it running for a few hours, without needing to have core
> dumps enabled.

I could do that :)

The way I imagined using it right now would be to run to a fault,
then fix that fault.  The rate it finds crashes you'd get 1000 an
hour easily, but they might all be the same three bugs.

I'm working on some other stuff now, but I might make it my task
to fix demangler bugs here and there between projects.  I pretty
much have a handle on it now I think.

Cheers,
Gary

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-15 14:28               ` Gary Benson
@ 2014-05-15 15:25                 ` Pedro Alves
  0 siblings, 0 replies; 50+ messages in thread
From: Pedro Alves @ 2014-05-15 15:25 UTC (permalink / raw)
  To: Gary Benson; +Cc: Mark Kettenis, gdb-patches

On 05/15/2014 03:28 PM, Gary Benson wrote:

> I'm working on some other stuff now, but I might make it my task
> to fix demangler bugs here and there between projects.  I pretty
> much have a handle on it now I think.

Now that's the spirit!

Before you'll know it you'll be our goto demangler guy.

Oh wait...  You already are.  :-)

Thanks,
-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-15 13:25           ` Gary Benson
@ 2014-05-15 16:01             ` Pedro Alves
  0 siblings, 0 replies; 50+ messages in thread
From: Pedro Alves @ 2014-05-15 16:01 UTC (permalink / raw)
  To: Gary Benson; +Cc: Andrew Burgess, gdb-patches

On 05/15/2014 02:25 PM, Gary Benson wrote:
> It's not clear to me what benefit a second system for early
> termination would add, or how you would decide which system
> to use for any given error.

Just to cross this off for the record.  The idea would be to use
early termination for bad input, and d_assert for asserting
internal consistency is sane.  Much like in GDB we don't use
gdb_assert for bad input.

But better coverage/testing beats this by a wide margin,
in my book.

Speaking of coverage, I wonder what gcov says...

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-15 13:24           ` Gary Benson
  2014-05-15 14:07             ` Pedro Alves
@ 2014-05-16 11:06             ` Pedro Alves
  1 sibling, 0 replies; 50+ messages in thread
From: Pedro Alves @ 2014-05-16 11:06 UTC (permalink / raw)
  To: Gary Benson; +Cc: Mark Kettenis, gdb-patches

On 05/15/2014 02:24 PM, Gary Benson wrote:
>>> > > * Demangler patches often get waved through with minimal scrutiny
>> > 
>> > That does sound like a problem.  Can we work with the gcc folks to
>> > somehow prevent this from happening?  E.g., perhaps we could ask
>> > them to CC all demangler patches to the gdb-patches@ list as well,
>> > like supposedly done for some other shared files.
> Maybe, I'm not sure who you'd ask though.  All mail to gcc-patches
> with "mangl" in the subject ends up in my inbox, so the stuff is at
> least getting extra scrutiny from me :)  Unless of course the subject
> is something useless like "PR 12345" (a pet hate of mine!)

We could point the current libiberty/demanger maintainers at this
discussion, and see what they think of that.  Or gcc-patches@.  Or both.
If they agree, we could document it in src/MAINTAINERS,  like e.g.,
it's mentioned for top level files:

Makefile.*; configure; configure.ac; src-release
        Any global maintainer can approve changes to these
        files, but they should be aware that they need to
        be kept in sync with their counterparts in the GCC
        repository.  Also please notify the following of
        any committed patches:
                binutils@sourceware.org
                gdb-patches@sourceware.org

It might be cleaner if the demangler was split into its own
directory, IMO.  Say libdemangler.  I don't really know whether
it depends on much in libiberty -- it's just a text transform.
But that's probably not going to happen -- some measurable effort
there for not that much gain.  :-)

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-14 14:18     ` Pedro Alves
  2014-05-14 16:08       ` Andrew Burgess
  2014-05-15 13:27       ` Gary Benson
@ 2014-05-20 17:05       ` Tom Tromey
  2014-05-20 18:40         ` Stan Shebs
  2014-05-22 14:09         ` Gary Benson
  2 siblings, 2 replies; 50+ messages in thread
From: Tom Tromey @ 2014-05-20 17:05 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Florian Weimer, Mark Kettenis, gbenson, gdb-patches

>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:

Pedro> I have to admit I'm not super keen on using signals for this either.

Pedro> For one, not all bugs trigger segmentation faults.

That is true, but the goal of the patch is to cheaply improve gdb's
behavior in some failure modes, not to solve every problem.

I think this is warranted due to known properties of the demangler.
First, it is complicated.  Second, it is hard to test well.  Third,
there's been a history of new demangler features being rolled out with
insufficient testing, and we can reasonably expect that to continue.
Fourth, the bugs in question have a very severe effect on gdb users --
you simply cannot debug -- whereas the effect on other users of the
demangler is slight (this is why I think we can expect to see more
demangler bugs of a similar nature).

Pedro> Then stealing a signal handler always has multi-threading
Pedro> considerations.  E.g., gdb Python code could well spawn a thread
Pedro> that happens to call something that wants its own SIGSEGV
Pedro> handler...  Signal handlers are per-process, not per-thread.

That is true in theory but I think it is unlikely in practice.  And,
should it happen -- well, the onus is on folks writing extensions not to
mess things up.  That's the nature of the beast.  And, sure, it is
messy, particularly if we ever upstream "import gdb", but even so,
signals are just fraught and this is not an ordinary enough usage to
justify preventing gdb from doing it.

Pedro> Then we'd add a demangle_assert macro to the demangler, similar to
Pedro> gdb_assert, that calls that hook if the assertion fails.  And then
Pedro> we could sprinkle the demangler with assertions.

Pedro> I think that'd be easy to do, and I'd think it's much cleaner
Pedro> and robust.

This would be an improvement but it isn't really under consideration.
The demangler isn't the most important thing we're working on, and
nobody is going to spend the time adding assertions to it.  And, even if
they did, the crash handler would still useful, just hopefully used
somewhat less.  This is because bugs happen even when there are many
assertions in place.

The choice is really between SEGV catching and "somebody else down the
road fixes more demangler bugs".

Tom

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-20 17:05       ` Tom Tromey
@ 2014-05-20 18:40         ` Stan Shebs
  2014-05-20 19:36           ` Tom Tromey
  2014-05-22 13:18           ` Gary Benson
  2014-05-22 14:09         ` Gary Benson
  1 sibling, 2 replies; 50+ messages in thread
From: Stan Shebs @ 2014-05-20 18:40 UTC (permalink / raw)
  To: gdb-patches

On 5/20/14, 10:05 AM, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
> 
> Pedro> I have to admit I'm not super keen on using signals for this either.
> 
> Pedro> For one, not all bugs trigger segmentation faults.
> 
> That is true, but the goal of the patch is to cheaply improve gdb's
> behavior in some failure modes, not to solve every problem.
> 
> I think this is warranted due to known properties of the demangler.
> First, it is complicated.  Second, it is hard to test well.  Third,
> there's been a history of new demangler features being rolled out with
> insufficient testing, and we can reasonably expect that to continue.
> Fourth, the bugs in question have a very severe effect on gdb users --
> you simply cannot debug -- whereas the effect on other users of the
> demangler is slight (this is why I think we can expect to see more
> demangler bugs of a similar nature).

After reading all the discussion, I'm tending to disfavor catching
demangler segfaults.

My memory may be playing tricks on me, but once upon a time it seemed
like the demangler was the most reliable part of the mixed bag that was
C++ debugging - segfaults were pretty much unheard of.  So it's a little
strange to me that it's now become so troublesome that it needs to be
wrapped, or has been suggested, to be run in a different process(!), and
it reinforces Mark K's original point about signal catchers masking more
serious problems.

Complicated or not, the demangler is one of the most algorithmically
predictable components of GDB, and it is very easy to test
comprehensively; no races, no arcane target dependencies, textual
input and output.  So if it's becoming unreliable, perhaps there are
process flaws that we should be addressing.

Stan
stan@codesourcery.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-20 18:40         ` Stan Shebs
@ 2014-05-20 19:36           ` Tom Tromey
  2014-05-20 20:23             ` Joel Brobecker
  2014-05-22 13:18           ` Gary Benson
  1 sibling, 1 reply; 50+ messages in thread
From: Tom Tromey @ 2014-05-20 19:36 UTC (permalink / raw)
  To: Stan Shebs; +Cc: gdb-patches

Stan> So it's a little strange to me that it's now become so troublesome
Stan> that it needs to be wrapped, or has been suggested, to be run in a
Stan> different process(!), and it reinforces Mark K's original point
Stan> about signal catchers masking more serious problems.

The typical failure mode here is that gdb crashes when you start it.
You cannot debug your program, and there is no reasonable workaround.
(There is an unreasonable one: edit your binary to remove all mentions
of the offending symbol.)

What problem might be masked that is more serious than this?

Stan> Complicated or not, the demangler is one of the most algorithmically
Stan> predictable components of GDB, and it is very easy to test
Stan> comprehensively; no races, no arcane target dependencies, textual
Stan> input and output.  So if it's becoming unreliable, perhaps there are
Stan> process flaws that we should be addressing.

I agree, that would be a good thing to do.

However, even if the process flaws are fixed, it remains that gdb is
unusually exposed to demangler bugs, and that the patch in question is
not likely to cause harm.  We know this because this technique is
commonplace in the managed runtime community.

And, supposing that the demangler is thoroughly de-bugged -- the patch
in question will still not cause any harm.

That is, addressing the process is not exclusive with this patch.

I realize there is an attractive quality to the "suffering is good for
us" idea.  It appeals to my inner spartan.  However, I disagree very
strongly with this.

As a thought experiment, consider applying this same logic to
internal_error.  The same considerations apply to making it simply
abort.  Yet, it would be a mistake to make this change -- because in the
end, a user's debug session is more important than any particular gdb
bug.

BTW you can try it out thanks to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61233

Compile this with g++ and try it with a recent gdb.

    extern "C"
    void _Z7ZipWithI7QStringS0_5QListZN4oral6detail16AdaptCreateTableI7AccountEES0_RKNS3_16CachedFieldsDataEEUlRKS0_SA_E_ET1_IDTclfp1_cvT__EcvT0__EEEERKT1_ISC_ERKT1_ISD_ET2_ ()
    {
    }

    int main()
    {
      return 0;
    }

Ouch:

    barimba. ./gdb/gdb --quiet /tmp/q
    Reading symbols from /tmp/q...Segmentation fault (core dumped)

Tom

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-20 19:36           ` Tom Tromey
@ 2014-05-20 20:23             ` Joel Brobecker
  2014-05-22 12:56               ` Gary Benson
  0 siblings, 1 reply; 50+ messages in thread
From: Joel Brobecker @ 2014-05-20 20:23 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Stan Shebs, gdb-patches

> I realize there is an attractive quality to the "suffering is good for
> us" idea.  It appeals to my inner spartan.  However, I disagree very
> strongly with this.

Having been on the receiving end of this kind of logic, I agree
with Tom - quite strongly too, in fact. As a user, I do not want
to be held hostage, especially when there is no workaround. If
the proposed solution brings no noticeable harm for our users
in the situation where things are working as expected, I think
we should consider it. And to help mitigating the fears that
we would be hiding bugs, we can perhaps find a middle-ground;
for instance, making sure that we print a really verbose error
message.

-- 
Joel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-20 20:23             ` Joel Brobecker
@ 2014-05-22 12:56               ` Gary Benson
  2014-05-22 13:09                 ` Joel Brobecker
  2014-05-22 14:13                 ` Pedro Alves
  0 siblings, 2 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-22 12:56 UTC (permalink / raw)
  To: Joel Brobecker
  Cc: Tom Tromey, Stan Shebs, gdb-patches, Florian Weimer,
	Mark Kettenis, Pedro Alves

Joel Brobecker wrote:
> Tom Tromey wrote:
> > I realize there is an attractive quality to the "suffering is
> > good for us" idea.  It appeals to my inner spartan.  However,
> > I disagree very strongly with this.
> 
> Having been on the receiving end of this kind of logic, I agree
> with Tom - quite strongly too, in fact. As a user, I do not want
> to be held hostage, especially when there is no workaround. If
> the proposed solution brings no noticeable harm for our users
> in the situation where things are working as expected, I think
> we should consider it. And to help mitigating the fears that
> we would be hiding bugs, we can perhaps find a middle-ground;
> for instance, making sure that we print a really verbose error
> message.

I'm definitely not trying to hide bugs; if anything I'm trying to
make them more reportable.  FWIW users would see this:

  (gdb) set lang c++
  (gdb) maint demangle _Z1-Av23*;cG~Wo2Vu
  /home/gary/work/archer/demangle-crashcatcher/src/gdb/cp-support.c:1590:
  internal-warning: unable to demangle '_Z1-Av23*;cG~Wo2Vu' (demangler
  failed with signal 11)
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  Quit this debugging session? (y or n)

A separate patch [1] I've posted augments this with the lines:

  This is a bug, please report it.  For instructions, see:
  <http://www.gnu.org/software/gdb/bugs/>.

The point is to make it easier for users to file straightforward bug
reports *with reproducers* rather than the opaque "GDB crashed at
startup" bugs we've been getting at the moment that people (by which
I mean Keith) have had to spend time triaging.  And, at the same time,
for the user to have the option to attempt to continue using GDB to
debug their program.  I realise that people may feel that the user
*should* then fix GDB, but not everyone has the time or the ability
or the inclination.  I don't want the workaround for this to become
"try LLDB".

Cheers,
Gary

-- 
[1] https://sourceware.org/ml/gdb-patches/2014-05/msg00198.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-22 12:56               ` Gary Benson
@ 2014-05-22 13:09                 ` Joel Brobecker
  2014-05-22 14:13                 ` Pedro Alves
  1 sibling, 0 replies; 50+ messages in thread
From: Joel Brobecker @ 2014-05-22 13:09 UTC (permalink / raw)
  To: Gary Benson
  Cc: Tom Tromey, Stan Shebs, gdb-patches, Florian Weimer,
	Mark Kettenis, Pedro Alves

> > Having been on the receiving end of this kind of logic, I agree
> > with Tom - quite strongly too, in fact. As a user, I do not want
> > to be held hostage, especially when there is no workaround. If
> > the proposed solution brings no noticeable harm for our users
> > in the situation where things are working as expected, I think
> > we should consider it. And to help mitigating the fears that
> > we would be hiding bugs, we can perhaps find a middle-ground;
> > for instance, making sure that we print a really verbose error
> > message.
> 
> I'm definitely not trying to hide bugs; if anything I'm trying to
> make them more reportable.  FWIW users would see this:

Sorry, Gary. You're right. I should have said something like
"reducing the likeliness of getting bugs fixed because the pressure
to get them fixed would be less".

> The point is to make it easier for users to file straightforward bug
> reports *with reproducers* rather than the opaque "GDB crashed at
> startup" bugs we've been getting at the moment that people (by which
> I mean Keith) have had to spend time triaging.  And, at the same time,
> for the user to have the option to attempt to continue using GDB to
> debug their program.  I realise that people may feel that the user
> *should* then fix GDB, but not everyone has the time or the ability
> or the inclination.  I don't want the workaround for this to become
> "try LLDB".

I think that the fact that this makes it easier for the user to
report the problem is also another excellent point in favor of
the suggested solution.

-- 
Joel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-20 18:40         ` Stan Shebs
  2014-05-20 19:36           ` Tom Tromey
@ 2014-05-22 13:18           ` Gary Benson
  1 sibling, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-22 13:18 UTC (permalink / raw)
  To: Stan Shebs
  Cc: Florian Weimer, Mark Kettenis, Pedro Alves, Tom Tromey, gdb-patches

Stan Shebs wrote:
> My memory may be playing tricks on me, but once upon a time it
> seemed like the demangler was the most reliable part of the mixed
> bag that was C++ debugging - segfaults were pretty much unheard of.
> So it's a little strange to me that it's now become so troublesome
> that it needs to be wrapped, or has been suggested, to be run in a
> different process(!), and it reinforces Mark K's original point
> about signal catchers masking more serious problems.

Apart from the ones the fuzzer found, the recent crashes [1] all seem
to have C++11 features, specifically, lambdas or lvalue references or
both.  My suspicion is that these bugs are being shaken out as C++11
code becomes more widespread and/or people are doing more complicated
things with it.

> Complicated or not, the demangler is one of the most algorithmically
> predictable components of GDB, and it is very easy to test
> comprehensively; no races, no arcane target dependencies, textual
> input and output.

For sure, but somebody needs to write those tests, and that somebody
needs 1) a deep knowledge of C++, including C++11, 2) a deep knowledge
of the mangling scheme, and 3) the time and inclination to sit down
and compile an extensive test suite.  I don't know anybody with three
out of three.

Thanks,
Gary

-- 
[1] https://github.com/gbenson/binutils-gdb/blob/demangler/segfault-test.tests

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-20 17:05       ` Tom Tromey
  2014-05-20 18:40         ` Stan Shebs
@ 2014-05-22 14:09         ` Gary Benson
  2014-05-22 14:40           ` Mark Kettenis
  1 sibling, 1 reply; 50+ messages in thread
From: Gary Benson @ 2014-05-22 14:09 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Pedro Alves, Florian Weimer, Mark Kettenis, gdb-patches

Tom Tromey wrote:
> Pedro> Then stealing a signal handler always has multi-threading
> Pedro> considerations.  E.g., gdb Python code could well spawn a
> Pedro> thread that happens to call something that wants its own
> Pedro> SIGSEGV handler...  Signal handlers are per-process, not
> Pedro> per-thread.
> 
> That is true in theory but I think it is unlikely in practice.  And,
> should it happen -- well, the onus is on folks writing extensions
> not to mess things up.  That's the nature of the beast.  And, sure,
> it is messy, particularly if we ever upstream "import gdb", but even
> so, signals are just fraught and this is not an ordinary enough
> usage to justify preventing gdb from doing it.

GDB installs handlers for INT, TERM, QUIT, HUP, FPE, WINCH, CONT,
TTOU, TRAP, ALRM and TSTP, and some other platform-specific ones
I didn't recognise.  Is there anything that means SIGSEGV should
be treated differently to all these other signals?

> The choice is really between SEGV catching and "somebody else
> down the road fixes more demangler bugs".

The demangler bugs will get fixed one way or another.  The choice is:
do we allow users to continue to use GDB while the bug they've hit is
fixed, or, do we make them wait?  In the expectation that they will
put their own work aside while they fix GDB instead?

Thanks,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-22 12:56               ` Gary Benson
  2014-05-22 13:09                 ` Joel Brobecker
@ 2014-05-22 14:13                 ` Pedro Alves
  2014-05-22 15:57                   ` Gary Benson
  1 sibling, 1 reply; 50+ messages in thread
From: Pedro Alves @ 2014-05-22 14:13 UTC (permalink / raw)
  To: Gary Benson, Joel Brobecker
  Cc: Tom Tromey, Stan Shebs, gdb-patches, Florian Weimer, Mark Kettenis

On 05/22/2014 01:56 PM, Gary Benson wrote:

> The point is to make it easier for users to file straightforward bug
> reports *with reproducers* rather than the opaque "GDB crashed at
> startup" bugs we've been getting at the moment that people (by which
> I mean Keith) have had to spend time triaging.  And, at the same time,
> for the user to have the option to attempt to continue using GDB to
> debug their program.  I realise that people may feel that the user
> *should* then fix GDB, but not everyone has the time or the ability
> or the inclination.  I don't want the workaround for this to become
> "try LLDB".

I still think you guys are blowing this way out of proportion.

I don't believe anyone here likes that GDB crashes.  But this demangler
is being turned into a scarecrow, for no good reason, IMO.  This is
not how we handle all others sorts of crashes, even for other libraries
in the tree (libbfd, libopcodes, etc.).

Here:

  https://sourceware.org/bugzilla/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=SUSPENDED&bug_status=WAITING&bug_status=REOPENED&list_id=14858&product=gdb&query_format=advanced&short_desc=crash&short_desc_type=allwordssubstr

That shows 68 open GDB bugs with "crash" in the summary.  Some of them
are likewise crashes that make GDB not even start.  Are you going to
propose wrapping all of GDB's modules for those?  E.g., some are (and
I'm sure there are more) in the dwarf reader code.  Are we wrapping
that to dump the dwarf?  Seems like only one, maybe two are still
relevant demangler crashes, even.  And I don't see a huge number of
users adding themselves to CC on those bugs.

Looking at Tom's example crash from gcc 61233, the crash is due to
stack exhaustion due to seemingly infinite recursion.  Even if the
main bug is not fixed, I'd think it'd be easy to add a recursion
limit to the demangler, and thus prevent the crash, until whatever
c++11 feature that's still not well developed is handled properly.

I just can't/won't believe there are that many unindentified
bugs in the demangler.  If that were true, we'd hear _much_ more
about them.

And I bet that that if we took all the time we already collectively
spent these past weeks discussing this already and used it to
write that recursion-limiter patch, it'd be upstream already.  :-)

It's clear to me that we need to do something
demangler-development-process-wise to prevent these sorts
of bugs from manifesting only late in GDB, and I think we should
definitely get in contact with the GCC folks about it.

I had another idea to aid demangler testing with g++.
I had suggested before to run the demangler through the binaries
produced by g++'s testsuite.  The whole point here is to validate the
mangling as close as possible to the generator.
The alternative idea is to teach g++ itself to try to demangle any
symbol it generates/mangles (off by default, under a --param switch
or some such).  A sort of making-it-very easy-to-eat-your-own-dog-food
methodology.
We could then just run g++'s testsuite with that enabled.  No other
infrastructure would be necessary.  And, we could ask g++ maintainers
to do that once in a while, and perhaps run it through whatever usual
set of complicated C++ codebases they usually test g++ with.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-22 14:09         ` Gary Benson
@ 2014-05-22 14:40           ` Mark Kettenis
  2014-05-22 20:42             ` Gary Benson
  0 siblings, 1 reply; 50+ messages in thread
From: Mark Kettenis @ 2014-05-22 14:40 UTC (permalink / raw)
  To: gbenson; +Cc: tromey, palves, fw, mark.kettenis, gdb-patches

> Date: Thu, 22 May 2014 15:09:04 +0100
> From: Gary Benson <gbenson@redhat.com>
> 
> Tom Tromey wrote:
> > Pedro> Then stealing a signal handler always has multi-threading
> > Pedro> considerations.  E.g., gdb Python code could well spawn a
> > Pedro> thread that happens to call something that wants its own
> > Pedro> SIGSEGV handler...  Signal handlers are per-process, not
> > Pedro> per-thread.
> > 
> > That is true in theory but I think it is unlikely in practice.  And,
> > should it happen -- well, the onus is on folks writing extensions
> > not to mess things up.  That's the nature of the beast.  And, sure,
> > it is messy, particularly if we ever upstream "import gdb", but even
> > so, signals are just fraught and this is not an ordinary enough
> > usage to justify preventing gdb from doing it.
> 
> GDB installs handlers for INT, TERM, QUIT, HUP, FPE, WINCH, CONT,
> TTOU, TRAP, ALRM and TSTP, and some other platform-specific ones
> I didn't recognise.  Is there anything that means SIGSEGV should
> be treated differently to all these other signals?

From that list SIGFPE is probably a bogosity.  I don't think the
SIGFPE handler will do the right thing on many OSes and architectures
supported by GDB, since it is unspecified whether the trapping
instruction will be re-executed upon return from the signal handler.
I'd argue that the SIGFPE handler is just as unhelpful as the SIGSEGV
handler you're proposing.  Luckily, we don't seem to have a lot of
division-by-zero bugs in the code base.

> > The choice is really between SEGV catching and "somebody else
> > down the road fixes more demangler bugs".
> 
> The demangler bugs will get fixed one way or another.  The choice is:
> do we allow users to continue to use GDB while the bug they've hit is
> fixed, or, do we make them wait?  In the expectation that they will
> put their own work aside while they fix GDB instead?

Unless there is a way to force a core dump (like internal_error()
offers) with the state at the point of the SIGSEGV in it, yes, we need
to make them wait or fix it themselves.

I'd really like to avoid adding a SIGSEGV handler altogether.  But I'm
willing to compromise if the signal handler offers to opportunity to
create a core dump.  Now doing so in a signal-safe way will be a bit
tricky of course.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-22 14:13                 ` Pedro Alves
@ 2014-05-22 15:57                   ` Gary Benson
  0 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-22 15:57 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Joel Brobecker, Tom Tromey, Stan Shebs, gdb-patches,
	Florian Weimer, Mark Kettenis

Pedro Alves wrote:
> I don't believe anyone here likes that GDB crashes.  But this
> demangler is being turned into a scarecrow, for no good reason,
> IMO.  This is not how we handle all others sorts of crashes,
> even for other libraries in the tree (libbfd, libopcodes, etc.).
> 
> Here:
> 
>   https://sourceware.org/bugzilla/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=SUSPENDED&bug_status=WAITING&bug_status=REOPENED&list_id=14858&product=gdb&query_format=advanced&short_desc=crash&short_desc_type=allwordssubstr
> 
> That shows 68 open GDB bugs with "crash" in the summary.  Some of
> them are likewise crashes that make GDB not even start.  Are you
> going to propose wrapping all of GDB's modules for those?  E.g.,
> some are (and I'm sure there are more) in the dwarf reader code.
> Are we wrapping that to dump the dwarf?

Mark Kettenis raised a similar point earlier in the thread [1] and
I replied [2].  This is a straw man.  It's irrelevant whether other
subsystems of GDB also have crash bugs filed against them or not.
I'm not proposing to wrap all or even any other subsystems of GDB.
I'm proposing to wrap this one specific subsystem which regularly has
bugs filed against it, bugs which completely prevent users from being
able to debug their programs with GDB.

> Seems like only one, maybe two are still relevant demangler crashes,
> even.

This is because I fixed all the open ones before proposing the wrapper.

> And I don't see a huge number of users adding themselves to CC on
> those bugs.

Every user matters.

> Looking at Tom's example crash from gcc 61233, the crash is due to
> stack exhaustion due to seemingly infinite recursion.  Even if the
> main bug is not fixed, I'd think it'd be easy to add a recursion
> limit to the demangler, and thus prevent the crash, until whatever
> c++11 feature that's still not well developed is handled properly.

A recursion-limiter would catch 61233, and some of the others, but
certainly not all of the bugs that were filed.  One failure was
because the demangler was trying to dereference 0x3.  And OTOH only
one of the bugs the fuzzer caught was an infinite recursion, the
others look like reading the wrong entry from a union.

> I just can't/won't believe there are that many unindentified
> bugs in the demangler.  If that were true, we'd hear _much_ more
> about them.

We _are_ hearing about them.  There's not been much traffic here, on
this list, because Keith has been quietly triaging every segfault and
I've been quietly fixing the bugs.  That all happens in GCC land.

> It's clear to me that we need to do something
> demangler-development-process-wise to prevent these sorts of bugs
> from manifesting only late in GDB, and I think we should definitely
> get in contact with the GCC folks about it.

Sure, but what do we say?  "Hey, you guys, it'd be great if you all
did some more testing on your code."  I can imagine the response.  I
don't have the expertise to write the kind of tests that are needed.

> I had another idea to aid demangler testing with g++.
> I had suggested before to run the demangler through the binaries
> produced by g++'s testsuite.  The whole point here is to validate
> the mangling as close as possible to the generator.
> The alternative idea is to teach g++ itself to try to demangle any
> symbol it generates/mangles (off by default, under a --param switch
> or some such).  A sort of making-it-very-easy-to-eat-your-own-dog-
> food methodology.
> We could then just run g++'s testsuite with that enabled.  No other
> infrastructure would be necessary.  And, we could ask g++
> maintainers to do that once in a while, and perhaps run it through
> whatever usual set of complicated C++ codebases they usually test
> g++ with.

That is a good idea (and something I could do myself) but you of all
people know I'm not free to spend the time it would take, so it will
remain a good idea indefinitely unless someone steps up to do the
work.

In any event, whatever future testing and/or bugfixing is performed,
the crash catcher still has merit.  I never plan on crashing my car,
but I still wear a seatbelt.

Cheers,
Gary

-- 
[1] https://sourceware.org/ml/gdb-patches/2014-05/msg00129.html
[2] https://sourceware.org/ml/gdb-patches/2014-05/msg00152.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] Demangler crash handler
  2014-05-22 14:40           ` Mark Kettenis
@ 2014-05-22 20:42             ` Gary Benson
  0 siblings, 0 replies; 50+ messages in thread
From: Gary Benson @ 2014-05-22 20:42 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: tromey, palves, fw, gdb-patches

Mark Kettenis wrote:
> > Date: Thu, 22 May 2014 15:09:04 +0100
> > From: Gary Benson <gbenson@redhat.com>
> > 
> > Tom Tromey wrote:
> > > Pedro> Then stealing a signal handler always has multi-threading
> > > Pedro> considerations.  E.g., gdb Python code could well spawn a
> > > Pedro> thread that happens to call something that wants its own
> > > Pedro> SIGSEGV handler...  Signal handlers are per-process, not
> > > Pedro> per-thread.
> > > 
> > > That is true in theory but I think it is unlikely in practice.
> > > And, should it happen -- well, the onus is on folks writing
> > > extensions not to mess things up.  That's the nature of the
> > > beast.  And, sure, it is messy, particularly if we ever upstream
> > > "import gdb", but even so, signals are just fraught and this is
> > > not an ordinary enough usage to justify preventing gdb from
> > > doing it.
> > 
> > GDB installs handlers for INT, TERM, QUIT, HUP, FPE, WINCH, CONT,
> > TTOU, TRAP, ALRM and TSTP, and some other platform-specific ones
> > I didn't recognise.  Is there anything that means SIGSEGV should
> > be treated differently to all these other signals?
> 
> From that list SIGFPE is probably a bogosity.  I don't think the
> SIGFPE handler will do the right thing on many OSes and
> architectures supported by GDB, since it is unspecified whether the
> trapping instruction will be re-executed upon return from the signal
> handler.  I'd argue that the SIGFPE handler is just as unhelpful as
> the SIGSEGV handler you're proposing.  Luckily, we don't seem to
> have a lot of division-by-zero bugs in the code base.

Fair enough.

> > > The choice is really between SEGV catching and "somebody else
> > > down the road fixes more demangler bugs".
> > 
> > The demangler bugs will get fixed one way or another.  The choice
> > is: do we allow users to continue to use GDB while the bug they've
> > hit is fixed, or, do we make them wait?  In the expectation that
> > they will put their own work aside while they fix GDB instead?
> 
> Unless there is a way to force a core dump (like internal_error()
> offers) with the state at the point of the SIGSEGV in it, yes, we
> need to make them wait or fix it themselves.
> 
> I'd really like to avoid adding a SIGSEGV handler altogether.  But
> I'm willing to compromise if the signal handler offers to
> opportunity to create a core dump.  Now doing so in a signal-safe
> way will be a bit tricky of course.

Thank you for your offer to compromise.  I appreciate that you want to
avoid a SIGSEGV handler, but I don't know of any other way to regain
control of the process after the fault.

Getting core files...

With a disabled-by-default SIGSEGV handler it's easy: there will be a
core file from the initial crash.  With enabled-by-default you'd have
to request the user disable the handler and repeat whatever they did.
This does admit the possibility that there may one day be a bug that
only ever occurred once but was caught by the handler so no core file
could ever be created.

I could put something like this in the signal handler to get a core
file with the correct state from an enabled-by-default handler:

  #ifdef HAVE_WORKING_FORK
    static int core_dumped = 0;

    if (!core_dumped)
      {
        if (fork () == 0)
          dump_core ();

        core_dumped = 1;
      }
  #endif

I could even wrap the entire signal handler and its installation with
#ifdef HAVE_WORKING_FORK, so no attempt was made to handle the signal
if a core file could not be made.

fork is async-signal-safe.  setrlimit and abort are not.  I don't
know what happens after the fork--will the child process will be in
the signal handler or not?--but if this is a possible way forward
for you then I will do some research into this.

I don't think you could *ask* the user if they wanted a core file.
No printing code is safe.  With the code snippet above the user
would get a core file regardless, but that's no different from the
current situation.

As an aside, while calling non-signal-safe functions technically
results in undefined behaviour, I understand the common failure modes
are segfault and deadlock. In either case the signal handler will
still be on the stack, with the demangler failure just below ready to
debug, though for the deadlock case the user would have to intervene
to create the corefile, for example by attaching another GDB and
aborting the inferior manually.

I have to say that in my experience fixing these bugs the core dump
(when provided) has only been of use to discover the symbol that
caused the failure.  At that point I've switched to live debugging;
there's some pretty useful debug code in the demangler but you need
to compile with -DCP_DEMANGLE_DEBUG to use it.

Thanks,
Gary

-- 
http://gbenson.net/

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2014-05-22 20:42 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-09 10:07 [PATCH 0/2] Demangler crash handler Gary Benson
2014-05-09 10:09 ` [PATCH 1/2] " Gary Benson
2014-05-09 10:10 ` [PATCH 2/2] " Gary Benson
2014-05-09 11:20 ` [PATCH 0/2] " Mark Kettenis
2014-05-09 15:33   ` Gary Benson
2014-05-11  5:17     ` Doug Evans
2014-05-13 10:20       ` Gary Benson
2014-05-13 19:29         ` Tom Tromey
2014-05-14 13:07           ` Gary Benson
2014-05-13 19:39         ` Tom Tromey
2014-05-14  9:15           ` Gary Benson
2014-05-11 20:23     ` Mark Kettenis
2014-05-13 10:21       ` Gary Benson
2014-05-13 16:05         ` Pedro Alves
2014-05-15 13:24           ` Gary Benson
2014-05-15 14:07             ` Pedro Alves
2014-05-15 14:28               ` Gary Benson
2014-05-15 15:25                 ` Pedro Alves
2014-05-16 11:06             ` Pedro Alves
2014-05-10 20:55   ` Florian Weimer
2014-05-11  5:10     ` Doug Evans
2014-05-13 10:22     ` Gary Benson
2014-05-13 18:22       ` Florian Weimer
2014-05-13 18:42         ` Pedro Alves
2014-05-13 19:16           ` Gary Benson
2014-05-13 19:19             ` Pedro Alves
2014-05-14  9:11               ` Gary Benson
2014-05-13 19:20           ` Florian Weimer
2014-05-13 19:22             ` Pedro Alves
2014-05-13 19:22         ` Gary Benson
2014-05-13 19:36           ` Tom Tromey
2014-05-14  9:13             ` Gary Benson
2014-05-14 14:18     ` Pedro Alves
2014-05-14 16:08       ` Andrew Burgess
2014-05-14 18:32         ` Pedro Alves
2014-05-15 13:25           ` Gary Benson
2014-05-15 16:01             ` Pedro Alves
2014-05-15 13:27       ` Gary Benson
2014-05-20 17:05       ` Tom Tromey
2014-05-20 18:40         ` Stan Shebs
2014-05-20 19:36           ` Tom Tromey
2014-05-20 20:23             ` Joel Brobecker
2014-05-22 12:56               ` Gary Benson
2014-05-22 13:09                 ` Joel Brobecker
2014-05-22 14:13                 ` Pedro Alves
2014-05-22 15:57                   ` Gary Benson
2014-05-22 13:18           ` Gary Benson
2014-05-22 14:09         ` Gary Benson
2014-05-22 14:40           ` Mark Kettenis
2014-05-22 20:42             ` Gary Benson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).