[Bug gdb/15573] New: Decode fatal signals to show faulting address, access type, etc.

public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed

* [Bug gdb/15573] New: Decode fatal signals to show faulting address, access type, etc.
@ 2013-06-04 18:37 luto at mit dot edu
  2013-06-05  9:08 ` [Bug gdb/15573] " palves at redhat dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: luto at mit dot edu @ 2013-06-04 18:37 UTC (permalink / raw)
  To: gdb-prs

http://sourceware.org/bugzilla/show_bug.cgi?id=15573

            Bug ID: 15573
           Summary: Decode fatal signals to show faulting address, access
                    type, etc.
           Product: gdb
           Version: 7.5
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: gdb
          Assignee: unassigned at sourceware dot org
          Reporter: luto at mit dot edu

I have a (buggy) program that segfaulted while running in gdb.  gdb said:

Program received signal SIGSEGV, Segmentation fault.

followed by a stacktrace.  If I weren't using gdb, my program's signal handler
would have run and displayed a far more useful error message:

Caught fatal signal: Segmentation fault (Address not mapped to object [0x28])
Dying due to fatal signal Segmentation fault in pid 14030 / tid 14030
The error was "not mapped" at address 28. The CPU reported page not present
reading from 28.

This is on x86_64.  That information comes from psiginfo (the first line) and a
custom decoder that reads SEGV_MAPERR as "not mapped" and pulls the number 28
from siginfo (the first time) and cr2 (the second time).  The "page not
present" part is the low bit of the error code (from ucontext); the alternative
is "protection violation", which is a different error.  The "reading from" part
is really quite handy when debugging; it distinguishes read faults from write
faults.  The alternatives are "executing from" and "writing to".

Having gdb decode this information would save a lot of time tracking down bugs.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug gdb/15573] Decode fatal signals to show faulting address, access type, etc.
  2013-06-04 18:37 [Bug gdb/15573] New: Decode fatal signals to show faulting address, access type, etc luto at mit dot edu
@ 2013-06-05  9:08 ` palves at redhat dot com
  2013-06-05  9:44 ` palves at redhat dot com
  2013-06-15  4:44 ` luto at mit dot edu
  2 siblings, 0 replies; 4+ messages in thread
From: palves at redhat dot com @ 2013-06-05  9:08 UTC (permalink / raw)
  To: gdb-prs

http://sourceware.org/bugzilla/show_bug.cgi?id=15573

Pedro Alves <palves at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |palves at redhat dot com

--- Comment #1 from Pedro Alves <palves at redhat dot com> ---
Note you can get at the siginfo with "p $_siginfo".

cr2 is in mcontext_t, which is in ucontext.  I don't know how GDB could get at
that info before the signal is actually delivered.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug gdb/15573] Decode fatal signals to show faulting address, access type, etc.
  2013-06-04 18:37 [Bug gdb/15573] New: Decode fatal signals to show faulting address, access type, etc luto at mit dot edu
  2013-06-05  9:08 ` [Bug gdb/15573] " palves at redhat dot com
@ 2013-06-05  9:44 ` palves at redhat dot com
  2013-06-15  4:44 ` luto at mit dot edu
  2 siblings, 0 replies; 4+ messages in thread
From: palves at redhat dot com @ 2013-06-05  9:44 UTC (permalink / raw)
  To: gdb-prs

http://sourceware.org/bugzilla/show_bug.cgi?id=15573

--- Comment #2 from Pedro Alves <palves at redhat dot com> ---
BTW, OOC, is the code for that signal handler of yours something you could
share?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug gdb/15573] Decode fatal signals to show faulting address, access type, etc.
  2013-06-04 18:37 [Bug gdb/15573] New: Decode fatal signals to show faulting address, access type, etc luto at mit dot edu
  2013-06-05  9:08 ` [Bug gdb/15573] " palves at redhat dot com
  2013-06-05  9:44 ` palves at redhat dot com
@ 2013-06-15  4:44 ` luto at mit dot edu
  2 siblings, 0 replies; 4+ messages in thread
From: luto at mit dot edu @ 2013-06-15  4:44 UTC (permalink / raw)
  To: gdb-prs

http://sourceware.org/bugzilla/show_bug.cgi?id=15573

--- Comment #3 from Andy Lutomirski <luto at mit dot edu> ---
It looks more or less like this.  I can provide some kind of license if it'll
be useful.

static void HandleFatalSignal(int sig, siginfo_t *info, void *context)
{
    /*
     * This is x86-specific and insanely poorly (wrongly?) documented.
     * I figured it by reading the kernel source.  --luto
     */
    struct ucontext *uc = (struct ucontext *)context;
    struct sigcontext *sc = (struct sigcontext *)&uc->uc_mcontext;

    psiginfo(info, "Caught fatal signal");

    std::cerr << "Dying due to fatal signal " << strsignal(sig)
              << " in pid " << getpid() << " / tid "
              << syscall(SYS_gettid) << std::endl;

    char causebuf[128];
    sprintf(causebuf, "code %d", info->si_code);

    const char *cause = causebuf;
    if (info->si_code == SI_USER)
        cause = "kill/raise";
    else if (info->si_code == SI_KERNEL)
        cause = "generic error from kernel";
    else if (info->si_code == SI_QUEUE)
        cause = "sigqueue";
    else if (info->si_code == SI_TKILL)
        cause = "tkill/tgkill";

    if (sig == SIGSEGV || sig == SIGBUS) {
        if (sig == SIGSEGV && info->si_code == SEGV_MAPERR)
            cause = "not mapped";
        else if (sig == SIGSEGV && info->si_code == SEGV_ACCERR)
            cause = "access error";
        else if (sig == SIGBUS && info->si_code == BUS_ADRALN)
            cause = "alignment error";
        else if (sig == SIGBUS && info->si_code == BUS_ADRERR)
            cause = "bad physical address";
        else if (sig == SIGBUS && info->si_code == BUS_OBJERR)
            cause = "object error";
        /* damnit, glibc
        else if (sig == SIGBUS && info->si_code == BUS_MCEERR_AR)
            cause = "mce; action required";
        else if (sig == SIGBUS && info->si_code == BUS_MCEERR_AO)
            cause = "mce; action optional";
        */

        void *cr2 = (void *)sc->cr2;

        // Decode the CPU error code (see Intel or AMD manual)
        const char *hw_reason = (sc->err & 1)
            ? "protection violation"
            : "page not present";
        const char *access_type;
        if (sc->err & 0x10)
            access_type = "executing from";
        else if (sc->err & 0x2)
            access_type = "writing to";
        else
            access_type = "reading from";

        std::cerr << "The error was \"" << cause << "\" at address "
                  << (void *)info->si_addr << ". The CPU reported "
                  << hw_reason << ' ' << access_type << ' '
                  << cr2 << '.' << std::endl;
    } else if (sig == SIGTRAP) {
        if (info->si_code == TRAP_BRKPT)
            cause = "breakpoint";
        else if (info->si_code == TRAP_TRACE)
            cause = "trace trap";
        /* damnit, glibc
        else if (info->si_code == TRAP_BRANCH)
            cause = "process taken branch trap";  // whatever that is...
        else if (info->si_code == TRAP_HWBKPT)
            cause = "hw breakpoint/watchpoint";
        */

        std::cerr << "The error was " << cause << std::endl;
    } else if (sig == SIGILL) {
        if (info->si_code == ILL_ILLOPC)
            cause = "illegal opcode";
        else if (info->si_code == ILL_ILLOPN)
            cause = "illegal operand";
        else if (info->si_code == ILL_ILLADR)
            cause = "illegal addressing mode";
        else if (info->si_code == ILL_ILLTRP)
            cause = "illegal trap";
        else if (info->si_code == ILL_PRVOPC)
            cause = "privileged opcode";
        else if (info->si_code == ILL_PRVREG)
            cause = "privileged register";  // not on x86...
        else if (info->si_code == ILL_COPROC)
            cause = "coprocessor error";  // yay '80s
        else if (info->si_code == ILL_BADSTK)
            cause = "internal stack error";

        std::cerr << "The error was " << cause << std::endl;
    } else {
        // TODO: We could also decode SIGFPE.

        std::cerr << "The error was " << cause << std::endl;
    }

#define SC(x) " " #x " = " << (void *)(uintptr_t)sc->x
    std::cerr << "Signal context:" << SC(rip) << '\n'
              << SC(rax) << SC(rbx) << SC(rcx) << SC(rdx) << '\n'
              << SC(rsi) << SC(rdi) << SC(rbp) << SC(rsp) << '\n'
              << SC(r8) << SC(r9) << SC(r10) << SC(r11) << '\n'
              << SC(r12) << SC(r13) << SC(r14) << SC(r15) << '\n'
              << SC(eflags) << SC(cs) << SC(gs) << SC(fs);
#undef SC
}

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-06-15  4:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-04 18:37 [Bug gdb/15573] New: Decode fatal signals to show faulting address, access type, etc luto at mit dot edu
2013-06-05  9:08 ` [Bug gdb/15573] " palves at redhat dot com
2013-06-05  9:44 ` palves at redhat dot com
2013-06-15  4:44 ` luto at mit dot edu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).