public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v3 1/1] Don't rewind PC for GHC generated frames
  2018-02-04  0:07 [PATCH v3 0/1] Don't rewind PC for GHC generated frames Bartosz Nitka
@ 2018-02-04  0:07 ` Bartosz Nitka
  2018-02-12 12:35   ` Bartosz Nitka
  2018-02-12 15:44   ` Simon Marchi
  2018-02-12 13:50 ` [PATCH v3 0/1] " Yao Qi
  1 sibling, 2 replies; 10+ messages in thread
From: Bartosz Nitka @ 2018-02-04  0:07 UTC (permalink / raw)
  To: gdb-patches; +Cc: Bartosz Nitka

GHC - the Haskell compiler generates code that violates one of
GDB's assumptions.

GDB assumes that the address in a frame was generated by the call
instruction and that it is the address right after the call
instruction (I'm paraphrasing the comment in get_frame_address_in_block).
So to get an address in the same block as the call instruction, one
has to substract 1. This is doubly beneficial because some functions
are "noreturn" and don't have further instructions after call,
so GDB would be looking at gibberish.

GHC generates completely different code. It uses jumps instead of call
and manages the stack itself. Furthermore every piece of code is
preceeded by some metadata called the Info Table. If we substract
from the program counter it ends up pointing to the Info Table,
which is undesirable.
GHC has a workaround for this [1] that works most of the time, it
basically lies in the DWARF data and extends the function one byte
backwards. That helps with making unwinding succeed most of the time,
but then the address is also used for looking up symbols and
they can't be resolved.

This change disables program counter rewinding for
compilation units generated by GHC.

Some additional context can be found here [2].

The impact.

Let's take an example from [2].
Here's the example Haskell program (fib.hs):

  fib :: Int -> Int
  fib 0 = 0
  fib 1 = 1
  fib n = fib (n-1) + fib (n-2)
  main :: IO ()
  main = print $ fib 20

If we run it with current GDB master we get:

  (gdb) br fib.hs:3
  Breakpoint 1 at 0x405588: file fib.hs, line 3.
  (gdb) r
  Breakpoint 1, Main_zdwfib_info () at fib.hs:3
  3       fib 1 = 1
  (gdb) bt
  #0  Main_zdwfib_info () at fib.hs:3
  #1  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #2  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #3  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #4  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #5  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #6  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #7  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #8  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #9  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #10 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #11 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #12 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #13 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #14 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #15 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #16 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #17 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #18 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #19 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
  #20 0x00000000004056d8 in Main_main2_info () at fib.hs:4
  #21 0x000000000040bab0 in base_GHCziIOziHandleziText_zdwwriteBlocks_info () at libraries/base/GHC/IO/Handle/Text.hs:595
  #22 0x0000000000488bf0 in ?? () at rts/Exception.cmm:332
  Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Here's an analogous session after this patch:

  (gdb) br fib.hs:3
  Breakpoint 1 at 0x405588: file fib.hs, line 3.
  (gdb) r
  Breakpoint 1, Main_zdwfib_info () at fib.hs:3
  3       fib 1 = 1
  (gdb) bt
  #0  Main_zdwfib_info () at fib.hs:3
  #1  Main_zdwfib_info () at fib.hs:4
  #2  Main_zdwfib_info () at fib.hs:4
  #3  Main_zdwfib_info () at fib.hs:4
  #4  Main_zdwfib_info () at fib.hs:4
  #5  Main_zdwfib_info () at fib.hs:4
  #6  Main_zdwfib_info () at fib.hs:4
  #7  Main_zdwfib_info () at fib.hs:4
  #8  Main_zdwfib_info () at fib.hs:4
  #9  Main_zdwfib_info () at fib.hs:4
  #10 Main_zdwfib_info () at fib.hs:4
  #11 Main_zdwfib_info () at fib.hs:4
  #12 Main_zdwfib_info () at fib.hs:4
  #13 Main_zdwfib_info () at fib.hs:4
  #14 Main_zdwfib_info () at fib.hs:4
  #15 Main_zdwfib_info () at fib.hs:4
  #16 Main_zdwfib_info () at fib.hs:4
  #17 Main_zdwfib_info () at fib.hs:4
  #18 Main_zdwfib_info () at fib.hs:4
  #19 Main_zdwfib_info () at fib.hs:4
  #20 0x00000000004056d8 in Main_main2_info () at fib.hs:4
  #21 base_GHCziIOziHandleziText_zdwwriteBlocks_info () at libraries/base/GHC/IO/Handle/Text.hs:582
  #22 stg_catch_frame_info () at rts/Exception.cmm:370
  #23 stg_stop_thread_info () at rts/StgStartup.cmm:42
  #24 0x00000000004a62ab in c2MH_str ()
  #25 0x00010000006c94c0 in ?? ()
  #26 0x00000000006e5e60 in ?? ()
  #27 0x00007fffffffdae8 in ?? ()
  #28 0x00007fffffffdcb0 in ?? ()
  #29 0x00000000006c94c0 in Main_main3_closure ()
  #30 0x00007fffffffdbd0 in ?? ()
  #31 0x0000000000405480 in ?? ()
  #32 0x00007fffffffdcb0 in ?? ()
  #33 0x0000000000000000 in ?? ()

There are a couple of things to note here.

First is that the unwinding got further, even a bit too far. It should
have ended on frame #23 on stg_stop_thread_info. That's the first thing
pushed on the stack when running a ligthweight thread. I'm not sure yet
how GHC is supposed to signal that it's the last thing on the stack and
if it's trying to do that.

Second thing is that we get the correct line number for frame #22 *and*
the symbol gets successfully resolved. Before the patch we were
substracting one, making the symbol lookup fail. I think that's also
what makes GDB give a cleaner output on frames #0 - #19.

Note that to get the same results as me, you have to compile the
compiler with debugging symbols as described on [2].

[1] https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/nativeGen/Dwarf/Types.hs;e9ae0cae9eb6a340473b339b5711ae76c6bdd045$399-417
[2] https://ghc.haskell.org/trac/ghc/wiki/DWARF

gdb/ChangeLog:

       * dwarf2read.c (process_full_comp_unit): Populate
       * producer_is_ghc.
       * frame.c (get_frame_address_in_block): Don't rewind the program
       counter for code generated by GHC.
       * symtab.h (struct compunit_symtab): Add producer_is_ghc.
---
 gdb/ChangeLog    | 7 +++++++
 gdb/dwarf2read.c | 4 ++++
 gdb/frame.c      | 9 ++++++++-
 gdb/symtab.h     | 3 +++
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 3ce980c8c3..6b35bf34b6 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,10 @@
+2018-02-01  Bartosz Nitka  <niteria@gmail.com>
+
+	* dwarf2read.c (process_full_comp_unit): Populate producer_is_ghc.
+	* frame.c (get_frame_address_in_block): Don't rewind the program
+	counter for code generated by GHC.
+	* symtab.h (struct compunit_symtab): Add producer_is_ghc.
+
 2018-02-01  Yao Qi  <yao.qi@linaro.org>
 
 	* arm-tdep.c (arm_record_data_proc_misc_ld_str): Rewrite it.
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index 51d0f39f75..2516c48741 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -10501,6 +10501,10 @@ process_full_comp_unit (struct dwarf2_per_cu_data *per_cu,
 	cust->epilogue_unwind_valid = 1;
 
       cust->call_site_htab = cu->call_site_htab;
+
+      if (startswith (cu->producer,
+            "The Glorious Glasgow Haskell Compilation System"))
+        cust->producer_is_ghc = 1;
     }
 
   if (dwarf2_per_objfile->using_index)
diff --git a/gdb/frame.c b/gdb/frame.c
index 1384ecca4f..9ff0dcb130 100644
--- a/gdb/frame.c
+++ b/gdb/frame.c
@@ -2458,7 +2458,14 @@ get_frame_address_in_block (struct frame_info *this_frame)
       && (get_frame_type (this_frame) == NORMAL_FRAME
 	  || get_frame_type (this_frame) == TAILCALL_FRAME
 	  || get_frame_type (this_frame) == INLINE_FRAME))
-    return pc - 1;
+    {
+      /* GHC intermixes metadata (info tables) with code, going back is
+         guaranteed to land us in the metadata.  */
+      struct compunit_symtab *cust = find_pc_compunit_symtab (pc);
+      if (cust != NULL && cust->producer_is_ghc)
+        return pc;
+      return pc - 1;
+    }
 
   return pc;
 }
diff --git a/gdb/symtab.h b/gdb/symtab.h
index f9d52e7697..c164e5ba5f 100644
--- a/gdb/symtab.h
+++ b/gdb/symtab.h
@@ -1432,6 +1432,9 @@ struct compunit_symtab
      instruction).  This is supported by GCC since 4.5.0.  */
   unsigned int epilogue_unwind_valid : 1;
 
+  /* This CU was produced by Glasgow Haskell Compiler */
+  unsigned int producer_is_ghc : 1;
+
   /* struct call_site entries for this compilation unit or NULL.  */
   htab_t call_site_htab;
 
-- 
2.14.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 0/1] Don't rewind PC for GHC generated frames
@ 2018-02-04  0:07 Bartosz Nitka
  2018-02-04  0:07 ` [PATCH v3 1/1] " Bartosz Nitka
  2018-02-12 13:50 ` [PATCH v3 0/1] " Yao Qi
  0 siblings, 2 replies; 10+ messages in thread
From: Bartosz Nitka @ 2018-02-04  0:07 UTC (permalink / raw)
  To: gdb-patches; +Cc: Bartosz Nitka

Hi Simon and everyone else,

Thank you Simon for taking a look at my patch.
The only thing that changed between v2 and v3 is the 
commit log.

The main improvement is an example that shows the impact
of this change. The example is fairly small, on bigger ones the
effect is bigger. I've been using GDB for debugging Haskell 
code and often you have to resort to manually unwinding the 
stack, which can be time consuming and isn't friendly to beginners.

I've also corrected the claim that substracting 1 would 
somehow land us at the beginning of the call instruction
and reworded a couple of sentences to make it read better.

As a side note, Haskell code can be pretty dense and a lot
of functionality can fit on one line. GHC already puts column
numbers in the debugging information, but from what I can tell
GDB doesn't expose that in any way. 
Would that be a worthwhile addition to GDB?

Thanks,
Bartosz

Bartosz Nitka (1):
  Don't rewind PC for GHC generated frames

 gdb/ChangeLog    | 7 +++++++
 gdb/dwarf2read.c | 4 ++++
 gdb/frame.c      | 9 ++++++++-
 gdb/symtab.h     | 3 +++
 4 files changed, 22 insertions(+), 1 deletion(-)

-- 
2.14.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/1] Don't rewind PC for GHC generated frames
  2018-02-04  0:07 ` [PATCH v3 1/1] " Bartosz Nitka
@ 2018-02-12 12:35   ` Bartosz Nitka
  2018-02-12 15:44   ` Simon Marchi
  1 sibling, 0 replies; 10+ messages in thread
From: Bartosz Nitka @ 2018-02-12 12:35 UTC (permalink / raw)
  To: gdb-patches, simon.marchi

Hi,

Is there anyway I can help to move this forward?

Thanks,
Bartosz

2018-02-04 0:06 GMT+00:00 Bartosz Nitka <niteria@gmail.com>:
> GHC - the Haskell compiler generates code that violates one of
> GDB's assumptions.
>
> GDB assumes that the address in a frame was generated by the call
> instruction and that it is the address right after the call
> instruction (I'm paraphrasing the comment in get_frame_address_in_block).
> So to get an address in the same block as the call instruction, one
> has to substract 1. This is doubly beneficial because some functions
> are "noreturn" and don't have further instructions after call,
> so GDB would be looking at gibberish.
>
> GHC generates completely different code. It uses jumps instead of call
> and manages the stack itself. Furthermore every piece of code is
> preceeded by some metadata called the Info Table. If we substract
> from the program counter it ends up pointing to the Info Table,
> which is undesirable.
> GHC has a workaround for this [1] that works most of the time, it
> basically lies in the DWARF data and extends the function one byte
> backwards. That helps with making unwinding succeed most of the time,
> but then the address is also used for looking up symbols and
> they can't be resolved.
>
> This change disables program counter rewinding for
> compilation units generated by GHC.
>
> Some additional context can be found here [2].
>
> The impact.
>
> Let's take an example from [2].
> Here's the example Haskell program (fib.hs):
>
>   fib :: Int -> Int
>   fib 0 = 0
>   fib 1 = 1
>   fib n = fib (n-1) + fib (n-2)
>   main :: IO ()
>   main = print $ fib 20
>
> If we run it with current GDB master we get:
>
>   (gdb) br fib.hs:3
>   Breakpoint 1 at 0x405588: file fib.hs, line 3.
>   (gdb) r
>   Breakpoint 1, Main_zdwfib_info () at fib.hs:3
>   3       fib 1 = 1
>   (gdb) bt
>   #0  Main_zdwfib_info () at fib.hs:3
>   #1  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #2  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #3  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #4  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #5  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #6  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #7  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #8  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #9  0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #10 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #11 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #12 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #13 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #14 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #15 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #16 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #17 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #18 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #19 0x00000000004055c8 in Main_zdwfib_info () at fib.hs:4
>   #20 0x00000000004056d8 in Main_main2_info () at fib.hs:4
>   #21 0x000000000040bab0 in base_GHCziIOziHandleziText_zdwwriteBlocks_info () at libraries/base/GHC/IO/Handle/Text.hs:595
>   #22 0x0000000000488bf0 in ?? () at rts/Exception.cmm:332
>   Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>
> Here's an analogous session after this patch:
>
>   (gdb) br fib.hs:3
>   Breakpoint 1 at 0x405588: file fib.hs, line 3.
>   (gdb) r
>   Breakpoint 1, Main_zdwfib_info () at fib.hs:3
>   3       fib 1 = 1
>   (gdb) bt
>   #0  Main_zdwfib_info () at fib.hs:3
>   #1  Main_zdwfib_info () at fib.hs:4
>   #2  Main_zdwfib_info () at fib.hs:4
>   #3  Main_zdwfib_info () at fib.hs:4
>   #4  Main_zdwfib_info () at fib.hs:4
>   #5  Main_zdwfib_info () at fib.hs:4
>   #6  Main_zdwfib_info () at fib.hs:4
>   #7  Main_zdwfib_info () at fib.hs:4
>   #8  Main_zdwfib_info () at fib.hs:4
>   #9  Main_zdwfib_info () at fib.hs:4
>   #10 Main_zdwfib_info () at fib.hs:4
>   #11 Main_zdwfib_info () at fib.hs:4
>   #12 Main_zdwfib_info () at fib.hs:4
>   #13 Main_zdwfib_info () at fib.hs:4
>   #14 Main_zdwfib_info () at fib.hs:4
>   #15 Main_zdwfib_info () at fib.hs:4
>   #16 Main_zdwfib_info () at fib.hs:4
>   #17 Main_zdwfib_info () at fib.hs:4
>   #18 Main_zdwfib_info () at fib.hs:4
>   #19 Main_zdwfib_info () at fib.hs:4
>   #20 0x00000000004056d8 in Main_main2_info () at fib.hs:4
>   #21 base_GHCziIOziHandleziText_zdwwriteBlocks_info () at libraries/base/GHC/IO/Handle/Text.hs:582
>   #22 stg_catch_frame_info () at rts/Exception.cmm:370
>   #23 stg_stop_thread_info () at rts/StgStartup.cmm:42
>   #24 0x00000000004a62ab in c2MH_str ()
>   #25 0x00010000006c94c0 in ?? ()
>   #26 0x00000000006e5e60 in ?? ()
>   #27 0x00007fffffffdae8 in ?? ()
>   #28 0x00007fffffffdcb0 in ?? ()
>   #29 0x00000000006c94c0 in Main_main3_closure ()
>   #30 0x00007fffffffdbd0 in ?? ()
>   #31 0x0000000000405480 in ?? ()
>   #32 0x00007fffffffdcb0 in ?? ()
>   #33 0x0000000000000000 in ?? ()
>
> There are a couple of things to note here.
>
> First is that the unwinding got further, even a bit too far. It should
> have ended on frame #23 on stg_stop_thread_info. That's the first thing
> pushed on the stack when running a ligthweight thread. I'm not sure yet
> how GHC is supposed to signal that it's the last thing on the stack and
> if it's trying to do that.
>
> Second thing is that we get the correct line number for frame #22 *and*
> the symbol gets successfully resolved. Before the patch we were
> substracting one, making the symbol lookup fail. I think that's also
> what makes GDB give a cleaner output on frames #0 - #19.
>
> Note that to get the same results as me, you have to compile the
> compiler with debugging symbols as described on [2].
>
> [1] https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/nativeGen/Dwarf/Types.hs;e9ae0cae9eb6a340473b339b5711ae76c6bdd045$399-417
> [2] https://ghc.haskell.org/trac/ghc/wiki/DWARF
>
> gdb/ChangeLog:
>
>        * dwarf2read.c (process_full_comp_unit): Populate
>        * producer_is_ghc.
>        * frame.c (get_frame_address_in_block): Don't rewind the program
>        counter for code generated by GHC.
>        * symtab.h (struct compunit_symtab): Add producer_is_ghc.
> ---
>  gdb/ChangeLog    | 7 +++++++
>  gdb/dwarf2read.c | 4 ++++
>  gdb/frame.c      | 9 ++++++++-
>  gdb/symtab.h     | 3 +++
>  4 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/gdb/ChangeLog b/gdb/ChangeLog
> index 3ce980c8c3..6b35bf34b6 100644
> --- a/gdb/ChangeLog
> +++ b/gdb/ChangeLog
> @@ -1,3 +1,10 @@
> +2018-02-01  Bartosz Nitka  <niteria@gmail.com>
> +
> +       * dwarf2read.c (process_full_comp_unit): Populate producer_is_ghc.
> +       * frame.c (get_frame_address_in_block): Don't rewind the program
> +       counter for code generated by GHC.
> +       * symtab.h (struct compunit_symtab): Add producer_is_ghc.
> +
>  2018-02-01  Yao Qi  <yao.qi@linaro.org>
>
>         * arm-tdep.c (arm_record_data_proc_misc_ld_str): Rewrite it.
> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
> index 51d0f39f75..2516c48741 100644
> --- a/gdb/dwarf2read.c
> +++ b/gdb/dwarf2read.c
> @@ -10501,6 +10501,10 @@ process_full_comp_unit (struct dwarf2_per_cu_data *per_cu,
>         cust->epilogue_unwind_valid = 1;
>
>        cust->call_site_htab = cu->call_site_htab;
> +
> +      if (startswith (cu->producer,
> +            "The Glorious Glasgow Haskell Compilation System"))
> +        cust->producer_is_ghc = 1;
>      }
>
>    if (dwarf2_per_objfile->using_index)
> diff --git a/gdb/frame.c b/gdb/frame.c
> index 1384ecca4f..9ff0dcb130 100644
> --- a/gdb/frame.c
> +++ b/gdb/frame.c
> @@ -2458,7 +2458,14 @@ get_frame_address_in_block (struct frame_info *this_frame)
>        && (get_frame_type (this_frame) == NORMAL_FRAME
>           || get_frame_type (this_frame) == TAILCALL_FRAME
>           || get_frame_type (this_frame) == INLINE_FRAME))
> -    return pc - 1;
> +    {
> +      /* GHC intermixes metadata (info tables) with code, going back is
> +         guaranteed to land us in the metadata.  */
> +      struct compunit_symtab *cust = find_pc_compunit_symtab (pc);
> +      if (cust != NULL && cust->producer_is_ghc)
> +        return pc;
> +      return pc - 1;
> +    }
>
>    return pc;
>  }
> diff --git a/gdb/symtab.h b/gdb/symtab.h
> index f9d52e7697..c164e5ba5f 100644
> --- a/gdb/symtab.h
> +++ b/gdb/symtab.h
> @@ -1432,6 +1432,9 @@ struct compunit_symtab
>       instruction).  This is supported by GCC since 4.5.0.  */
>    unsigned int epilogue_unwind_valid : 1;
>
> +  /* This CU was produced by Glasgow Haskell Compiler */
> +  unsigned int producer_is_ghc : 1;
> +
>    /* struct call_site entries for this compilation unit or NULL.  */
>    htab_t call_site_htab;
>
> --
> 2.14.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/1] Don't rewind PC for GHC generated frames
  2018-02-04  0:07 [PATCH v3 0/1] Don't rewind PC for GHC generated frames Bartosz Nitka
  2018-02-04  0:07 ` [PATCH v3 1/1] " Bartosz Nitka
@ 2018-02-12 13:50 ` Yao Qi
  2018-02-12 14:14   ` Bartosz Nitka
  1 sibling, 1 reply; 10+ messages in thread
From: Yao Qi @ 2018-02-12 13:50 UTC (permalink / raw)
  To: Bartosz Nitka; +Cc: GDB Patches

On Sun, Feb 4, 2018 at 12:06 AM, Bartosz Nitka <niteria@gmail.com> wrote:
> As a side note, Haskell code can be pretty dense and a lot
> of functionality can fit on one line. GHC already puts column
> numbers in the debugging information, but from what I can tell
> GDB doesn't expose that in any way.
> Would that be a worthwhile addition to GDB?

If what you meant is DW_LNS_set_column, then, other compilers
also generates it, and GDB doesn't use it.  Maybe, GDB can use
it in TUI mode, to blink on the right line and column, or GDB tells
MI front ends the column number so they can highlight accordingly.
What do you want to add to GDB?

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/1] Don't rewind PC for GHC generated frames
  2018-02-12 13:50 ` [PATCH v3 0/1] " Yao Qi
@ 2018-02-12 14:14   ` Bartosz Nitka
  2018-02-12 14:42     ` Yao Qi
  0 siblings, 1 reply; 10+ messages in thread
From: Bartosz Nitka @ 2018-02-12 14:14 UTC (permalink / raw)
  To: Yao Qi; +Cc: GDB Patches

2018-02-12 13:50 GMT+00:00 Yao Qi <qiyaoltc@gmail.com>:

> If what you meant is DW_LNS_set_column, then, other compilers
> also generates it, and GDB doesn't use it.

I'm not familiar with DW_LNS_set_column, perhaps GHC can be made to use it.
The current encoding is described in [1]. Based on my experimentation [2] lldb
seems to understand it. Note
    LineEntry: [0x00000000004056f7-0x0000000000405720):
/data/users/bnitka/tmp/fib.hs:3:9
there. The second number (9) is the column number.

> Maybe, GDB can use
> it in TUI mode, to blink on the right line and column, or GDB tells
> MI front ends the column number so they can highlight accordingly.

I'm not a regular user of TUI mode or other front ends, so this
wouldn't be the priority for me.

> What do you want to add to GDB?

Using the example from the summary. Currently gdb shows:

   (gdb) bt
   #0  Main_zdwfib_info () at fib.hs:3
   #1  Main_zdwfib_info () at fib.hs:4
   #2  Main_zdwfib_info () at fib.hs:4
   #3  Main_zdwfib_info () at fib.hs:4

I'd like it to show something like:

   (gdb) bt
   #0  Main_zdwfib_info () at fib.hs:3:9
   #1  Main_zdwfib_info () at fib.hs:4:20
   #2  Main_zdwfib_info () at fib.hs:4:20
   #3  Main_zdwfib_info () at fib.hs:4:20

Similarly, currently when you do:

  (gdb) info line *f1_info
  Line 4 of "fib.hs" starts at address...

I'd like to get:

  Line 4, column 20 of "fib.hs" starts at address...

The ability to break on line + column would also be welcome, but it's
not that important to me.

[1] https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/debug-info.html?highlight=dw_tag_ghc_src_note#dw-tag-ghc-src-note
[2] https://ghc.haskell.org/trac/ghc/wiki/DWARF#lldb

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/1] Don't rewind PC for GHC generated frames
  2018-02-12 14:14   ` Bartosz Nitka
@ 2018-02-12 14:42     ` Yao Qi
  2018-02-19 13:27       ` Bartosz Nitka
  0 siblings, 1 reply; 10+ messages in thread
From: Yao Qi @ 2018-02-12 14:42 UTC (permalink / raw)
  To: Bartosz Nitka; +Cc: GDB Patches

On Mon, Feb 12, 2018 at 2:14 PM, Bartosz Nitka <niteria@gmail.com> wrote:
> 2018-02-12 13:50 GMT+00:00 Yao Qi <qiyaoltc@gmail.com>:
>
>> If what you meant is DW_LNS_set_column, then, other compilers
>> also generates it, and GDB doesn't use it.
>
> I'm not familiar with DW_LNS_set_column, perhaps GHC can be made to use it.
> The current encoding is described in [1]. Based on my experimentation [2] lldb
> seems to understand it. Note
>     LineEntry: [0x00000000004056f7-0x0000000000405720):
> /data/users/bnitka/tmp/fib.hs:3:9
> there. The second number (9) is the column number.

GHC is completely new to me, so I don't understand the reason
DW_TAG_ghc_src_note was added to describe line/column.  To me,
it is .debug_line's responsibility.

>
>> What do you want to add to GDB?
>
> Using the example from the summary. Currently gdb shows:
>
>    (gdb) bt
>    #0  Main_zdwfib_info () at fib.hs:3
>    #1  Main_zdwfib_info () at fib.hs:4
>    #2  Main_zdwfib_info () at fib.hs:4
>    #3  Main_zdwfib_info () at fib.hs:4
>
> I'd like it to show something like:
>
>    (gdb) bt
>    #0  Main_zdwfib_info () at fib.hs:3:9
>    #1  Main_zdwfib_info () at fib.hs:4:20
>    #2  Main_zdwfib_info () at fib.hs:4:20
>    #3  Main_zdwfib_info () at fib.hs:4:20
>
> Similarly, currently when you do:
>
>   (gdb) info line *f1_info
>   Line 4 of "fib.hs" starts at address...
>
> I'd like to get:
>
>   Line 4, column 20 of "fib.hs" starts at address...
>

Yes, they are good improvements to GDB, IMO.

FAOD, it is good to me that we can show and use column
information in GDB.  If the column information is from
DW_LNS_set_column, that is absolutely fine to me, because
DW_LNS_set_column is in standard dwarf spec.  However,
if column information is from DW_TAG_ghc_src_note, we want
to support it, it should be discussed as "Haskell/GHC support in
GDB".  It is great if you can describe the reason that
DW_TAG_ghc_src_note was invented rather than using existing
dwarf line program to describe line/column information.

-- 
Yao (齐尧)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/1] Don't rewind PC for GHC generated frames
  2018-02-04  0:07 ` [PATCH v3 1/1] " Bartosz Nitka
  2018-02-12 12:35   ` Bartosz Nitka
@ 2018-02-12 15:44   ` Simon Marchi
  2018-02-19 13:10     ` Bartosz Nitka
  1 sibling, 1 reply; 10+ messages in thread
From: Simon Marchi @ 2018-02-12 15:44 UTC (permalink / raw)
  To: Bartosz Nitka; +Cc: gdb-patches

Hi Bartosz,

Thanks for the additional info in the commit message, it's very useful.

I hoped there would be some comments from others.  In particular, is 
anybody able to tell if adding a call to find_pc_compunit_symtab in 
get_frame_address_in_block is a performance concern?  How frequently is 
get_frame_address_in_block called, and how costly is 
find_pc_compunit_symtab to call?

I noted two small comments, no need to submit another version just for 
that, we can fix it before pushing.  However, please tell me if you are 
fine with my suggestion below.

> diff --git a/gdb/frame.c b/gdb/frame.c
> index 1384ecca4f..9ff0dcb130 100644
> --- a/gdb/frame.c
> +++ b/gdb/frame.c
> @@ -2458,7 +2458,14 @@ get_frame_address_in_block (struct frame_info
> *this_frame)
>        && (get_frame_type (this_frame) == NORMAL_FRAME
>  	  || get_frame_type (this_frame) == TAILCALL_FRAME
>  	  || get_frame_type (this_frame) == INLINE_FRAME))
> -    return pc - 1;
> +    {
> +      /* GHC intermixes metadata (info tables) with code, going back 
> is
> +         guaranteed to land us in the metadata.  */

I think the main reason why we don't want to subtract one is that the 
compiler generates return addresses that refer to the jump instruction 
that made the "call", and not to the instruction after, as with the call 
instruction.  I think the comment should reflect that.  What about:

/* GHC (Glasgow Haskell Compiler) uses jump and manages its own stack 
rather than
    using the call instruction.  The return address it generates is that 
of the
    jump instruction, not the following instruction.  It is therefore not
    necessary to go back one byte.  It would not be desirable anyway, 
because it
    intermixes metadat (info tables) with code, so going back would land 
us in the
    metadata.  */

> +      struct compunit_symtab *cust = find_pc_compunit_symtab (pc);
> +      if (cust != NULL && cust->producer_is_ghc)
> +        return pc;
> +      return pc - 1;
> +    }
> 
>    return pc;
>  }
> diff --git a/gdb/symtab.h b/gdb/symtab.h
> index f9d52e7697..c164e5ba5f 100644
> --- a/gdb/symtab.h
> +++ b/gdb/symtab.h
> @@ -1432,6 +1432,9 @@ struct compunit_symtab
>       instruction).  This is supported by GCC since 4.5.0.  */
>    unsigned int epilogue_unwind_valid : 1;
> 
> +  /* This CU was produced by Glasgow Haskell Compiler */

The comment should end with a period and two spaces.

Thanks,

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/1] Don't rewind PC for GHC generated frames
  2018-02-12 15:44   ` Simon Marchi
@ 2018-02-19 13:10     ` Bartosz Nitka
  2018-02-19 17:02       ` Simon Marchi
  0 siblings, 1 reply; 10+ messages in thread
From: Bartosz Nitka @ 2018-02-19 13:10 UTC (permalink / raw)
  To: Simon Marchi; +Cc: GDB Patches

Hi Simon,

Thanks for taking a look at the updated version.

> I hoped there would be some comments from others.

Is there anyone we could ask directly?

> In particular, is anybody
> able to tell if adding a call to find_pc_compunit_symtab in
> get_frame_address_in_block is a performance concern?  How frequently is
> get_frame_address_in_block called, and how costly is find_pc_compunit_symtab
> to call?

This is a valid concern.
When I originally made this change, I was focusing on making dwarf2_frame_cache
work. dwarf2_frame_cache calls get_frame_address_in_block and then calls
dwarf2_frame_find_quirks. dwarf2_frame_find_quirks calls
find_pc_compunit_symtab.
That makes me think that for this code path there should be no difference.

That's only one code path, I haven't analyzed others. If performance
turned out to be a
problem, then undoing the effect of get_frame_address_in_block in
dwarf2_frame_cache
would get me 90% of what I care about currently (unwinding).

That said, there must be a way to resolve this in a data driven way.
Are there any pre-existing benchmarks I can run?
Is there a scenario that I can benchmark to resolve the questions above?

> However, please tell me if you are fine with
> my suggestion below.

Yes, that's a good change. Thanks! If it helps move things along I'm
happy to submit
an updated version.

Thanks,
Bartosz

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/1] Don't rewind PC for GHC generated frames
  2018-02-12 14:42     ` Yao Qi
@ 2018-02-19 13:27       ` Bartosz Nitka
  0 siblings, 0 replies; 10+ messages in thread
From: Bartosz Nitka @ 2018-02-19 13:27 UTC (permalink / raw)
  To: Yao Qi; +Cc: GDB Patches

Hi Yao,

Thanks for taking a look.
I agree that GHC should use DW_LNS_set_column and I will try to see
what I can do about that.
My understanding is that DW_TAG_ghc_src_note was introduced because we
don't get enough
fidelity with DW_LNS_set_column. I think it's because we have ranges
for expression locations, and
sometimes the start points overlap, but I'm a bit hazy on the details.
I want to try DW_LNS_set_column and see how far I get with that.

One thing I don't understand is why lldb seems to understand
DW_TAG_ghc_src_note. I need to investigate a bit more.

Thanks,
Bartosz

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/1] Don't rewind PC for GHC generated frames
  2018-02-19 13:10     ` Bartosz Nitka
@ 2018-02-19 17:02       ` Simon Marchi
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Marchi @ 2018-02-19 17:02 UTC (permalink / raw)
  To: Bartosz Nitka; +Cc: GDB Patches

On 2018-02-19 08:10 AM, Bartosz Nitka wrote:
> Hi Simon,
> 
> Thanks for taking a look at the updated version.
> 
>> I hoped there would be some comments from others.
> 
> Is there anyone we could ask directly?

I don't have anybody in particular in mind.

>> In particular, is anybody
>> able to tell if adding a call to find_pc_compunit_symtab in
>> get_frame_address_in_block is a performance concern?  How frequently is
>> get_frame_address_in_block called, and how costly is find_pc_compunit_symtab
>> to call?
> 
> This is a valid concern.
> When I originally made this change, I was focusing on making dwarf2_frame_cache
> work. dwarf2_frame_cache calls get_frame_address_in_block and then calls
> dwarf2_frame_find_quirks. dwarf2_frame_find_quirks calls
> find_pc_compunit_symtab.
> That makes me think that for this code path there should be no difference.
> 
> That's only one code path, I haven't analyzed others. If performance
> turned out to be a
> problem, then undoing the effect of get_frame_address_in_block in
> dwarf2_frame_cache
> would get me 90% of what I care about currently (unwinding).
> 
> That said, there must be a way to resolve this in a data driven way.
> Are there any pre-existing benchmarks I can run?
> Is there a scenario that I can benchmark to resolve the questions above?

You can try running the tests in testsuite/gdb.perf (see the README in there).
These tests stress GDB in some very specific ways, so they may not catch
all performance regressions in general, but it might still be a good idea to run
them anyway.  Can you do that?

>> However, please tell me if you are fine with
>> my suggestion below.
> 
> Yes, that's a good change. Thanks! If it helps move things along I'm
> happy to submit
> an updated version.

I don't think it will be necessary, I can adjust the patch before pushing it.

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-02-19 17:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-04  0:07 [PATCH v3 0/1] Don't rewind PC for GHC generated frames Bartosz Nitka
2018-02-04  0:07 ` [PATCH v3 1/1] " Bartosz Nitka
2018-02-12 12:35   ` Bartosz Nitka
2018-02-12 15:44   ` Simon Marchi
2018-02-19 13:10     ` Bartosz Nitka
2018-02-19 17:02       ` Simon Marchi
2018-02-12 13:50 ` [PATCH v3 0/1] " Yao Qi
2018-02-12 14:14   ` Bartosz Nitka
2018-02-12 14:42     ` Yao Qi
2018-02-19 13:27       ` Bartosz Nitka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).