From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lndn.lancelotsix.com (vps-42846194.vps.ovh.net [IPv6:2001:41d0:801:2000::2400]) by sourceware.org (Postfix) with ESMTPS id DBB80385B83D for ; Wed, 4 Aug 2021 21:49:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DBB80385B83D Received: from ubuntu.lan (unknown [IPv6:2a02:390:9086::635]) by lndn.lancelotsix.com (Postfix) with ESMTPSA id 70C10818FB; Wed, 4 Aug 2021 21:49:42 +0000 (UTC) From: Lancelot SIX To: gdb-patches@sourceware.org Cc: Lancelot SIX Subject: [PATCH] gdb: riscv_scan_prologue: handle LD instruction Date: Wed, 4 Aug 2021 21:49:35 +0000 Message-Id: <20210804214935.704303-1-lsix@lancelotsix.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.11 (lndn.lancelotsix.com [0.0.0.0]); Wed, 04 Aug 2021 21:49:42 +0000 (UTC) X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 21:49:47 -0000 While working on the testsuite, I ended up noticing that GDB fails to produce a full backtrace from a thread waiting in pthread_join. When selecting the waiting thread and using the 'bt' command, the following result can be observed: (gdb) bt #0 0x0000003ff7fccd20 in __futex_abstimed_wait_common64 () from /lib/riscv64-linux-gnu/libpthread.so.0 #1 0x0000003ff7fc43da in __pthread_clockjoin_ex () from /lib/riscv64-linux-gnu/libpthread.so.0 Backtrace stopped: frame did not save the PC On my platform, I do not have debug symbols for glibc, so I need to rely on prologue analysis in order to unwind stack. Here is what the function prologue looks like: (gdb) disassemble __pthread_clockjoin_ex Dump of assembler code for function __pthread_clockjoin_ex: 0x0000003ff7fc42de <+0>: addi sp,sp,-144 0x0000003ff7fc42e0 <+2>: sd s5,88(sp) 0x0000003ff7fc42e2 <+4>: auipc s5,0xd 0x0000003ff7fc42e6 <+8>: ld s5,-2(s5) # 0x3ff7fd12e0 0x0000003ff7fc42ea <+12>: ld a5,0(s5) 0x0000003ff7fc42ee <+16>: sd ra,136(sp) 0x0000003ff7fc42f0 <+18>: sd s0,128(sp) 0x0000003ff7fc42f2 <+20>: sd s1,120(sp) 0x0000003ff7fc42f4 <+22>: sd s2,112(sp) 0x0000003ff7fc42f6 <+24>: sd s3,104(sp) 0x0000003ff7fc42f8 <+26>: sd s4,96(sp) 0x0000003ff7fc42fa <+28>: sd s6,80(sp) 0x0000003ff7fc42fc <+30>: sd s7,72(sp) 0x0000003ff7fc42fe <+32>: sd s8,64(sp) 0x0000003ff7fc4300 <+34>: sd s9,56(sp) 0x0000003ff7fc4302 <+36>: sd a5,40(sp) As far as prologue analysis is concerned, the most interesting part is done at address 0x0000003ff7fc42ee (<+16>): 'sd ra,136(sp)'. This stores the RA (return address) register on the stack, which is the information we are looking for in order to identify the caller. In the current implementation of the prologue scanner, GDB stops when hitting 0x0000003ff7fc42e6 (<+8>) because it does not know what to do with the 'ld' instruction. GDB thinks it reached the end of the prologue but have not yet reached the important part, which explain GDB's inability to unwind past this point. The section of the prologue starting at <+4> until <+12> is used to load the stack canary[1], which will then be placed on the stack at <+36> at the end of the prologue. In order to have the prologue properly handled, this commit proposes to add support for the ld instruction in the RISC-V prologue scanner. I guess riscv32 would use lw in such situation which I have not tested, and therefore not included it in this patch. With this patch applied, gdb is now able to unwind past pthread_join: (gdb) bt #0 0x0000003ff7fccd20 in __futex_abstimed_wait_common64 () from /lib/riscv64-linux-gnu/libpthread.so.0 #1 0x0000003ff7fc43da in __pthread_clockjoin_ex () from /lib/riscv64-linux-gnu/libpthread.so.0 #2 0x0000002aaaaaa88e in bar() () #3 0x0000002aaaaaa8c4 in foo() () #4 0x0000002aaaaaa8da in main () I have had a look to see if I could reproduce this easily, but in my simple testcases using '-fstack-protector-all', the canary is loaded after the RA register is saved. I do not have a reliable way of generating a prologue similar to the problematic one so I forged one instead. The testsuite have been run on riscv64 ubuntu 21.01 with no regression observed. [1] https://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries --- gdb/riscv-tdep.c | 23 +++++++ .../riscv64-unwind-prologue-with-ld-foo.s | 64 +++++++++++++++++++ .../riscv64-unwind-prologue-with-ld.c | 30 +++++++++ .../riscv64-unwind-prologue-with-ld.exp | 45 +++++++++++++ 4 files changed, 162 insertions(+) create mode 100644 gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld-foo.s create mode 100644 gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld.c create mode 100644 gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld.exp diff --git a/gdb/riscv-tdep.c b/gdb/riscv-tdep.c index b5b0d2d79de..8ce619ca805 100644 --- a/gdb/riscv-tdep.c +++ b/gdb/riscv-tdep.c @@ -1409,6 +1409,7 @@ class riscv_insn LUI, SD, SW, + LD, /* These are needed for software breakpoint support. */ JAL, JALR, @@ -1519,6 +1520,15 @@ class riscv_insn m_imm.s = EXTRACT_CITYPE_IMM (ival); } + /* Helper for DECODE, decode 16-bit compressed CL-type instruction. */ + void decode_cl_type_insn (enum opcode opcode, ULONGEST ival) + { + m_opcode = opcode; + m_rd = decode_register_index_short (ival, OP_SH_CRS2S); + m_rs1 = decode_register_index_short (ival, OP_SH_CRS1S); + m_imm.s = EXTRACT_CLTYPE_IMM (ival); + } + /* Helper for DECODE, decode 32-bit S-type instruction. */ void decode_s_type_insn (enum opcode opcode, ULONGEST ival) { @@ -1715,6 +1725,8 @@ riscv_insn::decode (struct gdbarch *gdbarch, CORE_ADDR pc) decode_r_type_insn (SC, ival); else if (is_ecall_insn (ival)) decode_i_type_insn (ECALL, ival); + else if (is_ld_insn (ival)) + decode_i_type_insn (LD, ival); else /* None of the other fields are valid in this case. */ m_opcode = OTHER; @@ -1783,6 +1795,8 @@ riscv_insn::decode (struct gdbarch *gdbarch, CORE_ADDR pc) decode_cb_type_insn (BEQ, ival); else if (is_c_bnez_insn (ival)) decode_cb_type_insn (BNE, ival); + else if (is_c_ld_insn (ival)) + decode_cl_type_insn (LD, ival); else /* None of the other fields of INSN are valid in this case. */ m_opcode = OTHER; @@ -1931,6 +1945,15 @@ riscv_scan_prologue (struct gdbarch *gdbarch, gdb_assert (insn.rs2 () < RISCV_NUM_INTEGER_REGS); regs[insn.rd ()] = pv_add (regs[insn.rs1 ()], regs[insn.rs2 ()]); } + else if (insn.opcode () == riscv_insn::LD) + { + /* Handle: ld reg, offset(rs1). */ + gdb_assert (insn.rd () < RISCV_NUM_INTEGER_REGS); + gdb_assert (insn.rs1 () < RISCV_NUM_INTEGER_REGS); + regs[insn.rd ()] + = stack.fetch (pv_add_constant (regs[insn.rs1 ()], + insn.imm_signed ()), 8); + } else { end_prologue_addr = cur_pc; diff --git a/gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld-foo.s b/gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld-foo.s new file mode 100644 index 00000000000..8666412457a --- /dev/null +++ b/gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld-foo.s @@ -0,0 +1,64 @@ +/* Copyright 2021 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +/* This testcase contains a function where the 'ld' instruction is used + in the prologue before the RA register have been saved on the stack. + + This mimics a pattern observed in the __pthread_clockjoin_ex function + in libpthread.so.0 (from glibc-2.33-0ubuntu5) where a canary value is + loaded and placed on the stack in order to detect stack smashing. + + The skeleton for this file was generated using the following command: + + gcc -x c -S -c -o - - <. */ + +/* See riscv64-unwind-prologue-with-ld-foo.s for implementation. */ +extern int foo (void); + +int +bar () +{ + return 0; +} + +int +main () +{ + return foo (); +} + diff --git a/gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld.exp b/gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld.exp new file mode 100644 index 00000000000..d3f3058ddff --- /dev/null +++ b/gdb/testsuite/gdb.arch/riscv64-unwind-prologue-with-ld.exp @@ -0,0 +1,45 @@ +# Copyright 2021 Free Software Foundation, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# This tests GDB's ability to use the RISC-V prologue scanner in order to +# unwind through a function that uses the 'ld' instruction in its prologue. + +if {![istarget "riscv64-*-*"]} { + verbose "Skipping ${gdb_test_file_name}." + return +} + +standard_testfile riscv64-unwind-prologue-with-ld.c \ + riscv64-unwind-prologue-with-ld-foo.s +if {[prepare_for_testing "failed to prepare" $testfile \ + "$srcfile $srcfile2" nodebug]} { + return -1 +} + +if ![runto_main] then { + fail "can't run to main" + return 0 +} + +gdb_breakpoint "bar" +gdb_continue_to_breakpoint "bar" +gdb_test "bt" \ + [multi_line \ + "#0\[ \t\]*$hex in bar \\\(\\\)" \ + "#1\[ \t\]*$hex in foo \\\(\\\)" \ + "#2\[ \t\]*$hex in main \\\(\\\)"] \ + "Backtrace to the main frame" +gdb_test "finish" "foo \\\(\\\)" "finish bar" +gdb_test "finish" "main \\\(\\\)" "finish foo" -- 2.30.2