[PATCH v3 0/2] Fix a couple undefined behaviors flagged by UBSan

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH v3 0/2] Fix a couple undefined behaviors flagged by UBSan
@ 2022-04-07 20:06 Pedro Alves
  2022-04-07 20:06 ` [PATCH v3 1/2] Fix undefined behavior in the Fortran, Go and Pascal number parsers Pedro Alves
  2022-04-07 20:06 ` [PATCH v3 2/2] gdb: Avoid undefined shifts, fix Go shifts Pedro Alves
  0 siblings, 2 replies; 8+ messages in thread
From: Pedro Alves @ 2022-04-07 20:06 UTC (permalink / raw)
  To: gdb-patches

New in v3:
  - The scheme patch has been merged meanwhile.

  - There's a new patch to fix "print 0xffffffffffffffff" in a couple
    languages.  This also triggered undefined behavior.  Discovered
    while extending the bit shift testcase to run to on all languages.

  - The patch that fixes bit shifts in all languages was improved
    substantially:

    - The testcase now exercises all languages that have shift
      operators, instead of just C.

    - The testcase now uses with_test_prefix.

    - properly handles the case where the rhs of a shift is
      0xffffffffffffffff (max 64-bit), and only warning that being
      negative if it really is negative.

    - error out on negative shifts for Go, warn on other languages.

    - don't error or warn on too-large shift for Go, as that's
      actually defined in Go.

    - Right shifts of a too-large count now result in -1 if the result
      is signed, instead of 0.

Running the testsuite against GDBserver with a GDB built with
--enable-ubsan, I noticed a few problems.  This mini-series fixes a
couple.

Pedro Alves (2):
  Fix undefined behavior in the Fortran, Go and Pascal number parsers
  gdb: Avoid undefined shifts, fix Go shifts

 gdb/f-exp.y                                   |   8 +-
 gdb/go-exp.y                                  |  10 +-
 gdb/p-exp.y                                   |  12 +-
 .../gdb.base/all-architectures.exp.tcl        |  32 +-
 gdb/testsuite/gdb.base/bitshift.exp           | 368 ++++++++++++++++++
 gdb/testsuite/gdb.base/parse_number.exp       |  53 +++
 gdb/testsuite/lib/gdb.exp                     |  20 +
 gdb/valarith.c                                | 103 ++++-
 8 files changed, 560 insertions(+), 46 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/bitshift.exp
 create mode 100644 gdb/testsuite/gdb.base/parse_number.exp


base-commit: 591cc9fbbfd6d51131c0f1d4a92e7893edcc7a28
-- 
2.26.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/2] Fix undefined behavior in the Fortran, Go and Pascal number parsers
  2022-04-07 20:06 [PATCH v3 0/2] Fix a couple undefined behaviors flagged by UBSan Pedro Alves
@ 2022-04-07 20:06 ` Pedro Alves
  2022-04-08 14:44   ` Tom Tromey
  2022-04-07 20:06 ` [PATCH v3 2/2] gdb: Avoid undefined shifts, fix Go shifts Pedro Alves
  1 sibling, 1 reply; 8+ messages in thread
From: Pedro Alves @ 2022-04-07 20:06 UTC (permalink / raw)
  To: gdb-patches

This commit ports these two fixes to the C parser:

  commit ebf13736b42af47c9907b5157c8e80c78dbe00e1
  CommitDate: Thu Sep 4 21:46:28 2014 +0100

      parse_number("0") reads uninitialized memory

  commit 20562150d8a894bc91657c843ee88c508188e32e
  CommitDate: Wed Oct 3 15:19:06 2018 -0600

      Avoid undefined behavior in parse_number

... to the Fortran, Go, and Fortran number parsers, fixing the same
problems there.

Also add a new testcase that exercises printing 0xffffffffffffffff
(max 64-bit) in all languages, which crashes a GDB built with UBsan
without the fix.

I moved get_set_option_choices out of all-architectures.exp.tcl to
common code to be able to extract all the supported languages.  I did
a tweak to it to generalize it a bit -- you now have to pass down the
"set" part of the command as well.  This is so that the proc can be
used with "maintenance set" commands as well in future.

Change-Id: I8e8f2fdc1e8407f63d923c26fd55d98148b9e16a
---
 gdb/f-exp.y                                   |  8 +--
 gdb/go-exp.y                                  | 10 ++--
 gdb/p-exp.y                                   | 12 ++---
 .../gdb.base/all-architectures.exp.tcl        | 32 +++--------
 gdb/testsuite/gdb.base/parse_number.exp       | 53 +++++++++++++++++++
 gdb/testsuite/lib/gdb.exp                     | 20 +++++++
 6 files changed, 93 insertions(+), 42 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/parse_number.exp

diff --git a/gdb/f-exp.y b/gdb/f-exp.y
index 9cba30f6837..c5876e13e8f 100644
--- a/gdb/f-exp.y
+++ b/gdb/f-exp.y
@@ -834,8 +834,8 @@ static int
 parse_number (struct parser_state *par_state,
 	      const char *p, int len, int parsed_float, YYSTYPE *putithere)
 {
-  LONGEST n = 0;
-  LONGEST prevn = 0;
+  ULONGEST n = 0;
+  ULONGEST prevn = 0;
   int c;
   int base = input_radix;
   int unsigned_p = 0;
@@ -866,7 +866,7 @@ parse_number (struct parser_state *par_state,
     }
 
   /* Handle base-switching prefixes 0x, 0t, 0d, 0 */
-  if (p[0] == '0')
+  if (p[0] == '0' && len > 1)
     switch (p[1])
       {
       case 'x':
@@ -926,7 +926,7 @@ parse_number (struct parser_state *par_state,
       /* If range checking enabled, portably test for unsigned overflow.  */
       if (RANGE_CHECK && n != 0)
 	{
-	  if ((unsigned_p && (unsigned)prevn >= (unsigned)n))
+	  if ((unsigned_p && prevn >= n))
 	    range_error (_("Overflow on numeric constant."));
 	}
       prevn = n;
diff --git a/gdb/go-exp.y b/gdb/go-exp.y
index 456920456cd..07943b1c8c3 100644
--- a/gdb/go-exp.y
+++ b/gdb/go-exp.y
@@ -649,8 +649,8 @@ parse_number (struct parser_state *par_state,
 {
   /* FIXME: Shouldn't these be unsigned?  We don't deal with negative values
      here, and we do kind of silly things like cast to unsigned.  */
-  LONGEST n = 0;
-  LONGEST prevn = 0;
+  ULONGEST n = 0;
+  ULONGEST prevn = 0;
   ULONGEST un;
 
   int i = 0;
@@ -702,7 +702,7 @@ parse_number (struct parser_state *par_state,
     }
 
   /* Handle base-switching prefixes 0x, 0t, 0d, 0.  */
-  if (p[0] == '0')
+  if (p[0] == '0' && len > 1)
     switch (p[1])
       {
       case 'x':
@@ -790,7 +790,7 @@ parse_number (struct parser_state *par_state,
 	 on 0x123456789 when LONGEST is 32 bits.  */
       if (c != 'l' && c != 'u' && n != 0)
 	{
-	  if ((unsigned_p && (ULONGEST) prevn >= (ULONGEST) n))
+	  if ((unsigned_p && prevn >= n))
 	    error (_("Numeric constant too large."));
 	}
       prevn = n;
@@ -808,7 +808,7 @@ parse_number (struct parser_state *par_state,
      the case where it is we just always shift the value more than
      once, with fewer bits each time.  */
 
-  un = (ULONGEST)n >> 2;
+  un = n >> 2;
   if (long_p == 0
       && (un >> (gdbarch_int_bit (par_state->gdbarch ()) - 2)) == 0)
     {
diff --git a/gdb/p-exp.y b/gdb/p-exp.y
index 024a335453a..df09e392e8b 100644
--- a/gdb/p-exp.y
+++ b/gdb/p-exp.y
@@ -802,10 +802,8 @@ static int
 parse_number (struct parser_state *par_state,
 	      const char *p, int len, int parsed_float, YYSTYPE *putithere)
 {
-  /* FIXME: Shouldn't these be unsigned?  We don't deal with negative values
-     here, and we do kind of silly things like cast to unsigned.  */
-  LONGEST n = 0;
-  LONGEST prevn = 0;
+  ULONGEST n = 0;
+  ULONGEST prevn = 0;
   ULONGEST un;
 
   int i = 0;
@@ -854,7 +852,7 @@ parse_number (struct parser_state *par_state,
     }
 
   /* Handle base-switching prefixes 0x, 0t, 0d, 0.  */
-  if (p[0] == '0')
+  if (p[0] == '0' && len > 1)
     switch (p[1])
       {
       case 'x':
@@ -932,7 +930,7 @@ parse_number (struct parser_state *par_state,
 	 on 0x123456789 when LONGEST is 32 bits.  */
       if (c != 'l' && c != 'u' && n != 0)
 	{
-	  if ((unsigned_p && (ULONGEST) prevn >= (ULONGEST) n))
+	  if (unsigned_p && prevn >= n)
 	    error (_("Numeric constant too large."));
 	}
       prevn = n;
@@ -950,7 +948,7 @@ parse_number (struct parser_state *par_state,
      the case where it is we just always shift the value more than
      once, with fewer bits each time.  */
 
-  un = (ULONGEST)n >> 2;
+  un = n >> 2;
   if (long_p == 0
       && (un >> (gdbarch_int_bit (par_state->gdbarch ()) - 2)) == 0)
     {
diff --git a/gdb/testsuite/gdb.base/all-architectures.exp.tcl b/gdb/testsuite/gdb.base/all-architectures.exp.tcl
index 94b5efe80f3..d6679a1e9a1 100644
--- a/gdb/testsuite/gdb.base/all-architectures.exp.tcl
+++ b/gdb/testsuite/gdb.base/all-architectures.exp.tcl
@@ -91,45 +91,25 @@ proc gdb_test_internal {cmd pattern {message ""}} {
 gdb_test_internal "set max-completions unlimited" \
     "^set max-completions unlimited"
 
-# Return a list of all the accepted values of "set WHAT".
-
-proc get_set_option_choices {what} {
-    global gdb_prompt
-
-    set values {}
-
-    set test "complete set $what"
-    gdb_test_multiple "complete set $what " "$test" {
-	-re "set $what (\[^\r\n\]+)\r\n" {
-	    lappend values $expect_out(1,string)
-	    exp_continue
-	}
-	-re "$gdb_prompt " {
-	    internal_pass $test
-	}
-    }
-    return $values
-}
-
-set supported_archs [get_set_option_choices "architecture"]
+set supported_archs [get_set_option_choices "set architecture"]
 # There should be at least one more than "auto".
 gdb_assert {[llength $supported_archs] > 1} "at least one architecture"
 
-set supported_osabis [get_set_option_choices "osabi"]
+set supported_osabis [get_set_option_choices "set osabi"]
 # There should be at least one more than "auto" and "default".
 gdb_assert {[llength $supported_osabis] > 2} "at least one osabi"
 
 if {[lsearch $supported_archs "mips"] >= 0} {
-    set supported_mipsfpu [get_set_option_choices "mipsfpu"]
-    set supported_mips_abi [get_set_option_choices "mips abi"]
+    set supported_mipsfpu [get_set_option_choices "set mipsfpu"]
+    set supported_mips_abi [get_set_option_choices "set mips abi"]
 
     gdb_assert {[llength $supported_mipsfpu] != 0} "at least one mipsfpu"
     gdb_assert {[llength $supported_mips_abi] != 0} "at least one mips abi"
 }
 
 if {[lsearch $supported_archs "arm"] >= 0} {
-    set supported_arm_fpu [get_set_option_choices "arm fpu"]
-    set supported_arm_abi [get_set_option_choices "arm abi"]
+    set supported_arm_fpu [get_set_option_choices "set arm fpu"]
+    set supported_arm_abi [get_set_option_choices "set arm abi"]
 
     gdb_assert {[llength $supported_arm_fpu] != 0} "at least one arm fpu"
     gdb_assert {[llength $supported_arm_abi] != 0} "at least one arm abi"
diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
new file mode 100644
index 00000000000..2cf0594bf1b
--- /dev/null
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -0,0 +1,53 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test parsing numbers.  Several language parsers had the same bug
+# around parsing large 64-bit numbers, hitting undefined behavior, and
+# thus crashing a GDB built with UBsan.  This testcase goes over all
+# languages exercising printing the max 64-bit number, making sure
+# that GDB doesn't crash.
+
+proc test_parse_numbers {} {
+    clean_restart
+
+    set all_languages [get_set_option_choices "set language"]
+    foreach_with_prefix lang $all_languages {
+	gdb_test "set language $lang"
+
+	set val "0xffffffffffffffff"
+	if {$lang == "ada"} {
+	    gdb_test "p/x $val" "Integer literal out of range"
+	} elseif {$lang == "fortran"} {
+	    gdb_test "p/x $val" " = 0xffffffff"
+	    gdb_test "ptype $val" " = unsigned int"
+	} elseif {$lang == "modula-2"} {
+	    gdb_test "p/x 0FFFFFFFFFFFFFFFFH" "Overflow on numeric constant\\."
+	} elseif {$lang == "unknown"} {
+	    gdb_test "p/x $val" \
+		"expression parsing not implemented for language \"Unknown\""
+	} else {
+	    gdb_test "p/x $val" " = $val"
+	    if {$lang == "d"} {
+		gdb_test "ptype $val" " = ulong"
+	    } elseif {$lang == "rust"} {
+		gdb_test "ptype $val" " = i64"
+	    } else {
+		gdb_test "ptype $val" " = unsigned long long"
+	    }
+	}
+    }
+}
+
+test_parse_numbers
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 0b242b64992..cd8e10f98b7 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -8471,5 +8471,25 @@ gdb_caching_proc has_hw_wp_support {
     return $has_hw_wp_support
 }
 
+# Return a list of all the accepted values of the set command SET_CMD.
+
+proc get_set_option_choices {set_cmd} {
+    global gdb_prompt
+
+    set values {}
+
+    set test "complete $set_cmd"
+    gdb_test_multiple "complete $set_cmd " "$test" {
+	-re "$set_cmd (\[^\r\n\]+)\r\n" {
+	    lappend values $expect_out(1,string)
+	    exp_continue
+	}
+	-re "$gdb_prompt " {
+	    pass $test
+	}
+    }
+    return $values
+}
+
 # Always load compatibility stuff.
 load_lib future.exp
-- 
2.26.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] gdb: Avoid undefined shifts, fix Go shifts
  2022-04-07 20:06 [PATCH v3 0/2] Fix a couple undefined behaviors flagged by UBSan Pedro Alves
  2022-04-07 20:06 ` [PATCH v3 1/2] Fix undefined behavior in the Fortran, Go and Pascal number parsers Pedro Alves
@ 2022-04-07 20:06 ` Pedro Alves
  2022-04-08 14:55   ` Tom Tromey
  1 sibling, 1 reply; 8+ messages in thread
From: Pedro Alves @ 2022-04-07 20:06 UTC (permalink / raw)
  To: gdb-patches

I noticed that a build of GDB with GCC + --enable-ubsan, testing
against GDBserver showed this GDB crash:

  (gdb) PASS: gdb.trace/trace-condition.exp: trace: 0x00abababcdcdcdcd << 46 == 0x7373400000000000: advance to trace begin
  tstart
  ../../src/gdb/valarith.c:1365:15: runtime error: left shift of 48320975398096333 by 46 places cannot be represented in type 'long int'
  ERROR: GDB process no longer exists
  GDB process exited with wait status 269549 exp9 0 1
  UNRESOLVED: gdb.trace/trace-condition.exp: trace: 0x00abababcdcdcdcd << 46 == 0x7373400000000000: start trace experiment

The problem is that, "0x00abababcdcdcdcd << 46" is an undefined signed
left shift, because the result is not representable in the type of the
lhs, which is signed.  This actually became defined in C++20, and if
you compile with "g++ -std=c++20 -Wall", you'll see that GCC no longer
warns about it, while it warns if you specify prior language versions.

While at it, there are a couple other situations that are undefined
(and are still undefined in C++20) and result in GDB dying: shifting
by a negative ammount, or by >= than the bit size of the promoted lhs.
For the latter, GDB shifts using (U)LONGEST internally, so you have to
shift by >= 64 bits to see it:

 $ gdb --batch -q -ex "p 1 << -1"
 ../../src/gdb/valarith.c:1365:15: runtime error: shift exponent -1 is negative
 $ # gdb exited

 $ gdb --batch -q -ex "p 1 << 64"
 ../../src/gdb/valarith.c:1365:15: runtime error: shift exponent 64 is too large for 64-bit type 'long int'
 $ # gdb exited

Also, right shifting a negative value is implementation-defined
(before C++20, after which it is defined).  For this, I chose to
change nothing in GDB other than adding tests, as I don't really know
whether we need to do anything.  AFAIK, most implementations do an
arithmetic right shift, and it may be we don't support any host or
target that behaves differently.  Plus, this becomes defined in C++20
exactly as arithmetic right shift.

Compilers don't error out on such shifts, at best they warn, so I
think GDB should just continue doing the shifts anyhow too.

Thus:

- Adjust scalar_binop to avoid the undefined paths, either by adding
  explicit result paths, or by casting the lhs of the left shift to
  unsigned, as appropriate.

  For the shifts by a too-large count, I made the result be what you'd
  get if you split the large count in a series of smaller shifts.
  Thus:

     Left shift, positive or negative lhs:

       V << 64
	 =>  V << 16 << 16 << 16 << 16
	   => 0

     Right shift, positive lhs:

       Vpos >> 64 ()
	 =>  Vpos >> 16 >> 16 >> 16 >> 16
	   => 0

     Right shift, negative lhs:

       Vneg >> 64
	 =>  Vneg >> 16 >> 16 >> 16 >> 16
	   => -1

  This is actually Go's semantics (the compiler really emits
  instructions to make it so that you get 0 or -1 if you have a
  too-large shift).  So for that language GDB does the shift and
  nothing else.  For other C-like languages where such a shift is
  undefined, GDB warns in addition to performing the shift.

  For shift by a negative count, for Go, this is a hard error.  For
  other languages, since their compilers only warn, I made GDB warn
  too.  The semantics I chose (we're free to pick them since this is
  undefined behavior) is as-if you had shifted by the count cast to
  unsigned, thus as if you had shifted by a too-large count, thus the
  same as the previous scenario illustrated above.

  Examples:

    (gdb) set language go
    (gdb) p 1 << 100
    $1 = 0
    (gdb) p -1 << 100
    $2 = 0
    (gdb) p 1 >> 100
    $3 = 0
    (gdb) p -1 >> 100
    $4 = -1
    (gdb) p -2 >> 100
    $5 = -1
    (gdb) p 1 << -1
    left shift count is negative

    (gdb) set language c
    (gdb) p -2 >> 100
    warning: right shift count >= width of type
    $6 = -1
    (gdb) p -2 << 100
    warning: left shift count >= width of type
    $7 = 0
    (gdb) p 1 << -1
    warning: left shift count is negative
    $8 = 0
    (gdb) p -1 >> -1
    warning: right shift count is negative
    $9 = -1

- The warnings' texts are the same as what GCC prints.

- Add comprehensive tests in a new gdb.base/bitshift.exp testcase, so
  that we exercise all these scenarios.

Change-Id: I8bcd5fa02de3114b7ababc03e65702d86ec8d45d
---
 gdb/testsuite/gdb.base/bitshift.exp | 368 ++++++++++++++++++++++++++++
 gdb/valarith.c                      | 103 +++++++-
 2 files changed, 467 insertions(+), 4 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/bitshift.exp

diff --git a/gdb/testsuite/gdb.base/bitshift.exp b/gdb/testsuite/gdb.base/bitshift.exp
new file mode 100644
index 00000000000..6308478d848
--- /dev/null
+++ b/gdb/testsuite/gdb.base/bitshift.exp
@@ -0,0 +1,368 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test left and right bit shifting, in all languages that have such
+# operator.
+
+clean_restart
+
+# Test a print command that prints out RESULT_RE.  If WARNING_OR_ERROR
+# is non-empty, it is expected that for languages other that Go, GDB
+# prints this warning before the print result.  For Go, this is an
+# expected error.  If WARNING_OR_ERROR is empty, it is expected that
+# GDB prints no text other than the print result.
+proc test_shift {lang cmd result_re {warning_or_error ""}} {
+    set cmd_re [string_to_regexp $cmd]
+
+    if {$lang == "go"} {
+	if {$warning_or_error != ""} {
+	    set error_re "[string_to_regexp $warning_or_error]"
+	    gdb_test_multiple $cmd "" {
+		-re -wrap "^$cmd_re\r\n$error_re" {
+		    pass $gdb_test_name
+		}
+	    }
+	} else {
+	    gdb_test_multiple $cmd "" {
+		-re -wrap "^$cmd_re\r\n\\$$::decimal$result_re" {
+		    pass $gdb_test_name
+		}
+	    }
+	}
+    } else {
+	if {$warning_or_error != ""} {
+	    set warning_re "warning: [string_to_regexp $warning_or_error]\r\n"
+	} else {
+	    set warning_re ""
+	}
+
+	gdb_test_multiple $cmd "" {
+	    -re -wrap "^$cmd_re\r\n$warning_re\\$$::decimal$result_re" {
+		pass $gdb_test_name
+	    }
+	}
+    }
+}
+
+# Some warnings/errors GDB outputs.
+set rs_negative_shift_count "right shift count is negative"
+set rs_too_large_shift_count "right shift count >= width of type"
+set ls_negative_shift_count "left shift count is negative"
+set ls_too_large_shift_count "left shift count >= width of type"
+
+# Test a left shift that results in a too-large shift count warning in
+# all languages except Go.
+proc test_lshift_tl {lang cmd result_re} {
+    if {$lang != "go"} {
+	test_shift $lang $cmd $result_re $::ls_too_large_shift_count
+    } else {
+	test_shift $lang $cmd $result_re
+    }
+}
+
+# Test a right shift that results in a too-large shift count warning
+# in all languages except Go.
+proc test_rshift_tl {lang cmd result_re} {
+    if {$lang != "go"} {
+	test_shift $lang $cmd $result_re $::rs_too_large_shift_count
+    } else {
+	test_shift $lang $cmd $result_re
+    }
+}
+
+# Return VAL, and integer value converted/cast to the right type for
+# LANG.  SIGNED indicates whether the type should be signed or
+# unsigned.  BITS indicates the bit width of the type.  E.g., signed=0
+# and bits=32 results in:
+#   Go            =>  "uint($VAL)"
+#   D             =>  "cast(uint) $VAL"
+#   Rust          =>  "$VAL as i32"
+#   C/C++/others  =>  "(unsigned int) $VAL"
+proc make_val_cast {lang signed bits val} {
+    if {$lang == "go"} {
+	if {$signed} {
+	    set sign_prefix ""
+	} else {
+	    set sign_prefix "u"
+	}
+	return "${sign_prefix}int${bits}($val)"
+    } elseif {$lang == "d"} {
+	if {$signed} {
+	    set sign_prefix ""
+	} else {
+	    set sign_prefix "u"
+	}
+	if {$bits == 8} {
+	    set type "byte"
+	} elseif {$bits == 16} {
+	    set type "short"
+	} elseif {$bits == 32} {
+	    set type "int"
+	} elseif {$bits == 64} {
+	    set type "long"
+	} else {
+	    error "$lang: unsupported bits"
+	}
+	return "cast(${sign_prefix}$type) $val"
+    } elseif {$lang == "rust"} {
+	if {$signed} {
+	    set sign_prefix "i"
+	} else {
+	    set sign_prefix "u"
+	}
+	return "$val as ${sign_prefix}$bits"
+    } else {
+	# C-like cast.
+	if {$signed} {
+	    set sign_prefix ""
+	} else {
+	    set sign_prefix "un"
+	}
+	if {$bits == 8} {
+	    set type "char"
+	} elseif {$bits == 16} {
+	    set type "short"
+	} elseif {$bits == 32} {
+	    set type "int"
+	} elseif {$bits == 64} {
+	    if {$lang == "opencl"} {
+		set type "long"
+	    } else {
+		set type "long long"
+	    }
+	} else {
+	    error "$lang: unsupported bits"
+	}
+	return "(${sign_prefix}signed $type) $val"
+    }
+}
+
+# Generate make_int8 ... make_uint64 convenience procs, wrappers
+# around make_val_cast.
+foreach signed {0 1} {
+    if {$signed} {
+	set sign_prefix ""
+    } else {
+	set sign_prefix "u"
+    }
+    foreach bits {8 16 32 64} {
+	proc make_${sign_prefix}int${bits} {lang val} \
+	    "make_val_cast \$lang $signed $bits \$val"
+    }
+}
+
+# Test bitshifting, particularly with negative shift counts and
+# too-large-for-type shift counts.  Exercises all C-like-ish
+# languages.
+proc test_shifts {} {
+    global ls_negative_shift_count rs_negative_shift_count
+
+    # Extract the set of all supported languages.  We try all except
+    # languages we know wouldn't work.  We do this instead of
+    # hardcoding the set of languages that we know work, so that if
+    # GDB gains a new language, it is automatically exercised.
+    set supported_langs [get_set_option_choices "set language"]
+
+    foreach_with_prefix lang $supported_langs {
+	set skip_langs {
+	    "unknown" "ada" "modula-2" "pascal" "fortran"
+	}
+	if {[lsearch -exact $skip_langs $lang] >= 0} {
+	    continue
+	}
+
+	gdb_test "set language $lang"
+
+	# Make sure a signed left shift that overflows, i.e., whose
+	# result isn't representable in the signed type of the lhs,
+	# which is actually undefined, doesn't crash GDB when is it
+	# built with UBSan.
+
+	with_test_prefix "lsh overflow" {
+	    test_shift $lang "print /x 0x0fffffffffffffff << 8" \
+		" = 0xffffffffffffff00"
+	    test_shift $lang "print /x 0x0fffffff << 8" \
+		" = 0xffffff00"
+
+	    # Make sure the result is still signed when the lhs was
+	    # signed.
+	    test_shift $lang "print 0x0fffffffffffffff << 8" " = -256"
+	    test_shift $lang "print 0x0fffffff << 8" " = -256"
+	}
+
+	# 8-bit and 16-bit are promoted to int.
+	with_test_prefix "8-bit, promoted" {
+	    foreach lhs \
+		[list \
+		     [make_int8 $lang 0x0f] \
+		     [make_uint8 $lang 0x0f]] \
+	    {
+		test_shift $lang "print /x $lhs << 8" " = 0xf00"
+		test_shift $lang "print $lhs << 8" " = 3840"
+	    }
+	}
+	with_test_prefix "16-bit, promoted" {
+	    foreach lhs \
+		[list \
+		     [make_int16 $lang 0x0fff] \
+		     [make_uint16 $lang 0x0fff]] \
+	    {
+		test_shift $lang "print /x $lhs << 8" " = 0xfff00"
+		test_shift $lang "print $lhs << 8" " = 1048320"
+	    }
+	}
+
+	# Similarly, test shifting with both negative and too-large
+	# rhs.  Both cases are undefined, but GDB lets them go through
+	# anyhow, similarly to how compilers don't error out.  Try
+	# both signed and unsigned lhs.
+
+	# 8-bit lhs, signed and unsigned.  These get promoted to
+	# 32-bit int.
+	with_test_prefix "8-bit, invalid" {
+	    foreach lhs \
+		[list \
+		     [make_int8 $lang 0x7f] \
+		     [make_uint8 $lang 0xff]] \
+	    {
+		test_shift $lang "print $lhs << -1" " = 0" \
+		    $ls_negative_shift_count
+		test_shift $lang "print $lhs >> -1" " = 0" \
+		    $rs_negative_shift_count
+
+		test_shift $lang "print/x $lhs << 8" " = 0x(7|f)f00"
+		test_shift $lang "print/x $lhs >> 8" " = 0x0"
+
+		test_lshift_tl $lang "print $lhs << 32" " = 0"
+		test_rshift_tl $lang "print $lhs >> 32" " = 0"
+		test_lshift_tl $lang "print $lhs << 33" " = 0"
+		test_rshift_tl $lang "print $lhs >> 33" " = 0"
+	    }
+	}
+
+	# 16-bit lhs, signed and unsigned.  These get promoted to 32-bit int.
+	with_test_prefix "16-bit, invalid" {
+	    foreach {lhs res} \
+		[list \
+		     [make_int16 $lang 0x7fff] 0x7fff \
+		     [make_uint16 $lang 0xffff] 0xffff] \
+	    {
+		test_shift $lang "print $lhs << -1" " = 0" \
+		    $ls_negative_shift_count
+		test_shift $lang "print $lhs >> -1" " = 0" \
+		    $rs_negative_shift_count
+
+		# Confirm shifting by 0 doesn't warn.
+		test_shift $lang "print/x $lhs << 0" " = $res"
+		test_shift $lang "print/x $lhs >> 0" " = $res"
+
+		# These don't overflow due to promotion.
+		test_shift $lang "print/x $lhs << 16" " = 0x(7|f)fff0000"
+		test_shift $lang "print/x $lhs >> 16" " = 0x0"
+
+		test_lshift_tl $lang "print $lhs << 32" " = 0"
+		test_rshift_tl $lang "print $lhs >> 32" " = 0"
+		test_lshift_tl $lang "print $lhs << 33" " = 0"
+		test_rshift_tl $lang "print $lhs >> 33" " = 0"
+	    }
+	}
+
+	# 32-bit lhs, signed and unsigned.
+	with_test_prefix "32-bit, invalid" {
+	    foreach {lhs res} \
+		[list \
+		     [make_int32 $lang 0x7fffffff] 0x7fffffff \
+		     [make_uint32 $lang 0xffffffff] 0xffffffff] \
+	    {
+		test_shift $lang "print $lhs << -1" " = 0" \
+		    $ls_negative_shift_count
+		test_shift $lang "print $lhs >> -1" " = 0" \
+		    $rs_negative_shift_count
+
+		# Confirm shifting by 0 doesn't warn.
+		test_shift $lang "print/x $lhs << 0" " = $res"
+		test_shift $lang "print/x $lhs >> 0" " = $res"
+
+		test_lshift_tl $lang "print $lhs << 32" " = 0"
+		test_rshift_tl $lang "print $lhs >> 32" " = 0"
+
+		test_lshift_tl $lang "print $lhs << 33" " = 0"
+		test_rshift_tl $lang "print $lhs >> 33" " = 0"
+	    }
+	}
+
+	# 64-bit lhs, signed and unsigned.
+	with_test_prefix "64-bit, invalid" {
+	    foreach {lhs res} \
+		[list \
+		     [make_int64 $lang 0x7fffffffffffffff] \
+		     0x7fffffffffffffff \
+		     \
+		     [make_uint64 $lang 0xffffffffffffffff] \
+		     0xffffffffffffffff] \
+	    {
+		test_shift $lang "print $lhs << -1" " = 0" \
+		    $ls_negative_shift_count
+		test_shift $lang "print $lhs >> -1" " = 0" \
+		    $rs_negative_shift_count
+
+		# Confirm shifting by 0 doesn't warn.
+		test_shift $lang "print/x $lhs << 0" " = $res"
+		test_shift $lang "print/x $lhs >> 0" " = $res"
+
+		test_lshift_tl $lang "print $lhs << 64" " = 0"
+		test_rshift_tl $lang "print $lhs >> 64" " = 0"
+
+		test_lshift_tl $lang "print $lhs << 65" " = 0"
+		test_rshift_tl $lang "print $lhs >> 65" " = 0"
+	    }
+	}
+
+	# Right shift a negative number by a negative amount.
+	with_test_prefix "neg lhs/rhs" {
+	    test_shift $lang "print -1 >> -1" " = -1" $rs_negative_shift_count
+	    test_shift $lang "print -4 >> -2" " = -1" $rs_negative_shift_count
+	}
+
+	# Check right shifting a negative value.  For C++, this is
+	# implementation-defined, up until C++20.  In most
+	# implementations, this performs an arithmetic right shift, so
+	# that the result remains negative.  Currently, GDB does
+	# whatever the host's compiler does.  If that turns out wrong
+	# for some host/target, then GDB should be taught to ask the
+	# target gdbarch what to do.
+	with_test_prefix "rsh neg lhs" {
+	    test_shift $lang "print -1 >> 0" " = -1"
+	    test_shift $lang "print -1 >> 1" " = -1"
+	    test_shift $lang "print -8 >> 1" " = -4"
+	    test_shift $lang "print [make_int64 $lang -8] >> 1" " = -4"
+	}
+
+	# Make sure a 64-bit unsigned with high bit set isn't confused
+	# for a negative shift count in the warning messages.
+	with_test_prefix "max-uint64" {
+	    test_lshift_tl $lang \
+		"print 1 << [make_uint64 $lang 0xffffffffffffffff]" " = 0"
+	    test_rshift_tl $lang \
+		"print 1 >> [make_uint64 $lang 0xffffffffffffffff]" " = 0"
+	    test_lshift_tl $lang \
+		"print -1 << [make_uint64 $lang 0xffffffffffffffff]" " = 0"
+	    test_rshift_tl $lang \
+		"print -1 >> [make_uint64 $lang 0xffffffffffffffff]" " = -1"
+	}
+    }
+}
+
+test_shifts
diff --git a/gdb/valarith.c b/gdb/valarith.c
index 36d30f161f6..6210267826e 100644
--- a/gdb/valarith.c
+++ b/gdb/valarith.c
@@ -1070,6 +1070,66 @@ complex_binop (struct value *arg1, struct value *arg2, enum exp_opcode op)
   return value_literal_complex (result_real, result_imag, result_type);
 }
 
+/* Return the type's length in bits.  */
+
+static int
+type_length_bits (type *type)
+{
+  int unit_size = gdbarch_addressable_memory_unit_size (type->arch ());
+  return unit_size * 8 * TYPE_LENGTH (type);
+}
+
+/* Check whether the RHS value of a shift is valid in C/C++ semantics.
+   SHIFT_COUNT is the shift amount, SHIFT_COUNT_TYPE is the type of
+   the shift count value, used to determine whether the type is
+   signed, and RESULT_TYPE is the result type.  This is used to avoid
+   both negative and too-large shift amounts, which are undefined, and
+   would crash a GDB built with UBSan.  Depending on the current
+   language, if the shift is not valid, this either warns and returns
+   false, or errors out.  Returns true if valid.  */
+
+static bool
+check_valid_shift_count (int op, type *result_type,
+			 type *shift_count_type, ULONGEST shift_count)
+{
+  if (!shift_count_type->is_unsigned () && (LONGEST) shift_count < 0)
+    {
+      auto error_or_warning = [] (const char *msg)
+      {
+	/* Shifts by a negative amount are always an error in Go.  Other
+	   languages are more permissive and their compilers just warn or
+	   have modes to disable the errors.  */
+	if (current_language->la_language == language_go)
+	  error (("%s"), msg);
+	else
+	  warning (("%s"), msg);
+      };
+
+      if (op == BINOP_RSH)
+	error_or_warning (_("right shift count is negative"));
+      else
+	error_or_warning (_("left shift count is negative"));
+      return false;
+    }
+
+  if (shift_count >= type_length_bits (result_type))
+    {
+      /* In Go, shifting by large amounts is defined.  Be silent and
+	 still return false, as the caller's error path does the right
+	 thing for Go.  */
+      if (current_language->la_language != language_go)
+	{
+	  if (op == BINOP_RSH)
+	    warning (_("right shift count >= width of type"));
+	  else
+	    warning (_("left shift count >= width of type"));
+	}
+      return false;
+    }
+
+  return true;
+}
+
 /* Perform a binary operation on two operands which have reasonable
    representations as integers or floats.  This includes booleans,
    characters, integers, or floats.
@@ -1233,11 +1293,17 @@ scalar_binop (struct value *arg1, struct value *arg2, enum exp_opcode op)
 	      break;
 
 	    case BINOP_LSH:
-	      v = v1 << v2;
+	      if (!check_valid_shift_count (op, result_type, type2, v2))
+		v = 0;
+	      else
+		v = v1 << v2;
 	      break;
 
 	    case BINOP_RSH:
-	      v = v1 >> v2;
+	      if (!check_valid_shift_count (op, result_type, type2, v2))
+		v = 0;
+	      else
+		v = v1 >> v2;
 	      break;
 
 	    case BINOP_BITWISE_AND:
@@ -1362,11 +1428,40 @@ scalar_binop (struct value *arg1, struct value *arg2, enum exp_opcode op)
 	      break;
 
 	    case BINOP_LSH:
-	      v = v1 << v2;
+	      if (!check_valid_shift_count (op, result_type, type2, v2))
+		v = 0;
+	      else
+		{
+		  /* Cast to unsigned to avoid undefined behavior on
+		     signed shift overflow (unless C++20 or later),
+		     which would crash GDB when built with UBSan.
+		     Note we don't warn on left signed shift overflow,
+		     because starting with C++20, that is actually
+		     defined behavior.  Also, note GDB assumes 2's
+		     complement throughout.  */
+		  v = (ULONGEST) v1 << v2;
+		}
 	      break;
 
 	    case BINOP_RSH:
-	      v = v1 >> v2;
+	      if (!check_valid_shift_count (op, result_type, type2, v2))
+		{
+		  /* Pretend the too-large shift was decomposed in a
+		     number of smaller shifts.  An arithmetic signed
+		     right shift of a negative number always yields -1
+		     with such semantics.  This is the right thing to
+		     do for Go, and we might as well do it for
+		     languages where it is undefined.  Also, pretend a
+		     shift by a negative number was a shift by the
+		     negative number cast to unsigned, which is the
+		     same as shifting by a too-large number.  */
+		  if (v1 < 0)
+		    v = -1;
+		  else
+		    v = 0;
+		}
+	      else
+		v = v1 >> v2;
 	      break;
 
 	    case BINOP_BITWISE_AND:
-- 
2.26.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] Fix undefined behavior in the Fortran, Go and Pascal number parsers
  2022-04-07 20:06 ` [PATCH v3 1/2] Fix undefined behavior in the Fortran, Go and Pascal number parsers Pedro Alves
@ 2022-04-08 14:44   ` Tom Tromey
  2022-04-08 14:58     ` Pedro Alves
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Tromey @ 2022-04-08 14:44 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches

>>>>> "Pedro" == Pedro Alves <pedro@palves.net> writes:

Pedro> This commit ports these two fixes to the C parser:

Pedro> ... to the Fortran, Go, and Fortran number parsers, fixing the same
Pedro> problems there.

If the code is all the same (I didn't check), maybe it should just be
merged into a single spot.  It's fine by me if you don't want to, though.

Pedro> diff --git a/gdb/go-exp.y b/gdb/go-exp.y
Pedro> index 456920456cd..07943b1c8c3 100644
Pedro> --- a/gdb/go-exp.y
Pedro> +++ b/gdb/go-exp.y
Pedro> @@ -649,8 +649,8 @@ parse_number (struct parser_state *par_state,
Pedro>  {
Pedro>    /* FIXME: Shouldn't these be unsigned?  We don't deal with negative values
Pedro>       here, and we do kind of silly things like cast to unsigned.  */
Pedro> -  LONGEST n = 0;
Pedro> -  LONGEST prevn = 0;
Pedro> +  ULONGEST n = 0;
Pedro> +  ULONGEST prevn = 0;

The comment can be removed now.  I think you just forgot this spot since
you did remove it in p-exp.y.

Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/2] gdb: Avoid undefined shifts, fix Go shifts
  2022-04-07 20:06 ` [PATCH v3 2/2] gdb: Avoid undefined shifts, fix Go shifts Pedro Alves
@ 2022-04-08 14:55   ` Tom Tromey
  2022-04-08 15:20     ` Pedro Alves
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Tromey @ 2022-04-08 14:55 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches

>>>>> "Pedro" == Pedro Alves <pedro@palves.net> writes:

Pedro> Thus:

Pedro> - Adjust scalar_binop to avoid the undefined paths, either by adding
Pedro>   explicit result paths, or by casting the lhs of the left shift to
Pedro>   unsigned, as appropriate.
[...]

This looks good to me.  Thank you.

Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] Fix undefined behavior in the Fortran, Go and Pascal number parsers
  2022-04-08 14:44   ` Tom Tromey
@ 2022-04-08 14:58     ` Pedro Alves
  2022-04-08 15:22       ` Tom Tromey
  0 siblings, 1 reply; 8+ messages in thread
From: Pedro Alves @ 2022-04-08 14:58 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 2022-04-08 15:44, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <pedro@palves.net> writes:
> 
> Pedro> This commit ports these two fixes to the C parser:
> 
> Pedro> ... to the Fortran, Go, and Fortran number parsers, fixing the same
> Pedro> problems there.
> 
> If the code is all the same (I didn't check), maybe it should just be
> merged into a single spot.  It's fine by me if you don't want to, though.

A lot of it looks the same on the surface, but there are important differences.
For example, the builtin types are different, like:

          putithere->typed_val_float.type
            = builtin_go_types->builtin_float32;

each parser's YYSTYPE is going to be a different type, and then different languages
parse different suffixes or features.  E.g., not all support complex numbers.

go-exp.y has this comment:

/* FIXME(dje): IWBN to use c-exp.y's parse_number if we could.
   That will require moving the guts into a function that we both call
   as our YYSTYPE is different than c-exp.y's  */

C++ templates nowadays may help here for the YYSTYPE type, not sure.
I had looked at bit into this, but quickly decided it would be a lot more effort
than I'd wanted to spend here.

> 
> Pedro> diff --git a/gdb/go-exp.y b/gdb/go-exp.y
> Pedro> index 456920456cd..07943b1c8c3 100644
> Pedro> --- a/gdb/go-exp.y
> Pedro> +++ b/gdb/go-exp.y
> Pedro> @@ -649,8 +649,8 @@ parse_number (struct parser_state *par_state,
> Pedro>  {
> Pedro>    /* FIXME: Shouldn't these be unsigned?  We don't deal with negative values
> Pedro>       here, and we do kind of silly things like cast to unsigned.  */
> Pedro> -  LONGEST n = 0;
> Pedro> -  LONGEST prevn = 0;
> Pedro> +  ULONGEST n = 0;
> Pedro> +  ULONGEST prevn = 0;
> 
> The comment can be removed now.  I think you just forgot this spot since
> you did remove it in p-exp.y.

Thanks.  I'll fix this and merge this one.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/2] gdb: Avoid undefined shifts, fix Go shifts
  2022-04-08 14:55   ` Tom Tromey
@ 2022-04-08 15:20     ` Pedro Alves
  0 siblings, 0 replies; 8+ messages in thread
From: Pedro Alves @ 2022-04-08 15:20 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 2022-04-08 15:55, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <pedro@palves.net> writes:
> 
> Pedro> Thus:
> 
> Pedro> - Adjust scalar_binop to avoid the undefined paths, either by adding
> Pedro>   explicit result paths, or by casting the lhs of the left shift to
> Pedro>   unsigned, as appropriate.
> [...]
> 
> This looks good to me.  Thank you.

Thank you.  Now merged.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] Fix undefined behavior in the Fortran, Go and Pascal number parsers
  2022-04-08 14:58     ` Pedro Alves
@ 2022-04-08 15:22       ` Tom Tromey
  0 siblings, 0 replies; 8+ messages in thread
From: Tom Tromey @ 2022-04-08 15:22 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Tom Tromey, gdb-patches

Pedro> each parser's YYSTYPE is going to be a different type, and then different languages
Pedro> parse different suffixes or features.  E.g., not all support complex numbers.

Aha, thanks.

Pedro> /* FIXME(dje): IWBN to use c-exp.y's parse_number if we could.
Pedro>    That will require moving the guts into a function that we both call
Pedro>    as our YYSTYPE is different than c-exp.y's  */

Pedro> C++ templates nowadays may help here for the YYSTYPE type, not sure.
Pedro> I had looked at bit into this, but quickly decided it would be a lot more effort
Pedro> than I'd wanted to spend here.

I tend to think a number-parsing function shouldn't need to know
anything about YYSTYPE.  However I don't think it's important to bother
with these.  A lot of these parsers are on a best-effort basis already.

Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-04-08 15:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-07 20:06 [PATCH v3 0/2] Fix a couple undefined behaviors flagged by UBSan Pedro Alves
2022-04-07 20:06 ` [PATCH v3 1/2] Fix undefined behavior in the Fortran, Go and Pascal number parsers Pedro Alves
2022-04-08 14:44   ` Tom Tromey
2022-04-08 14:58     ` Pedro Alves
2022-04-08 15:22       ` Tom Tromey
2022-04-07 20:06 ` [PATCH v3 2/2] gdb: Avoid undefined shifts, fix Go shifts Pedro Alves
2022-04-08 14:55   ` Tom Tromey
2022-04-08 15:20     ` Pedro Alves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).