[PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp
@ 2022-05-23 11:05 Tom de Vries
  2022-05-23 11:05 ` [PATCH 2/8] [gdb/c] Fix type of 2147483648 and literal truncation Tom de Vries
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Tom de Vries @ 2022-05-23 11:05 UTC (permalink / raw)
  To: gdb-patches

Currently we only test value 0xffffffffffffffff in test-case
gdb.base/parse_numbers.exp.

Test more interesting values, both in decimal and hex format, as well as
negative decimals for language modula-2.

This results in an increase in total tests from 15572 to 847448 (55 times
more tests).

Balance out the increase in runtime by reducing the number of architectures
tested: only test one architecture per sizeof longlong/long/int/short
combination, while keeping the possibility intact to run with all
architectures (through setting a variable in the test-case)

Results in slight reduction of total tests: 15572 -> 13853.

Document interesting cases in the expected results:
- wrapping from unsigned to signed
- truncation
- PR16377: using unsigned types to represent decimal constants in C

Running the test-case with a gdb build with -fsanitize=undefined, we trigger
two UB errors in the modula-2 parser, filed as PR29163.

Tested on x86_64-linux with --enable-targets=all.
---
 gdb/testsuite/gdb.base/parse_number.exp | 302 ++++++++++++++++++++----
 1 file changed, 260 insertions(+), 42 deletions(-)

diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
index 197e27a8e9e..7c259e0a8a0 100644
--- a/gdb/testsuite/gdb.base/parse_number.exp
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -16,13 +16,179 @@
 # Format hex value VAL for language LANG.
 
 proc hex_for_lang { lang val } {
-    set val [regsub ^0x $val ""]
+    set neg_p [regexp ^- $val]
+    set val [regsub ^-?0x $val ""]
     if { $lang == "modula-2" } {
        set val 0[string toupper $val]H
     } else {
        set val 0x$val
     }
-    return $val
+    if { $neg_p } {
+	return -$val
+    } else {
+	return $val
+    }
+}
+
+# Determine whether N fits in type with TYPE_BITS and TYPE_SIGNEDNESS.
+
+proc fits_in_type { n type_bits type_signedness } {
+    if { $type_signedness == "s" } {
+	set type_signed_p 1
+    } elseif { $type_signedness == "u" } {
+	set type_signed_p 0
+    } else {
+	error "unreachable"
+    }
+
+    if { $n < 0 && !$type_signed_p } {
+	# Can't fit a negative number in an unsigned type.
+	return 0
+    }
+
+    if { $n < 0} {
+	set n_sign -1
+	set n [expr -$n]
+    } else {
+	set n_sign 1
+    }
+
+    set smax [expr 1 << ($type_bits - 1)];
+    if  { $n_sign == -1 } {
+	# Negative number, signed type.
+	return [expr ($n <= $smax)]
+    } elseif { $n_sign == 1 && $type_signed_p } {
+	# Positive number, signed type.
+	return [expr ($n < $smax)]
+    } elseif { $n_sign == 1 && !$type_signed_p } {
+	# Positive number, unsigned type.
+	return [expr ($n >> $type_bits) == 0]
+    } else {
+	error "unreachable"
+    }
+}
+
+# Parse number N for LANG, and return a list of expected type and value.
+
+proc parse_number { lang n } {
+    global re_overflow
+
+    set hex_p [regexp ^-?0x $n]
+
+    global hex decimal
+    if { $hex_p } {
+	set any $hex
+    } else {
+	set any $decimal
+    }
+
+    global sizeof_long_long sizeof_long sizeof_int
+    set long_long_bits [expr $sizeof_long_long * 8]
+    set long_bits [expr $sizeof_long * 8]
+    set int_bits [expr $sizeof_int * 8]
+
+    if { $lang == "rust" } {
+	if { [fits_in_type $n 32 s] } {
+	    return [list "i32" $n]
+	} elseif { [fits_in_type $n 64 s] } {
+	    return [list "i64" $n]
+	} elseif { [fits_in_type $n 64 u] } {
+	    # Note: Interprets MAX_U64 as -1.
+	    return [list "i64" $n]
+	} else {
+	    # Overflow.
+	    # Some truncated value, should be re_overflow.
+	    return [list i64 $any]
+	}
+    } elseif { $lang == "d" } {
+	if { [fits_in_type $n 32 s] } {
+	    return [list int $n]
+	} elseif { [fits_in_type $n 32 u] } {
+	    if { $hex_p } {
+		return [list uint $n]
+	    } else {
+		return [list long $n]
+	    }
+	} elseif { [fits_in_type $n 64 s] } {
+	    return [list long $n]
+	} elseif { [fits_in_type $n 64 u] } {
+	    return [list ulong $n]
+	} else {
+	    # Overflow.
+	    return [list $re_overflow $re_overflow]
+	}
+    } elseif { $lang == "ada" } {
+	if { [fits_in_type $n $int_bits s] } {
+	    return [list "<$sizeof_int-byte integer>" $n]
+	} elseif { [fits_in_type $n $long_bits s] } {
+	    return [list "<$sizeof_long-byte integer>" $n]
+	} elseif { [fits_in_type $n $long_bits u] } {
+	    return [list "<$sizeof_long-byte integer>" $n]
+	} elseif { [fits_in_type $n $long_long_bits s] } {
+	    return [list "<$sizeof_long_long-byte integer>" $n]
+	} elseif { [fits_in_type $n $long_long_bits u] } {
+	    # Note: Interprets ULLONG_MAX as -1.
+	    return [list "<$sizeof_long_long-byte integer>" $n]
+	} else {
+	    # Overflow.
+	    # Some truncated value or re_overflow, should be re_overflow.
+	    return [list "($re_overflow|<$decimal-byte integer>)" \
+			($re_overflow|$any)]
+	}
+    } elseif { $lang == "modula-2" } {
+	if { [string equal $n -0] } {
+	    # Note: 0 is CARDINAL, but -0 is an INTEGER.
+	    return [list "INTEGER" 0]
+	}
+	if { $n < 0 && [fits_in_type $n $int_bits s] } {
+	    return [list "INTEGER" $n]
+	} elseif { [fits_in_type $n $int_bits u] } {
+	    return [list "CARDINAL" $n]
+	} else {
+	    # Overflow.
+	    # Some truncated value or re_overflow, should be re_overflow.
+	    return [list ($re_overflow|CARDINAL|INTEGER) ($re_overflow|$any)]
+	}
+    } elseif { $lang == "fortran" } {
+	if { [fits_in_type $n $int_bits s] } {
+	    return [list int $n]
+	} elseif { [fits_in_type $n $int_bits u] } {
+	    return [list "unsigned int" $n]
+	} elseif { [fits_in_type $n $long_bits s] } {
+	    return [list long $n]
+	} elseif { [fits_in_type $n $long_bits u] } {
+	    return [list "unsigned long" $n]
+	} else {
+	    # Overflow.
+	    # Some truncated value or re_overflow, should be re_overflow.
+	    return [list "((unsigned )?(int|long)|$re_overflow)" \
+			($any|$re_overflow)]
+	}
+    } else {
+	# This is wrong for c-like languages.  For the decimal case, we
+	# shouldn't use unsigned.
+	# See PR 16377.
+	if { [fits_in_type $n $int_bits s] } {
+	    return [list int $n]
+	} elseif { [fits_in_type $n $int_bits u] } {
+	    return [list "unsigned int" $n]
+	} elseif { [fits_in_type $n $long_bits s] } {
+	    return [list long $n]
+	} elseif { [fits_in_type $n $long_bits u] } {
+	    return [list "unsigned long" $n]
+	} elseif { [fits_in_type $n $long_long_bits s] } {
+	    return [list "long long" $n]
+	} elseif { [fits_in_type $n $long_long_bits u] } {
+	    return [list "unsigned long long" $n]
+	} else {
+	    # Overflow.
+	    # Some truncated value or re_overflow, should be re_overflow.
+	    return [list "((unsigned )?(int|long)|$re_overflow)" \
+			($any|$re_overflow)]
+	}
+    }
+
+    error "unreachable"
 }
 
 # Test parsing numbers.  Several language parsers had the same bug
@@ -32,6 +198,10 @@ proc hex_for_lang { lang val } {
 # that GDB doesn't crash.  ARCH is the architecture to test with.
 
 proc test_parse_numbers {arch} {
+    global full_arch_testing
+    global tested_archs
+    global verbose
+
     set arch_re [string_to_regexp $arch]
     gdb_test "set architecture $arch" "The target architecture is set to \"$arch_re\"."
 
@@ -41,24 +211,21 @@ proc test_parse_numbers {arch} {
     # Figure out type sizes before matching patterns in the upcoming
     # tests.
 
+    global sizeof_long_long sizeof_long sizeof_int sizeof_short
     set sizeof_long_long [get_sizeof "long long" -1]
     set sizeof_long [get_sizeof "long" -1]
     set sizeof_int [get_sizeof "int" -1]
+    set sizeof_short [get_sizeof "short" -1]
 
-    if {$sizeof_long_long == 8 && $sizeof_long == 8} {
-	set 8B_type "unsigned long"
-	set fortran_type "unsigned long"
-	set fortran_value "0xffffffffffffffff"
-    } elseif {$sizeof_long_long == 8 && $sizeof_long == 4 && $sizeof_int == 4} {
-	set 8B_type "unsigned long long"
-	set fortran_type "unsigned int"
-	set fortran_value "0xffffffff"
-    } elseif {$sizeof_long == 4 && $sizeof_int == 2} {
-	set 8B_type "unsigned long long"
-	set fortran_type "unsigned long"
-	set fortran_value "0xffffffff"
-    } else {
-	error "missing case for long long = $sizeof_long_long, long = $sizeof_long, int = $sizeof_int"
+    if { ! $full_arch_testing } {
+	set arch_id \
+	    [list $sizeof_long_long $sizeof_long $sizeof_long $sizeof_int \
+		 $sizeof_short]
+	if { [lsearch $tested_archs $arch_id] == -1 } {
+	    lappend tested_archs $arch_id
+	} else {
+	    return
+	}
     }
 
     foreach_with_prefix lang $::all_languages {
@@ -72,34 +239,78 @@ proc test_parse_numbers {arch} {
 
 	gdb_test_no_output "set language $lang"
 
-	set val "0xffffffffffffffff"
-	set val [hex_for_lang $lang $val]
-	if {$lang == "fortran"} {
-	    gdb_test "p/x $val" " = $fortran_value"
-	    gdb_test "ptype $val" " = $fortran_type"
-	} elseif {$lang == "modula-2"} {
-	    gdb_test "p/x $val" "Overflow on numeric constant\\."
+	global re_overflow
+	if { $lang == "modula-2" || $lang == "fortran" } {
+	    set re_overflow "Overflow on numeric constant\\."
+	} elseif { $lang == "ada" } {
+	    set re_overflow "Integer literal out of range"
 	} else {
-	    # D and Rust define their own built-in 64-bit types, and
-	    # are thus always able to parse/print 64-bit values.
-	    if {$sizeof_long_long == 4 && $lang != "d" && $lang != "rust"} {
-		set out "0xffffffff"
-	    } else {
-		set out $val
-	    }
-	    gdb_test "p/x $val" " = $out"
-	    if {$lang == "ada"} {
-		if {$sizeof_long_long == 4} {
-		    gdb_test "ptype $val" " = <4-byte integer>"
-		} else {
-		    gdb_test "ptype $val" " = <8-byte integer>"
+	    set re_overflow "Numeric constant too large\\."
+	}
+
+	set basevals {
+	    0xffffffffffffffff
+	    0x7fffffffffffffff
+	    0xffffffff
+	    0x7fffffff
+	    0xffff
+	    0x7fff
+	    0xff
+	    0x7f
+	    0x0
+	}
+
+	if { $lang == "modula-2" } {
+	    # Modula-2 is the only language that changes the type of an
+	    # integral literal based on whether it's prefixed with "-",
+	    # so test both scenarios.
+	    set prefixes { "" "-" }
+	} else {
+	    # For all the other languages, we'd just be testing the
+	    # parsing twice, so just test the basic scenario of no prefix.
+	    set prefixes { "" }
+	}
+
+	foreach_with_prefix prefix $prefixes {
+	    foreach baseval $basevals {
+		foreach offset { -2 -1 0 1 2 } {
+		    set dec_val [expr $baseval + $offset]
+		    set hex_val [format "0x%llx" $dec_val]
+		    if { $dec_val < 0 } {
+			continue
+		    }
+
+		    set dec_val $prefix$dec_val
+		    lassign [parse_number $lang $dec_val] type out
+		    if { $verbose >= 1 } { verbose -log "EXPECTED: $out" 2 }
+		    if { $prefix == "" } {
+			gdb_test "p/u $dec_val" "$out"
+		    } else {
+			gdb_test "p/d $dec_val" "$out"
+		    }
+		    if { $verbose >= 1 } { verbose -log "EXPECTED: $type" 2 }
+		    gdb_test "ptype $dec_val" "$type"
+
+		    if { $prefix == "-" } {
+			# Printing with /x below means negative numbers are
+			# converted to unsigned representation.  We could
+			# support this by updating the expected patterns.
+			# Possibly, we could print with /u and /d instead of
+			# /x here as well (which would also require updating
+			# expected patterns).
+			# For now, this doesn't seem worth the trouble,
+			# so skip.
+			continue
+		    }
+
+		    set hex_val $prefix$hex_val
+		    lassign [parse_number $lang $hex_val] type out
+		    set hex_val [hex_for_lang $lang $hex_val]
+		    if { $verbose >= 1 } { verbose -log "EXPECTED: $out" 2 }
+		    gdb_test "p/x $hex_val" "$out"
+		    if { $verbose >= 1 } { verbose -log "EXPECTED: $type" 2 }
+		    gdb_test "ptype $hex_val" "$type"
 		}
-	    } elseif {$lang == "d"} {
-		gdb_test "ptype $val" " = ulong"
-	    } elseif {$lang == "rust"} {
-		gdb_test "ptype $val" " = i64"
-	    } else {
-		gdb_test "ptype $val" " = $8B_type"
 	    }
 	}
     }
@@ -119,6 +330,13 @@ gdb_assert {[llength $supported_archs] > 1} "at least one architecture"
 
 set all_languages [get_set_option_choices "set language"]
 
+# If 1, test each arch.  If 0, test one arch for each sizeof
+# short/int/long/longlong configuration.
+# For a build with --enable-targets=all, full_arch_testing == 0 takes 15s,
+# while full_arch_testing == 1 takes 9m20s.
+set full_arch_testing 0
+
+set tested_archs {}
 foreach_with_prefix arch $supported_archs {
     if {$arch == "auto"} {
 	# Avoid duplicate testing.

base-commit: cb0d58bf4d274cfb1ae11b75bd2b3ba81c8d371d
-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/8] [gdb/c] Fix type of 2147483648 and literal truncation
  2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
@ 2022-05-23 11:05 ` Tom de Vries
  2022-05-23 11:05 ` [PATCH 3/8] [gdb/fortran] Fix " Tom de Vries
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Tom de Vries @ 2022-05-23 11:05 UTC (permalink / raw)
  To: gdb-patches

[ Assuming arch i386:x86-64, sizeof (int) == 4,
sizeof (long) == sizeof (long long) == 8. ]

Currently we have (decimal for 0x80000000):
...
(gdb) ptype 2147483648
type = unsigned int
...

According to C language rules, unsigned types cannot be used for decimal
constants, so the type should be long instead (reported in PR16377).

Fix this by making sure the type of 2147483648 is long.

The next interesting case is (decimal for 0x8000000000000000):
...
(gdb) ptype 9223372036854775808
type = unsigned long
...

According to the same rules, unsigned long is incorrect.

Current gcc uses __int128 as type, which is allowed, but we don't have that
available in gdb, so the strict response here would be erroring out with
overflow.

Older gcc without __int128 support, as well as clang use an unsigned type, but with
a warning.  Interestingly, clang uses "unsigned long long" while gcc uses
"unsigned long", which seems the better choice.

Given that the compilers allow this as a convience, do the same in gdb
and keep type "unsigned long", and make this explicit in parser and test-case.

Furthermore, make sure we error out on overflow instead of truncating in all
cases.

Tested on x86_64-linux with --enable-targets=all.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16377
---
 gdb/c-exp.y                             | 111 ++++++++++--------------
 gdb/parse.c                             |  37 ++++++++
 gdb/parser-defs.h                       |   2 +
 gdb/testsuite/gdb.base/parse_number.exp |  56 ++++++++++--
 4 files changed, 130 insertions(+), 76 deletions(-)

diff --git a/gdb/c-exp.y b/gdb/c-exp.y
index 72f8dd32d93..61a61fcba09 100644
--- a/gdb/c-exp.y
+++ b/gdb/c-exp.y
@@ -1929,7 +1929,6 @@ parse_number (struct parser_state *par_state,
 {
   ULONGEST n = 0;
   ULONGEST prevn = 0;
-  ULONGEST un;
 
   int i = 0;
   int c;
@@ -1945,9 +1944,6 @@ parse_number (struct parser_state *par_state,
   /* We have found a "L" or "U" (or "i") suffix.  */
   int found_suffix = 0;
 
-  ULONGEST high_bit;
-  struct type *signed_type;
-  struct type *unsigned_type;
   char *p;
 
   p = (char *) alloca (len);
@@ -2095,18 +2091,12 @@ parse_number (struct parser_state *par_state,
       if (i >= base)
 	return ERROR;		/* Invalid digit in this base */
 
-      /* Portably test for overflow (only works for nonzero values, so make
-	 a second check for zero).  FIXME: Can't we just make n and prevn
-	 unsigned and avoid this?  */
-      if (c != 'l' && c != 'u' && c != 'i' && (prevn >= n) && n != 0)
-	unsigned_p = 1;		/* Try something unsigned */
-
-      /* Portably test for unsigned overflow.
-	 FIXME: This check is wrong; for example it doesn't find overflow
-	 on 0x123456789 when LONGEST is 32 bits.  */
-      if (c != 'l' && c != 'u' && c != 'i' && n != 0)
-	{	
-	  if (unsigned_p && prevn >= n)
+      if (c != 'l' && c != 'u' && c != 'i')
+	{
+	  /* Test for overflow.  */
+	  if (prevn == 0 && n == 0)
+	    ;
+	  else if (prevn >= n)
 	    error (_("Numeric constant too large."));
 	}
       prevn = n;
@@ -2123,58 +2113,45 @@ parse_number (struct parser_state *par_state,
      or gdbarch_long_bit will be that big, sometimes not.  To deal with
      the case where it is we just always shift the value more than
      once, with fewer bits each time.  */
-
-  un = n >> 2;
-  if (long_p == 0
-      && (un >> (gdbarch_int_bit (par_state->gdbarch ()) - 2)) == 0)
-    {
-      high_bit
-	= ((ULONGEST)1) << (gdbarch_int_bit (par_state->gdbarch ()) - 1);
-
-      /* A large decimal (not hex or octal) constant (between INT_MAX
-	 and UINT_MAX) is a long or unsigned long, according to ANSI,
-	 never an unsigned int, but this code treats it as unsigned
-	 int.  This probably should be fixed.  GCC gives a warning on
-	 such constants.  */
-
-      unsigned_type = parse_type (par_state)->builtin_unsigned_int;
-      signed_type = parse_type (par_state)->builtin_int;
-    }
-  else if (long_p <= 1
-	   && (un >> (gdbarch_long_bit (par_state->gdbarch ()) - 2)) == 0)
-    {
-      high_bit
-	= ((ULONGEST)1) << (gdbarch_long_bit (par_state->gdbarch ()) - 1);
-      unsigned_type = parse_type (par_state)->builtin_unsigned_long;
-      signed_type = parse_type (par_state)->builtin_long;
-    }
+  int int_bits = gdbarch_int_bit (par_state->gdbarch ());
+  int long_bits = gdbarch_long_bit (par_state->gdbarch ());
+  int long_long_bits = gdbarch_long_long_bit (par_state->gdbarch ());
+  bool have_signed
+    /* No 'u' suffix.  */
+    = !unsigned_p;
+  bool have_unsigned
+    = ((/* 'u' suffix.  */
+	unsigned_p)
+       || (/* Not a decimal.  */
+	   base != 10)
+       || (/* Allowed as a convenience, in case decimal doesn't fit in largest
+	      signed type.  */
+	   !fits_in_type (1, n, long_long_bits, true)));
+  bool have_int
+    /* No 'l' or 'll' suffix.  */
+    = long_p == 0;
+  bool have_long
+    /* No 'll' suffix.  */
+    = long_p <= 1;
+  if (have_int && have_signed && fits_in_type (1, n, int_bits, true))
+    putithere->typed_val_int.type = parse_type (par_state)->builtin_int;
+  else if (have_int && have_unsigned && fits_in_type (1, n, int_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_int;
+  else if (have_long && have_signed && fits_in_type (1, n, long_bits, true))
+    putithere->typed_val_int.type = parse_type (par_state)->builtin_long;
+  else if (have_long && have_unsigned && fits_in_type (1, n, long_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_long;
+  else if (have_signed && fits_in_type (1, n, long_long_bits, true))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_long_long;
+  else if (have_unsigned && fits_in_type (1, n, long_long_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_long_long;
   else
-    {
-      int shift;
-      if (sizeof (ULONGEST) * HOST_CHAR_BIT
-	  < gdbarch_long_long_bit (par_state->gdbarch ()))
-	/* A long long does not fit in a LONGEST.  */
-	shift = (sizeof (ULONGEST) * HOST_CHAR_BIT - 1);
-      else
-	shift = (gdbarch_long_long_bit (par_state->gdbarch ()) - 1);
-      high_bit = (ULONGEST) 1 << shift;
-      unsigned_type = parse_type (par_state)->builtin_unsigned_long_long;
-      signed_type = parse_type (par_state)->builtin_long_long;
-    }
-
-   putithere->typed_val_int.val = n;
-
-   /* If the high bit of the worked out type is set then this number
-      has to be unsigned. */
-
-   if (unsigned_p || (n & high_bit))
-     {
-       putithere->typed_val_int.type = unsigned_type;
-     }
-   else
-     {
-       putithere->typed_val_int.type = signed_type;
-     }
+    error (_("Numeric constant too large."));
+  putithere->typed_val_int.val = n;
 
    if (imaginary_p)
      putithere->typed_val_int.type
diff --git a/gdb/parse.c b/gdb/parse.c
index fb308be0d7c..aa12f6fe012 100644
--- a/gdb/parse.c
+++ b/gdb/parse.c
@@ -611,6 +611,43 @@ parse_float (const char *p, int len,
 {
   return target_float_from_string (data, type, std::string (p, len));
 }
+
+/* Return true if the number N_SIGN * N fits in a type with TYPE_BITS and
+   TYPE_SIGNED_P.  N_SIGNED is either 1 or -1.  */
+
+bool
+fits_in_type (int n_sign, ULONGEST n, int type_bits, bool type_signed_p)
+{
+  /* Normalize -0.  */
+  if (n == 0 && n_sign == -1)
+    n_sign = 1;
+
+  if (n_sign == -1 && !type_signed_p)
+    /* Can't fit a negative number in an unsigned type.  */
+    return false;
+
+  if (type_bits > sizeof (ULONGEST) * 8)
+    return true;
+
+  ULONGEST smax = (ULONGEST)1 << (type_bits - 1);
+  if (n_sign == -1)
+    {
+      /* Negative number, signed type.  */
+      return (n <= smax);
+    }
+  else if (n_sign == 1 && type_signed_p)
+    {
+      /* Positive number, signed type.  */
+      return (n < smax);
+    }
+  else if (n_sign == 1 && !type_signed_p)
+    {
+      /* Positive number, unsigned type.  */
+      return ((n >> 1) >> (type_bits - 1)) == 0;
+    }
+  else
+    gdb_assert_not_reached ("");
+}
 \f
 /* This function avoids direct calls to fprintf 
    in the parser generated debug code.  */
diff --git a/gdb/parser-defs.h b/gdb/parser-defs.h
index 3be7d6c839f..ade3039ebb0 100644
--- a/gdb/parser-defs.h
+++ b/gdb/parser-defs.h
@@ -412,6 +412,8 @@ extern std::string copy_name (struct stoken);
 
 extern bool parse_float (const char *p, int len,
 			 const struct type *type, gdb_byte *data);
+extern bool fits_in_type (int n_sign, ULONGEST n, int type_bits,
+			  bool type_signed_p);
 \f
 
 /* Function used to avoid direct calls to fprintf
diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
index 7c259e0a8a0..638ea342384 100644
--- a/gdb/testsuite/gdb.base/parse_number.exp
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -68,6 +68,22 @@ proc fits_in_type { n type_bits type_signedness } {
     }
 }
 
+# Return 1 if LANG is a c-like language, in the sense that it uses the same
+# parser.
+
+proc c_like { lang } {
+    set res 0
+    switch $lang {
+	c
+	- c++
+	- asm
+	- objective-c
+	- opencl
+	- minimal {set res 1}
+    }
+    return $res
+}
+
 # Parse number N for LANG, and return a list of expected type and value.
 
 proc parse_number { lang n } {
@@ -165,26 +181,48 @@ proc parse_number { lang n } {
 			($any|$re_overflow)]
 	}
     } else {
-	# This is wrong for c-like languages.  For the decimal case, we
-	# shouldn't use unsigned.
-	# See PR 16377.
+	if { [c_like $lang] } {
+	    if { $hex_p } {
+		# C Hex.
+		set have_unsigned 1
+	    } else {
+		# C Decimal.  Unsigned not allowed according.
+		if { [fits_in_type $n $long_long_bits s] } {
+		    # Fits in largest signed type.
+		    set have_unsigned 0
+		} else {
+		    # Doesn't fit in largest signed type, so ill-formed, but
+		    # allow unsigned as a convenience, as compilers do (though
+		    # with a warning).
+		    set have_unsigned 1
+		}
+	    }
+	} else {
+	    # Non-C.
+	    set have_unsigned 1
+	}
+
 	if { [fits_in_type $n $int_bits s] } {
 	    return [list int $n]
-	} elseif { [fits_in_type $n $int_bits u] } {
+	} elseif { $have_unsigned && [fits_in_type $n $int_bits u] } {
 	    return [list "unsigned int" $n]
 	} elseif { [fits_in_type $n $long_bits s] } {
 	    return [list long $n]
-	} elseif { [fits_in_type $n $long_bits u] } {
+	} elseif { $have_unsigned && [fits_in_type $n $long_bits u] } {
 	    return [list "unsigned long" $n]
 	} elseif { [fits_in_type $n $long_long_bits s] } {
 	    return [list "long long" $n]
-	} elseif { [fits_in_type $n $long_long_bits u] } {
+	} elseif { $have_unsigned && [fits_in_type $n $long_long_bits u] } {
 	    return [list "unsigned long long" $n]
 	} else {
 	    # Overflow.
-	    # Some truncated value or re_overflow, should be re_overflow.
-	    return [list "((unsigned )?(int|long)|$re_overflow)" \
-			($any|$re_overflow)]
+	    if { [c_like $lang] } {
+		return [list $re_overflow $re_overflow]
+	    } else {
+		# Some truncated value or re_overflow, should be re_overflow.
+		return [list "((unsigned )?(int|long)|$re_overflow)" \
+			    ($any|$re_overflow)]
+	    }
 	}
     }
 
-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/8] [gdb/fortran] Fix literal truncation
  2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
  2022-05-23 11:05 ` [PATCH 2/8] [gdb/c] Fix type of 2147483648 and literal truncation Tom de Vries
@ 2022-05-23 11:05 ` Tom de Vries
  2022-05-23 11:05 ` [PATCH 4/8] [gdb/go] " Tom de Vries
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Tom de Vries @ 2022-05-23 11:05 UTC (permalink / raw)
  To: gdb-patches

As mentioned in commit 5b758627a18 ("Make gdb.base/parse_number.exp test all
architectures"):
...
    There might be a bug that 32-bit fortran truncates 64-bit values to
    32-bit, given "p/x 0xffffffffffffffff" returns "0xffffffff".
...

More concretely, we have:
...
$ for arch in i386:x86-64 i386; do \
    gdb -q -batch -ex "set arch $arch" -ex "set lang fortran" \
      -ex "p /x 0xffffffffffffffff"; \
  done
The target architecture is set to "i386:x86-64".
$1 = 0xffffffffffffffff
The target architecture is set to "i386".
$1 = 0xffffffff
...

Fix this by adding a range check in parse_number in gdb/f-exp.y.

Furthermore, make sure we error out on overflow instead of truncating in all
other cases.

Tested on x86_64-linux.
---
 gdb/f-exp.y                             | 31 ++++++++++++-------------
 gdb/testsuite/gdb.base/parse_number.exp |  4 +---
 2 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/gdb/f-exp.y b/gdb/f-exp.y
index 90cc2c65c7b..62641083850 100644
--- a/gdb/f-exp.y
+++ b/gdb/f-exp.y
@@ -1076,16 +1076,11 @@ parse_number (struct parser_state *par_state,
 	  n *= base;
 	  n += i;
 	}
-      /* Portably test for overflow (only works for nonzero values, so make
-	 a second check for zero).  */
-      if ((prevn >= n) && n != 0)
-	unsigned_p=1;		/* Try something unsigned */
-      /* If range checking enabled, portably test for unsigned overflow.  */
-      if (RANGE_CHECK && n != 0)
-	{
-	  if ((unsigned_p && prevn >= n))
-	    range_error (_("Overflow on numeric constant."));
-	}
+      /* Test for overflow.  */
+      if (prevn == 0 && n == 0)
+	;
+      else if (RANGE_CHECK && prevn >= n)
+	range_error (_("Overflow on numeric constant."));
       prevn = n;
     }
   
@@ -1100,7 +1095,8 @@ parse_number (struct parser_state *par_state,
      but too many compilers warn about that, when ints and longs
      are the same size.  So we shift it twice, with fewer bits
      each time, for the same result.  */
-  
+
+  int bits_available;
   if ((gdbarch_int_bit (par_state->gdbarch ())
        != gdbarch_long_bit (par_state->gdbarch ())
        && ((n >> 2)
@@ -1108,19 +1104,22 @@ parse_number (struct parser_state *par_state,
 							    shift warning */
       || long_p)
     {
-      high_bit = ((ULONGEST)1)
-      << (gdbarch_long_bit (par_state->gdbarch ())-1);
+      bits_available = gdbarch_long_bit (par_state->gdbarch ());
       unsigned_type = parse_type (par_state)->builtin_unsigned_long;
       signed_type = parse_type (par_state)->builtin_long;
-    }
+  }
   else 
     {
-      high_bit =
-	((ULONGEST)1) << (gdbarch_int_bit (par_state->gdbarch ()) - 1);
+      bits_available = gdbarch_int_bit (par_state->gdbarch ());
       unsigned_type = parse_type (par_state)->builtin_unsigned_int;
       signed_type = parse_type (par_state)->builtin_int;
     }    
+  high_bit = ((ULONGEST)1) << (bits_available - 1);
   
+  if (RANGE_CHECK
+      && ((n >> 2) >> (bits_available - 2)))
+    range_error (_("Overflow on numeric constant."));
+
   putithere->typed_val.val = n;
   
   /* If the high bit of the worked out type is set then this number
diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
index 638ea342384..87554ccf995 100644
--- a/gdb/testsuite/gdb.base/parse_number.exp
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -176,9 +176,7 @@ proc parse_number { lang n } {
 	    return [list "unsigned long" $n]
 	} else {
 	    # Overflow.
-	    # Some truncated value or re_overflow, should be re_overflow.
-	    return [list "((unsigned )?(int|long)|$re_overflow)" \
-			($any|$re_overflow)]
+	    return [list $re_overflow $re_overflow]
 	}
     } else {
 	if { [c_like $lang] } {
-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/8] [gdb/go] Fix literal truncation
  2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
  2022-05-23 11:05 ` [PATCH 2/8] [gdb/c] Fix type of 2147483648 and literal truncation Tom de Vries
  2022-05-23 11:05 ` [PATCH 3/8] [gdb/fortran] Fix " Tom de Vries
@ 2022-05-23 11:05 ` Tom de Vries
  2022-05-23 11:05 ` [PATCH 5/8] [gdb/pascal] " Tom de Vries
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Tom de Vries @ 2022-05-23 11:05 UTC (permalink / raw)
  To: gdb-patches

Make sure we error out on overflow instead of truncating in all cases.

The current implementation of parse_number contains a comment about PR16377,
but that's related to C-like languages.  In absence of information of whether
the same fix is needed for go, take the conservative approach and keep
behaviour for decimals unchanged.

Tested on x86_64-linux, with a build with --enable-targets=all.
---
 gdb/go-exp.y                            | 95 ++++++++-----------------
 gdb/testsuite/gdb.base/parse_number.exp |  2 +-
 2 files changed, 30 insertions(+), 67 deletions(-)

diff --git a/gdb/go-exp.y b/gdb/go-exp.y
index 1b9b6ea7e90..f786ec20171 100644
--- a/gdb/go-exp.y
+++ b/gdb/go-exp.y
@@ -649,7 +649,6 @@ parse_number (struct parser_state *par_state,
 {
   ULONGEST n = 0;
   ULONGEST prevn = 0;
-  ULONGEST un;
 
   int i = 0;
   int c;
@@ -662,10 +661,6 @@ parse_number (struct parser_state *par_state,
   /* We have found a "L" or "U" suffix.  */
   int found_suffix = 0;
 
-  ULONGEST high_bit;
-  struct type *signed_type;
-  struct type *unsigned_type;
-
   if (parsed_float)
     {
       const struct builtin_go_type *builtin_go_types
@@ -777,18 +772,12 @@ parse_number (struct parser_state *par_state,
       if (i >= base)
 	return ERROR;		/* Invalid digit in this base.  */
 
-      /* Portably test for overflow (only works for nonzero values, so make
-	 a second check for zero).  FIXME: Can't we just make n and prevn
-	 unsigned and avoid this?  */
-      if (c != 'l' && c != 'u' && (prevn >= n) && n != 0)
-	unsigned_p = 1;		/* Try something unsigned.  */
-
-      /* Portably test for unsigned overflow.
-	 FIXME: This check is wrong; for example it doesn't find overflow
-	 on 0x123456789 when LONGEST is 32 bits.  */
-      if (c != 'l' && c != 'u' && n != 0)
+      if (c != 'l' && c != 'u')
 	{
-	  if ((unsigned_p && prevn >= n))
+	  /* Test for overflow.  */
+	  if (n == 0 && prevn == 0)
+	    ;
+	  else if (prevn >= n)
 	    error (_("Numeric constant too large."));
 	}
       prevn = n;
@@ -806,57 +795,31 @@ parse_number (struct parser_state *par_state,
      the case where it is we just always shift the value more than
      once, with fewer bits each time.  */
 
-  un = n >> 2;
-  if (long_p == 0
-      && (un >> (gdbarch_int_bit (par_state->gdbarch ()) - 2)) == 0)
-    {
-      high_bit
-	= ((ULONGEST)1) << (gdbarch_int_bit (par_state->gdbarch ()) - 1);
-
-      /* A large decimal (not hex or octal) constant (between INT_MAX
-	 and UINT_MAX) is a long or unsigned long, according to ANSI,
-	 never an unsigned int, but this code treats it as unsigned
-	 int.  This probably should be fixed.  GCC gives a warning on
-	 such constants.  */
-
-      unsigned_type = parse_type (par_state)->builtin_unsigned_int;
-      signed_type = parse_type (par_state)->builtin_int;
-    }
-  else if (long_p <= 1
-	   && (un >> (gdbarch_long_bit (par_state->gdbarch ()) - 2)) == 0)
-    {
-      high_bit
-	= ((ULONGEST)1) << (gdbarch_long_bit (par_state->gdbarch ()) - 1);
-      unsigned_type = parse_type (par_state)->builtin_unsigned_long;
-      signed_type = parse_type (par_state)->builtin_long;
-    }
+  int int_bits = gdbarch_int_bit (par_state->gdbarch ());
+  int long_bits = gdbarch_long_bit (par_state->gdbarch ());
+  int long_long_bits = gdbarch_long_long_bit (par_state->gdbarch ());
+  bool have_signed = !unsigned_p;
+  bool have_int = long_p == 0;
+  bool have_long = long_p <= 1;
+  if (have_int && have_signed && fits_in_type (1, n, int_bits, true))
+    putithere->typed_val_int.type = parse_type (par_state)->builtin_int;
+  else if (have_int && fits_in_type (1, n, int_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_int;
+  else if (have_long && have_signed && fits_in_type (1, n, long_bits, true))
+    putithere->typed_val_int.type = parse_type (par_state)->builtin_long;
+  else if (have_long && fits_in_type (1, n, long_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_long;
+  else if (have_signed && fits_in_type (1, n, long_long_bits, true))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_long_long;
+  else if (fits_in_type (1, n, long_long_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_long_long;
   else
-    {
-      int shift;
-      if (sizeof (ULONGEST) * HOST_CHAR_BIT
-	  < gdbarch_long_long_bit (par_state->gdbarch ()))
-	/* A long long does not fit in a LONGEST.  */
-	shift = (sizeof (ULONGEST) * HOST_CHAR_BIT - 1);
-      else
-	shift = (gdbarch_long_long_bit (par_state->gdbarch ()) - 1);
-      high_bit = (ULONGEST) 1 << shift;
-      unsigned_type = parse_type (par_state)->builtin_unsigned_long_long;
-      signed_type = parse_type (par_state)->builtin_long_long;
-    }
-
-   putithere->typed_val_int.val = n;
-
-   /* If the high bit of the worked out type is set then this number
-      has to be unsigned.  */
-
-   if (unsigned_p || (n & high_bit))
-     {
-       putithere->typed_val_int.type = unsigned_type;
-     }
-   else
-     {
-       putithere->typed_val_int.type = signed_type;
-     }
+    error (_("Numeric constant too large."));
+  putithere->typed_val_int.val = n;
 
    return INT;
 }
diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
index 87554ccf995..bedb4d64c5a 100644
--- a/gdb/testsuite/gdb.base/parse_number.exp
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -214,7 +214,7 @@ proc parse_number { lang n } {
 	    return [list "unsigned long long" $n]
 	} else {
 	    # Overflow.
-	    if { [c_like $lang] } {
+	    if { [c_like $lang] || $lang == "go" } {
 		return [list $re_overflow $re_overflow]
 	    } else {
 		# Some truncated value or re_overflow, should be re_overflow.
-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 5/8] [gdb/pascal] Fix literal truncation
  2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
                   ` (2 preceding siblings ...)
  2022-05-23 11:05 ` [PATCH 4/8] [gdb/go] " Tom de Vries
@ 2022-05-23 11:05 ` Tom de Vries
  2022-05-23 11:05 ` [PATCH 6/8] [gdb/rust] " Tom de Vries
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Tom de Vries @ 2022-05-23 11:05 UTC (permalink / raw)
  To: gdb-patches

Make sure we error out on overflow instead of truncating in all cases.

The current implementation of parse_number contains a comment about PR16377,
but that's related to C-like languages.  In absence of information of whether
the same fix is needed for pascal, take the conservative approach and keep
behaviour for decimals unchanged.

Tested on x86_64-linux, with a build with --enable-targets=all.
---
 gdb/p-exp.y                             | 95 ++++++++-----------------
 gdb/testsuite/gdb.base/parse_number.exp |  8 +--
 2 files changed, 30 insertions(+), 73 deletions(-)

diff --git a/gdb/p-exp.y b/gdb/p-exp.y
index 7c88df65e69..2df41b80ec7 100644
--- a/gdb/p-exp.y
+++ b/gdb/p-exp.y
@@ -804,7 +804,6 @@ parse_number (struct parser_state *par_state,
 {
   ULONGEST n = 0;
   ULONGEST prevn = 0;
-  ULONGEST un;
 
   int i = 0;
   int c;
@@ -817,10 +816,6 @@ parse_number (struct parser_state *par_state,
   /* We have found a "L" or "U" suffix.  */
   int found_suffix = 0;
 
-  ULONGEST high_bit;
-  struct type *signed_type;
-  struct type *unsigned_type;
-
   if (parsed_float)
     {
       /* Handle suffixes: 'f' for float, 'l' for long double.
@@ -919,18 +914,12 @@ parse_number (struct parser_state *par_state,
       if (i >= base)
 	return ERROR;		/* Invalid digit in this base.  */
 
-      /* Portably test for overflow (only works for nonzero values, so make
-	 a second check for zero).  FIXME: Can't we just make n and prevn
-	 unsigned and avoid this?  */
-      if (c != 'l' && c != 'u' && (prevn >= n) && n != 0)
-	unsigned_p = 1;		/* Try something unsigned.  */
-
-      /* Portably test for unsigned overflow.
-	 FIXME: This check is wrong; for example it doesn't find overflow
-	 on 0x123456789 when LONGEST is 32 bits.  */
-      if (c != 'l' && c != 'u' && n != 0)
+      if (c != 'l' && c != 'u')
 	{
-	  if (unsigned_p && prevn >= n)
+	  /* Test for overflow.  */
+	  if (prevn == 0 && n == 0)
+	    ;
+	  else if (prevn >= n)
 	    error (_("Numeric constant too large."));
 	}
       prevn = n;
@@ -948,57 +937,31 @@ parse_number (struct parser_state *par_state,
      the case where it is we just always shift the value more than
      once, with fewer bits each time.  */
 
-  un = n >> 2;
-  if (long_p == 0
-      && (un >> (gdbarch_int_bit (par_state->gdbarch ()) - 2)) == 0)
-    {
-      high_bit
-	= ((ULONGEST)1) << (gdbarch_int_bit (par_state->gdbarch ()) - 1);
-
-      /* A large decimal (not hex or octal) constant (between INT_MAX
-	 and UINT_MAX) is a long or unsigned long, according to ANSI,
-	 never an unsigned int, but this code treats it as unsigned
-	 int.  This probably should be fixed.  GCC gives a warning on
-	 such constants.  */
-
-      unsigned_type = parse_type (par_state)->builtin_unsigned_int;
-      signed_type = parse_type (par_state)->builtin_int;
-    }
-  else if (long_p <= 1
-	   && (un >> (gdbarch_long_bit (par_state->gdbarch ()) - 2)) == 0)
-    {
-      high_bit
-	= ((ULONGEST)1) << (gdbarch_long_bit (par_state->gdbarch ()) - 1);
-      unsigned_type = parse_type (par_state)->builtin_unsigned_long;
-      signed_type = parse_type (par_state)->builtin_long;
-    }
+  int int_bits = gdbarch_int_bit (par_state->gdbarch ());
+  int long_bits = gdbarch_long_bit (par_state->gdbarch ());
+  int long_long_bits = gdbarch_long_long_bit (par_state->gdbarch ());
+  bool have_signed = !unsigned_p;
+  bool have_int = long_p == 0;
+  bool have_long = long_p <= 1;
+  if (have_int && have_signed && fits_in_type (1, n, int_bits, true))
+    putithere->typed_val_int.type = parse_type (par_state)->builtin_int;
+  else if (have_int && fits_in_type (1, n, int_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_int;
+  else if (have_long && have_signed && fits_in_type (1, n, long_bits, true))
+    putithere->typed_val_int.type = parse_type (par_state)->builtin_long;
+  else if (have_long && fits_in_type (1, n, long_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_long;
+  else if (have_signed && fits_in_type (1, n, long_long_bits, true))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_long_long;
+  else if (fits_in_type (1, n, long_long_bits, false))
+    putithere->typed_val_int.type
+      = parse_type (par_state)->builtin_unsigned_long_long;
   else
-    {
-      int shift;
-      if (sizeof (ULONGEST) * HOST_CHAR_BIT
-	  < gdbarch_long_long_bit (par_state->gdbarch ()))
-	/* A long long does not fit in a LONGEST.  */
-	shift = (sizeof (ULONGEST) * HOST_CHAR_BIT - 1);
-      else
-	shift = (gdbarch_long_long_bit (par_state->gdbarch ()) - 1);
-      high_bit = (ULONGEST) 1 << shift;
-      unsigned_type = parse_type (par_state)->builtin_unsigned_long_long;
-      signed_type = parse_type (par_state)->builtin_long_long;
-    }
-
-   putithere->typed_val_int.val = n;
-
-   /* If the high bit of the worked out type is set then this number
-      has to be unsigned.  */
-
-   if (unsigned_p || (n & high_bit))
-     {
-       putithere->typed_val_int.type = unsigned_type;
-     }
-   else
-     {
-       putithere->typed_val_int.type = signed_type;
-     }
+    error (_("Numeric constant too large."));
+  putithere->typed_val_int.val = n;
 
    return INT;
 }
diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
index bedb4d64c5a..f9782115b7c 100644
--- a/gdb/testsuite/gdb.base/parse_number.exp
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -214,13 +214,7 @@ proc parse_number { lang n } {
 	    return [list "unsigned long long" $n]
 	} else {
 	    # Overflow.
-	    if { [c_like $lang] || $lang == "go" } {
-		return [list $re_overflow $re_overflow]
-	    } else {
-		# Some truncated value or re_overflow, should be re_overflow.
-		return [list "((unsigned )?(int|long)|$re_overflow)" \
-			    ($any|$re_overflow)]
-	    }
+	    return [list $re_overflow $re_overflow]
 	}
     }
 
-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 6/8] [gdb/rust] Fix literal truncation
  2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
                   ` (3 preceding siblings ...)
  2022-05-23 11:05 ` [PATCH 5/8] [gdb/pascal] " Tom de Vries
@ 2022-05-23 11:05 ` Tom de Vries
  2022-05-26 17:17   ` Tom Tromey
  2022-05-23 11:05 ` [PATCH 7/8] [gdb/m2] Fix UB and " Tom de Vries
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: Tom de Vries @ 2022-05-23 11:05 UTC (permalink / raw)
  To: gdb-patches

Make sure we error out on overflow instead of truncating in all cases.

I've used as overflow string: "Integer literal is too large", based
on what I found at
<rust-lang/rust>/src/test/ui/parser/int-literal-too-large-span.rs
but perhaps someone has a better idea.

Tested on x86_64-linux, with a build with --enable-targets=all.
---
 gdb/rust-parse.c                        | 5 ++++-
 gdb/testsuite/gdb.base/parse_number.exp | 5 +++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gdb/rust-parse.c b/gdb/rust-parse.c
index 7d7d882872c..836f49108f8 100644
--- a/gdb/rust-parse.c
+++ b/gdb/rust-parse.c
@@ -1024,7 +1024,10 @@ rust_parser::lex_number ()
 	    }
 	}
 
-      value = strtoulst (number.c_str () + offset, NULL, radix);
+      const char *trailer;
+      value = strtoulst (number.c_str () + offset, &trailer, radix);
+      if (*trailer != '\0')
+	error ("Integer literal is too large");
       if (implicit_i32 && value >= ((uint64_t) 1) << 31)
 	type = get_type ("i64");
 
diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
index f9782115b7c..4189ccaf92c 100644
--- a/gdb/testsuite/gdb.base/parse_number.exp
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -113,8 +113,7 @@ proc parse_number { lang n } {
 	    return [list "i64" $n]
 	} else {
 	    # Overflow.
-	    # Some truncated value, should be re_overflow.
-	    return [list i64 $any]
+	    return [list $re_overflow $re_overflow]
 	}
     } elseif { $lang == "d" } {
 	if { [fits_in_type $n 32 s] } {
@@ -274,6 +273,8 @@ proc test_parse_numbers {arch} {
 	    set re_overflow "Overflow on numeric constant\\."
 	} elseif { $lang == "ada" } {
 	    set re_overflow "Integer literal out of range"
+	} elseif { $lang == "rust" } {
+	    set re_overflow "Integer literal is too large"
 	} else {
 	    set re_overflow "Numeric constant too large\\."
 	}
-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 7/8] [gdb/m2] Fix UB and literal truncation
  2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
                   ` (4 preceding siblings ...)
  2022-05-23 11:05 ` [PATCH 6/8] [gdb/rust] " Tom de Vries
@ 2022-05-23 11:05 ` Tom de Vries
  2022-05-23 11:05 ` [PATCH 8/8] [gdb/ada] Fix " Tom de Vries
  2022-06-04 11:20 ` [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
  7 siblings, 0 replies; 11+ messages in thread
From: Tom de Vries @ 2022-05-23 11:05 UTC (permalink / raw)
  To: gdb-patches

Rewrite parse_number to use ULONGEST instead of LONGEST, to fix UB errors as
mentioned in PR29163.

Furthermore, make sure we error out on overflow instead of truncating in all
cases.

Tested on x86_64-linux, with a build with --enable-targets=all.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29163
---
 gdb/m2-exp.y                            | 47 ++++++++++++-------------
 gdb/testsuite/gdb.base/parse_number.exp |  3 +-
 2 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/gdb/m2-exp.y b/gdb/m2-exp.y
index 85bac11b8fb..d3e917bb8d7 100644
--- a/gdb/m2-exp.y
+++ b/gdb/m2-exp.y
@@ -582,12 +582,11 @@ static int
 parse_number (int olen)
 {
   const char *p = pstate->lexptr;
-  LONGEST n = 0;
-  LONGEST prevn = 0;
+  ULONGEST n = 0;
+  ULONGEST prevn = 0;
   int c,i,ischar=0;
   int base = input_radix;
   int len = olen;
-  int unsigned_p = number_sign == 1 ? 1 : 0;
 
   if(p[len-1] == 'H')
   {
@@ -639,16 +638,11 @@ parse_number (int olen)
       n+=i;
       if(i >= base)
 	 return ERROR;
-      if(!unsigned_p && number_sign == 1 && (prevn >= n))
-	 unsigned_p=1;		/* Try something unsigned */
-      /* Don't do the range check if n==i and i==0, since that special
-	 case will give an overflow error.  */
-      if(RANGE_CHECK && n!=i && i)
-      {
-	 if((unsigned_p && (unsigned)prevn >= (unsigned)n) ||
-	    ((!unsigned_p && number_sign==-1) && -prevn <= -n))
-	    range_error (_("Overflow on numeric constant."));
-      }
+      if (n == 0 && prevn == 0)
+	;
+      else if (RANGE_CHECK && prevn >= n)
+	range_error (_("Overflow on numeric constant."));
+
 	 prevn=n;
     }
 
@@ -661,17 +655,22 @@ parse_number (int olen)
      yylval.ulval = n;
      return CHAR;
   }
-  else if ( unsigned_p && number_sign == 1)
-  {
-     yylval.ulval = n;
-     return UINT;
-  }
-  else if((unsigned_p && (n<0))) {
-     range_error (_("Overflow on numeric constant -- number too large."));
-     /* But, this can return if range_check == range_warn.  */
-  }
-  yylval.lval = n;
-  return INT;
+
+  int int_bits = gdbarch_int_bit (pstate->gdbarch ());
+  bool have_signed = number_sign == -1;
+  bool have_unsigned = number_sign == 1;
+  if (have_signed && fits_in_type (number_sign, n, int_bits, true))
+    {
+      yylval.lval = n;
+      return INT;
+    }
+  else if (have_unsigned && fits_in_type (number_sign, n, int_bits, false))
+    {
+      yylval.ulval = n;
+      return UINT;
+    }
+  else
+    error (_("Overflow on numeric constant."));
 }
 
 
diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
index 4189ccaf92c..6e0091278a9 100644
--- a/gdb/testsuite/gdb.base/parse_number.exp
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -161,8 +161,7 @@ proc parse_number { lang n } {
 	    return [list "CARDINAL" $n]
 	} else {
 	    # Overflow.
-	    # Some truncated value or re_overflow, should be re_overflow.
-	    return [list ($re_overflow|CARDINAL|INTEGER) ($re_overflow|$any)]
+	    return [list $re_overflow $re_overflow]
 	}
     } elseif { $lang == "fortran" } {
 	if { [fits_in_type $n $int_bits s] } {
-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 8/8] [gdb/ada] Fix literal truncation
  2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
                   ` (5 preceding siblings ...)
  2022-05-23 11:05 ` [PATCH 7/8] [gdb/m2] Fix UB and " Tom de Vries
@ 2022-05-23 11:05 ` Tom de Vries
  2022-06-04 11:20 ` [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
  7 siblings, 0 replies; 11+ messages in thread
From: Tom de Vries @ 2022-05-23 11:05 UTC (permalink / raw)
  To: gdb-patches

Make sure we error out on overflow instead of truncating in all cases.

Tested on x86_64-linux, with a build with --enable-targets=all.
---
 gdb/ada-lex.l                           | 27 +++++++++++++++++++++----
 gdb/testsuite/gdb.base/parse_number.exp |  4 +---
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/gdb/ada-lex.l b/gdb/ada-lex.l
index 33a08eaa93b..002eb811e41 100644
--- a/gdb/ada-lex.l
+++ b/gdb/ada-lex.l
@@ -466,12 +466,16 @@ processInt (struct parser_state *par_state, const char *base0,
   if (mpz_cmp (result.val, maxval.val) > 0)
     error (_("Integer literal out of range"));
 
+  int int_bits = gdbarch_int_bit (par_state->gdbarch ());
+  int long_bits = gdbarch_long_bit (par_state->gdbarch ());
+  int long_long_bits = gdbarch_long_long_bit (par_state->gdbarch ());
+
   ULONGEST value = result.as_integer<ULONGEST> ();
-  if ((value >> (gdbarch_int_bit (par_state->gdbarch ())-1)) == 0)
+  if (fits_in_type (1, value, int_bits, true))
     yylval.typed_val.type = type_int (par_state);
-  else if ((value >> (gdbarch_long_bit (par_state->gdbarch ())-1)) == 0)
+  else if (fits_in_type (1, value, long_bits, true))
     yylval.typed_val.type = type_long (par_state);
-  else if (((value >> (gdbarch_long_bit (par_state->gdbarch ())-1)) >> 1) == 0)
+  else if (fits_in_type (1, value, long_bits, false))
     {
       /* We have a number representable as an unsigned integer quantity.
          For consistency with the C treatment, we will treat it as an
@@ -490,8 +494,23 @@ processInt (struct parser_state *par_state, const char *base0,
 	yylval.typed_val.val = (LONGEST) value;
       return INT;
     }
-  else
+  else if (fits_in_type (1, value, long_long_bits, true))
     yylval.typed_val.type = type_long_long (par_state);
+  else if (fits_in_type (1, value, long_long_bits, false))
+    {
+      /* Note: Interprets ULLONG_MAX as -1.  */
+      yylval.typed_val.type = type_long_long (par_state);
+      /* See unsigned long case above.  */
+      if (value & LONGEST_SIGN)
+	yylval.typed_val.val =
+	  (LONGEST) (value & ~LONGEST_SIGN)
+	  - (LONGEST_SIGN>>1) - (LONGEST_SIGN>>1);
+      else
+	yylval.typed_val.val = (LONGEST) value;
+      return INT;
+    }
+  else
+    error (_("Integer literal out of range"));
 
   yylval.typed_val.val = value;
   return INT;
diff --git a/gdb/testsuite/gdb.base/parse_number.exp b/gdb/testsuite/gdb.base/parse_number.exp
index 6e0091278a9..70b0ad065d7 100644
--- a/gdb/testsuite/gdb.base/parse_number.exp
+++ b/gdb/testsuite/gdb.base/parse_number.exp
@@ -146,9 +146,7 @@ proc parse_number { lang n } {
 	    return [list "<$sizeof_long_long-byte integer>" $n]
 	} else {
 	    # Overflow.
-	    # Some truncated value or re_overflow, should be re_overflow.
-	    return [list "($re_overflow|<$decimal-byte integer>)" \
-			($re_overflow|$any)]
+	    return [list $re_overflow $re_overflow]
 	}
     } elseif { $lang == "modula-2" } {
 	if { [string equal $n -0] } {
-- 
2.35.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 6/8] [gdb/rust] Fix literal truncation
  2022-05-23 11:05 ` [PATCH 6/8] [gdb/rust] " Tom de Vries
@ 2022-05-26 17:17   ` Tom Tromey
  2022-06-07  9:29     ` Tom de Vries
  0 siblings, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2022-05-26 17:17 UTC (permalink / raw)
  To: Tom de Vries via Gdb-patches

>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> Make sure we error out on overflow instead of truncating in all cases.
Tom> I've used as overflow string: "Integer literal is too large", based
Tom> on what I found at
Tom> <rust-lang/rust>/src/test/ui/parser/int-literal-too-large-span.rs
Tom> but perhaps someone has a better idea.

Tom> -      value = strtoulst (number.c_str () + offset, NULL, radix);
Tom> +      const char *trailer;
Tom> +      value = strtoulst (number.c_str () + offset, &trailer, radix);
Tom> +      if (*trailer != '\0')
Tom> +	error ("Integer literal is too large");

This seems fine, though I think it's normal to use _() around the
argument to error.

Tom> +	} elseif { $lang == "rust" } {
Tom> +	    set re_overflow "Integer literal is too large"

I don't mind if this is there, but normally the Rust lexer uses unit
tests.  They are in rust-parse.c, see rust_lex_tests.

Tom

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp
  2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
                   ` (6 preceding siblings ...)
  2022-05-23 11:05 ` [PATCH 8/8] [gdb/ada] Fix " Tom de Vries
@ 2022-06-04 11:20 ` Tom de Vries
  7 siblings, 0 replies; 11+ messages in thread
From: Tom de Vries @ 2022-06-04 11:20 UTC (permalink / raw)
  To: gdb-patches

On 5/23/22 13:05, Tom de Vries via Gdb-patches wrote:

I've pushed this series.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 6/8] [gdb/rust] Fix literal truncation
  2022-05-26 17:17   ` Tom Tromey
@ 2022-06-07  9:29     ` Tom de Vries
  0 siblings, 0 replies; 11+ messages in thread
From: Tom de Vries @ 2022-06-07  9:29 UTC (permalink / raw)
  To: Tom Tromey, Tom de Vries via Gdb-patches

[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]

On 5/26/22 19:17, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Tom> Make sure we error out on overflow instead of truncating in all cases.
> Tom> I've used as overflow string: "Integer literal is too large", based
> Tom> on what I found at
> Tom> <rust-lang/rust>/src/test/ui/parser/int-literal-too-large-span.rs
> Tom> but perhaps someone has a better idea.
> 
> Tom> -      value = strtoulst (number.c_str () + offset, NULL, radix);
> Tom> +      const char *trailer;
> Tom> +      value = strtoulst (number.c_str () + offset, &trailer, radix);
> Tom> +      if (*trailer != '\0')
> Tom> +	error ("Integer literal is too large");
> 
> This seems fine, though I think it's normal to use _() around the
> argument to error.
> 

Indeed, and I forgot to follow up on that before commit.  Fixed in this 
commit.

> Tom> +	} elseif { $lang == "rust" } {
> Tom> +	    set re_overflow "Integer literal is too large"
> 
> I don't mind if this is there, but normally the Rust lexer uses unit
> tests.  They are in rust-parse.c, see rust_lex_tests.

Ack, and I did wonder whether it would be a good idea to port the entire 
test-case to the unit tests, since it might help with the test duration 
and enable us to test for all archs again.

For now though, I'm leaving things as they are.

Thanks,
- Tom

[-- Attachment #2: 0001-gdb-rust-Add-missing-_-for-error-call.patch --]
[-- Type: text/x-patch, Size: 786 bytes --]

[gdb/rust] Add missing _() for error call

In commit 1390b65a1b9 ("[gdb/rust] Fix literal truncation") I forgot to add
_() around a string using in an error call.

Fix this by adding the missing _().

Tested on x86_64-linux.

---
 gdb/rust-parse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdb/rust-parse.c b/gdb/rust-parse.c
index 836f49108f8..f5eb63ec1e3 100644
--- a/gdb/rust-parse.c
+++ b/gdb/rust-parse.c
@@ -1027,7 +1027,7 @@ rust_parser::lex_number ()
       const char *trailer;
       value = strtoulst (number.c_str () + offset, &trailer, radix);
       if (*trailer != '\0')
-	error ("Integer literal is too large");
+	error (_("Integer literal is too large"));
       if (implicit_i32 && value >= ((uint64_t) 1) << 31)
 	type = get_type ("i64");
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-06-07  9:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-23 11:05 [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries
2022-05-23 11:05 ` [PATCH 2/8] [gdb/c] Fix type of 2147483648 and literal truncation Tom de Vries
2022-05-23 11:05 ` [PATCH 3/8] [gdb/fortran] Fix " Tom de Vries
2022-05-23 11:05 ` [PATCH 4/8] [gdb/go] " Tom de Vries
2022-05-23 11:05 ` [PATCH 5/8] [gdb/pascal] " Tom de Vries
2022-05-23 11:05 ` [PATCH 6/8] [gdb/rust] " Tom de Vries
2022-05-26 17:17   ` Tom Tromey
2022-06-07  9:29     ` Tom de Vries
2022-05-23 11:05 ` [PATCH 7/8] [gdb/m2] Fix UB and " Tom de Vries
2022-05-23 11:05 ` [PATCH 8/8] [gdb/ada] Fix " Tom de Vries
2022-06-04 11:20 ` [PATCH 1/8] [gdb/testsuite] Test more values in gdb.base/parse_numbers.exp Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).