[PATCH 0/4] improve range checking for certain constants on x86

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 0/4] improve range checking for certain constants on x86
@ 2023-05-05 13:00 Jan Beulich
  2023-05-05 13:01 ` [PATCH 1/4] x86: tighten extend-to-32bit-address conditions Jan Beulich
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Jan Beulich @ 2023-05-05 13:00 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Unfortunately I haven't found a way to do so without some common code
changes, which in turn required adjustments to a few other targets.

1: x86: tighten extend-to-32bit-address conditions
2: gas: maintain O_constant signedness in more cases
3: gas: invoke md_optimize_expr() also for unary expressions
4: x86: further adjust extend-to-32bit-address conditions

Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] x86: tighten extend-to-32bit-address conditions
  2023-05-05 13:00 [PATCH 0/4] improve range checking for certain constants on x86 Jan Beulich
@ 2023-05-05 13:01 ` Jan Beulich
  2023-05-05 13:01 ` [PATCH 2/4] gas: maintain O_constant signedness in more cases Jan Beulich
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Jan Beulich @ 2023-05-05 13:01 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

In a442cac5084e ("ix86: wrap constants") I made the truncation condition
too relaxed: Any indication of a mode that's possible with BFD64 only
should avoid the truncation. Therefore, like in the other two cases of
calls to extend_to_32bit_address(), also check whether we're generating
a 64-bit object.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -11684,7 +11684,7 @@ i386_finalize_immediate (segT exp_seg AT
 
       /* If not 64bit, sign/zero extend val, to account for wraparound
 	 when !BFD64.  */
-      if (flag_code != CODE_64BIT)
+      if (flag_code != CODE_64BIT && !object_64bit)
 	exp->X_add_number = extend_to_32bit_address (exp->X_add_number);
     }
 #if (defined (OBJ_AOUT) || defined (OBJ_MAYBE_AOUT))
@@ -11976,7 +11976,7 @@ i386_finalize_displacement (segT exp_seg
 
 	 If not 64bit, sign/zero extend val, to account for wraparound
 	 when !BFD64.  */
-      if (flag_code != CODE_64BIT)
+      if (flag_code != CODE_64BIT && !object_64bit)
 	exp->X_add_number = extend_to_32bit_address (exp->X_add_number);
     }
 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/4] gas: maintain O_constant signedness in more cases
  2023-05-05 13:00 [PATCH 0/4] improve range checking for certain constants on x86 Jan Beulich
  2023-05-05 13:01 ` [PATCH 1/4] x86: tighten extend-to-32bit-address conditions Jan Beulich
@ 2023-05-05 13:01 ` Jan Beulich
  2023-05-05 13:03 ` [PATCH 3/4] gas: invoke md_optimize_expr() also for unary expressions Jan Beulich
  2023-05-05 13:04 ` [PATCH 4/4] x86: further adjust extend-to-32bit-address conditions Jan Beulich
  3 siblings, 0 replies; 5+ messages in thread
From: Jan Beulich @ 2023-05-05 13:01 UTC (permalink / raw)
  To: Binutils

Unary '~' doesn't really produce an unsigned result. Neither does
subtraction (unless taking operand values into consideration). And an
abstract operator applied to two operands which aren't both unsigned
can't be assumed to yield an unsigned result; exceptions are
- shifts, where only signedness of the left hand operand matters,
- comparisons, which - unlike unary '!' - produce signed results (they
  deliver 0 or ~0, as opposed to '!', which yields 0 or 1),
- logical operators (yielding 0 or 1 and hence treated like unary '!').

While doing this (specifically while extending the all/quad testcase),
update .quad and .8byte documentation: With 64-bit architectures now
being common, it is highly inappropriate to state that these directives
unconditionally require bignums.
---
Similarly X_extrabit looks to be poorly maintained when folding
expressions ("Optimize common cases."); only binary '+' / '-' (and unary
operators) are updating it. A sub-aspect here is that
subtract_from_result(), when not passed an RHS X_extrabit, is (unlike
add_to_result()) passed 0 instead of whether the RHS value is negative
(there are two such cases). With X_unsigned properly maintained I wonder
though whether we really need X_extrabit.

Considering that octa.d has only two variants of expectations, I've
omitted the 3rd and 4th in quad2.d as well.

As to documentation: .{2,4,8}byte are documented to insert values
unaligned, but cons_worker() doesn't treat them any different than
.long etc. In particular, if defined, md_cons_align() is invoked. Is
this a doc error or a bug?

--- a/gas/doc/as.texi
+++ b/gas/doc/as.texi
@@ -4631,7 +4631,7 @@ Some machine configurations provide addi
 @end ifclear
 * 2byte::                       @code{.2byte @var{expressions}}
 * 4byte::                       @code{.4byte @var{expressions}}
-* 8byte::                       @code{.8byte @var{bignums}}
+* 8byte::                       @code{.8byte @var{expressions}}
 * Deprecated::                  Deprecated Directives
 @end menu
 
@@ -6579,14 +6579,16 @@ as in the @code{.section} (@pxref{Sectio
 @end ifset
 
 @node Quad
-@section @code{.quad @var{bignums}}
+@section @code{.quad @var{expressions}}
 
 @cindex @code{quad} directive
-@code{.quad} expects zero or more bignums, separated by commas.  For
-each bignum, it emits
 @ifclear bignum-16
-an 8-byte integer.  If the bignum won't fit in 8 bytes, it prints a
-warning message; and just takes the lowest order 8 bytes of the bignum.
+For 64-bit architectures, or more generally with any GAS configured to support
+64-bit target virtual addresses, this is like @samp{.int}, but emitting 64-bit
+quantities.  Otherwise @code{.quad} expects zero or more bignums, separated by
+commas.  For each item, it emits an 8-byte integer.  If a bignum won't fit in
+8 bytes, a warning message is printed and just the lowest order 8 bytes of the
+bignum are taken.
 @cindex eight-byte integer
 @cindex integer, 8-byte
 
@@ -6594,8 +6596,11 @@ The term ``quad'' comes from contexts in
 hence @emph{quad}-word for 8 bytes.
 @end ifclear
 @ifset bignum-16
-a 16-byte integer.  If the bignum won't fit in 16 bytes, it prints a
-warning message; and just takes the lowest order 16 bytes of the bignum.
+@code{.quad} expects zero or more bignums, separated by commas.  For
+each bignum, it emits a 16-byte integer.  If the bignum won't fit in 16
+bytes, it prints a warning message; and just takes the lowest order 16
+bytes of the bignum.
+@xref{Octa,,@code{.octa @var{bignums}}}.
 @cindex sixteen-byte integer
 @cindex integer, 16-byte
 @end ifset
@@ -7693,8 +7698,11 @@ long values into the output.
 @cindex eight-byte integer
 @cindex integer, 8-byte
 
-Like the @option{.2byte} directive, except that it inserts unaligned, eight
-byte long bignum values into the output.
+For 64-bit architectures, or more generally with any GAS configured to support
+64-bit target virtual addresses, this is like the @option{.2byte} directive,
+except that it inserts unaligned, eight byte long values into the output.
+Otherwise, like @ref{Quad,,@code{.quad @var{expressions}}}, it expects zero or
+more bignums, separated by commas.
 
 @node Deprecated
 @section Deprecated Directives
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -1056,6 +1056,7 @@ operand (expressionS *expressionP, enum
 	      {
 		expressionP->X_add_number = ~ expressionP->X_add_number;
 		expressionP->X_extrabit ^= 1;
+		expressionP->X_unsigned = 0;
 	      }
 	    else if (c == '!')
 	      {
@@ -1816,6 +1817,7 @@ expr (int rankarg,		/* Larger # is highe
   while (op_left != O_illegal && op_rank[(int) op_left] > rank)
     {
       segT rightseg;
+      bool is_unsigned;
       offsetT frag_off;
 
       input_line_pointer += op_chars;	/* -> after operator.  */
@@ -1883,6 +1885,8 @@ expr (int rankarg,		/* Larger # is highe
 	  right.X_op_symbol = NULL;
 	}
 
+      is_unsigned = resultP->X_unsigned && right.X_unsigned;
+
       if (mode == expr_defer
 	  && ((resultP->X_add_symbol != NULL
 	       && S_IS_FORWARD_REF (resultP->X_add_symbol))
@@ -1895,7 +1899,7 @@ expr (int rankarg,		/* Larger # is highe
       if (md_optimize_expr (resultP, op_left, &right))
 	{
 	  /* Skip.  */
-	  ;
+	  is_unsigned = resultP->X_unsigned;
 	}
       else
 #endif
@@ -1928,12 +1932,14 @@ expr (int rankarg,		/* Larger # is highe
 	  add_to_result (resultP, symval_diff, symval_diff < 0);
 	  resultP->X_op = O_constant;
 	  resultP->X_add_symbol = 0;
+	  is_unsigned = false;
 	}
       else if (op_left == O_subtract && right.X_op == O_constant
 	       && (md_register_arithmetic || resultP->X_op != O_register))
 	{
 	  /* X - constant.  */
 	  subtract_from_result (resultP, right.X_add_number, right.X_extrabit);
+	  is_unsigned = false;
 	}
       else if (op_left == O_add && resultP->X_op == O_constant
 	       && (md_register_arithmetic || right.X_op != O_register))
@@ -1988,6 +1994,7 @@ expr (int rankarg,		/* Larger # is highe
 	      else
 		resultP->X_add_number
 		  = (valueT) resultP->X_add_number >> (valueT) v;
+	      is_unsigned = resultP->X_unsigned;
 	      break;
 	    case O_bit_inclusive_or:	resultP->X_add_number |= v; break;
 	    case O_bit_or_not:		resultP->X_add_number |= ~v; break;
@@ -1998,36 +2005,45 @@ expr (int rankarg,		/* Larger # is highe
 		 here.  */
 	    case O_subtract:
 	      subtract_from_result (resultP, v, 0);
+	      is_unsigned = false;
 	      break;
 	    case O_eq:
 	      resultP->X_add_number =
 		resultP->X_add_number == v ? ~ (offsetT) 0 : 0;
+	      is_unsigned = false;
 	      break;
 	    case O_ne:
 	      resultP->X_add_number =
 		resultP->X_add_number != v ? ~ (offsetT) 0 : 0;
+	      is_unsigned = false;
 	      break;
 	    case O_lt:
 	      resultP->X_add_number =
 		resultP->X_add_number <  v ? ~ (offsetT) 0 : 0;
+	      is_unsigned = false;
 	      break;
 	    case O_le:
 	      resultP->X_add_number =
 		resultP->X_add_number <= v ? ~ (offsetT) 0 : 0;
+	      is_unsigned = false;
 	      break;
 	    case O_ge:
 	      resultP->X_add_number =
 		resultP->X_add_number >= v ? ~ (offsetT) 0 : 0;
+	      is_unsigned = false;
 	      break;
 	    case O_gt:
 	      resultP->X_add_number =
 		resultP->X_add_number >  v ? ~ (offsetT) 0 : 0;
+	      is_unsigned = false;
 	      break;
 	    case O_logical_and:
 	      resultP->X_add_number = resultP->X_add_number && v;
+	      is_unsigned = true;
 	      break;
 	    case O_logical_or:
 	      resultP->X_add_number = resultP->X_add_number || v;
+	      is_unsigned = true;
 	      break;
 	    }
 	}
@@ -2065,10 +2081,11 @@ expr (int rankarg,		/* Larger # is highe
 	  resultP->X_op_symbol = make_expr_symbol (&right);
 	  resultP->X_op = op_left;
 	  resultP->X_add_number = 0;
-	  resultP->X_unsigned = 1;
 	  resultP->X_extrabit = 0;
 	}
 
+      resultP->X_unsigned = is_unsigned;
+
       if (retval != rightseg)
 	{
 	  if (retval == undefined_section)
--- a/gas/testsuite/gas/all/gas.exp
+++ b/gas/testsuite/gas/all/gas.exp
@@ -427,6 +427,11 @@ if { ![istarget "tic4x*-*-*"] && ![istar
 
 run_dump_test quad
 
+# ~ isn't an operator on PDP-11
+if { ![istarget "pdp11-*-*"] } {
+    run_dump_test quad2
+}
+
 # poor little PDP-11 can't handle 16-byte values
 if { ![istarget "pdp11-*-*"] } {
     run_dump_test octa
--- a/gas/testsuite/gas/all/octa.d
+++ b/gas/testsuite/gas/all/octa.d
@@ -6,3 +6,11 @@
 Contents of section .data:
  [^ ]* (ffff3344 55667788 99aabbcc ddeeffff|ffffeedd ccbbaa99 88776655 4433ffff) .*
  [^ ]* (00003444 55667788 99aabbcc ddeeffff|ffffeedd ccbbaa99 88776655 44340000) .*
+ [^ ]* (00000080 00000000 00000000 00000000|00000000 00000000 00000000 80000000) .*
+ [^ ]* (ffffffff 00000000 00000000 00000000|00000000 00000000 00000000 ffffffff) .*
+ [^ ]* (00000080 ffffffff ffffffff ffffffff|ffffffff ffffffff ffffffff 80000000) .*
+ [^ ]* (01000000 ffffffff ffffffff ffffffff|ffffffff ffffffff ffffffff 00000001) .*
+ [^ ]* (ffffff7f ffffffff ffffffff ffffffff|ffffffff ffffffff ffffffff 7fffffff) .*
+ [^ ]* (00000000 ffffffff ffffffff ffffffff|ffffffff ffffffff ffffffff 00000000) .*
+ [^ ]* (00000080 ffffffff ffffffff ffffffff|ffffffff ffffffff ffffffff 80000000) .*
+ [^ ]* (01000000 ffffffff ffffffff ffffffff|ffffffff ffffffff ffffffff 00000001) .*
--- a/gas/testsuite/gas/all/octa.s
+++ b/gas/testsuite/gas/all/octa.s
@@ -1,3 +1,11 @@
  .data
  .octa ~0x112233445566778899aabbcc0000
  .octa -347510587133311339321256747728896
+ .octa 0x80000000
+ .octa 0xffffffff
+ .octa -0x80000000
+ .octa -0xffffffff
+ .octa ~0x80000000
+ .octa ~0xffffffff
+ .octa 0 - 0x80000000
+ .octa 0 - 0xffffffff
--- a/gas/testsuite/gas/all/quad.d
+++ b/gas/testsuite/gas/all/quad.d
@@ -8,5 +8,6 @@ Contents of section (\.data|\$DATA\$):
  00.. (00000000 87654321 00000000 ffffffff|21436587 00000000 ffffffff 00000000|00000000 65872143 00000000 ffffffff|43218765 00000000 ffffffff 00000000) .*
  00.. (ffffffff 89abcdf0 ffffffff 80000000|f0cdab89 ffffffff 00000080 ffffffff|ffffffff ab89f0cd ffffffff 00800000|cdf089ab ffffffff 00008000 ffffffff) .*
  00.. (ffffffff 789abcdf ffffffff 00000001|dfbc9a78 ffffffff 01000000 ffffffff|ffffffff 9a78dfbc ffffffff 00000100|bcdf789a ffffffff 00010000 ffffffff) .*
+ 00.. (ffffffff 80000000 ffffffff 00000001|00000080 ffffffff 01000000 ffffffff|ffffffff 00800000 ffffffff 00000100|00008000 ffffffff 00010000 ffffffff) .*
  00.. (01234567 89abcdef fedcba98 76543211|efcdab89 67452301 11325476 98badcfe|23016745 ab89efcd dcfe98ba 54761132|cdef89ab 45670123 32117654 ba98fedc) .*
 #pass
--- a/gas/testsuite/gas/all/quad.s
+++ b/gas/testsuite/gas/all/quad.s
@@ -8,5 +8,8 @@
 	.quad	-0x87654321
 	.quad	-0xffffffff
 
+	.quad	0 - 0x80000000
+	.quad	0 - 0xffffffff
+
 	.quad	0x123456789abcdef
 	.quad	-0x123456789abcdef
--- /dev/null
+++ b/gas/testsuite/gas/all/quad2.d
@@ -0,0 +1,8 @@
+#objdump : -s -j .data -j "\$DATA\$"
+#name : .quad binary-not tests
+
+.*: .*
+
+Contents of section (\.data|\$DATA\$):
+ 0000 (ffffffff 7fffffff ffffffff 00000000|ffffff7f ffffffff 00000000 ffffffff) .*
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/all/quad2.s
@@ -0,0 +1,3 @@
+	.data
+	.quad	~0x80000000
+	.quad	~0xffffffff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 3/4] gas: invoke md_optimize_expr() also for unary expressions
  2023-05-05 13:00 [PATCH 0/4] improve range checking for certain constants on x86 Jan Beulich
  2023-05-05 13:01 ` [PATCH 1/4] x86: tighten extend-to-32bit-address conditions Jan Beulich
  2023-05-05 13:01 ` [PATCH 2/4] gas: maintain O_constant signedness in more cases Jan Beulich
@ 2023-05-05 13:03 ` Jan Beulich
  2023-05-05 13:04 ` [PATCH 4/4] x86: further adjust extend-to-32bit-address conditions Jan Beulich
  3 siblings, 0 replies; 5+ messages in thread
From: Jan Beulich @ 2023-05-05 13:03 UTC (permalink / raw)
  To: Binutils; +Cc: Alan Modra, Peter Bergner, Geoff Keating, Dave Anglin

Give backends a chance to see these, just as they can see binary ones.
Most of those which use this hook already cope with NULL being passed
for the left operand (typically because of checking the operator first).
Adjust the two which don't.

Take the opportunity and also document the hook.

--- a/gas/config/tc-hppa.c
+++ b/gas/config/tc-hppa.c
@@ -6354,6 +6354,7 @@ hppa_force_reg_syms_absolute (expression
 			      expressionS *rightP)
 {
   if (fudge_reg_expressions
+      && resultP
       && rightP->X_op == O_register
       && resultP->X_op == O_register)
     {
--- a/gas/config/tc-ppc.c
+++ b/gas/config/tc-ppc.c
@@ -945,9 +945,9 @@ ppc_optimize_expr (expressionS *left, op
     }
 
   /* Accept the above plus <cr bit>, and <cr bit> plus the above.  */
-  if (right->X_op == O_register
+  if (op == O_add
       && left->X_op == O_register
-      && op == O_add
+      && right->X_op == O_register
       && ((right->X_md == PPC_OPERAND_CR_BIT
 	   && left->X_md == (PPC_OPERAND_CR_REG | PPC_OPERAND_CR_BIT))
 	  || (right->X_md == (PPC_OPERAND_CR_REG | PPC_OPERAND_CR_BIT)
@@ -959,7 +959,7 @@ ppc_optimize_expr (expressionS *left, op
     }
 
   /* Accept reg +/- constant.  */
-  if (left->X_op == O_register
+  if (left && left->X_op == O_register
       && !((op == O_add || op == O_subtract) && right->X_op == O_constant))
     as_warn (_("invalid register expression"));
 
--- a/gas/doc/internals.texi
+++ b/gas/doc/internals.texi
@@ -1047,6 +1047,14 @@ pointer, for any expression that can not
 is called, @code{input_line_pointer} will point to the start of the
 expression.
 
+@item md_optimize_expr
+@cindex md_optimize_expr
+GAS will call this function before trying to carry out certain operations,
+like the adding of two constants.  The function is passed the left-hand
+operand, an @code{expressionS} pointer, the operator, an @code{operatorT}
+value, and the right-hand operand, again an @code{expressionS} pointer.  For
+unary expressions NULL is passed as first argument.
+
 @item md_register_arithmetic
 @cindex md_register_arithmetic
 If this macro is defined and evaluates to zero then GAS will not fold
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -764,6 +764,7 @@ operand (expressionS *expressionP, enum
   symbolS *symbolP;	/* Points to symbol.  */
   char *name;		/* Points to name of symbol.  */
   segT segment;
+  operatorT op = O_absent; /* For unary operators.  */
 
   /* All integers are regarded as unsigned unless they are negated.
      This is because the only thing which cares whether a number is
@@ -1029,19 +1030,33 @@ operand (expressionS *expressionP, enum
       /* '~' is permitted to start a label on the Delta.  */
       if (is_name_beginner (c))
 	goto isname;
-      /* Fall through.  */
+      op = O_bit_not;
+      goto unary;
+
     case '!':
+      op = O_logical_not;
+      goto unary;
+
     case '-':
+      op = O_uminus;
+      /* Fall through.  */
     case '+':
       {
-#ifdef md_operator
       unary:
-#endif
 	operand (expressionP, mode);
+
+#ifdef md_optimize_expr
+	if (md_optimize_expr (NULL, op, expressionP))
+	{
+	  /* Skip.  */
+	  ;
+	}
+	else
+#endif
 	if (expressionP->X_op == O_constant)
 	  {
 	    /* input_line_pointer -> char after operand.  */
-	    if (c == '-')
+	    if (op == O_uminus)
 	      {
 		expressionP->X_add_number
 		  = - (addressT) expressionP->X_add_number;
@@ -1052,13 +1067,13 @@ operand (expressionS *expressionP, enum
 		if (expressionP->X_add_number)
 		  expressionP->X_extrabit ^= 1;
 	      }
-	    else if (c == '~' || c == '"')
+	    else if (op == O_bit_not)
 	      {
 		expressionP->X_add_number = ~ expressionP->X_add_number;
 		expressionP->X_extrabit ^= 1;
 		expressionP->X_unsigned = 0;
 	      }
-	    else if (c == '!')
+	    else if (op == O_logical_not)
 	      {
 		expressionP->X_add_number = ! expressionP->X_add_number;
 		expressionP->X_unsigned = 1;
@@ -1067,7 +1082,7 @@ operand (expressionS *expressionP, enum
 	  }
 	else if (expressionP->X_op == O_big
 		 && expressionP->X_add_number <= 0
-		 && c == '-'
+		 && op == O_uminus
 		 && (generic_floating_point_number.sign == '+'
 		     || generic_floating_point_number.sign == 'P'))
 	  {
@@ -1082,7 +1097,7 @@ operand (expressionS *expressionP, enum
 	  {
 	    int i;
 
-	    if (c == '~' || c == '-')
+	    if (op == O_uminus || op == O_bit_not)
 	      {
 		for (i = 0; i < expressionP->X_add_number; ++i)
 		  generic_bignum[i] = ~generic_bignum[i];
@@ -1095,7 +1110,7 @@ operand (expressionS *expressionP, enum
 		      generic_bignum[i] = ~(LITTLENUM_TYPE) 0;
 		  }
 
-		if (c == '-')
+		if (op == O_uminus)
 		  for (i = 0; i < expressionP->X_add_number; ++i)
 		    {
 		      generic_bignum[i] += 1;
@@ -1103,7 +1118,7 @@ operand (expressionS *expressionP, enum
 			break;
 		    }
 	      }
-	    else if (c == '!')
+	    else if (op == O_logical_not)
 	      {
 		for (i = 0; i < expressionP->X_add_number; ++i)
 		  if (generic_bignum[i] != 0)
@@ -1117,15 +1132,10 @@ operand (expressionS *expressionP, enum
 	else if (expressionP->X_op != O_illegal
 		 && expressionP->X_op != O_absent)
 	  {
-	    if (c != '+')
+	    if (op != O_absent)
 	      {
 		expressionP->X_add_symbol = make_expr_symbol (expressionP);
-		if (c == '-')
-		  expressionP->X_op = O_uminus;
-		else if (c == '~' || c == '"')
-		  expressionP->X_op = O_bit_not;
-		else
-		  expressionP->X_op = O_logical_not;
+		expressionP->X_op = op;
 		expressionP->X_add_number = 0;
 	      }
 	    else if (!md_register_arithmetic && expressionP->X_op == O_register)
@@ -1288,8 +1298,7 @@ operand (expressionS *expressionP, enum
 
 #ifdef md_operator
 	  {
-	    operatorT op = md_operator (name, 1, &c);
-
+	    op = md_operator (name, 1, &c);
 	    switch (op)
 	      {
 	      case O_uminus:


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 4/4] x86: further adjust extend-to-32bit-address conditions
  2023-05-05 13:00 [PATCH 0/4] improve range checking for certain constants on x86 Jan Beulich
                   ` (2 preceding siblings ...)
  2023-05-05 13:03 ` [PATCH 3/4] gas: invoke md_optimize_expr() also for unary expressions Jan Beulich
@ 2023-05-05 13:04 ` Jan Beulich
  3 siblings, 0 replies; 5+ messages in thread
From: Jan Beulich @ 2023-05-05 13:04 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

While a442cac5084e ("ix86: wrap constants") helped address a number of
inconsistencies between BFD64 and !BFD64 builds, it has also resulted in
certain bogus uses of constants to no longer be warned about. Leverage
the md_optimize_expr() hook to adjust when to actually truncate
expressions to 32 bits - any involvement of binary expressions (which
would be evaluated in 32 bits only when !BFD64) signals the need for
doing so. Plain constants (or ones merely subject to unary operators)
should remain un-truncated - they would be handled as bignums when
!BFD64, and hence are okay to permit.

To compensate
- slightly extend optimize_imm() (to be honest I never understood why
  the code being added - or something similar - wasn't there in the
  first place),
- adjust expectations of the disp-imm-32 testcase (there are now
  warnings, as there should be for any code which won't build [warning-
  free] when !BFD64, and Disp8/Imm8 are no longer used in the warned
  about cases).
---
The no-longer-emitting-Disp8/-Imm8 could probably be avoided, by
fiddling with optimize_{disp,imm}(). If that's wanted, I think it would
still want to be in a separate patch.

What still differs between BFD64 vs !BFD64 are cases like
	.byte ~0xffffffffffffff00
	.word ~0xffffffffffff0000
	.long ~0xffffffff00000000
	.quad ~0xffffffff00000000
where the latter warns about bignum truncation. The draft patch I have
for this unfortunately causes quite a bit of testsuite fallout, so I'll
have to do more investigation there.

Along these lines
	.long 0xffffffff+2
	.long 0xffffffff*2
differ in diagnostics (and I consider it ambiguous whether a warning is
appropriate here with BFD64), while
	.quad 0xffffffff+2
	.quad 0xffffffff*2
are uniformly quiet but emit different data.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -827,6 +827,14 @@ i386_cpu_flags cpu_arch_isa_flags;
    larger than a byte offset.  */
 static bool no_cond_jump_promotion = false;
 
+/* This will be set from an expression parser hook if there's any
+   applicable operator involved in an expression.  */
+static enum {
+  expr_operator_none,
+  expr_operator_present,
+  expr_large_value,
+} expr_mode;
+
 /* Encode SSE instructions with VEX prefix.  */
 static unsigned int sse2avx;
 
@@ -6016,6 +6024,8 @@ optimize_imm (void)
     }
   else if ((flag_code == CODE_16BIT) ^ (i.prefix[DATA_PREFIX] != 0))
     guess_suffix = WORD_MNEM_SUFFIX;
+  else if (flag_code != CODE_64BIT || !(i.prefix[REX_PREFIX] & REX_W))
+    guess_suffix = LONG_MNEM_SUFFIX;
 
   for (op = i.operands; --op >= 0;)
     if (operand_type_check (i.types[op], imm))
@@ -10508,6 +10518,7 @@ x86_cons (expressionS *exp, int size)
 
   intel_syntax = -intel_syntax;
   exp->X_md = 0;
+  expr_mode = expr_operator_none;
 
 #if ((defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)) \
       && !defined (LEX_AT)) \
@@ -10567,7 +10578,8 @@ x86_cons (expressionS *exp, int size)
     i386_intel_simplify (exp);
 
   /* If not 64bit, massage value, to account for wraparound when !BFD64.  */
-  if (size == 4 && exp->X_op == O_constant && !object_64bit)
+  if (size <= 4 && expr_mode == expr_operator_present
+      && exp->X_op == O_constant && !object_64bit)
     exp->X_add_number = extend_to_32bit_address (exp->X_add_number);
 
   return got_reloc;
@@ -11626,6 +11638,7 @@ i386_immediate (char *imm_start)
   if (gotfree_input_line)
     input_line_pointer = gotfree_input_line;
 
+  expr_mode = expr_operator_none;
   exp_seg = expression (exp);
 
   /* For .insn immediates there may be a size specifier.  */
@@ -11684,7 +11697,8 @@ i386_finalize_immediate (segT exp_seg AT
 
       /* If not 64bit, sign/zero extend val, to account for wraparound
 	 when !BFD64.  */
-      if (flag_code != CODE_64BIT && !object_64bit)
+      if (expr_mode == expr_operator_present
+	  && flag_code != CODE_64BIT && !object_64bit)
 	exp->X_add_number = extend_to_32bit_address (exp->X_add_number);
     }
 #if (defined (OBJ_AOUT) || defined (OBJ_MAYBE_AOUT))
@@ -11906,6 +11920,7 @@ i386_displacement (char *disp_start, cha
   if (gotfree_input_line)
     input_line_pointer = gotfree_input_line;
 
+  expr_mode = expr_operator_none;
   exp_seg = expression (exp);
 
   SKIP_WHITESPACE ();
@@ -11976,7 +11991,8 @@ i386_finalize_displacement (segT exp_seg
 
 	 If not 64bit, sign/zero extend val, to account for wraparound
 	 when !BFD64.  */
-      if (flag_code != CODE_64BIT && !object_64bit)
+      if (expr_mode == expr_operator_present
+	  && flag_code != CODE_64BIT && !object_64bit)
 	exp->X_add_number = extend_to_32bit_address (exp->X_add_number);
     }
 
@@ -13928,6 +13944,41 @@ md_operand (expressionS *e)
     }
 }
 
+#ifdef BFD64
+/* To maintain consistency with !BFD64 builds of gas record, whether any
+   (binary) operator was involved in an expression.  As expressions are
+   evaluated in only 32 bits when !BFD64, we use this to decide whether to
+   truncate results.  */
+bool i386_record_operator (operatorT op,
+			   const expressionS *left,
+			   const expressionS *right)
+{
+  if (op == O_absent)
+    return false;
+
+  if (!left)
+    {
+      /* Since the expression parser applies unary operators fine to bignum
+	 operands, we don't need to be concerned of respective operands not
+	 fitting in 32 bits.  */
+      if (right->X_op == O_constant && right->X_unsigned
+	  && !fits_in_unsigned_long (right->X_add_number))
+	return false;
+    }
+  /* This isn't entirely right: The pattern can also result when constant
+     expressions are folded (e.g. 0xffffffff + 1).  */
+  else if ((left->X_op == O_constant && left->X_unsigned
+	    && !fits_in_unsigned_long (left->X_add_number))
+	   || (right->X_op == O_constant && right->X_unsigned
+	       && !fits_in_unsigned_long (right->X_add_number)))
+    expr_mode = expr_large_value;
+
+  if (expr_mode != expr_large_value)
+    expr_mode = expr_operator_present;
+
+  return false;
+}
+#endif
 \f
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
 const char *md_shortopts = "kVQ:sqnO::";
--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -633,6 +633,7 @@ i386_intel_operand (char *operand_string
   input_line_pointer = buf = xstrdup (operand_string);
 
   intel_syntax = -1;
+  expr_mode = expr_operator_none;
   memset (&exp, 0, sizeof(exp));
   exp_seg = expression (&exp);
   ret = i386_intel_simplify (&exp);
--- a/gas/config/tc-i386.h
+++ b/gas/config/tc-i386.h
@@ -188,6 +188,12 @@ extern operatorT i386_operator (const ch
 extern int i386_need_index_operator (void);
 #define md_need_index_operator i386_need_index_operator
 
+#ifdef BFD64
+extern bool i386_record_operator
+  (operatorT, const expressionS *, const expressionS *);
+#define md_optimize_expr(l, o, r) i386_record_operator (o, l, r)
+#endif
+
 #define md_register_arithmetic 0
 
 extern const struct relax_type md_relax_table[];
--- /dev/null
+++ b/gas/testsuite/gas/i386/cst-diag.l
@@ -0,0 +1,43 @@
+.*: Assembler messages:
+.*:3: Warning: .*
+.*:4: Warning: .*
+.*:5: (Error|Warning): .*
+.*:6: (Error|Warning): .*
+.*:8: Warning: .*
+.*:9: Warning: .*
+.*:10: (Error|Warning): .*
+.*:11: (Error|Warning): .*
+.*:13: Warning: .*
+.*:14: Warning: .*
+.*:15: (Error|Warning): .*
+.*:16: (Error|Warning): .*
+.*:18: Warning: .*
+.*:19: (Error|Warning): .*
+.*:20: (Error|Warning): .*
+.*:22: (Error|Warning): .*
+.*:23: (Error|Warning): .*
+.*:24: (Error|Warning): .*
+.*:25: (Error|Warning): .*
+.*:30: Warning: .*
+.*:31: Warning: .*
+.*:35: Warning: .*
+.*:36: Warning: .*
+.*:40: Warning: .*
+.*:41: Warning: .*
+.*:46: Warning: .*
+.*:47: Warning: .*
+.*:48: Warning: .*
+.*:50: Warning: .*
+.*:51: Warning: .*
+.*:52: Warning: .*
+.*:55: Warning: .*
+.*:56: Warning: .*
+.*:57: Warning: .*
+.*:59: Warning: .*
+.*:60: Warning: .*
+.*:61: Warning: .*
+.*:64: Warning: .*
+.*:65: Warning: .*
+.*:66: Warning: .*
+GAS LISTING .*
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/cst-diag.s
@@ -0,0 +1,79 @@
+	.text
+const:
+	add	$0x101, %cl
+	add	$0x10001, %cx
+	add	$0x100000001, %ecx
+	add	0x100000001, %ecx
+
+	add	$0x100, %cl
+	add	$0x10000, %cx
+	add	$0x100000000, %ecx
+	add	0x100000000, %ecx
+
+	add	$-0x101, %cl
+	add	$-0x10001, %cx
+	add	$-0x100000001, %ecx
+	add	-0x100000001, %ecx
+
+	add	$-0x100, %cl
+	add	$-0x10000, %cx
+	add	$-0x100000000, %ecx
+
+	add	$0xffffffffffffff00, %cl
+	add	$0xffffffffffff0000, %cx
+	add	$0xffffffff00000000, %ecx
+	add	0xffffffff00000000, %ecx
+
+	# The next two might as well not have a disagnostic issued, but if
+	# there is one (as is the case now), then it should be independent
+	# of BFD64.
+	and	$~0xff, %cl
+	and	$~0xffff, %cx
+	and	$~0xffffffff, %ecx
+	and	~0xffffffff, %ecx
+
+	and	$0xff+2, %cl
+	and	$0xffff+2, %cx
+	and	$0xffffffff+2, %ecx
+	and	0xffffffff+2, %ecx
+
+	and	$0xff*2, %cl
+	and	$0xffff*2, %cx
+	and	$0xffffffff*2, %ecx
+	and	0xffffffff*2, %ecx
+
+	.data
+	.byte 0x101
+	.byte -0x100
+	.byte 0xffffffffffffff00
+#	.byte ~0xffffffffffffff00
+	.byte ~0xff
+	.byte 0xff+2
+	.byte 0xff*2
+
+	.p2align 4
+	.word 0x10001
+	.word -0x10000
+	.word 0xffffffffffff0000
+#	.word ~0xffffffffffff0000
+	.word ~0xffff
+	.word 0xffff+2
+	.word 0xffff*2
+
+	.p2align 4
+	.long 0x100000001
+	.long -0x100000000
+	.long 0xffffffff00000000
+#	.long ~0xffffffff00000000
+	.long ~0xffffffff
+#	.long 0xffffffff+2
+#	.long 0xffffffff*2
+
+	.p2align 4
+	.quad 0x100000001
+	.quad -0x100000000
+	.quad 0xffffffff00000000
+#	.quad ~0xffffffff00000000
+	.quad ~0xffffffff
+#	.quad 0xffffffff+2
+#	.quad 0xffffffff*2
--- a/gas/testsuite/gas/i386/disp-imm-32.d
+++ b/gas/testsuite/gas/i386/disp-imm-32.d
@@ -1,5 +1,6 @@
 #objdump: -dw
 #name: i386 displacements / immediates (32-bit)
+#warning_output: disp-imm-32.e
 
 .*: +file format .*
 
@@ -15,7 +16,7 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	8b 40 ff             	mov    -0x1\(%eax\),%eax
 [ 	]*[a-f0-9]+:	62 f1 7c 48 28 40 ff 	vmovaps -0x40\(%eax\),%zmm0
 [ 	]*[a-f0-9]+:	83 c1 ff             	add    \$0xffffffff,%ecx
-[ 	]*[a-f0-9]+:	8b 40 01             	mov    0x1\(%eax\),%eax
-[ 	]*[a-f0-9]+:	62 f1 7c 48 28 40 01 	vmovaps 0x40\(%eax\),%zmm0
-[ 	]*[a-f0-9]+:	83 c1 01             	add    \$0x1,%ecx
+[ 	]*[a-f0-9]+:	8b (40 01 +|80 01 00 00 00)    	mov    0x1\(%eax\),%eax
+[ 	]*[a-f0-9]+:	62 f1 7c 48 28 (40 01|80 40 00 00 00) 	vmovaps 0x40\(%eax\),%zmm0
+[ 	]*[a-f0-9]+:	(83 c1 01 +|81 c1 01 00 00 00)    	add    \$0x1,%ecx
 #pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/disp-imm-32.e
@@ -0,0 +1,4 @@
+.*: Assembler messages:
+.*:15: Warning: .* shortened to 0x1
+.*:16: Warning: .* shortened to 0x40
+.*:17: Warning: .* shortened to 0x1
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -100,6 +100,7 @@ if [gas_32_check] then {
     run_dump_test "suffix-intel"
     run_list_test "suffix-bad"
     run_dump_test "immed32"
+    run_list_test "cst-diag" "-al"
     run_dump_test "equ"
     run_list_test "equ-2" "-al"
     run_list_test "equ-bad"


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-05-05 13:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-05 13:00 [PATCH 0/4] improve range checking for certain constants on x86 Jan Beulich
2023-05-05 13:01 ` [PATCH 1/4] x86: tighten extend-to-32bit-address conditions Jan Beulich
2023-05-05 13:01 ` [PATCH 2/4] gas: maintain O_constant signedness in more cases Jan Beulich
2023-05-05 13:03 ` [PATCH 3/4] gas: invoke md_optimize_expr() also for unary expressions Jan Beulich
2023-05-05 13:04 ` [PATCH 4/4] x86: further adjust extend-to-32bit-address conditions Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).