public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
@ 2011-07-25  2:04 Uros Bizjak
  2011-07-25  3:23 ` H.J. Lu
  0 siblings, 1 reply; 10+ messages in thread
From: Uros Bizjak @ 2011-07-25  2:04 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Richard Henderson, jh

[-- Attachment #1: Type: text/plain, Size: 2961 bytes --]

On Sat, Jul 23, 2011 at 3:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>> This patch adds x32 LEA insn support.  The main issue is
>>>
>>> gen_lowpart (Pmode, operands[1]);
>>>
>>> doesn't work on symbol.  This patch avoids it.
>>>
>>> Also we shouldn't generate 32bit store with x32 PIC source.
>>>
>>> Any comments?

You are not fixing the core of the problem... this is why you need so
much hacks and kludges at various places (some w.r.t. -fPIC already
existed, see the patch). Above, you correctly identified the problem,
so let's avoid gen_lowpart on SImode operands by not calling it
anymore.

Attached patch effectively rewrites LEA handling. The trick is, that
instead of using Pmode operations in addresses, we use either SImode
or DImode operations to calculate the address on 64bit targets. Up to
now, address calculations strictly used Pmode, so SImode on 32bit
targets and DImode on 64bit targets. Recent patches to
ix86_decompose_address and ix86_legitimate_address_p relaxed this
requirement.

Attached patch changes LEA patterns and LEA splitters to accept
addresses, calculated with either SImode or DImode operations.This
means, that on x64 targets, we don't use gen_lowpart on SImode
operands anymore. Since symbol references on x32 are in SImode, this
solves the problem. The patch also avoids generating SImode subregs of
DImode addresses and DImode zero_extends of SImode addresses, since
LEA insn does this for us automatically.

Please also note the change to ix86_print_operand_address. To avoid
addr32 prefixes, we can force registers in DImode on 64bit targets
without any problems. On x32, we can investigate, if this change
avoids unnecessary LEAs (for PR 49781, patched gcc genrates 6 vs. 8).
Also, we can investigate the effect of addr32 on benchmarks.

Patched gcc also fixes all testcases from PR 47381.

2011-07-24  Uros Bizjak  <ubizjak@gmail.com>

	PR target/47381
	* config/i386/i386.md (*lea_1): Use SWI48 mode iterator.
	(*lea_1_zext): New insn pattern.
	(add->lea splitter): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(add->lea zext splitter): Do not extend operands to DImode.
	(*lea_general_1): Handle only QImode and HImode operands.
	(*lea_general_2): Ditto.
	(*lea_general_3): Ditto.
	(*lea_general_1_zext): Remove.
	(*lea_general_2_zext): Ditto.
	(*lea_general_3_zext): Ditto.
	(*lea_general_4): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(ashift->lea splitter): Ditto.
	* config/i386/i386.md (ix86_print_operand_address): Print address
	registers with 'q' modifier on 64bit targets.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32} with no regressions. H.J., can you please test it on x32?

BTW: -fPIC is not yet implemented on trunk and still fails there with
an (unrelated) error, I didn't check x32 branch.

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 16295 bytes --]

Index: i386.md
===================================================================
--- i386.md	(revision 176713)
+++ i386.md	(working copy)
@@ -5425,13 +5425,22 @@
    (set_attr "mode" "QI")])
 
 (define_insn "*lea_1"
-  [(set (match_operand:P 0 "register_operand" "=r")
-	(match_operand:P 1 "no_seg_address_operand" "p"))]
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+	(match_operand:SWI48 1 "no_seg_address_operand" "p"))]
   ""
   "lea{<imodesuffix>}\t{%a1, %0|%0, %a1}"
   [(set_attr "type" "lea")
    (set_attr "mode" "<MODE>")])
 
+(define_insn "*lea_1_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(zero_extend:DI
+	  (match_operand:SI 1 "no_seg_address_operand" "p")))]
+  "TARGET_64BIT"
+  "lea{l}\t{%a1, %k0|%k0, %a1}"
+  [(set_attr "type" "lea")
+   (set_attr "mode" "SI")])
+
 (define_insn "*lea_2"
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(subreg:SI (match_operand:DI 1 "no_seg_address_operand" "p") 0))]
@@ -5794,39 +5803,36 @@
         (const_string "none")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(plus (match_operand 1 "register_operand" "")
               (match_operand 2 "nonmemory_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed && ix86_lea_for_add_ok (insn, operands)" 
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && (GET_MODE (operands[0]) == GET_MODE (operands[2])
+       || GET_MODE (operands[2]) == VOIDmode)
+   && reload_completed && ix86_lea_for_add_ok (insn, operands)" 
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  /* In -fPIC mode the constructs like (const (unspec [symbol_ref]))
-     may confuse gen_lowpart.  */
-  if (mode != Pmode)
-    {
-      operands[1] = gen_lowpart (Pmode, operands[1]);
-      operands[2] = gen_lowpart (Pmode, operands[2]);
-    }
-
-  pat = gen_rtx_PLUS (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+      operands[2] = gen_lowpart (mode, operands[2]);
+    }
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_PLUS (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 ;; ??? This pattern handles immediate operands that do not satisfy immediate
 ;; operand predicate (TARGET_LEGITIMATE_CONSTANT_P) in the previous pattern.
 (define_split
@@ -5839,7 +5845,7 @@
   [(set (match_dup 0)
 	(plus:DI (match_dup 1) (match_dup 2)))])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
@@ -5849,11 +5855,7 @@
   "TARGET_64BIT && reload_completed
    && ix86_lea_for_add_ok (insn, operands)"
   [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (match_dup 1) (match_dup 2)) 0)))]
-{
-  operands[1] = gen_lowpart (DImode, operands[1]);
-  operands[2] = gen_lowpart (DImode, operands[2]);
-})
+	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))])
 
 (define_insn "*add<mode>_2"
   [(set (reg FLAGS_REG)
@@ -6233,7 +6235,7 @@
   [(set_attr "type" "alu")
    (set_attr "mode" "QI")])
 
-;; The lea patterns for non-Pmodes needs to be matched by
+;; The lea patterns for modes less than 32 bits need to be matched by
 ;; several insns converted to real lea by splitters.
 
 (define_insn_and_split "*lea_general_1"
@@ -6241,8 +6243,7 @@
 	(plus (plus (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "register_operand" "r"))
 	      (match_operand 3 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
    && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[2])
@@ -6252,53 +6253,30 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_PLUS (Pmode, operands[1], operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[2] = gen_lowpart (mode, operands[2]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_PLUS (mode, operands[1], operands[2]),
   		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_1_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "register_operand" "r"))
-		   (match_operand:SI 3 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_2"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (mult (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "const248_operand" "i"))
 	      (match_operand 3 "nonmemory_operand" "ri")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && (GET_MODE (operands[0]) == GET_MODE (operands[3])
        || GET_MODE (operands[3]) == VOIDmode)"
@@ -6306,110 +6284,68 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1], operands[2]),
-  		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, operands[1], operands[2]),
+		      operands[3]);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_2_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (mult:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "const248_operand" "n"))
-		   (match_operand:SI 3 "nonmemory_operand" "ri"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (mult:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_3"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (plus (mult (match_operand 1 "index_register_operand" "l")
 			  (match_operand 2 "const248_operand" "i"))
 		    (match_operand 3 "register_operand" "r"))
 	      (match_operand 4 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[3])"
   "#"
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-  pat = gen_rtx_PLUS (Pmode,
-  		      gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1],
-		      					 operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+  operands[4] = gen_lowpart (mode, operands[4]);
+
+  pat = gen_rtx_PLUS (mode,
+  		      gen_rtx_PLUS (mode,
+				    gen_rtx_MULT (mode, operands[1],
+		      					operands[2]),
 				    operands[3]),
   		      operands[4]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_3_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (mult:SI
-		       (match_operand:SI 1 "index_register_operand" "l")
-		       (match_operand:SI 2 "const248_operand" "n"))
-		     (match_operand:SI 3 "register_operand" "r"))
-		   (match_operand:SI 4 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (mult:DI (match_dup 1)
-							      (match_dup 2))
-						     (match_dup 3))
-					    (match_dup 4)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_4"
-  [(set (match_operand:SWI 0 "register_operand" "=r")
-	(any_or:SWI (ashift:SWI (match_operand:SWI 1 "index_register_operand" "l")
-				(match_operand:SWI 2 "const_int_operand" "n"))
-		    (match_operand 3 "const_int_operand" "n")))]
-  "(<MODE>mode == DImode
-    || <MODE>mode == SImode
-    || !TARGET_PARTIAL_REG_STALL
-    || optimize_function_for_size_p (cfun))
+  [(set (match_operand 0 "register_operand" "=r")
+	(any_or (ashift
+		  (match_operand 1 "index_register_operand" "l")
+		  (match_operand 2 "const_int_operand" "n"))
+		(match_operand 3 "const_int_operand" "n")))]
+  "(((GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+      && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)))
+    || GET_MODE (operands[0]) == SImode
+    || (TARGET_64BIT && GET_MODE (operands[0]) == DImode))
+   && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 < 3
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
        < ((unsigned HOST_WIDE_INT) 1 << INTVAL (operands[2])))"
@@ -6417,23 +6353,29 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = GET_MODE (operands[0]);
   rtx pat;
-  if (<MODE>mode != DImode)
-    operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
+
+  if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
   operands[2] = GEN_INT (1 << INTVAL (operands[2]));
-  pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]),
+
+  pat = plus_constant (gen_rtx_MULT (mode, operands[1], operands[2]),
 		       INTVAL (operands[3]));
-  if (Pmode != SImode && <MODE>mode != DImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set (attr "mode")
-      (if_then_else (eq (symbol_ref "<MODE>mode == DImode") (const_int 0))
-	(const_string "SI")
-	(const_string "DI")))])
+      (if_then_else (match_operand:DI 0 "" "")
+	(const_string "DI")
+	(const_string "SI")))])
 \f
 ;; Subtract instructions
 
@@ -9395,36 +9337,36 @@
        (const_string "*")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(ashift (match_operand 1 "index_register_operand" "")
                 (match_operand:QI 2 "const_int_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && reload_completed
    && true_regnum (operands[0]) != true_regnum (operands[1])"
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  if (mode != Pmode)
-    operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), Pmode);
-
-  pat = gen_rtx_MULT (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
+  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), mode);
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_MULT (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
Index: i386.c
===================================================================
--- i386.c	(revision 176713)
+++ i386.c	(working copy)
@@ -14122,6 +14122,9 @@ ix86_print_operand_address (FILE *file, 
     }
   else
     {
+      /* Use DImode registers in address on 64bit target.  */
+      int code = TARGET_64BIT ? 'q' : 0;
+
       if (ASSEMBLER_DIALECT == ASM_ATT)
 	{
 	  if (disp)
@@ -14136,11 +14139,11 @@ ix86_print_operand_address (FILE *file, 
 
 	  putc ('(', file);
 	  if (base)
-	    print_reg (base, 0, file);
+	    print_reg (base, code, file);
 	  if (index)
 	    {
 	      putc (',', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, ",%d", scale);
 	    }
@@ -14175,7 +14178,7 @@ ix86_print_operand_address (FILE *file, 
 	  putc ('[', file);
 	  if (base)
 	    {
-	      print_reg (base, 0, file);
+	      print_reg (base, code, file);
 	      if (offset)
 		{
 		  if (INTVAL (offset) >= 0)
@@ -14191,7 +14194,7 @@ ix86_print_operand_address (FILE *file, 
 	  if (index)
 	    {
 	      putc ('+', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, "*%d", scale);
 	    }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25  2:04 [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns) Uros Bizjak
@ 2011-07-25  3:23 ` H.J. Lu
  2011-07-25 13:24   ` Uros Bizjak
  0 siblings, 1 reply; 10+ messages in thread
From: H.J. Lu @ 2011-07-25  3:23 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Richard Henderson, jh

On Sun, Jul 24, 2011 at 2:34 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Sat, Jul 23, 2011 at 3:57 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>>> This patch adds x32 LEA insn support.  The main issue is
>>>>
>>>> gen_lowpart (Pmode, operands[1]);
>>>>
>>>> doesn't work on symbol.  This patch avoids it.
>>>>
>>>> Also we shouldn't generate 32bit store with x32 PIC source.
>>>>
>>>> Any comments?
>
> You are not fixing the core of the problem... this is why you need so
> much hacks and kludges at various places (some w.r.t. -fPIC already
> existed, see the patch). Above, you correctly identified the problem,
> so let's avoid gen_lowpart on SImode operands by not calling it
> anymore.
>
> Attached patch effectively rewrites LEA handling. The trick is, that
> instead of using Pmode operations in addresses, we use either SImode
> or DImode operations to calculate the address on 64bit targets. Up to
> now, address calculations strictly used Pmode, so SImode on 32bit
> targets and DImode on 64bit targets. Recent patches to
> ix86_decompose_address and ix86_legitimate_address_p relaxed this
> requirement.
>
> Attached patch changes LEA patterns and LEA splitters to accept
> addresses, calculated with either SImode or DImode operations.This
> means, that on x64 targets, we don't use gen_lowpart on SImode
> operands anymore. Since symbol references on x32 are in SImode, this
> solves the problem. The patch also avoids generating SImode subregs of
> DImode addresses and DImode zero_extends of SImode addresses, since
> LEA insn does this for us automatically.
>
> Please also note the change to ix86_print_operand_address. To avoid
> addr32 prefixes, we can force registers in DImode on 64bit targets
> without any problems. On x32, we can investigate, if this change
> avoids unnecessary LEAs (for PR 49781, patched gcc genrates 6 vs. 8).

The testcase won't compile since PIC doesn't work:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49833

> Also, we can investigate the effect of addr32 on benchmarks.

We should get x32 working on trunk first.  We will improve its
performance later.

> Patched gcc also fixes all testcases from PR 47381.
>
> 2011-07-24  Uros Bizjak  <ubizjak@gmail.com>
>
>        PR target/47381
>        * config/i386/i386.md (*lea_1): Use SWI48 mode iterator.
>        (*lea_1_zext): New insn pattern.
>        (add->lea splitter): Check operand modes in insn constraint.  Extend
>        operands less than SImode wide to SImode.
>        (add->lea zext splitter): Do not extend operands to DImode.
>        (*lea_general_1): Handle only QImode and HImode operands.
>        (*lea_general_2): Ditto.
>        (*lea_general_3): Ditto.
>        (*lea_general_1_zext): Remove.
>        (*lea_general_2_zext): Ditto.
>        (*lea_general_3_zext): Ditto.
>        (*lea_general_4): Check operand modes in insn constraint.  Extend
>        operands less than SImode wide to SImode.
>        (ashift->lea splitter): Ditto.
>        * config/i386/i386.md (ix86_print_operand_address): Print address
>        registers with 'q' modifier on 64bit targets.
>
> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
> {,-m32} with no regressions. H.J., can you please test it on x32?

On x32, it failed:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49832

> BTW: -fPIC is not yet implemented on trunk and still fails there with
> an (unrelated) error, I didn't check x32 branch.
>

This could be:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49833

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25  3:23 ` H.J. Lu
@ 2011-07-25 13:24   ` Uros Bizjak
  2011-07-25 14:03     ` H.J. Lu
  0 siblings, 1 reply; 10+ messages in thread
From: Uros Bizjak @ 2011-07-25 13:24 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Richard Henderson, jh

[-- Attachment #1: Type: text/plain, Size: 3762 bytes --]

On Mon, Jul 25, 2011 at 3:58 AM, H.J. Lu <hjl.tools@gmail.com> wrote:

>> You are not fixing the core of the problem... this is why you need so
>> much hacks and kludges at various places (some w.r.t. -fPIC already
>> existed, see the patch). Above, you correctly identified the problem,
>> so let's avoid gen_lowpart on SImode operands by not calling it
>> anymore.
>>
>> Attached patch effectively rewrites LEA handling. The trick is, that
>> instead of using Pmode operations in addresses, we use either SImode
>> or DImode operations to calculate the address on 64bit targets. Up to
>> now, address calculations strictly used Pmode, so SImode on 32bit
>> targets and DImode on 64bit targets. Recent patches to
>> ix86_decompose_address and ix86_legitimate_address_p relaxed this
>> requirement.
>>
>> Attached patch changes LEA patterns and LEA splitters to accept
>> addresses, calculated with either SImode or DImode operations.This
>> means, that on x64 targets, we don't use gen_lowpart on SImode
>> operands anymore. Since symbol references on x32 are in SImode, this
>> solves the problem. The patch also avoids generating SImode subregs of
>> DImode addresses and DImode zero_extends of SImode addresses, since
>> LEA insn does this for us automatically.
>>
>> Please also note the change to ix86_print_operand_address. To avoid
>> addr32 prefixes, we can force registers in DImode on 64bit targets
>> without any problems. On x32, we can investigate, if this change
>> avoids unnecessary LEAs (for PR 49781, patched gcc genrates 6 vs. 8).
>
> The testcase won't compile since PIC doesn't work:

Well, I did say that -fPIC did not work.

>> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
>> {,-m32} with no regressions. H.J., can you please test it on x32?
>
> On x32, it failed:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49832
>
>> BTW: -fPIC is not yet implemented on trunk and still fails there with
>> an (unrelated) error, I didn't check x32 branch.
>>
>
> This could be:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49833

Attached patch implements -fpic handling for x32. In x32 mode, we now
use x86_64_general_operand and corresponding "e" constraints for adds
in SImode, since it looks that invalid addresses can only be generated
through adds. This avoids a whole bunch of new predicates and
constraints.

2011-07-25  Uros Bizjak  <ubizjak@gmail.com>

	PR target/47381
	PR target/49832
	PR target/49833
	* config/i386/i386.md (add_operand): New mode attribute.
	(*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
	(*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
	(*lea_1): Use SWI48 mode iterator.
	(*lea_1_zext): New insn pattern.
	(add<mode>3): Use <add_operand> predicate for operand 2.
	(*add<mode>_1): Use <add_operand> predicate for operand 2.  Use "le"
	constraint for alternative 2.
	(addsi_1_zext): Use addsi_operand predicate for operand 2.  Use "le"
	constraint for alternative 2.
	(add->lea splitter): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(add->lea zext splitter): Do not extend operands to DImode.
	(*lea_general_1): Handle only QImode and HImode operands.
	(*lea_general_2): Ditto.
	(*lea_general_3): Ditto.
	(*lea_general_1_zext): Remove.
	(*lea_general_2_zext): Ditto.
	(*lea_general_3_zext): Ditto.
	(*lea_general_4): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(ashift->lea splitter): Ditto.
	* config/i386/i386.c (ix86_print_operand_address): Print address
	registers with 'q' modifier on 64bit targets.
	* config/i386/predicates.md (pic_32bit_opreand): Define as special
	predicate.  Reject non-SI and non-DI modes.
	(addsi_operand): New predicate.

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 19634 bytes --]

Index: i386.md
===================================================================
--- i386.md	(revision 176733)
+++ i386.md	(working copy)
@@ -901,6 +901,14 @@
 	 (SI "nonmemory_operand")
 	 (DI "x86_64_nonmemory_operand")])
 
+;; Operand predicate for adds.
+(define_mode_attr add_operand
+	[(QI "general_operand")
+	 (HI "general_operand")
+	 (SI "addsi_operand")
+	 (DI "x86_64_general_operand")
+	 (TI "x86_64_general_operand")])
+
 ;; Operand predicate for shifts.
 (define_mode_attr shift_operand
 	[(QI "nonimmediate_operand")
@@ -2039,7 +2047,7 @@
 	      (const_string "ssemov")
 	    (eq_attr "alternative" "16,17")
 	      (const_string "ssecvt")
- 	    (match_operand:DI 1 "pic_32bit_operand" "")
+ 	    (match_operand 1 "pic_32bit_operand" "")
 	      (const_string "lea")
 	   ]
 	   (const_string "imov")))
@@ -2184,7 +2192,7 @@
   [(set (match_operand:SI 0 "nonimmediate_operand"
 			"=r,m ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x")
 	(match_operand:SI 1 "general_operand"
-			"g ,ri,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
+			"g ,re,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
   switch (get_attr_type (insn))
@@ -2232,7 +2240,7 @@
 	      (const_string "sselog1")
 	    (eq_attr "alternative" "7,8,9,10,11")
 	      (const_string "ssemov")
- 	    (match_operand:DI 1 "pic_32bit_operand" "")
+ 	    (match_operand 1 "pic_32bit_operand" "")
 	      (const_string "lea")
 	   ]
 	   (const_string "imov")))
@@ -5371,7 +5379,7 @@
 (define_expand "add<mode>3"
   [(set (match_operand:SDWIM 0 "nonimmediate_operand" "")
 	(plus:SDWIM (match_operand:SDWIM 1 "nonimmediate_operand" "")
-		    (match_operand:SDWIM 2 "<general_operand>" "")))]
+		    (match_operand:SDWIM 2 "<add_operand>" "")))]
   ""
   "ix86_expand_binary_operator (PLUS, <MODE>mode, operands); DONE;")
 
@@ -5425,13 +5433,22 @@
    (set_attr "mode" "QI")])
 
 (define_insn "*lea_1"
-  [(set (match_operand:P 0 "register_operand" "=r")
-	(match_operand:P 1 "no_seg_address_operand" "p"))]
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+	(match_operand:SWI48 1 "no_seg_address_operand" "p"))]
   ""
   "lea{<imodesuffix>}\t{%a1, %0|%0, %a1}"
   [(set_attr "type" "lea")
    (set_attr "mode" "<MODE>")])
 
+(define_insn "*lea_1_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(zero_extend:DI
+	  (match_operand:SI 1 "no_seg_address_operand" "p")))]
+  "TARGET_64BIT"
+  "lea{l}\t{%a1, %k0|%k0, %a1}"
+  [(set_attr "type" "lea")
+   (set_attr "mode" "SI")])
+
 (define_insn "*lea_2"
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(subreg:SI (match_operand:DI 1 "no_seg_address_operand" "p") 0))]
@@ -5453,7 +5470,7 @@
   [(set (match_operand:SWI48 0 "nonimmediate_operand" "=r,rm,r,r")
 	(plus:SWI48
 	  (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r")
-	  (match_operand:SWI48 2 "<general_operand>" "<g>,r<i>,0,l<i>")))
+	  (match_operand:SWI48 2 "<add_operand>" "<g>,r<i>,0,le")))
    (clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
 {
@@ -5512,7 +5529,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
 	(zero_extend:DI
 	  (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r")
-		   (match_operand:SI 2 "general_operand" "g,0,li"))))
+		   (match_operand:SI 2 "addsi_operand" "g,0,le"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)"
 {
@@ -5794,39 +5811,36 @@
         (const_string "none")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(plus (match_operand 1 "register_operand" "")
               (match_operand 2 "nonmemory_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed && ix86_lea_for_add_ok (insn, operands)" 
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && (GET_MODE (operands[0]) == GET_MODE (operands[2])
+       || GET_MODE (operands[2]) == VOIDmode)
+   && reload_completed && ix86_lea_for_add_ok (insn, operands)" 
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  /* In -fPIC mode the constructs like (const (unspec [symbol_ref]))
-     may confuse gen_lowpart.  */
-  if (mode != Pmode)
-    {
-      operands[1] = gen_lowpart (Pmode, operands[1]);
-      operands[2] = gen_lowpart (Pmode, operands[2]);
-    }
-
-  pat = gen_rtx_PLUS (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+      operands[2] = gen_lowpart (mode, operands[2]);
+    }
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_PLUS (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 ;; ??? This pattern handles immediate operands that do not satisfy immediate
 ;; operand predicate (TARGET_LEGITIMATE_CONSTANT_P) in the previous pattern.
 (define_split
@@ -5839,7 +5853,7 @@
   [(set (match_dup 0)
 	(plus:DI (match_dup 1) (match_dup 2)))])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
@@ -5849,11 +5863,7 @@
   "TARGET_64BIT && reload_completed
    && ix86_lea_for_add_ok (insn, operands)"
   [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (match_dup 1) (match_dup 2)) 0)))]
-{
-  operands[1] = gen_lowpart (DImode, operands[1]);
-  operands[2] = gen_lowpart (DImode, operands[2]);
-})
+	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))])
 
 (define_insn "*add<mode>_2"
   [(set (reg FLAGS_REG)
@@ -6233,7 +6243,7 @@
   [(set_attr "type" "alu")
    (set_attr "mode" "QI")])
 
-;; The lea patterns for non-Pmodes needs to be matched by
+;; The lea patterns for modes less than 32 bits need to be matched by
 ;; several insns converted to real lea by splitters.
 
 (define_insn_and_split "*lea_general_1"
@@ -6241,8 +6251,7 @@
 	(plus (plus (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "register_operand" "r"))
 	      (match_operand 3 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
    && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[2])
@@ -6252,53 +6261,30 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_PLUS (Pmode, operands[1], operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[2] = gen_lowpart (mode, operands[2]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_PLUS (mode, operands[1], operands[2]),
   		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_1_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "register_operand" "r"))
-		   (match_operand:SI 3 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_2"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (mult (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "const248_operand" "i"))
 	      (match_operand 3 "nonmemory_operand" "ri")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && (GET_MODE (operands[0]) == GET_MODE (operands[3])
        || GET_MODE (operands[3]) == VOIDmode)"
@@ -6306,110 +6292,68 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1], operands[2]),
-  		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, operands[1], operands[2]),
+		      operands[3]);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_2_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (mult:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "const248_operand" "n"))
-		   (match_operand:SI 3 "nonmemory_operand" "ri"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (mult:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_3"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (plus (mult (match_operand 1 "index_register_operand" "l")
 			  (match_operand 2 "const248_operand" "i"))
 		    (match_operand 3 "register_operand" "r"))
 	      (match_operand 4 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[3])"
   "#"
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-  pat = gen_rtx_PLUS (Pmode,
-  		      gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1],
-		      					 operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+  operands[4] = gen_lowpart (mode, operands[4]);
+
+  pat = gen_rtx_PLUS (mode,
+  		      gen_rtx_PLUS (mode,
+				    gen_rtx_MULT (mode, operands[1],
+		      					operands[2]),
 				    operands[3]),
   		      operands[4]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_3_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (mult:SI
-		       (match_operand:SI 1 "index_register_operand" "l")
-		       (match_operand:SI 2 "const248_operand" "n"))
-		     (match_operand:SI 3 "register_operand" "r"))
-		   (match_operand:SI 4 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (mult:DI (match_dup 1)
-							      (match_dup 2))
-						     (match_dup 3))
-					    (match_dup 4)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_4"
-  [(set (match_operand:SWI 0 "register_operand" "=r")
-	(any_or:SWI (ashift:SWI (match_operand:SWI 1 "index_register_operand" "l")
-				(match_operand:SWI 2 "const_int_operand" "n"))
-		    (match_operand 3 "const_int_operand" "n")))]
-  "(<MODE>mode == DImode
-    || <MODE>mode == SImode
-    || !TARGET_PARTIAL_REG_STALL
-    || optimize_function_for_size_p (cfun))
+  [(set (match_operand 0 "register_operand" "=r")
+	(any_or (ashift
+		  (match_operand 1 "index_register_operand" "l")
+		  (match_operand 2 "const_int_operand" "n"))
+		(match_operand 3 "const_int_operand" "n")))]
+  "(((GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+      && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)))
+    || GET_MODE (operands[0]) == SImode
+    || (TARGET_64BIT && GET_MODE (operands[0]) == DImode))
+   && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 < 3
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
        < ((unsigned HOST_WIDE_INT) 1 << INTVAL (operands[2])))"
@@ -6417,23 +6361,29 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = GET_MODE (operands[0]);
   rtx pat;
-  if (<MODE>mode != DImode)
-    operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
+
+  if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
   operands[2] = GEN_INT (1 << INTVAL (operands[2]));
-  pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]),
+
+  pat = plus_constant (gen_rtx_MULT (mode, operands[1], operands[2]),
 		       INTVAL (operands[3]));
-  if (Pmode != SImode && <MODE>mode != DImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set (attr "mode")
-      (if_then_else (eq (symbol_ref "<MODE>mode == DImode") (const_int 0))
-	(const_string "SI")
-	(const_string "DI")))])
+      (if_then_else (match_operand:DI 0 "" "")
+	(const_string "DI")
+	(const_string "SI")))])
 \f
 ;; Subtract instructions
 
@@ -9395,36 +9345,36 @@
        (const_string "*")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(ashift (match_operand 1 "index_register_operand" "")
                 (match_operand:QI 2 "const_int_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && reload_completed
    && true_regnum (operands[0]) != true_regnum (operands[1])"
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  if (mode != Pmode)
-    operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), Pmode);
-
-  pat = gen_rtx_MULT (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
+  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), mode);
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_MULT (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
Index: predicates.md
===================================================================
--- predicates.md	(revision 176733)
+++ predicates.md	(working copy)
@@ -366,9 +366,13 @@
 
 ;; Return true when operand is PIC expression that can be computed by lea
 ;; operation.
-(define_predicate "pic_32bit_operand"
+(define_special_predicate "pic_32bit_operand"
   (match_code "const,symbol_ref,label_ref")
 {
+  if (GET_MODE (op) != SImode
+      && GET_MODE (op) != DImode)
+    return false;
+
   if (!flag_pic)
     return false;
   /* Rule out relocations that translate into 64bit constants.  */
@@ -1188,3 +1192,9 @@
       return false;
   return true;
 })
+
+;; Test for a valid add operand for SImode.
+(define_predicate "addsi_operand"
+  (if_then_else (match_test "TARGET_X32")
+    (match_operand 0 "x86_64_general_operand")
+    (match_operand 0 "general_operand")))
Index: i386.c
===================================================================
--- i386.c	(revision 176733)
+++ i386.c	(working copy)
@@ -14122,6 +14122,9 @@ ix86_print_operand_address (FILE *file, 
     }
   else
     {
+      /* Use DImode registers in address on 64bit target.  */
+      int code = TARGET_64BIT ? 'q' : 0;
+
       if (ASSEMBLER_DIALECT == ASM_ATT)
 	{
 	  if (disp)
@@ -14136,11 +14139,11 @@ ix86_print_operand_address (FILE *file, 
 
 	  putc ('(', file);
 	  if (base)
-	    print_reg (base, 0, file);
+	    print_reg (base, code, file);
 	  if (index)
 	    {
 	      putc (',', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, ",%d", scale);
 	    }
@@ -14175,7 +14178,7 @@ ix86_print_operand_address (FILE *file, 
 	  putc ('[', file);
 	  if (base)
 	    {
-	      print_reg (base, 0, file);
+	      print_reg (base, code, file);
 	      if (offset)
 		{
 		  if (INTVAL (offset) >= 0)
@@ -14191,7 +14194,7 @@ ix86_print_operand_address (FILE *file, 
 	  if (index)
 	    {
 	      putc ('+', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, "*%d", scale);
 	    }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25 13:24   ` Uros Bizjak
@ 2011-07-25 14:03     ` H.J. Lu
  2011-07-25 14:08       ` Uros Bizjak
  2011-07-25 21:16       ` [PATCH, i386, take 2]: " Uros Bizjak
  0 siblings, 2 replies; 10+ messages in thread
From: H.J. Lu @ 2011-07-25 14:03 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Richard Henderson, jh

On Mon, Jul 25, 2011 at 5:33 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Jul 25, 2011 at 3:58 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>> You are not fixing the core of the problem... this is why you need so
>>> much hacks and kludges at various places (some w.r.t. -fPIC already
>>> existed, see the patch). Above, you correctly identified the problem,
>>> so let's avoid gen_lowpart on SImode operands by not calling it
>>> anymore.
>>>
>>> Attached patch effectively rewrites LEA handling. The trick is, that
>>> instead of using Pmode operations in addresses, we use either SImode
>>> or DImode operations to calculate the address on 64bit targets. Up to
>>> now, address calculations strictly used Pmode, so SImode on 32bit
>>> targets and DImode on 64bit targets. Recent patches to
>>> ix86_decompose_address and ix86_legitimate_address_p relaxed this
>>> requirement.
>>>
>>> Attached patch changes LEA patterns and LEA splitters to accept
>>> addresses, calculated with either SImode or DImode operations.This
>>> means, that on x64 targets, we don't use gen_lowpart on SImode
>>> operands anymore. Since symbol references on x32 are in SImode, this
>>> solves the problem. The patch also avoids generating SImode subregs of
>>> DImode addresses and DImode zero_extends of SImode addresses, since
>>> LEA insn does this for us automatically.
>>>
>>> Please also note the change to ix86_print_operand_address. To avoid
>>> addr32 prefixes, we can force registers in DImode on 64bit targets
>>> without any problems. On x32, we can investigate, if this change
>>> avoids unnecessary LEAs (for PR 49781, patched gcc genrates 6 vs. 8).
>>
>> The testcase won't compile since PIC doesn't work:
>
> Well, I did say that -fPIC did not work.
>
>>> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
>>> {,-m32} with no regressions. H.J., can you please test it on x32?
>>
>> On x32, it failed:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49832
>>
>>> BTW: -fPIC is not yet implemented on trunk and still fails there with
>>> an (unrelated) error, I didn't check x32 branch.
>>>
>>
>> This could be:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49833
>
> Attached patch implements -fpic handling for x32. In x32 mode, we now
> use x86_64_general_operand and corresponding "e" constraints for adds
> in SImode, since it looks that invalid addresses can only be generated
> through adds. This avoids a whole bunch of new predicates and
> constraints.
>
> 2011-07-25  Uros Bizjak  <ubizjak@gmail.com>
>
>        PR target/47381
>        PR target/49832
>        PR target/49833
>        * config/i386/i386.md (add_operand): New mode attribute.
>        (*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
>        (*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
>        (*lea_1): Use SWI48 mode iterator.
>        (*lea_1_zext): New insn pattern.
>        (add<mode>3): Use <add_operand> predicate for operand 2.
>        (*add<mode>_1): Use <add_operand> predicate for operand 2.  Use "le"
>        constraint for alternative 2.
>        (addsi_1_zext): Use addsi_operand predicate for operand 2.  Use "le"
>        constraint for alternative 2.
>        (add->lea splitter): Check operand modes in insn constraint.  Extend
>        operands less than SImode wide to SImode.
>        (add->lea zext splitter): Do not extend operands to DImode.
>        (*lea_general_1): Handle only QImode and HImode operands.
>        (*lea_general_2): Ditto.
>        (*lea_general_3): Ditto.
>        (*lea_general_1_zext): Remove.
>        (*lea_general_2_zext): Ditto.
>        (*lea_general_3_zext): Ditto.
>        (*lea_general_4): Check operand modes in insn constraint.  Extend
>        operands less than SImode wide to SImode.
>        (ashift->lea splitter): Ditto.
>        * config/i386/i386.c (ix86_print_operand_address): Print address
>        registers with 'q' modifier on 64bit targets.
>        * config/i386/predicates.md (pic_32bit_opreand): Define as special
>        predicate.  Reject non-SI and non-DI modes.
>        (addsi_operand): New predicate.
>
> Uros.
>

X32 glibc is miscompiled:

CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
 -E -x c-header'
/export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
--library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
../scripts -h rpcsvc/yppasswd.x -o
/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
Segmentation fault (core dumped)

Some LEA patterns are wrong for x32.  I will investigate.


-- 
H.J.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25 14:03     ` H.J. Lu
@ 2011-07-25 14:08       ` Uros Bizjak
  2011-07-25 14:20         ` H.J. Lu
  2011-07-25 21:16       ` [PATCH, i386, take 2]: " Uros Bizjak
  1 sibling, 1 reply; 10+ messages in thread
From: Uros Bizjak @ 2011-07-25 14:08 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Richard Henderson, jh

On Mon, Jul 25, 2011 at 3:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>> Attached patch implements -fpic handling for x32. In x32 mode, we now
>> use x86_64_general_operand and corresponding "e" constraints for adds
>> in SImode, since it looks that invalid addresses can only be generated
>> through adds. This avoids a whole bunch of new predicates and
>> constraints.

> X32 glibc is miscompiled:
>
> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>  -E -x c-header'
> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
> --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
> ../scripts -h rpcsvc/yppasswd.x -o
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
> make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
> Segmentation fault (core dumped)
>
> Some LEA patterns are wrong for x32.  I will investigate.

What about x32 GCC testsuite?

Uros.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25 14:08       ` Uros Bizjak
@ 2011-07-25 14:20         ` H.J. Lu
  0 siblings, 0 replies; 10+ messages in thread
From: H.J. Lu @ 2011-07-25 14:20 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Richard Henderson, jh

On Mon, Jul 25, 2011 at 6:43 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Jul 25, 2011 at 3:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>> Attached patch implements -fpic handling for x32. In x32 mode, we now
>>> use x86_64_general_operand and corresponding "e" constraints for adds
>>> in SImode, since it looks that invalid addresses can only be generated
>>> through adds. This avoids a whole bunch of new predicates and
>>> constraints.
>
>> X32 glibc is miscompiled:
>>
>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>  -E -x c-header'
>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>> --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>> ../scripts -h rpcsvc/yppasswd.x -o
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>> make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
>> Segmentation fault (core dumped)
>>
>> Some LEA patterns are wrong for x32.  I will investigate.
>
> What about x32 GCC testsuite?
>

GCC testsuite is clean on x32.



-- 
H.J.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH, i386, take 2]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25 14:03     ` H.J. Lu
  2011-07-25 14:08       ` Uros Bizjak
@ 2011-07-25 21:16       ` Uros Bizjak
  2011-07-25 21:47         ` H.J. Lu
  1 sibling, 1 reply; 10+ messages in thread
From: Uros Bizjak @ 2011-07-25 21:16 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Richard Henderson, jh

[-- Attachment #1: Type: text/plain, Size: 3932 bytes --]

On Mon, Jul 25, 2011 at 3:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>> Attached patch implements -fpic handling for x32. In x32 mode, we now
>> use x86_64_general_operand and corresponding "e" constraints for adds
>> in SImode, since it looks that invalid addresses can only be generated
>> through adds. This avoids a whole bunch of new predicates and
>> constraints.

> X32 glibc is miscompiled:
>
> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>  -E -x c-header'
> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
> --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
> ../scripts -h rpcsvc/yppasswd.x -o
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
> make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
> Segmentation fault (core dumped)
>
> Some LEA patterns are wrong for x32.  I will investigate.

We have to prevent symbols from entering general_operand predicated
SImode operands. Fortunatelly, x86_64_general_operand works OK for
x32, while both for i686 and x86_64 are unaffected due to early bypass
(i686) and due to the fact that all symbols are DImode (x86_64).

2011-07-25  Uros Bizjak  <ubizjak@gmail.com>
	    H.J. Lu  <hongjiu.lu@intel.com>

	PR target/47381
	PR target/49832
	PR target/49833
	* config/i386/i386.md (i): Change SImode attribute to "e".
	(g): Change SImode attribute to "rme".
	(di): Change SImode attribute to "nF".
	(general_operand): Change SImode attribute to x86_64_general_operand.
	(general_szext_operand): Change SImode attribute to
	x86_64_szext_general_operand.
	(immediate_operand): Change SImode attribute to
	x86_64_immediate_operand-
	(*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
	(*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
	(*lea_1): Use SWI48 mode iterator.
	(*lea_1_zext): New insn pattern.
	(*add<mode>1): Use x86_64_general_operand predicate for operand 2.
	Update operand constraints.
	(addsi_1_zext): Ditto.
	(*add<mode>2): Ditto.
	(*addsi_3_zext): Ditto.
	(*subsi_1_zext): Ditto.
	(*subsi_2_zext): Ditto.
	(*subsi_3_zext): Ditto.
	(*addsi3_carry_zext): Ditto.
	(*<plusminus_insn>si3_zext_cc_overflow): Ditto.
	(*mulsi3_1_zext): Ditto.
	(*andsi_1): Ditto.
	(*andsi_1_zext): Ditto.
	(*andsi_2_zext): Ditto.
	(*<any_or:code>si_1_zext): Ditto.
	(*<any_or:code>si_2_zext): Ditto.
	(*test<mode>_1): Use <general_operand> predicate for operand 1.
	(*and<mode>_2): Ditto.
	(add->lea splitter): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(add->lea zext splitter): Do not extend input operands to DImode.
	(*lea_general_1): Handle only QImode and HImode operands.
	(*lea_general_2): Ditto.
	(*lea_general_3): Ditto.
	(*lea_general_1_zext): Remove.
	(*lea_general_2_zext): Ditto.
	(*lea_general_3_zext): Ditto.
	(*lea_general_4): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(ashift->lea splitter): Ditto.
	* config/i386/i386.c (ix86_print_operand_address): Print address
	registers with 'q' modifier on 64bit targets.
	* config/i386/predicates.md (pic_32bit_opreand): Define as special
	predicate.  Reject non-SI and non-DI modes.

Bootstrapped and regression ested on x86_64-pc-linux-gnu {,-m32}.

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 26320 bytes --]

Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 176748)
+++ config/i386/i386.md	(working copy)
@@ -861,13 +861,13 @@
 (define_mode_attr r [(QI "q") (HI "r") (SI "r") (DI "r")])
 
 ;; Immediate operand constraint for integer modes.
-(define_mode_attr i [(QI "n") (HI "n") (SI "i") (DI "e")])
+(define_mode_attr i [(QI "n") (HI "n") (SI "e") (DI "e")])
 
 ;; General operand constraint for word modes.
-(define_mode_attr g [(QI "qmn") (HI "rmn") (SI "g") (DI "rme")])
+(define_mode_attr g [(QI "qmn") (HI "rmn") (SI "rme") (DI "rme")])
 
 ;; Immediate operand constraint for double integer modes.
-(define_mode_attr di [(SI "iF") (DI "e")])
+(define_mode_attr di [(SI "nF") (DI "e")])
 
 ;; Immediate operand constraint for shifts.
 (define_mode_attr S [(QI "I") (HI "I") (SI "I") (DI "J") (TI "O")])
@@ -876,7 +876,7 @@
 (define_mode_attr general_operand
 	[(QI "general_operand")
 	 (HI "general_operand")
-	 (SI "general_operand")
+	 (SI "x86_64_general_operand")
 	 (DI "x86_64_general_operand")
 	 (TI "x86_64_general_operand")])
 
@@ -884,14 +884,14 @@
 (define_mode_attr general_szext_operand
 	[(QI "general_operand")
 	 (HI "general_operand")
-	 (SI "general_operand")
+	 (SI "x86_64_szext_general_operand")
 	 (DI "x86_64_szext_general_operand")])
 
 ;; Immediate operand predicate for integer modes.
 (define_mode_attr immediate_operand
 	[(QI "immediate_operand")
 	 (HI "immediate_operand")
-	 (SI "immediate_operand")
+	 (SI "x86_64_immediate_operand")
 	 (DI "x86_64_immediate_operand")])
 
 ;; Nonmemory operand predicate for integer modes.
@@ -2039,7 +2039,7 @@
 	      (const_string "ssemov")
 	    (eq_attr "alternative" "16,17")
 	      (const_string "ssecvt")
- 	    (match_operand:DI 1 "pic_32bit_operand" "")
+ 	    (match_operand 1 "pic_32bit_operand" "")
 	      (const_string "lea")
 	   ]
 	   (const_string "imov")))
@@ -2184,7 +2184,7 @@
   [(set (match_operand:SI 0 "nonimmediate_operand"
 			"=r,m ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x")
 	(match_operand:SI 1 "general_operand"
-			"g ,ri,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
+			"g ,re,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
   switch (get_attr_type (insn))
@@ -2232,7 +2232,7 @@
 	      (const_string "sselog1")
 	    (eq_attr "alternative" "7,8,9,10,11")
 	      (const_string "ssemov")
- 	    (match_operand:DI 1 "pic_32bit_operand" "")
+ 	    (match_operand 1 "pic_32bit_operand" "")
 	      (const_string "lea")
 	   ]
 	   (const_string "imov")))
@@ -5425,13 +5425,22 @@
    (set_attr "mode" "QI")])
 
 (define_insn "*lea_1"
-  [(set (match_operand:P 0 "register_operand" "=r")
-	(match_operand:P 1 "no_seg_address_operand" "p"))]
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+	(match_operand:SWI48 1 "no_seg_address_operand" "p"))]
   ""
   "lea{<imodesuffix>}\t{%a1, %0|%0, %a1}"
   [(set_attr "type" "lea")
    (set_attr "mode" "<MODE>")])
 
+(define_insn "*lea_1_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(zero_extend:DI
+	  (match_operand:SI 1 "no_seg_address_operand" "p")))]
+  "TARGET_64BIT"
+  "lea{l}\t{%a1, %k0|%k0, %a1}"
+  [(set_attr "type" "lea")
+   (set_attr "mode" "SI")])
+
 (define_insn "*lea_2"
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(subreg:SI (match_operand:DI 1 "no_seg_address_operand" "p") 0))]
@@ -5453,7 +5462,7 @@
   [(set (match_operand:SWI48 0 "nonimmediate_operand" "=r,rm,r,r")
 	(plus:SWI48
 	  (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r")
-	  (match_operand:SWI48 2 "<general_operand>" "<g>,r<i>,0,l<i>")))
+	  (match_operand:SWI48 2 "x86_64_general_operand" "rme,re,0,le")))
    (clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
 {
@@ -5512,7 +5521,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
 	(zero_extend:DI
 	  (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r")
-		   (match_operand:SI 2 "general_operand" "g,0,li"))))
+		   (match_operand:SI 2 "x86_64_general_operand" "rme,0,le"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)"
 {
@@ -5794,39 +5803,36 @@
         (const_string "none")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(plus (match_operand 1 "register_operand" "")
               (match_operand 2 "nonmemory_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed && ix86_lea_for_add_ok (insn, operands)" 
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && (GET_MODE (operands[0]) == GET_MODE (operands[2])
+       || GET_MODE (operands[2]) == VOIDmode)
+   && reload_completed && ix86_lea_for_add_ok (insn, operands)" 
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  /* In -fPIC mode the constructs like (const (unspec [symbol_ref]))
-     may confuse gen_lowpart.  */
-  if (mode != Pmode)
-    {
-      operands[1] = gen_lowpart (Pmode, operands[1]);
-      operands[2] = gen_lowpart (Pmode, operands[2]);
-    }
-
-  pat = gen_rtx_PLUS (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+      operands[2] = gen_lowpart (mode, operands[2]);
+    }
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_PLUS (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 ;; ??? This pattern handles immediate operands that do not satisfy immediate
 ;; operand predicate (TARGET_LEGITIMATE_CONSTANT_P) in the previous pattern.
 (define_split
@@ -5839,7 +5845,7 @@
   [(set (match_dup 0)
 	(plus:DI (match_dup 1) (match_dup 2)))])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
@@ -5849,11 +5855,7 @@
   "TARGET_64BIT && reload_completed
    && ix86_lea_for_add_ok (insn, operands)"
   [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (match_dup 1) (match_dup 2)) 0)))]
-{
-  operands[1] = gen_lowpart (DImode, operands[1]);
-  operands[2] = gen_lowpart (DImode, operands[2]);
-})
+	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))])
 
 (define_insn "*add<mode>_2"
   [(set (reg FLAGS_REG)
@@ -5901,7 +5903,7 @@
   [(set (reg FLAGS_REG)
 	(compare
 	  (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
-		   (match_operand:SI 2 "general_operand" "g"))
+		   (match_operand:SI 2 "x86_64_general_operand" "rme"))
 	  (const_int 0)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))]
@@ -5979,7 +5981,7 @@
 (define_insn "*addsi_3_zext"
   [(set (reg FLAGS_REG)
 	(compare
-	  (neg:SI (match_operand:SI 2 "general_operand" "g"))
+	  (neg:SI (match_operand:SI 2 "x86_64_general_operand" "rme"))
 	  (match_operand:SI 1 "nonimmediate_operand" "%0")))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))]
@@ -6233,7 +6235,7 @@
   [(set_attr "type" "alu")
    (set_attr "mode" "QI")])
 
-;; The lea patterns for non-Pmodes needs to be matched by
+;; The lea patterns for modes less than 32 bits need to be matched by
 ;; several insns converted to real lea by splitters.
 
 (define_insn_and_split "*lea_general_1"
@@ -6241,8 +6243,7 @@
 	(plus (plus (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "register_operand" "r"))
 	      (match_operand 3 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
    && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[2])
@@ -6252,53 +6253,30 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_PLUS (Pmode, operands[1], operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[2] = gen_lowpart (mode, operands[2]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_PLUS (mode, operands[1], operands[2]),
   		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_1_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "register_operand" "r"))
-		   (match_operand:SI 3 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_2"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (mult (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "const248_operand" "i"))
 	      (match_operand 3 "nonmemory_operand" "ri")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && (GET_MODE (operands[0]) == GET_MODE (operands[3])
        || GET_MODE (operands[3]) == VOIDmode)"
@@ -6306,110 +6284,68 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1], operands[2]),
-  		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, operands[1], operands[2]),
+		      operands[3]);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_2_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (mult:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "const248_operand" "n"))
-		   (match_operand:SI 3 "nonmemory_operand" "ri"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (mult:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_3"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (plus (mult (match_operand 1 "index_register_operand" "l")
 			  (match_operand 2 "const248_operand" "i"))
 		    (match_operand 3 "register_operand" "r"))
 	      (match_operand 4 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[3])"
   "#"
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-  pat = gen_rtx_PLUS (Pmode,
-  		      gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1],
-		      					 operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+  operands[4] = gen_lowpart (mode, operands[4]);
+
+  pat = gen_rtx_PLUS (mode,
+  		      gen_rtx_PLUS (mode,
+				    gen_rtx_MULT (mode, operands[1],
+		      					operands[2]),
 				    operands[3]),
   		      operands[4]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_3_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (mult:SI
-		       (match_operand:SI 1 "index_register_operand" "l")
-		       (match_operand:SI 2 "const248_operand" "n"))
-		     (match_operand:SI 3 "register_operand" "r"))
-		   (match_operand:SI 4 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (mult:DI (match_dup 1)
-							      (match_dup 2))
-						     (match_dup 3))
-					    (match_dup 4)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_4"
-  [(set (match_operand:SWI 0 "register_operand" "=r")
-	(any_or:SWI (ashift:SWI (match_operand:SWI 1 "index_register_operand" "l")
-				(match_operand:SWI 2 "const_int_operand" "n"))
-		    (match_operand 3 "const_int_operand" "n")))]
-  "(<MODE>mode == DImode
-    || <MODE>mode == SImode
-    || !TARGET_PARTIAL_REG_STALL
-    || optimize_function_for_size_p (cfun))
+  [(set (match_operand 0 "register_operand" "=r")
+	(any_or (ashift
+		  (match_operand 1 "index_register_operand" "l")
+		  (match_operand 2 "const_int_operand" "n"))
+		(match_operand 3 "const_int_operand" "n")))]
+  "(((GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+      && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)))
+    || GET_MODE (operands[0]) == SImode
+    || (TARGET_64BIT && GET_MODE (operands[0]) == DImode))
+   && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 < 3
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
        < ((unsigned HOST_WIDE_INT) 1 << INTVAL (operands[2])))"
@@ -6417,23 +6353,29 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = GET_MODE (operands[0]);
   rtx pat;
-  if (<MODE>mode != DImode)
-    operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
+
+  if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
   operands[2] = GEN_INT (1 << INTVAL (operands[2]));
-  pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]),
+
+  pat = plus_constant (gen_rtx_MULT (mode, operands[1], operands[2]),
 		       INTVAL (operands[3]));
-  if (Pmode != SImode && <MODE>mode != DImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set (attr "mode")
-      (if_then_else (eq (symbol_ref "<MODE>mode == DImode") (const_int 0))
-	(const_string "SI")
-	(const_string "DI")))])
+      (if_then_else (match_operand:DI 0 "" "")
+	(const_string "DI")
+	(const_string "SI")))])
 \f
 ;; Subtract instructions
 
@@ -6481,7 +6423,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
 	  (minus:SI (match_operand:SI 1 "register_operand" "0")
-		    (match_operand:SI 2 "general_operand" "g"))))
+		    (match_operand:SI 2 "x86_64_general_operand" "rme"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)"
   "sub{l}\t{%2, %k0|%k0, %2}"
@@ -6518,7 +6460,7 @@
   [(set (reg FLAGS_REG)
 	(compare
 	  (minus:SI (match_operand:SI 1 "register_operand" "0")
-		    (match_operand:SI 2 "general_operand" "g"))
+		    (match_operand:SI 2 "x86_64_general_operand" "rme"))
 	  (const_int 0)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
@@ -6545,7 +6487,7 @@
 (define_insn "*subsi_3_zext"
   [(set (reg FLAGS_REG)
 	(compare (match_operand:SI 1 "register_operand" "0")
-		 (match_operand:SI 2 "general_operand" "g")))
+		 (match_operand:SI 2 "x86_64_general_operand" "rme")))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
 	  (minus:SI (match_dup 1)
@@ -6592,7 +6534,7 @@
 	  (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
 		   (plus:SI (match_operator 3 "ix86_carry_flag_operator"
 			     [(reg FLAGS_REG) (const_int 0)])
-			    (match_operand:SI 2 "general_operand" "g")))))
+			    (match_operand:SI 2 "x86_64_general_operand" "rme")))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)"
   "adc{l}\t{%2, %k0|%k0, %2}"
@@ -6607,7 +6549,7 @@
 	  (minus:SI (match_operand:SI 1 "register_operand" "0")
 		    (plus:SI (match_operator 3 "ix86_carry_flag_operator"
 			      [(reg FLAGS_REG) (const_int 0)])
-			     (match_operand:SI 2 "general_operand" "g")))))
+			     (match_operand:SI 2 "x86_64_general_operand" "rme")))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)"
   "sbb{l}\t{%2, %k0|%k0, %2}"
@@ -6661,7 +6603,7 @@
 	(compare:CCC
 	  (plusminus:SI
 	    (match_operand:SI 1 "nonimmediate_operand" "<comm>0")
-	    (match_operand:SI 2 "general_operand" "g"))
+	    (match_operand:SI 2 "x86_64_general_operand" "rme"))
 	  (match_dup 1)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (plusminus:SI (match_dup 1) (match_dup 2))))]
@@ -6748,7 +6690,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
 	(zero_extend:DI
 	  (mult:SI (match_operand:SI 1 "nonimmediate_operand" "%rm,rm,0")
-		   (match_operand:SI 2 "general_operand" "K,i,mr"))))
+		   (match_operand:SI 2 "x86_64_general_operand" "K,e,mr"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT
    && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
@@ -7440,7 +7382,7 @@
 	(compare
 	 (and:SWI124
 	  (match_operand:SWI124 0 "nonimmediate_operand" "%!*a,<r>,<r>m")
-	  (match_operand:SWI124 1 "general_operand" "<i>,<i>,<r><i>"))
+	  (match_operand:SWI124 1 "<general_operand>" "<i>,<i>,<r><i>"))
 	 (const_int 0)))]
   "ix86_match_ccmode (insn, CCNOmode)
    && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
@@ -7730,7 +7672,7 @@
 (define_insn "*andsi_1"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r,r")
 	(and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,qm")
-		(match_operand:SI 2 "general_operand" "ri,rm,L")))
+		(match_operand:SI 2 "x86_64_general_operand" "re,rm,L")))
    (clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (AND, SImode, operands)"
 {
@@ -7777,7 +7719,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
 	  (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
-		  (match_operand:SI 2 "general_operand" "g"))))
+		  (match_operand:SI 2 "x86_64_general_operand" "rme"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands)"
   "and{l}\t{%2, %k0|%k0, %2}"
@@ -7925,7 +7867,7 @@
   [(set (reg FLAGS_REG)
 	(compare (and:SWI124
 		  (match_operand:SWI124 1 "nonimmediate_operand" "%0,0")
-		  (match_operand:SWI124 2 "general_operand" "<g>,<r><i>"))
+		  (match_operand:SWI124 2 "<general_operand>" "<g>,<r><i>"))
 		 (const_int 0)))
    (set (match_operand:SWI124 0 "nonimmediate_operand" "=<r>,<r>m")
 	(and:SWI124 (match_dup 1) (match_dup 2)))]
@@ -7940,7 +7882,7 @@
   [(set (reg FLAGS_REG)
 	(compare (and:SI
 		  (match_operand:SI 1 "nonimmediate_operand" "%0")
-		  (match_operand:SI 2 "general_operand" "g"))
+		  (match_operand:SI 2 "x86_64_general_operand" "rme"))
 		 (const_int 0)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (and:SI (match_dup 1) (match_dup 2))))]
@@ -8157,7 +8099,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
 	 (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
-		    (match_operand:SI 2 "general_operand" "g"))))
+		    (match_operand:SI 2 "x86_64_general_operand" "rme"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (<CODE>, SImode, operands)"
   "<logic>{l}\t{%2, %k0|%k0, %2}"
@@ -8205,7 +8147,7 @@
 (define_insn "*<code>si_2_zext"
   [(set (reg FLAGS_REG)
 	(compare (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
-			    (match_operand:SI 2 "general_operand" "g"))
+			    (match_operand:SI 2 "x86_64_general_operand" "rme"))
 		 (const_int 0)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (any_or:SI (match_dup 1) (match_dup 2))))]
@@ -9395,36 +9337,36 @@
        (const_string "*")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(ashift (match_operand 1 "index_register_operand" "")
                 (match_operand:QI 2 "const_int_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && reload_completed
    && true_regnum (operands[0]) != true_regnum (operands[1])"
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  if (mode != Pmode)
-    operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), Pmode);
-
-  pat = gen_rtx_MULT (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
+  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), mode);
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_MULT (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
Index: config/i386/predicates.md
===================================================================
--- config/i386/predicates.md	(revision 176748)
+++ config/i386/predicates.md	(working copy)
@@ -366,9 +366,13 @@
 
 ;; Return true when operand is PIC expression that can be computed by lea
 ;; operation.
-(define_predicate "pic_32bit_operand"
+(define_special_predicate "pic_32bit_operand"
   (match_code "const,symbol_ref,label_ref")
 {
+  if (GET_MODE (op) != SImode
+      && GET_MODE (op) != DImode)
+    return false;
+
   if (!flag_pic)
     return false;
   /* Rule out relocations that translate into 64bit constants.  */
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 176748)
+++ config/i386/i386.c	(working copy)
@@ -14122,6 +14122,9 @@ ix86_print_operand_address (FILE *file, 
     }
   else
     {
+      /* Use DImode registers in address on 64bit target.  */
+      int code = TARGET_64BIT ? 'q' : 0;
+
       if (ASSEMBLER_DIALECT == ASM_ATT)
 	{
 	  if (disp)
@@ -14136,11 +14139,11 @@ ix86_print_operand_address (FILE *file, 
 
 	  putc ('(', file);
 	  if (base)
-	    print_reg (base, 0, file);
+	    print_reg (base, code, file);
 	  if (index)
 	    {
 	      putc (',', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, ",%d", scale);
 	    }
@@ -14175,7 +14178,7 @@ ix86_print_operand_address (FILE *file, 
 	  putc ('[', file);
 	  if (base)
 	    {
-	      print_reg (base, 0, file);
+	      print_reg (base, code, file);
 	      if (offset)
 		{
 		  if (INTVAL (offset) >= 0)
@@ -14191,7 +14194,7 @@ ix86_print_operand_address (FILE *file, 
 	  if (index)
 	    {
 	      putc ('+', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, "*%d", scale);
 	    }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, i386, take 2]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25 21:16       ` [PATCH, i386, take 2]: " Uros Bizjak
@ 2011-07-25 21:47         ` H.J. Lu
  2011-07-25 21:52           ` Uros Bizjak
  0 siblings, 1 reply; 10+ messages in thread
From: H.J. Lu @ 2011-07-25 21:47 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Richard Henderson, jh

On Mon, Jul 25, 2011 at 12:59 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Jul 25, 2011 at 3:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>> Attached patch implements -fpic handling for x32. In x32 mode, we now
>>> use x86_64_general_operand and corresponding "e" constraints for adds
>>> in SImode, since it looks that invalid addresses can only be generated
>>> through adds. This avoids a whole bunch of new predicates and
>>> constraints.
>
>> X32 glibc is miscompiled:
>>
>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>  -E -x c-header'
>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>> --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>> ../scripts -h rpcsvc/yppasswd.x -o
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>> make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
>> Segmentation fault (core dumped)
>>
>> Some LEA patterns are wrong for x32.  I will investigate.
>
> We have to prevent symbols from entering general_operand predicated
> SImode operands. Fortunatelly, x86_64_general_operand works OK for
> x32, while both for i686 and x86_64 are unaffected due to early bypass
> (i686) and due to the fact that all symbols are DImode (x86_64).
>
> 2011-07-25  Uros Bizjak  <ubizjak@gmail.com>
>            H.J. Lu  <hongjiu.lu@intel.com>
>
>        PR target/47381
>        PR target/49832
>        PR target/49833
>        * config/i386/i386.md (i): Change SImode attribute to "e".
>        (g): Change SImode attribute to "rme".
>        (di): Change SImode attribute to "nF".
>        (general_operand): Change SImode attribute to x86_64_general_operand.
>        (general_szext_operand): Change SImode attribute to
>        x86_64_szext_general_operand.
>        (immediate_operand): Change SImode attribute to
>        x86_64_immediate_operand-
>        (*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
>        (*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
>        (*lea_1): Use SWI48 mode iterator.
>        (*lea_1_zext): New insn pattern.
>        (*add<mode>1): Use x86_64_general_operand predicate for operand 2.
>        Update operand constraints.
>        (addsi_1_zext): Ditto.
>        (*add<mode>2): Ditto.
>        (*addsi_3_zext): Ditto.
>        (*subsi_1_zext): Ditto.
>        (*subsi_2_zext): Ditto.
>        (*subsi_3_zext): Ditto.
>        (*addsi3_carry_zext): Ditto.
>        (*<plusminus_insn>si3_zext_cc_overflow): Ditto.
>        (*mulsi3_1_zext): Ditto.
>        (*andsi_1): Ditto.
>        (*andsi_1_zext): Ditto.
>        (*andsi_2_zext): Ditto.
>        (*<any_or:code>si_1_zext): Ditto.
>        (*<any_or:code>si_2_zext): Ditto.
>        (*test<mode>_1): Use <general_operand> predicate for operand 1.
>        (*and<mode>_2): Ditto.
>        (add->lea splitter): Check operand modes in insn constraint.  Extend
>        operands less than SImode wide to SImode.
>        (add->lea zext splitter): Do not extend input operands to DImode.
>        (*lea_general_1): Handle only QImode and HImode operands.
>        (*lea_general_2): Ditto.
>        (*lea_general_3): Ditto.
>        (*lea_general_1_zext): Remove.
>        (*lea_general_2_zext): Ditto.
>        (*lea_general_3_zext): Ditto.
>        (*lea_general_4): Check operand modes in insn constraint.  Extend
>        operands less than SImode wide to SImode.
>        (ashift->lea splitter): Ditto.
>        * config/i386/i386.c (ix86_print_operand_address): Print address
>        registers with 'q' modifier on 64bit targets.
>        * config/i386/predicates.md (pic_32bit_opreand): Define as special
>        predicate.  Reject non-SI and non-DI modes.
>
> Bootstrapped and regression ested on x86_64-pc-linux-gnu {,-m32}.
>
> Uros.
>

GCC and glibc testsuites are clean on x32.  Can you check it in?

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, i386, take 2]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25 21:47         ` H.J. Lu
@ 2011-07-25 21:52           ` Uros Bizjak
  2011-07-26 13:26             ` Uros Bizjak
  0 siblings, 1 reply; 10+ messages in thread
From: Uros Bizjak @ 2011-07-25 21:52 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Richard Henderson, jh

On Mon, Jul 25, 2011 at 11:05 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>> Attached patch implements -fpic handling for x32. In x32 mode, we now
>>>> use x86_64_general_operand and corresponding "e" constraints for adds
>>>> in SImode, since it looks that invalid addresses can only be generated
>>>> through adds. This avoids a whole bunch of new predicates and
>>>> constraints.
>>
>>> X32 glibc is miscompiled:
>>>
>>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>>  -E -x c-header'
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>>> --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>>> ../scripts -h rpcsvc/yppasswd.x -o
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>>> make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
>>> Segmentation fault (core dumped)
>>>
>>> Some LEA patterns are wrong for x32.  I will investigate.
>>
>> We have to prevent symbols from entering general_operand predicated
>> SImode operands. Fortunatelly, x86_64_general_operand works OK for
>> x32, while both for i686 and x86_64 are unaffected due to early bypass
>> (i686) and due to the fact that all symbols are DImode (x86_64).
>>
>> 2011-07-25  Uros Bizjak  <ubizjak@gmail.com>
>>            H.J. Lu  <hongjiu.lu@intel.com>
>>
>>        PR target/47381
>>        PR target/49832
>>        PR target/49833
>>        * config/i386/i386.md (i): Change SImode attribute to "e".
>>        (g): Change SImode attribute to "rme".
>>        (di): Change SImode attribute to "nF".
>>        (general_operand): Change SImode attribute to x86_64_general_operand.
>>        (general_szext_operand): Change SImode attribute to
>>        x86_64_szext_general_operand.
>>        (immediate_operand): Change SImode attribute to
>>        x86_64_immediate_operand-
>>        (*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
>>        (*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
>>        (*lea_1): Use SWI48 mode iterator.
>>        (*lea_1_zext): New insn pattern.
>>        (*add<mode>1): Use x86_64_general_operand predicate for operand 2.
>>        Update operand constraints.
>>        (addsi_1_zext): Ditto.
>>        (*add<mode>2): Ditto.
>>        (*addsi_3_zext): Ditto.
>>        (*subsi_1_zext): Ditto.
>>        (*subsi_2_zext): Ditto.
>>        (*subsi_3_zext): Ditto.
>>        (*addsi3_carry_zext): Ditto.
>>        (*<plusminus_insn>si3_zext_cc_overflow): Ditto.
>>        (*mulsi3_1_zext): Ditto.
>>        (*andsi_1): Ditto.
>>        (*andsi_1_zext): Ditto.
>>        (*andsi_2_zext): Ditto.
>>        (*<any_or:code>si_1_zext): Ditto.
>>        (*<any_or:code>si_2_zext): Ditto.
>>        (*test<mode>_1): Use <general_operand> predicate for operand 1.
>>        (*and<mode>_2): Ditto.
>>        (add->lea splitter): Check operand modes in insn constraint.  Extend
>>        operands less than SImode wide to SImode.
>>        (add->lea zext splitter): Do not extend input operands to DImode.
>>        (*lea_general_1): Handle only QImode and HImode operands.
>>        (*lea_general_2): Ditto.
>>        (*lea_general_3): Ditto.
>>        (*lea_general_1_zext): Remove.
>>        (*lea_general_2_zext): Ditto.
>>        (*lea_general_3_zext): Ditto.
>>        (*lea_general_4): Check operand modes in insn constraint.  Extend
>>        operands less than SImode wide to SImode.
>>        (ashift->lea splitter): Ditto.
>>        * config/i386/i386.c (ix86_print_operand_address): Print address
>>        registers with 'q' modifier on 64bit targets.
>>        * config/i386/predicates.md (pic_32bit_opreand): Define as special
>>        predicate.  Reject non-SI and non-DI modes.
>>
>> Bootstrapped and regression ested on x86_64-pc-linux-gnu {,-m32}.

> GCC and glibc testsuites are clean on x32.  Can you check it in?

I will do this tomorrow, if anybody has some comment on the patch.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH, i386, take 2]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns)
  2011-07-25 21:52           ` Uros Bizjak
@ 2011-07-26 13:26             ` Uros Bizjak
  0 siblings, 0 replies; 10+ messages in thread
From: Uros Bizjak @ 2011-07-26 13:26 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Richard Henderson, jh

[-- Attachment #1: Type: text/plain, Size: 4440 bytes --]

On Mon, Jul 25, 2011 at 11:09 PM, Uros Bizjak <ubizjak@gmail.com> wrote:

>>>>> Attached patch implements -fpic handling for x32. In x32 mode, we now
>>>>> use x86_64_general_operand and corresponding "e" constraints for adds
>>>>> in SImode, since it looks that invalid addresses can only be generated
>>>>> through adds. This avoids a whole bunch of new predicates and
>>>>> constraints.
>>>
>>>> X32 glibc is miscompiled:
>>>>
>>>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>>>  -E -x c-header'
>>>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>>>> --library-path /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>>>> ../scripts -h rpcsvc/yppasswd.x -o
>>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>>>> make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.stmp]
>>>> Segmentation fault (core dumped)
>>>>
>>>> Some LEA patterns are wrong for x32.  I will investigate.
>>>
>>> We have to prevent symbols from entering general_operand predicated
>>> SImode operands. Fortunatelly, x86_64_general_operand works OK for
>>> x32, while both for i686 and x86_64 are unaffected due to early bypass
>>> (i686) and due to the fact that all symbols are DImode (x86_64).
>
>> GCC and glibc testsuites are clean on x32.  Can you check it in?
>
> I will do this tomorrow, if anybody has some comment on the patch.

Committed slightly updated version:

2011-07-26  Uros Bizjak  <ubizjak@gmail.com>
	    H.J. Lu  <hongjiu.lu@intel.com>

	PR target/47381
	PR target/49832
	PR target/49833
	* config/i386/i386.md (i): Change SImode attribute to "e".
	(g): Change SImode attribute to "rme".
	(di): Change SImode attribute to "nF".
	(general_operand): Change SImode attribute to x86_64_general_operand.
	(general_szext_operand): Change SImode attribute to
	x86_64_szext_general_operand.
	(immediate_operand): Change SImode attribute to
	x86_64_immediate_operand.
	(nonmemory_operand): Change SImode attribute to
	x86_64_nonmemory_operand.
	(*movdi_internal_rex64): Remove mode from pic_32bit_operand check.
	(*movsi_internal): Ditto.  Use "e" constraint in alternative 2.
	(*lea_1): Use SWI48 mode iterator.
	(*lea_1_zext): New insn pattern.
	(testsi_ccno_1): Use x86_64_nonmemory_operand predicate for operand 2.
	(*bt<mode>): Ditto.
	(*add<mode>1): Use x86_64_general_operand predicate for operand 2.
	Update operand constraints.
	(addsi_1_zext): Ditto.
	(*add<mode>2): Ditto.
	(*addsi_3_zext): Ditto.
	(*subsi_1_zext): Ditto.
	(*subsi_2_zext): Ditto.
	(*subsi_3_zext): Ditto.
	(*addsi3_carry_zext): Ditto.
	(*<plusminus_insn>si3_zext_cc_overflow): Ditto.
	(*mulsi3_1_zext): Ditto.
	(*andsi_1): Ditto.
	(*andsi_1_zext): Ditto.
	(*andsi_2_zext): Ditto.
	(*<any_or:code>si_1_zext): Ditto.
	(*<any_or:code>si_2_zext): Ditto.
	(*test<mode>_1): Use <general_operand> predicate for operand 1.
	(*and<mode>_2): Ditto.
	(mov<mode>cc): Use  <general_operand> predicate for operands 1 and 2.
	(add->lea splitter): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(add->lea zext splitter): Do not extend input operands to DImode.
	(*lea_general_1): Handle only QImode and HImode operands.
	(*lea_general_2): Ditto.
	(*lea_general_3): Ditto.
	(*lea_general_1_zext): Remove.
	(*lea_general_2_zext): Ditto.
	(*lea_general_3_zext): Ditto.
	(*lea_general_4): Check operand modes in insn constraint.  Extend
	operands less than SImode wide to SImode.
	(ashift->lea splitter): Ditto.
	* config/i386/i386.c (ix86_print_operand_address): Print address
	registers with 'q' modifier on 64bit targets.
	* config/i386/predicates.md (pic_32bit_opreand): Define as special
	predicate.  Reject non-SI and non-DI modes.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 27666 bytes --]

Index: i386.md
===================================================================
--- i386.md	(revision 176782)
+++ i386.md	(working copy)
@@ -861,13 +861,13 @@
 (define_mode_attr r [(QI "q") (HI "r") (SI "r") (DI "r")])
 
 ;; Immediate operand constraint for integer modes.
-(define_mode_attr i [(QI "n") (HI "n") (SI "i") (DI "e")])
+(define_mode_attr i [(QI "n") (HI "n") (SI "e") (DI "e")])
 
 ;; General operand constraint for word modes.
-(define_mode_attr g [(QI "qmn") (HI "rmn") (SI "g") (DI "rme")])
+(define_mode_attr g [(QI "qmn") (HI "rmn") (SI "rme") (DI "rme")])
 
 ;; Immediate operand constraint for double integer modes.
-(define_mode_attr di [(SI "iF") (DI "e")])
+(define_mode_attr di [(SI "nF") (DI "e")])
 
 ;; Immediate operand constraint for shifts.
 (define_mode_attr S [(QI "I") (HI "I") (SI "I") (DI "J") (TI "O")])
@@ -876,7 +876,7 @@
 (define_mode_attr general_operand
 	[(QI "general_operand")
 	 (HI "general_operand")
-	 (SI "general_operand")
+	 (SI "x86_64_general_operand")
 	 (DI "x86_64_general_operand")
 	 (TI "x86_64_general_operand")])
 
@@ -884,21 +884,21 @@
 (define_mode_attr general_szext_operand
 	[(QI "general_operand")
 	 (HI "general_operand")
-	 (SI "general_operand")
+	 (SI "x86_64_szext_general_operand")
 	 (DI "x86_64_szext_general_operand")])
 
 ;; Immediate operand predicate for integer modes.
 (define_mode_attr immediate_operand
 	[(QI "immediate_operand")
 	 (HI "immediate_operand")
-	 (SI "immediate_operand")
+	 (SI "x86_64_immediate_operand")
 	 (DI "x86_64_immediate_operand")])
 
 ;; Nonmemory operand predicate for integer modes.
 (define_mode_attr nonmemory_operand
 	[(QI "nonmemory_operand")
 	 (HI "nonmemory_operand")
-	 (SI "nonmemory_operand")
+	 (SI "x86_64_nonmemory_operand")
 	 (DI "x86_64_nonmemory_operand")])
 
 ;; Operand predicate for shifts.
@@ -2039,7 +2039,7 @@
 	      (const_string "ssemov")
 	    (eq_attr "alternative" "16,17")
 	      (const_string "ssecvt")
- 	    (match_operand:DI 1 "pic_32bit_operand" "")
+ 	    (match_operand 1 "pic_32bit_operand" "")
 	      (const_string "lea")
 	   ]
 	   (const_string "imov")))
@@ -2184,7 +2184,7 @@
   [(set (match_operand:SI 0 "nonimmediate_operand"
 			"=r,m ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x")
 	(match_operand:SI 1 "general_operand"
-			"g ,ri,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
+			"g ,re,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
 {
   switch (get_attr_type (insn))
@@ -2232,7 +2232,7 @@
 	      (const_string "sselog1")
 	    (eq_attr "alternative" "7,8,9,10,11")
 	      (const_string "ssemov")
- 	    (match_operand:DI 1 "pic_32bit_operand" "")
+ 	    (match_operand 1 "pic_32bit_operand" "")
 	      (const_string "lea")
 	   ]
 	   (const_string "imov")))
@@ -5425,13 +5425,22 @@
    (set_attr "mode" "QI")])
 
 (define_insn "*lea_1"
-  [(set (match_operand:P 0 "register_operand" "=r")
-	(match_operand:P 1 "no_seg_address_operand" "p"))]
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+	(match_operand:SWI48 1 "no_seg_address_operand" "p"))]
   ""
   "lea{<imodesuffix>}\t{%a1, %0|%0, %a1}"
   [(set_attr "type" "lea")
    (set_attr "mode" "<MODE>")])
 
+(define_insn "*lea_1_zext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(zero_extend:DI
+	  (match_operand:SI 1 "no_seg_address_operand" "p")))]
+  "TARGET_64BIT"
+  "lea{l}\t{%a1, %k0|%k0, %a1}"
+  [(set_attr "type" "lea")
+   (set_attr "mode" "SI")])
+
 (define_insn "*lea_2"
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(subreg:SI (match_operand:DI 1 "no_seg_address_operand" "p") 0))]
@@ -5453,7 +5462,7 @@
   [(set (match_operand:SWI48 0 "nonimmediate_operand" "=r,rm,r,r")
 	(plus:SWI48
 	  (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r")
-	  (match_operand:SWI48 2 "<general_operand>" "<g>,r<i>,0,l<i>")))
+	  (match_operand:SWI48 2 "x86_64_general_operand" "rme,re,0,le")))
    (clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
 {
@@ -5512,7 +5521,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
 	(zero_extend:DI
 	  (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,r,r")
-		   (match_operand:SI 2 "general_operand" "g,0,li"))))
+		   (match_operand:SI 2 "x86_64_general_operand" "rme,0,le"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)"
 {
@@ -5794,39 +5803,36 @@
         (const_string "none")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(plus (match_operand 1 "register_operand" "")
               (match_operand 2 "nonmemory_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed && ix86_lea_for_add_ok (insn, operands)" 
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && (GET_MODE (operands[0]) == GET_MODE (operands[2])
+       || GET_MODE (operands[2]) == VOIDmode)
+   && reload_completed && ix86_lea_for_add_ok (insn, operands)" 
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  /* In -fPIC mode the constructs like (const (unspec [symbol_ref]))
-     may confuse gen_lowpart.  */
-  if (mode != Pmode)
-    {
-      operands[1] = gen_lowpart (Pmode, operands[1]);
-      operands[2] = gen_lowpart (Pmode, operands[2]);
-    }
-
-  pat = gen_rtx_PLUS (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+      operands[2] = gen_lowpart (mode, operands[2]);
+    }
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_PLUS (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 ;; ??? This pattern handles immediate operands that do not satisfy immediate
 ;; operand predicate (TARGET_LEGITIMATE_CONSTANT_P) in the previous pattern.
 (define_split
@@ -5839,7 +5845,7 @@
   [(set (match_dup 0)
 	(plus:DI (match_dup 1) (match_dup 2)))])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
@@ -5849,11 +5855,7 @@
   "TARGET_64BIT && reload_completed
    && ix86_lea_for_add_ok (insn, operands)"
   [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (match_dup 1) (match_dup 2)) 0)))]
-{
-  operands[1] = gen_lowpart (DImode, operands[1]);
-  operands[2] = gen_lowpart (DImode, operands[2]);
-})
+	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))])
 
 (define_insn "*add<mode>_2"
   [(set (reg FLAGS_REG)
@@ -5901,7 +5903,7 @@
   [(set (reg FLAGS_REG)
 	(compare
 	  (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
-		   (match_operand:SI 2 "general_operand" "g"))
+		   (match_operand:SI 2 "x86_64_general_operand" "rme"))
 	  (const_int 0)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))]
@@ -5979,7 +5981,7 @@
 (define_insn "*addsi_3_zext"
   [(set (reg FLAGS_REG)
 	(compare
-	  (neg:SI (match_operand:SI 2 "general_operand" "g"))
+	  (neg:SI (match_operand:SI 2 "x86_64_general_operand" "rme"))
 	  (match_operand:SI 1 "nonimmediate_operand" "%0")))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))]
@@ -6233,7 +6235,7 @@
   [(set_attr "type" "alu")
    (set_attr "mode" "QI")])
 
-;; The lea patterns for non-Pmodes needs to be matched by
+;; The lea patterns for modes less than 32 bits need to be matched by
 ;; several insns converted to real lea by splitters.
 
 (define_insn_and_split "*lea_general_1"
@@ -6241,8 +6243,7 @@
 	(plus (plus (match_operand 1 "index_register_operand" "l")
 		    (match_operand 2 "register_operand" "r"))
 	      (match_operand 3 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
    && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[2])
@@ -6252,53 +6253,30 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_PLUS (Pmode, operands[1], operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[2] = gen_lowpart (mode, operands[2]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_PLUS (mode, operands[1], operands[2]),
   		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_1_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "register_operand" "r"))
-		   (match_operand:SI 3 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_2"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (mult (match_operand 1 "index_register_operand" "l")
-		    (match_operand 2 "const248_operand" "i"))
+		    (match_operand 2 "const248_operand" "n"))
 	      (match_operand 3 "nonmemory_operand" "ri")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && (GET_MODE (operands[0]) == GET_MODE (operands[3])
        || GET_MODE (operands[3]) == VOIDmode)"
@@ -6306,110 +6284,68 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  pat = gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1], operands[2]),
-  		      operands[3]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+
+  pat = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, operands[1], operands[2]),
+		      operands[3]);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_2_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (mult:SI
-		     (match_operand:SI 1 "index_register_operand" "l")
-		     (match_operand:SI 2 "const248_operand" "n"))
-		   (match_operand:SI 3 "nonmemory_operand" "ri"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (mult:DI (match_dup 1)
-						     (match_dup 2))
-					    (match_dup 3)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_3"
   [(set (match_operand 0 "register_operand" "=r")
 	(plus (plus (mult (match_operand 1 "index_register_operand" "l")
-			  (match_operand 2 "const248_operand" "i"))
+			  (match_operand 2 "const248_operand" "n"))
 		    (match_operand 3 "register_operand" "r"))
 	      (match_operand 4 "immediate_operand" "i")))]
-  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode
-    || (TARGET_64BIT && GET_MODE (operands[0]) == SImode))
-   && (!TARGET_PARTIAL_REG_STALL
-       || GET_MODE (operands[0]) == SImode
-       || optimize_function_for_size_p (cfun))
+  "(GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+   && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
    && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && GET_MODE (operands[0]) == GET_MODE (operands[3])"
   "#"
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = SImode;
   rtx pat;
-  operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-  pat = gen_rtx_PLUS (Pmode,
-  		      gen_rtx_PLUS (Pmode, gen_rtx_MULT (Pmode, operands[1],
-		      					 operands[2]),
+
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+  operands[4] = gen_lowpart (mode, operands[4]);
+
+  pat = gen_rtx_PLUS (mode,
+  		      gen_rtx_PLUS (mode,
+				    gen_rtx_MULT (mode, operands[1],
+		      					operands[2]),
 				    operands[3]),
   		      operands[4]);
-  if (Pmode != SImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set_attr "mode" "SI")])
 
-(define_insn_and_split "*lea_general_3_zext"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(zero_extend:DI
-	  (plus:SI (plus:SI
-		     (mult:SI
-		       (match_operand:SI 1 "index_register_operand" "l")
-		       (match_operand:SI 2 "const248_operand" "n"))
-		     (match_operand:SI 3 "register_operand" "r"))
-		   (match_operand:SI 4 "immediate_operand" "i"))))]
-  "TARGET_64BIT"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0)
-	(zero_extend:DI (subreg:SI (plus:DI (plus:DI (mult:DI (match_dup 1)
-							      (match_dup 2))
-						     (match_dup 3))
-					    (match_dup 4)) 0)))]
-{
-  operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[3] = gen_lowpart (Pmode, operands[3]);
-  operands[4] = gen_lowpart (Pmode, operands[4]);
-}
-  [(set_attr "type" "lea")
-   (set_attr "mode" "SI")])
-
 (define_insn_and_split "*lea_general_4"
-  [(set (match_operand:SWI 0 "register_operand" "=r")
-	(any_or:SWI (ashift:SWI (match_operand:SWI 1 "index_register_operand" "l")
-				(match_operand:SWI 2 "const_int_operand" "n"))
-		    (match_operand 3 "const_int_operand" "n")))]
-  "(<MODE>mode == DImode
-    || <MODE>mode == SImode
-    || !TARGET_PARTIAL_REG_STALL
-    || optimize_function_for_size_p (cfun))
+  [(set (match_operand 0 "register_operand" "=r")
+	(any_or (ashift
+		  (match_operand 1 "index_register_operand" "l")
+		  (match_operand 2 "const_int_operand" "n"))
+		(match_operand 3 "const_int_operand" "n")))]
+  "(((GET_MODE (operands[0]) == QImode || GET_MODE (operands[0]) == HImode)
+      && (!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)))
+    || GET_MODE (operands[0]) == SImode
+    || (TARGET_64BIT && GET_MODE (operands[0]) == DImode))
+   && GET_MODE (operands[0]) == GET_MODE (operands[1])
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 < 3
    && ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
        < ((unsigned HOST_WIDE_INT) 1 << INTVAL (operands[2])))"
@@ -6417,23 +6353,29 @@
   "&& reload_completed"
   [(const_int 0)]
 {
+  enum machine_mode mode = GET_MODE (operands[0]);
   rtx pat;
-  if (<MODE>mode != DImode)
-    operands[0] = gen_lowpart (SImode, operands[0]);
-  operands[1] = gen_lowpart (Pmode, operands[1]);
+
+  if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
   operands[2] = GEN_INT (1 << INTVAL (operands[2]));
-  pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]),
+
+  pat = plus_constant (gen_rtx_MULT (mode, operands[1], operands[2]),
 		       INTVAL (operands[3]));
-  if (Pmode != SImode && <MODE>mode != DImode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 }
   [(set_attr "type" "lea")
    (set (attr "mode")
-      (if_then_else (eq (symbol_ref "<MODE>mode == DImode") (const_int 0))
-	(const_string "SI")
-	(const_string "DI")))])
+      (if_then_else (match_operand:DI 0 "" "")
+	(const_string "DI")
+	(const_string "SI")))])
 \f
 ;; Subtract instructions
 
@@ -6481,7 +6423,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
 	  (minus:SI (match_operand:SI 1 "register_operand" "0")
-		    (match_operand:SI 2 "general_operand" "g"))))
+		    (match_operand:SI 2 "x86_64_general_operand" "rme"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)"
   "sub{l}\t{%2, %k0|%k0, %2}"
@@ -6518,7 +6460,7 @@
   [(set (reg FLAGS_REG)
 	(compare
 	  (minus:SI (match_operand:SI 1 "register_operand" "0")
-		    (match_operand:SI 2 "general_operand" "g"))
+		    (match_operand:SI 2 "x86_64_general_operand" "rme"))
 	  (const_int 0)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
@@ -6545,7 +6487,7 @@
 (define_insn "*subsi_3_zext"
   [(set (reg FLAGS_REG)
 	(compare (match_operand:SI 1 "register_operand" "0")
-		 (match_operand:SI 2 "general_operand" "g")))
+		 (match_operand:SI 2 "x86_64_general_operand" "rme")))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
 	  (minus:SI (match_dup 1)
@@ -6592,7 +6534,7 @@
 	  (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
 		   (plus:SI (match_operator 3 "ix86_carry_flag_operator"
 			     [(reg FLAGS_REG) (const_int 0)])
-			    (match_operand:SI 2 "general_operand" "g")))))
+			    (match_operand:SI 2 "x86_64_general_operand" "rme")))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)"
   "adc{l}\t{%2, %k0|%k0, %2}"
@@ -6607,7 +6549,7 @@
 	  (minus:SI (match_operand:SI 1 "register_operand" "0")
 		    (plus:SI (match_operator 3 "ix86_carry_flag_operator"
 			      [(reg FLAGS_REG) (const_int 0)])
-			     (match_operand:SI 2 "general_operand" "g")))))
+			     (match_operand:SI 2 "x86_64_general_operand" "rme")))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (MINUS, SImode, operands)"
   "sbb{l}\t{%2, %k0|%k0, %2}"
@@ -6661,7 +6603,7 @@
 	(compare:CCC
 	  (plusminus:SI
 	    (match_operand:SI 1 "nonimmediate_operand" "<comm>0")
-	    (match_operand:SI 2 "general_operand" "g"))
+	    (match_operand:SI 2 "x86_64_general_operand" "rme"))
 	  (match_dup 1)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (plusminus:SI (match_dup 1) (match_dup 2))))]
@@ -6748,7 +6690,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
 	(zero_extend:DI
 	  (mult:SI (match_operand:SI 1 "nonimmediate_operand" "%rm,rm,0")
-		   (match_operand:SI 2 "general_operand" "K,i,mr"))))
+		   (match_operand:SI 2 "x86_64_general_operand" "K,e,mr"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT
    && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
@@ -7374,7 +7316,7 @@
   [(set (reg:CCNO FLAGS_REG)
 	(compare:CCNO
 	  (and:SI (match_operand:SI 0 "nonimmediate_operand" "")
-		  (match_operand:SI 1 "nonmemory_operand" ""))
+		  (match_operand:SI 1 "x86_64_nonmemory_operand" ""))
 	  (const_int 0)))])
 
 (define_expand "testqi_ccz_1"
@@ -7440,7 +7382,7 @@
 	(compare
 	 (and:SWI124
 	  (match_operand:SWI124 0 "nonimmediate_operand" "%!*a,<r>,<r>m")
-	  (match_operand:SWI124 1 "general_operand" "<i>,<i>,<r><i>"))
+	  (match_operand:SWI124 1 "<general_operand>" "<i>,<i>,<r><i>"))
 	 (const_int 0)))]
   "ix86_match_ccmode (insn, CCNOmode)
    && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
@@ -7730,7 +7672,7 @@
 (define_insn "*andsi_1"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r,r")
 	(and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,qm")
-		(match_operand:SI 2 "general_operand" "ri,rm,L")))
+		(match_operand:SI 2 "x86_64_general_operand" "re,rm,L")))
    (clobber (reg:CC FLAGS_REG))]
   "ix86_binary_operator_ok (AND, SImode, operands)"
 {
@@ -7777,7 +7719,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
 	  (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
-		  (match_operand:SI 2 "general_operand" "g"))))
+		  (match_operand:SI 2 "x86_64_general_operand" "rme"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (AND, SImode, operands)"
   "and{l}\t{%2, %k0|%k0, %2}"
@@ -7925,7 +7867,7 @@
   [(set (reg FLAGS_REG)
 	(compare (and:SWI124
 		  (match_operand:SWI124 1 "nonimmediate_operand" "%0,0")
-		  (match_operand:SWI124 2 "general_operand" "<g>,<r><i>"))
+		  (match_operand:SWI124 2 "<general_operand>" "<g>,<r><i>"))
 		 (const_int 0)))
    (set (match_operand:SWI124 0 "nonimmediate_operand" "=<r>,<r>m")
 	(and:SWI124 (match_dup 1) (match_dup 2)))]
@@ -7940,7 +7882,7 @@
   [(set (reg FLAGS_REG)
 	(compare (and:SI
 		  (match_operand:SI 1 "nonimmediate_operand" "%0")
-		  (match_operand:SI 2 "general_operand" "g"))
+		  (match_operand:SI 2 "x86_64_general_operand" "rme"))
 		 (const_int 0)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (and:SI (match_dup 1) (match_dup 2))))]
@@ -8157,7 +8099,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
 	 (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
-		    (match_operand:SI 2 "general_operand" "g"))))
+		    (match_operand:SI 2 "x86_64_general_operand" "rme"))))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && ix86_binary_operator_ok (<CODE>, SImode, operands)"
   "<logic>{l}\t{%2, %k0|%k0, %2}"
@@ -8205,7 +8147,7 @@
 (define_insn "*<code>si_2_zext"
   [(set (reg FLAGS_REG)
 	(compare (any_or:SI (match_operand:SI 1 "nonimmediate_operand" "%0")
-			    (match_operand:SI 2 "general_operand" "g"))
+			    (match_operand:SI 2 "x86_64_general_operand" "rme"))
 		 (const_int 0)))
    (set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI (any_or:SI (match_dup 1) (match_dup 2))))]
@@ -9395,36 +9337,36 @@
        (const_string "*")))
    (set_attr "mode" "QI")])
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand 0 "register_operand" "")
 	(ashift (match_operand 1 "index_register_operand" "")
                 (match_operand:QI 2 "const_int_operand" "")))
    (clobber (reg:CC FLAGS_REG))]
-  "reload_completed
+  "GET_MODE (operands[0]) == GET_MODE (operands[1])
+   && reload_completed
    && true_regnum (operands[0]) != true_regnum (operands[1])"
   [(const_int 0)]
 {
-  rtx pat;
   enum machine_mode mode = GET_MODE (operands[0]);
-
-  if (mode != Pmode)
-    operands[1] = gen_lowpart (Pmode, operands[1]);
-  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), Pmode);
-
-  pat = gen_rtx_MULT (Pmode, operands[1], operands[2]);
+  rtx pat;
 
   if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (SImode))
-    operands[0] = gen_lowpart (SImode, operands[0]);
+    { 
+      mode = SImode; 
+      operands[0] = gen_lowpart (mode, operands[0]);
+      operands[1] = gen_lowpart (mode, operands[1]);
+    }
+
+  operands[2] = gen_int_mode (1 << INTVAL (operands[2]), mode);
 
-  if (TARGET_64BIT && mode != Pmode)
-    pat = gen_rtx_SUBREG (SImode, pat, 0);
+  pat = gen_rtx_MULT (mode, operands[1], operands[2]);
 
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
   DONE;
 })
 
-;; Convert lea to the lea pattern to avoid flags dependency.
+;; Convert ashift to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 	(zero_extend:DI
@@ -10364,7 +10306,7 @@
 	  (zero_extract:SWI48
 	    (match_operand:SWI48 0 "register_operand" "r")
 	    (const_int 1)
-	    (match_operand:SWI48 1 "nonmemory_operand" "rN"))
+	    (match_operand:SWI48 1 "x86_64_nonmemory_operand" "rN"))
 	  (const_int 0)))]
   "TARGET_USE_BT || optimize_function_for_size_p (cfun)"
   "bt{<imodesuffix>}\t{%1, %0|%0, %1}"
@@ -16021,8 +15963,8 @@
 (define_expand "mov<mode>cc"
   [(set (match_operand:SWIM 0 "register_operand" "")
 	(if_then_else:SWIM (match_operand 1 "ordered_comparison_operator" "")
-			   (match_operand:SWIM 2 "general_operand" "")
-			   (match_operand:SWIM 3 "general_operand" "")))]
+			   (match_operand:SWIM 2 "<general_operand>" "")
+			   (match_operand:SWIM 3 "<general_operand>" "")))]
   ""
   "if (ix86_expand_int_movcc (operands)) DONE; else FAIL;")
 
Index: predicates.md
===================================================================
--- predicates.md	(revision 176782)
+++ predicates.md	(working copy)
@@ -366,9 +366,13 @@
 
 ;; Return true when operand is PIC expression that can be computed by lea
 ;; operation.
-(define_predicate "pic_32bit_operand"
+(define_special_predicate "pic_32bit_operand"
   (match_code "const,symbol_ref,label_ref")
 {
+  if (GET_MODE (op) != SImode
+      && GET_MODE (op) != DImode)
+    return false;
+
   if (!flag_pic)
     return false;
   /* Rule out relocations that translate into 64bit constants.  */
Index: i386.c
===================================================================
--- i386.c	(revision 176782)
+++ i386.c	(working copy)
@@ -14122,6 +14122,9 @@ ix86_print_operand_address (FILE *file, 
     }
   else
     {
+      /* Print DImode registers on 64bit targets to avoid addr32 prefixes.  */
+      int code = TARGET_64BIT ? 'q' : 0;
+
       if (ASSEMBLER_DIALECT == ASM_ATT)
 	{
 	  if (disp)
@@ -14136,11 +14139,11 @@ ix86_print_operand_address (FILE *file, 
 
 	  putc ('(', file);
 	  if (base)
-	    print_reg (base, 0, file);
+	    print_reg (base, code, file);
 	  if (index)
 	    {
 	      putc (',', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, ",%d", scale);
 	    }
@@ -14175,7 +14178,7 @@ ix86_print_operand_address (FILE *file, 
 	  putc ('[', file);
 	  if (base)
 	    {
-	      print_reg (base, 0, file);
+	      print_reg (base, code, file);
 	      if (offset)
 		{
 		  if (INTVAL (offset) >= 0)
@@ -14191,7 +14194,7 @@ ix86_print_operand_address (FILE *file, 
 	  if (index)
 	    {
 	      putc ('+', file);
-	      print_reg (index, 0, file);
+	      print_reg (index, code, file);
 	      if (scale != 1)
 		fprintf (file, "*%d", scale);
 	    }

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-07-26 12:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-25  2:04 [PATCH, i386]: Rewrite LEA handling (was:Re: PATCH [10/n] X32: Support x32 LEA insns) Uros Bizjak
2011-07-25  3:23 ` H.J. Lu
2011-07-25 13:24   ` Uros Bizjak
2011-07-25 14:03     ` H.J. Lu
2011-07-25 14:08       ` Uros Bizjak
2011-07-25 14:20         ` H.J. Lu
2011-07-25 21:16       ` [PATCH, i386, take 2]: " Uros Bizjak
2011-07-25 21:47         ` H.J. Lu
2011-07-25 21:52           ` Uros Bizjak
2011-07-26 13:26             ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).