public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Various changes to i386.md
@ 1998-09-21 23:45 Jan Hubicka
  0 siblings, 0 replies; 3+ messages in thread
From: Jan Hubicka @ 1998-09-21 23:45 UTC (permalink / raw)
  To: egcs

Hi
I am sending this letter again, because server crashed just after I sent
it first time and it seems to get lost, since I still can't see it at
mailing list. So please ignore it, if you have seen it before.

Yesterday I've spent some time by playing with i386.md file just to learn
how md files works. I've done some changes and there is possibility that
some of them can be usefull of inclusion. So I am sending list of changes
I've made and some questions about problems I've caused :). Also adding
a patch. I would love, if someone should look at the list and tell me, 
what changes are good and what are complette nonsence.


 o I've added new function unit FPMUL to describe gcc, that fpmul instruction
   at pentium are not complettely pipelined and that it is good idea to put
   other instruction between them. It seems to work and brings 5-10% speedup
   at my Mandelbrot loop in XaoS
 o TEST instruction is pairable at Pentium just with EAX parameter (don't ask
   me why), so I've changed code generation to use CMP in other cases
 o Change non pairable NOT to XOR
   same should be done with NEG, but it is profitable just in case that
   scheduler does good job. (1 cycle instruction expand to two pairable 1 cycles)
   Maybe we should split it and if scheduler don't put anything between use
   define_peephole or something to recombine it..
 o FPDIV instruction takes 38 cycles at pentium, 2 cycles should overlap with
   other FP instruction and rest with integer code, so I've described it to
   scheduler
 o I've started work at clasifying instructions for U and V pipelines. It is
   just approximation now. Also sets attribute prefix, 
   if instruction has 32bit->16bit prefix. This should be usefull for other
   processors too.
   I am not sure, how to describe behaviour of some patterns. For example
   addhi3 has very strange behaviour and I don't know, if it is possible to
   describe, when pairable instruction will be generated and when not.
 o I've made an attempt to specify behaviour of pentium pipelines in greater
   detail, so it now less optimistic about them and don't try to pair imuls,
   divs and other similar instructions. It reduces register lifetimes so it
   should help a bit (0-20% in my tests).
   To describe it, I say, that some instructions uses multiple units (non
   pairable instructions uses both). It seems to work with HAIFA, but I am not
   sure, if it is correct
 o To improve scheduling it is probably good idea to split existing
   instruction patterns to generate just one instruction when possible.
   So I've started with divmod instruction patters, because they looked
   interesting. Changed them to use zero_extend/truncate oprations. It seems
   to work, just in following case generates worse code:
    unsigned int u=rand(),b=rand(),c;
    asm(""::"d"(u));
    c=u%b;
   Quite funny output is:
   movl %ebx,%esi
   xorl %edi,%edi
   movl %esi,%eax
   movl %edi,%edx
   divl,%ecx
   So I don't know, if it was good idea. Probably not...
 o I've also looked at zero_extendsidi pattern. It don't do anything special,
   so it should be IMO ommited to enable gcc's default version.

   There is problem with life analysis. GCC add clober before, wich
   generates unnecesary collision between source and target. Possibly GCC's
   default version should be modified to handle this in better way.
   I've changed optabs to generate REG_NO_CONFLICT for clobber and final move.

   Problem is, that code for handling REG_NO_CONFLICT is disabled in global.c,
   because it can't catch partial conflicts. So before this gets fixed, this
   is probably not the way to go.
   I would possibly try to change global.c to handle this case correctly,
   if someone don't plans some larger changes to global.c
 o How to split extendsi pattern? It is possible to tell gcc to generate
   one instruction in case register is eax and split it in outher cases?

--- i386.md.old	Sat Sep 19 07:23:32 1998
+++ i386.md	Sun Sep 20 18:37:41 1998
@@ -75,6 +75,22 @@
   "integer,binary,memory,test,compare,fcompare,idiv,imul,lea,fld,fpop,fpdiv,fpmul"
   (const_string "integer"))
 
+;; true if instruction have 32bit to 16bit switching prefix
+;; it is _very_ rough approximation of real situation, because many
+;; instruction patterns generates many different insturctions, and I
+;; don't know how to write it more exactly. Someone should look at HI mode
+;; patterns and improve this.
+(define_attr "prefix"
+  "true,false"
+  (const_string "false"))
+
+;; pipelines used by pentium. FX is for floating point instructions, that
+;; pairs with fxch
+;; it is just aproximation for exactly same purposes as "prefix" attribute
+(define_attr "pipes"
+  "none,u,v,uv,fx"
+  (const_string "none"))
+
 (define_attr "memory" "none,load,store"
   (cond [(eq_attr "type" "idiv,lea")
 	 (const_string "none")
@@ -133,9 +149,26 @@
  (and (eq_attr "type" "fpop,fcompare") (eq_attr "cpu" "pentium,pentiumpro")) 
  3 0)
 
+;; Most FP instructions are decoded in u pipe
+(define_function_unit "upipe" 1 0
+ (and (eq_attr "type" "fpop,fcompare,fld,fpmul,fpdiv") (eq_attr "cpu" "pentium")) 
+ 1 0) 
+
+;; But some blocks vpipe too
+(define_function_unit "vpipe" 1 0
+ (and (and (eq_attr "type" "fpop,fcompare,fld,fpmul") (eq_attr "cpu" "pentium")) 
+      (eq_attr "pipes" "!fx"))
+ 1 0) 
+
 (define_function_unit "fp" 1 0
  (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) 
- 7 0)
+ 7 0) 
+;; It is recomended to put one fp instruction between two fmuls,
+;; since unit is not completely pipelined
+(define_function_unit "fpmul" 1 1
+ (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) 
+ 2 2) 
+
 
 (define_function_unit "fp" 1 0
  (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentiumpro")) 
@@ -150,9 +183,18 @@
  6 0)
 
 (define_function_unit "fp" 1 0
- (eq_attr "type" "fpdiv") 
+ (and (eq_attr "type" "fpdiv") 
+ (eq_attr "cpu" "!pentium"))
  10 10)
 
+;; fpdiv takes 38 cycles. 2 cycles should be used for fp instructions and
+;; rest for integer ones.
+(define_function_unit "fp" 1 0
+ (and (eq_attr "type" "fpdiv") 
+ (eq_attr "cpu" "pentium"))
+ 38 36)
+
+
 (define_function_unit "fp" 1 0
   (and (eq_attr "type" "fld") (eq_attr "cpu" "!pentiumpro,k6"))
  1 0)
@@ -165,7 +207,28 @@
 ;; i386 and i486 have one integer unit, which need not be modeled
 
 (define_function_unit "integer" 2 0
-  (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium,pentiumpro"))
+  (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentiumpro"))
+ 1 0)
+
+;; Pentium has u and v pipelines. They works in very strange way, so this is
+;; just approximation
+(define_function_unit "upipe" 1 0
+  (and (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium"))
+       (eq_attr "pipes" "u"))
+ 1 0)
+(define_function_unit "upipe" 1 0
+  (and (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium"))
+       (eq_attr "prefix" "true"))
+ 2 0) ;; one extra cycle
+
+(define_function_unit "vpipe" 1 0
+  (and (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium"))
+       (eq_attr "pipes" "v,none"))
+ 1 0)
+; uv instruction takes one extra cycle to avoid dependencies in uv only code
+(define_function_unit "vpipe" 1 0
+  (and (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium"))
+       (and (eq_attr "pipes" "uv") (eq_attr "prefix" "false")))
  1 0)
 
 (define_function_unit "integer" 2 0
@@ -182,16 +245,29 @@
 	    (eq_attr "memory" "load")))
   3 0)
 
-;; Multiplies use one of the integer units
-(define_function_unit "integer" 2 0
+(define_function_unit "upipe" 1 0
+  (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul"))
+  11 11)
+(define_function_unit "vpipe" 1 0
   (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul"))
   11 11)
+(define_function_unit "fp" 1 0
+  (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul"))
+  11 11)
+
+;; Even fp unit is blocked
 
 (define_function_unit "integer" 2 0
   (and (eq_attr "cpu" "k6") (eq_attr "type" "imul"))
   2 2)
 
-(define_function_unit "integer" 2 0
+(define_function_unit "upipe" 1 0
+  (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv"))
+  25 25)
+(define_function_unit "vpipe" 1 0
+  (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv"))
+  25 25)
+(define_function_unit "fp" 1 0
   (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv"))
   25 25)
 
@@ -217,7 +293,6 @@
 (define_function_unit "store" 1 0
   (and (eq_attr "cpu" "k6") (eq_attr "type" "lea"))
   1 0)
-
 \f
 ;; "movl MEM,REG / testl REG,REG" is faster on a 486 than "cmpl $0,MEM".
 ;; But restricting MEM here would mean that gcc could not remove a redundant
@@ -243,13 +318,17 @@
   ""
   "*
 {
-  if (REG_P (operands[0]))
+  if (REG_P(operands[0]) && 
+     (!ix86_cpu == PROCESSOR_PENTIUM || optimize_size || !REGNO (operands[0])))
+	  /*for obscure reasons test is pairable at pentium just with
+	   *accumulator. Use pairable CMP with others registers*/
     return AS2 (test%L0,%0,%0);
 
   operands[1] = const0_rtx;
   return AS2 (cmp%L0,%1,%0);
 }"
-  [(set_attr "type" "test")])
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "test")])
 
 (define_expand "tstsi"
   [(set (cc0)
@@ -265,17 +344,22 @@
 
 (define_insn "tsthi_1"
   [(set (cc0)
-	(match_operand:HI 0 "nonimmediate_operand" "rm"))]
+	(match_operand:HI 0 "nonimmediate_operand" "a,r,m"))]
   ""
   "*
 {
-  if (REG_P (operands[0]))
+  if (REG_P(operands[0]) && 
+     (!ix86_cpu == PROCESSOR_PENTIUM || optimize_size || !REGNO (operands[0])))
+	  /*for obscure reasons test is pairable at pentium just with
+	   *accumulator. Use pairable CMP with others registers*/
     return AS2 (test%W0,%0,%0);
 
   operands[1] = const0_rtx;
   return AS2 (cmp%W0,%1,%0);
 }"
-  [(set_attr "type" "test")])
+  [(set_attr "pipes" "uv,none,uv")
+   (set_attr "prefix" "true")
+   (set_attr "type" "test")])
 
 (define_expand "tsthi"
   [(set (cc0)
@@ -291,17 +375,21 @@
 
 (define_insn "tstqi_1"
   [(set (cc0)
-	(match_operand:QI 0 "nonimmediate_operand" "qm"))]
+	(match_operand:QI 0 "nonimmediate_operand" "q,m"))]
   ""
   "*
 {
-  if (REG_P (operands[0]))
+  if (REG_P(operands[0]) && 
+     (!ix86_cpu == PROCESSOR_PENTIUM || optimize_size || !REGNO (operands[0])))
+	  /*for obscure reasons test is pairable at pentium just with
+	   *accumulator. Use pairable CMP with others registers*/
     return AS2 (test%B0,%0,%0);
 
   operands[1] = const0_rtx;
   return AS2 (cmp%B0,%1,%0);
 }"
-  [(set_attr "type" "test")])
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "test")])
 
 (define_expand "tstqi"
   [(set (cc0)
@@ -429,7 +517,8 @@
 		 (match_operand:SI 1 "general_operand" "ri,mr")))]
   "GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM"
   "* return AS2 (cmp%L0,%1,%0);"
-  [(set_attr "type" "compare")])
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "compare")])
 
 (define_expand "cmpsi"
   [(set (cc0)
@@ -453,7 +542,9 @@
 		 (match_operand:HI 1 "general_operand" "ri,mr")))]
   "GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM"
   "* return AS2 (cmp%W0,%1,%0);"
-  [(set_attr "type" "compare")])
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv")
+   (set_attr "type" "compare")])
 
 (define_expand "cmphi"
   [(set (cc0)
@@ -477,7 +568,8 @@
 		 (match_operand:QI 1 "general_operand" "qm,nq")))]
   "GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM"
   "* return AS2 (cmp%B0,%1,%0);"
-  [(set_attr "type" "compare")])
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "compare")])
 
 (define_expand "cmpqi"
   [(set (cc0)
@@ -825,8 +917,8 @@
 
 (define_insn ""
   [(set (cc0)
-	(and:SI (match_operand:SI 0 "general_operand" "%ro")
-		(match_operand:SI 1 "nonmemory_operand" "ri")))]
+	(and:SI (match_operand:SI 0 "general_operand" "%ro,a,ro")
+		(match_operand:SI 1 "nonmemory_operand" "r,i,i")))]
   ""
   "*
 {
@@ -880,12 +972,13 @@
 
   return AS2 (test%L1,%0,%1);
 }"
-  [(set_attr "type" "compare")])
+  [(set_attr "pipes" "uv,uv,none")
+   (set_attr "type" "compare")])
 
 (define_insn ""
   [(set (cc0)
-	(and:HI (match_operand:HI 0 "general_operand" "%ro")
-		(match_operand:HI 1 "nonmemory_operand" "ri")))]
+	(and:HI (match_operand:HI 0 "general_operand" "%ro,a,ro")
+		(match_operand:HI 1 "nonmemory_operand" "r,i,i")))]
   ""
   "*
 {
@@ -929,12 +1022,15 @@
 
   return AS2 (test%W1,%0,%1);
 }"
-  [(set_attr "type" "compare")])
+  [
+   (set_attr "pipes" "uv,uv,none")
+   (set_attr "prefix" "true,false,false") ; FIXME - bit too optimistic
+   (set_attr "type" "compare")])
 
 (define_insn ""
   [(set (cc0)
-	(and:QI (match_operand:QI 0 "nonimmediate_operand" "%qm")
-		(match_operand:QI 1 "nonmemory_operand" "qi")))]
+	(and:QI (match_operand:QI 0 "nonimmediate_operand" "%qm,a,qm")
+		(match_operand:QI 1 "nonmemory_operand" "q,i,i")))]
   ""
   "*
 {
@@ -943,7 +1039,8 @@
 
   return AS2 (test%B1,%0,%1);
 }"
-  [(set_attr "type" "compare")])
+  [(set_attr "pipes" "uv,uv,none")
+   (set_attr "type" "compare")])
 \f
 ;; move instructions.
 ;; There is one for each machine mode,
@@ -955,14 +1052,16 @@
 	(match_operand:SI 1 "nonmemory_operand" "rn"))]
   "flag_pic"
   "* return AS1 (push%L0,%1);"
-  [(set_attr "memory" "store")])
+  [(set_attr "pipes" "uv")
+   (set_attr "memory" "store")])
 
 (define_insn ""
   [(set (match_operand:SI 0 "push_operand" "=<")
 	(match_operand:SI 1 "nonmemory_operand" "ri"))]
   "!flag_pic"
   "* return AS1 (push%L0,%1);"
-  [(set_attr "memory" "store")])
+  [(set_attr "pipes" "uv")
+   (set_attr "memory" "store")])
 
 ;; On a 386, it is faster to push MEM directly.
 
@@ -1037,7 +1136,8 @@
 
   return AS2 (mov%L0,%1,%0);
 }"
-  [(set_attr "type" "integer,integer,memory")
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "integer,integer,memory")
    (set_attr "memory" "*,*,load")])
 
 (define_insn ""
@@ -1069,7 +1169,8 @@
 
   return AS2 (mov%L0,%1,%0);
 }"
-  [(set_attr "type" "integer,memory")
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "integer,memory")
    (set_attr "memory" "*,load")])
 
 (define_insn ""
@@ -1077,7 +1178,9 @@
 	(match_operand:HI 1 "nonmemory_operand" "ri"))]
   ""
   "* return AS1 (push%W0,%1);"
-  [(set_attr "type" "memory")
+  [(set_attr "pipes" "uv")
+   (set_attr "prefix" "true")
+   (set_attr "type" "memory")
    (set_attr "memory" "store")])
 
 (define_insn ""
@@ -1107,8 +1210,8 @@
 }")
 
 (define_insn ""
-  [(set (match_operand:HI 0 "general_operand" "=g,r")
-	(match_operand:HI 1 "general_operand" "ri,m"))]
+  [(set (match_operand:HI 0 "general_operand" "=r,m,r")
+	(match_operand:HI 1 "general_operand" "ri,ri,m"))]
   "(!TARGET_MOVE || GET_CODE (operands[0]) != MEM) || (GET_CODE (operands[1]) != MEM)"
   "*
 {
@@ -1151,8 +1254,10 @@
 
   return AS2 (mov%W0,%1,%0);
 }"
-  [(set_attr "type" "integer,memory")
-   (set_attr "memory" "*,load")])
+  [(set_attr "prefix" "false,true,true")
+   (set_attr "pipes" "uv")
+   (set_attr "type" "integer,integer,memory")
+   (set_attr "memory" "*,*,load")])
 
 (define_expand "movstricthi"
   [(set (strict_low_part (match_operand:HI 0 "general_operand" ""))
@@ -1197,7 +1302,9 @@
 
   return AS2 (mov%W0,%1,%0);
 }"
-  [(set_attr "type" "integer,memory")])
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv")
+   (set_attr "type" "integer,memory")])
 
 ;; emit_push_insn when it calls move_by_pieces
 ;; requires an insn to "push a byte".
@@ -1207,7 +1314,9 @@
   [(set (match_operand:QI 0 "push_operand" "=<")
 	(match_operand:QI 1 "const_int_operand" "n"))]
   ""
-  "* return AS1(push%W0,%1);")
+  "* return AS1(push%W0,%1);"
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_insn ""
   [(set (match_operand:QI 0 "push_operand" "=<")
@@ -1217,7 +1326,9 @@
 {
   operands[1] = gen_rtx_REG (HImode, REGNO (operands[1]));
   return AS1 (push%W0,%1);
-}")
+}"
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 ;; On i486, incb reg is faster than movb $1,reg.
 
@@ -1275,7 +1386,8 @@
     return (AS2 (mov%L0,%k1,%k0));
 
   return (AS2 (mov%B0,%1,%0));
-}")
+}"
+  [(set_attr "pipes" "uv")])
 
 ;; If it becomes necessary to support movstrictqi into %esi or %edi,
 ;; use the insn sequence:
@@ -1334,7 +1446,8 @@
     }
 
   return AS2 (mov%B0,%1,%0);
-}")
+}"
+  [(set_attr "pipes" "uv")])
 
 (define_insn "movsf_push"
   [(set (match_operand:SF 0 "push_operand" "=<,<")
@@ -1454,7 +1567,8 @@
 
   return singlemove_string (operands);
 }"
-  [(set_attr "type" "fld")])
+  [(set_attr "type" "fld")
+   (set_attr "pipes" "none,fx,fx,none")])
 
 
 (define_insn "swapsf"
@@ -1469,7 +1583,7 @@
     return AS1 (fxch,%1);
   else
     return AS1 (fxch,%0);
-}")
+}" [(set_attr "pipes" "v")])
 
 
 (define_insn "movdf_push"
@@ -1592,7 +1706,8 @@
 
   return output_move_double (operands);
 }"
-  [(set_attr "type" "fld")])
+  [(set_attr "type" "fld")
+   (set_attr "pipes" "none,fx,fx,none")])
 
 
 
@@ -1608,7 +1723,7 @@
     return AS1 (fxch,%1);
   else
     return AS1 (fxch,%0);
-}")
+}" [(set_attr "pipes" "v")])
 
 (define_insn "movxf_push"
   [(set (match_operand:XF 0 "push_operand" "=<,<")
@@ -1743,7 +1858,7 @@
     return AS1 (fxch,%1);
   else
     return AS1 (fxch,%0);
-}")
+}" [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(set (match_operand:DI 0 "push_operand" "=<")
@@ -1829,7 +1944,9 @@
 #else
   return AS2 (movz%W0%L0,%1,%0);
 #endif
-}")
+}"
+   [(set_attr "pipes" "uv")
+    (set_attr "prefix" "false,true,true")])
 
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
@@ -1892,7 +2009,8 @@
 #else
   return AS2 (movz%B0%W0,%1,%0);
 #endif
-}")
+}"
+   [(set_attr "pipes" "uv")])
 
 (define_split
   [(set (match_operand:HI 0 "register_operand" "")
@@ -1983,7 +2101,8 @@
 #else
   return AS2 (movz%B0%L0,%1,%0);
 #endif
-}")
+}"
+   [(set_attr "pipes" "uv")])
 
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
@@ -2021,37 +2140,9 @@
 	       (const_int 255)))]
  "operands[2] = gen_rtx_REG (SImode, true_regnum (operands[1]));")
 
-(define_insn "zero_extendsidi2"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=A,r,?r,?m")
-	(zero_extend:DI (match_operand:SI 1 "register_operand" "0,0,rm,r")))]
-  ""
-  "*
-  {
-  rtx high[2], low[2], xops[4];
 
-  if (REG_P (operands[0]) && REG_P (operands[1])
-      && REGNO (operands[0]) == REGNO (operands[1]))
-    {
-      operands[0] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
-      return AS2 (xor%L0,%0,%0);
-    }
+;; - sign extension instructions
 
-  split_di (operands, 1, low, high);
-  xops[0] = low[0];
-  xops[1] = operands[1];
-  xops[2] = high[0];
-  xops[3] = const0_rtx;
-
-  output_asm_insn (AS2 (mov%L0,%1,%0), xops);
-  if (GET_CODE (low[0]) == MEM)
-    output_asm_insn (AS2 (mov%L2,%3,%2), xops);
-  else
-    output_asm_insn (AS2 (xor%L2,%2,%2), xops);
-
-  RET;
-}")
-\f
-;;- sign extension instructions
 
 (define_insn "extendsidi2"
   [(set (match_operand:DI 0 "register_operand" "=r")
@@ -2074,7 +2165,8 @@
 
   operands[0] = GEN_INT (31);
   return AS2 (sar%L1,%0,%1);
-}")
+}"
+ [(set_attr "pipes" "none,v")])
 
 ;; Note that the i386 programmers' manual says that the opcodes
 ;; are named movsx..., but the assembler on Unix does not accept that.
@@ -2148,7 +2240,7 @@
     {
       rtx target = gen_reg_rtx (SImode);
       emit_insn (gen_truncdisi2 (target, operands[1]));
-      emit_move_insn (operands[0], target);
+      emit_move_insn (operands[0], const0_rtx);
       DONE;
     }
 }")
@@ -2168,7 +2260,8 @@
     output_asm_insn (AS2 (mov%L0,%1,%0), xops);
 
   RET;
-}")
+}"
+   [(set_attr "pipes" "uv")])
 
 (define_insn ""
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,m")
@@ -2186,7 +2279,8 @@
     output_asm_insn (AS2 (mov%L0,%1,%0), xops);
 
   RET;
-}")
+}"
+   [(set_attr "pipes" "uv")])
 
 
 \f
@@ -3001,9 +3095,9 @@
   "IX86_EXPAND_BINARY_OPERATOR (PLUS, SImode, operands);")
 
 (define_insn ""
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,r")
-	(plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,r")
-		 (match_operand:SI 2 "general_operand" "rmi,ri,ri")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,r,r")
+	(plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,r,r")
+		 (match_operand:SI 2 "general_operand" "rmi,ri,0,ri")))]
   "ix86_binary_operator_ok (PLUS, SImode, operands)"
   "*
 {
@@ -3054,7 +3148,8 @@
 
   return AS2 (add%L0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary,binary,binary,lea")
+   (set_attr "pipes" "uv")])
 
 ;; addsi3 is faster, so put this after.
 
@@ -3084,7 +3179,8 @@
   CC_STATUS_INIT;
   return AS2 (lea%L0,%a1,%0);
 }"
-  [(set_attr "type" "lea")])
+  [(set_attr "type" "lea")
+   (set_attr "pipes" "uv")])
 
 ;; ??? `lea' here, for three operand add?  If leaw is used, only %bx,
 ;; %si and %di can appear in SET_SRC, and output_asm_insn might not be
@@ -3155,7 +3251,9 @@
 
   return AS2 (add%W0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_expand "addqi3"
   [(set (match_operand:QI 0 "general_operand" "")
@@ -3181,7 +3279,8 @@
 
   return AS2 (add%B0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 ;Lennart Augustsson <augustss@cs.chalmers.se>
 ;says this pattern just makes slower code:
@@ -3370,7 +3469,8 @@
 		  (match_operand:SI 2 "general_operand" "ri,rm")))]
   "ix86_binary_operator_ok (MINUS, SImode, operands)"
   "* return AS2 (sub%L0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 (define_expand "subhi3"
   [(set (match_operand:HI 0 "general_operand" "")
@@ -3411,7 +3511,8 @@
 		  (match_operand:QI 2 "general_operand" "qn,qmn")))]
   "ix86_binary_operator_ok (MINUS, QImode, operands)"
   "* return AS2 (sub%B0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 ;; The patterns that match these are at the end of this file.
 
@@ -3459,7 +3560,8 @@
     return AS2 (imul%W0,%2,%0);
   return AS3 (imul%W0,%2,%1,%0);
 }"
-  [(set_attr "type" "imul")])
+  [(set_attr "type" "imul")
+   (set_attr "prefix" "true")])
 
 (define_insn "mulsi3"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
@@ -3507,7 +3609,8 @@
 		 (sign_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm"))))]
   "TARGET_WIDE_MULTIPLY"
   "imul%L0 %2"
-  [(set_attr "type" "imul")])
+  [(set_attr "type" "imul")
+   (set_attr "prefix" "true")])
 
 (define_insn "umulsi3_highpart"
   [(set (match_operand:SI 0 "register_operand" "=d")
@@ -3596,18 +3699,13 @@
 
 (define_insn "divmodsi4"
   [(set (match_operand:SI 0 "register_operand" "=a")
-	(div:SI (match_operand:SI 1 "register_operand" "0")
-		(match_operand:SI 2 "nonimmediate_operand" "rm")))
-   (set (match_operand:SI 3 "register_operand" "=&d")
-	(mod:SI (match_dup 1) (match_dup 2)))]
+	(truncate:SI (udiv:DI (match_operand:DI 1 "register_operand" "A")
+		 (sign_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm")))))
+   (set (match_operand:SI 3 "register_operand" "=d")
+	(truncate:SI (umod:DI (match_dup 1) (sign_extend:DI (match_dup 2)))))]
   ""
   "*
 {
-#ifdef INTEL_SYNTAX
-  output_asm_insn (\"cdq\", operands);
-#else
-  output_asm_insn (\"cltd\", operands);
-#endif
   return AS1 (idiv%L0,%2);
 }"
   [(set_attr "type" "idiv")])
@@ -3620,23 +3718,47 @@
 	(mod:HI (match_dup 1) (match_dup 2)))]
   ""
   "cwtd\;idiv%W0 %2"
-  [(set_attr "type" "idiv")])
+  [(set_attr "type" "idiv")
+   (set_attr "prefix" "true")])
 
 ;; ??? Can we make gcc zero extend operand[0]?
+;; possibly by this way, but I am not sure, what I am doing, so it might be
+;; complette nonsence. Seems to work in simple cases.
+;;
+;; In following case it generates worse code:
+;;  unsigned int u=rand(),b=rand(),c;
+;;  asm(""::"d"(u));
+;;  c=u%b;
+;; Quite funny output is:
+;; movl %ebx,%esi
+;; xorl %edi,%edi
+;; movl %esi,%eax
+;; movl %edi,%edx
+;; divl,%ecx
+;; Hope this situation is rare and advantages caused by better scheduling
+;; etc. will hide this.
+
+
 (define_insn "udivmodsi4"
   [(set (match_operand:SI 0 "register_operand" "=a")
-	(udiv:SI (match_operand:SI 1 "register_operand" "0")
-		 (match_operand:SI 2 "nonimmediate_operand" "rm")))
-   (set (match_operand:SI 3 "register_operand" "=&d")
-	(umod:SI (match_dup 1) (match_dup 2)))]
+	(truncate:SI (udiv:DI (match_operand:DI 1 "register_operand" "A")
+		 (zero_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm")))))
+   (set (match_operand:SI 3 "register_operand" "=d")
+	(truncate:SI (umod:DI (match_dup 1) (zero_extend:DI (match_dup 2)))))]
   ""
-  "*
-{
-  output_asm_insn (AS2 (xor%L3,%3,%3), operands);
-  return AS1 (div%L0,%2);
-}"
+  "div%L0 %2"
   [(set_attr "type" "idiv")])
-
+/*
+(define_insn "udivmodsi4"
+  [(set (subreg:SI (match_operand:DI 0 "register_operand" "=A") 0)
+	(truncate:SI (udiv:DI (match_operand:DI 1 "register_operand" "0")
+		     (zero_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm")))))
+   (set (subreg:SI (match_dup 0) 1)
+	(truncate:SI (umod:DI (match_dup 1) (zero_extend:DI (match_dup 2)))))]
+  ""
+  "div%L0 %2"
+  [(set_attr "type" "idiv")])
+*/
 ;; ??? Can we make gcc zero extend operand[0]?
 (define_insn "udivmodhi4"
   [(set (match_operand:HI 0 "register_operand" "=a")
@@ -3650,7 +3772,8 @@
   output_asm_insn (AS2 (xor%W0,%3,%3), operands);
   return AS1 (div%W0,%2);
 }"
-  [(set_attr "type" "idiv")])
+  [(set_attr "type" "idiv")
+   (set_attr "prefix" "true")])
 
 /*
 ;;this should be a valid double division which we may want to add
@@ -3838,7 +3961,8 @@
 
   return AS2 (and%L0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 (define_insn "andhi3"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r")
@@ -3917,7 +4041,9 @@
 
   return AS2 (and%W0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_insn "andqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
@@ -3925,7 +4051,8 @@
 		(match_operand:QI 2 "general_operand" "qn,qmn")))]
   ""
   "* return AS2 (and%B0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 /* I am nervous about these two.. add them later..
 ;I presume this means that we have something in say op0= eax which is small
@@ -4042,7 +4169,8 @@
 
   return AS2 (or%L0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 (define_insn "iorhi3"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r")
@@ -4127,7 +4255,9 @@
 
   return AS2 (or%W0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_insn "iorqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
@@ -4135,7 +4265,8 @@
 		(match_operand:QI 2 "general_operand" "qn,qmn")))]
   ""
   "* return AS2 (or%B0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 \f
 ;;- xor instructions
 
@@ -4171,7 +4302,8 @@
 byte_xor_operation:
 	    CC_STATUS_INIT;
 	      
-	    if (intval == 0xff)
+	    if (intval == 0xff && (optimize_size ||
+		    ix86_cpu!=PROCESSOR_PENTIUM))
 	      return AS1 (not%B0,%b0);
 
 	    if (intval != INTVAL (operands[2]))
@@ -4187,7 +4319,8 @@
 	  if (REG_P (operands[0]))
 	    {
 	      CC_STATUS_INIT;
-	      if (intval == 0xff)
+	      if (intval == 0xff && (optimize_size ||
+		      ix86_cpu!=PROCESSOR_PENTIUM))
 		return AS1 (not%B0,%h0);
 
 	      operands[2] = GEN_INT (intval);
@@ -4224,7 +4357,8 @@
 
   return AS2 (xor%L0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 (define_insn "xorhi3"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r")
@@ -4244,7 +4378,8 @@
 	  if (INTVAL (operands[2]) & 0xffff0000)
 	    operands[2] = GEN_INT (INTVAL (operands[2]) & 0xffff);
 
-	  if (INTVAL (operands[2]) == 0xff)
+	  if (INTVAL (operands[2]) == 0xff && (optimize_size ||
+		  ix86_cpu!=PROCESSOR_PENTIUM))
 	    return AS1 (not%B0,%b0);
 
 	  return AS2 (xor%B0,%2,%b0);
@@ -4258,9 +4393,9 @@
 	  CC_STATUS_INIT;
 	  operands[2] = GEN_INT ((INTVAL (operands[2]) >> 8) & 0xff);
 
-	  if (INTVAL (operands[2]) == 0xff)
+	  if (INTVAL (operands[2]) == 0xff && (optimize_size ||
+	         ix86_cpu!=PROCESSOR_PENTIUM))
 	    return AS1 (not%B0,%h0);
-
 	  return AS2 (xor%B0,%2,%h0);
 	}
     }
@@ -4286,7 +4421,9 @@
 
   return AS2 (xor%W0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_insn "xorqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
@@ -4294,7 +4431,8 @@
 		(match_operand:QI 2 "general_operand" "qn,qm")))]
   ""
   "* return AS2 (xor%B0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 \f
 ;; logical operations for DImode
 
@@ -4367,7 +4505,8 @@
   [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
 	(neg:HI (match_operand:HI 1 "nonimmediate_operand" "0")))]
   ""
-  "neg%W0 %0")
+  "neg%W0 %0"
+  [(set_attr "prefix" "true")])
 
 (define_insn "negqi2"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
@@ -4379,31 +4518,36 @@
   [(set (match_operand:SF 0 "register_operand" "=f")
 	(neg:SF (match_operand:SF 1 "register_operand" "0")))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 
 (define_insn "negdf2"
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(neg:DF (match_operand:DF 1 "register_operand" "0")))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 
 (define_insn ""
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(neg:DF (float_extend:DF (match_operand:SF 1 "register_operand" "0"))))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 
 (define_insn "negxf2"
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(neg:XF (match_operand:XF 1 "register_operand" "0")))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 
 (define_insn ""
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(neg:XF (float_extend:XF (match_operand:DF 1 "register_operand" "0"))))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 \f
 ;; Absolute value instructions
 
@@ -4412,35 +4556,40 @@
 	(abs:SF (match_operand:SF 1 "register_operand" "0")))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn "absdf2"
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(abs:DF (match_operand:DF 1 "register_operand" "0")))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn ""
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(abs:DF (float_extend:DF (match_operand:SF 1 "register_operand" "0"))))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn "absxf2"
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(abs:XF (match_operand:XF 1 "register_operand" "0")))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn ""
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(abs:XF (float_extend:XF (match_operand:DF 1 "register_operand" "0"))))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn "sqrtsf2"
   [(set (match_operand:SF 0 "register_operand" "=f")
@@ -4536,22 +4685,53 @@
 ;;- one complement instructions
 
 (define_insn "one_cmplsi2"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
-	(not:SI (match_operand:SI 1 "nonimmediate_operand" "0")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,o,m")
+	(not:SI (match_operand:SI 1 "nonimmediate_operand" "0,0,0")))]
   ""
-  "not%L0 %0")
+  "*
+  rtx xops[2];
+     if (ix86_cpu == PROCESSOR_PENTIUM && !optimize_size)
+     {
+       xops[0] = operands[0];
+       xops[1] = GEN_INT (0xffffffff);
+       output_asm_insn(AS2 (xor%L0,%1,%0),xops);
+       RET;
+     }
+    return AS1 (not%L0,%0);"
+  [(set_attr "pipes" "uv,uv,u")])
 
 (define_insn "one_cmplhi2"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
-	(not:HI (match_operand:HI 1 "nonimmediate_operand" "0")))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=r,o,m")
+	(not:HI (match_operand:HI 1 "nonimmediate_operand" "0,0,0")))]
   ""
-  "not%W0 %0")
+  "*
+  rtx xops[2];
+     if (ix86_cpu == PROCESSOR_PENTIUM && !optimize_size)
+     {
+       xops[0] = operands[0];
+       xops[1] = GEN_INT (0xffff);
+       output_asm_insn(AS2 (xor%W0,%1,%0),xops);
+       RET;
+     }
+    return AS1 (not%W0,%0);"
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv,uv,u")])
 
 (define_insn "one_cmplqi2"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
-	(not:QI (match_operand:QI 1 "nonimmediate_operand" "0")))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=q,o,m")
+	(not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,0")))]
   ""
-  "not%B0 %0")
+  "*
+  rtx xops[2];
+     if (ix86_cpu == PROCESSOR_PENTIUM && !optimize_size)
+     {
+       xops[0] = operands[0];
+       xops[1] = GEN_INT (0xff);
+       output_asm_insn(AS2 (xor%B0,%1,%0),xops);
+       RET;
+     }
+    return AS1 (not%B0,%0);"
+  [(set_attr "pipes" "uv,uv,u")])
 \f
 ;;- arithmetic shift instructions
 
@@ -4631,7 +4811,8 @@
       output_asm_insn (AS2 (sal%L2,%0,%2), xops);
     }
   RET;
-}")
+}"
+  [(set_attr "pipes" "u")])
 
 (define_insn "ashldi3_non_const_int"
   [(set (match_operand:DI 0 "register_operand" "=&r")
@@ -4667,9 +4848,9 @@
 ;; is smaller - use leal for now unless the shift count is 1.
 
 (define_insn "ashlsi3"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm")
-	(ashift:SI (match_operand:SI 1 "nonimmediate_operand" "r,0")
-		   (match_operand:SI 2 "nonmemory_operand" "M,cI")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,rm")
+	(ashift:SI (match_operand:SI 1 "nonimmediate_operand" "r,0,0")
+		   (match_operand:SI 2 "nonmemory_operand" "M,I,c")))]
   ""
   "*
 {
@@ -4702,12 +4883,13 @@
     return AS2 (add%L0,%0,%0);
 
   return AS2 (sal%L0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,u,none")])
 
 (define_insn "ashlhi3"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
-	(ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0")
-		   (match_operand:HI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,rm")
+	(ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,0")
+		   (match_operand:HI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4718,12 +4900,14 @@
     return AS2 (add%W0,%0,%0);
 
   return AS2 (sal%W0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")
+   (set_attr "prefix" "true")])
 
 (define_insn "ashlqi3"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
-	(ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0")
-		   (match_operand:QI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,qm")
+	(ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,0")
+		   (match_operand:QI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4734,7 +4918,8 @@
     return AS2 (add%B0,%0,%0);
 
   return AS2 (sal%B0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 
 ;; See comment above `ashldi3' about how this works.
 
@@ -4781,7 +4966,8 @@
     output_asm_insn (AS2 (xor%L2,%2,%2), xops);
 
   RET;
-}")
+}"
+  [(set_attr "pipes" "uv")])
 
 (define_insn "ashrdi3_const_int"
   [(set (match_operand:DI 0 "register_operand" "=&r")
@@ -4852,9 +5038,9 @@
 }")
 
 (define_insn "ashrsi3"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
-	(ashiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0")
-		     (match_operand:SI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm")
+	(ashiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:SI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4862,12 +5048,13 @@
     return AS2 (sar%L0,%b2,%0);
   else
     return AS2 (sar%L0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 
 (define_insn "ashrhi3"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
-	(ashiftrt:HI (match_operand:HI 1 "nonimmediate_operand" "0")
-		     (match_operand:HI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,rm")
+	(ashiftrt:HI (match_operand:HI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:HI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4875,12 +5062,14 @@
     return AS2 (sar%W0,%b2,%0);
   else
     return AS2 (sar%W0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")
+   (set_attr "prefix" "true")])
 
 (define_insn "ashrqi3"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
-	(ashiftrt:QI (match_operand:QI 1 "nonimmediate_operand" "0")
-		     (match_operand:QI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,qm")
+	(ashiftrt:QI (match_operand:QI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:QI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4888,7 +5077,8 @@
     return AS2 (sar%B0,%b2,%0);
   else
     return AS2 (sar%B0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 \f
 ;;- logical shift instructions
 
@@ -5006,9 +5196,9 @@
 }")
 
 (define_insn "lshrsi3"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
-	(lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0")
-		     (match_operand:SI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm")
+	(lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:SI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -5016,12 +5206,13 @@
     return AS2 (shr%L0,%b2,%0);
   else
     return AS2 (shr%L0,%2,%1);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 
 (define_insn "lshrhi3"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
-	(lshiftrt:HI (match_operand:HI 1 "nonimmediate_operand" "0")
-		     (match_operand:HI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,rm")
+	(lshiftrt:HI (match_operand:HI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:HI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -5029,12 +5220,14 @@
     return AS2 (shr%W0,%b2,%0);
   else
     return AS2 (shr%W0,%2,%0);
-}")
+}"
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "u,none")])
 
 (define_insn "lshrqi3"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
-	(lshiftrt:QI (match_operand:QI 1 "nonimmediate_operand" "0")
-		     (match_operand:QI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,qm")
+	(lshiftrt:QI (match_operand:QI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:QI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -5042,7 +5235,8 @@
     return AS2 (shr%B0,%b2,%0);
   else
     return AS2 (shr%B0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 \f
 ;;- rotate instructions
 
@@ -5070,7 +5264,8 @@
     return AS2 (rol%W0,%b2,%0);
   else
     return AS2 (rol%W0,%2,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 (define_insn "rotlqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
@@ -5109,7 +5304,8 @@
     return AS2 (ror%W0,%b2,%0);
   else
     return AS2 (ror%W0,%2,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 (define_insn "rotrqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
@@ -5210,7 +5406,8 @@
     return AS2 (bts%L0,%2,%0);
   else
     return AS2 (btr%L0,%2,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 ;; Bit complement.  See comments on previous pattern.
 ;; ??? Is this really worthwhile?
@@ -5225,7 +5422,8 @@
   CC_STATUS_INIT;
 
   return AS2 (btc%L0,%1,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 (define_insn ""
   [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
@@ -5238,7 +5436,8 @@
   CC_STATUS_INIT;
 
   return AS2 (btc%L0,%2,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 \f
 ;; Recognizers for bit-test instructions.
 
@@ -5259,12 +5458,13 @@
 {
   cc_status.flags |= CC_Z_IN_NOT_C;
   return AS2 (bt%L0,%1,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 (define_insn ""
-  [(set (cc0) (zero_extract (match_operand:SI 0 "register_operand" "r")
-			    (match_operand:SI 1 "const_int_operand" "n")
-			    (match_operand:SI 2 "const_int_operand" "n")))]
+  [(set (cc0) (zero_extract (match_operand:SI 0 "register_operand" "a,r")
+			    (match_operand:SI 1 "const_int_operand" "n,n")
+			    (match_operand:SI 2 "const_int_operand" "n,n")))]
   ""
   "*
 {
@@ -5292,6 +5492,7 @@
   return AS2 (test%L0,%1,%0);
 }")
 
+
 ;; ??? All bets are off if operand 0 is a volatile MEM reference.
 ;; The CPU may access unspecified bytes around the actual target byte.
 
@@ -5350,7 +5551,8 @@
     return AS2 (test%L0,%1,%0);
 
   return AS2 (test%L1,%0,%1);
-}")
+}"
+ [(set_attr "pipes" "none")])
 \f
 ;; Store-flag instructions.
 
@@ -5671,7 +5873,8 @@
     return (char *)0;
 
   return AS1(j%D0,%l1);
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(set (pc)
@@ -5725,7 +5928,8 @@
     return (char *)0;
 
   return AS1(j%d0,%l1);
-}")
+}"
+ [(set_attr "pipes" "v")])
 \f
 ;; Unconditional and other jump instructions
 
@@ -5733,7 +5937,8 @@
   [(set (pc)
 	(label_ref (match_operand 0 "" "")))]
   ""
-  "jmp %l0")
+  "jmp %l0"
+ [(set_attr "pipes" "v")])
 
 (define_insn "indirect_jump"
   [(set (pc) (match_operand:SI 0 "nonimmediate_operand" "rm"))]
@@ -5743,7 +5948,8 @@
   CC_STATUS_INIT;
 
   return AS1 (jmp,%*%0);
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 ;; ??? could transform while(--i > 0) S; to if (--i > 0) do S; while(--i);
 ;;     if S does not change i
@@ -6030,7 +6236,7 @@
 ;; call* patterns.  Each named pattern is followed by an unnamed pattern
 ;; that matches any call to a symbolic CONST (ie, a symbol_ref).  The
 ;; unnamed patterns are only used while generating PIC code, because
-;; otherwise the named patterns match.
+;; otherwise the named pa|terns match.
 
 ;; Call subroutine returning no value.
 
@@ -6075,7 +6281,8 @@
     }
   else
     return AS1 (call,%P0);
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(call (mem:QI (match_operand:SI 0 "symbolic_operand" ""))
@@ -6123,14 +6330,16 @@
     }
   else
     return AS1 (call,%P0);
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(call (mem:QI (match_operand:SI 0 "symbolic_operand" ""))
 	 (match_operand:SI 1 "general_operand" "g"))]
   ;; Operand 1 not used on the i386.
   "!HALF_PIC_P ()"
-  "call %P0")
+  "call %P0"
+ [(set_attr "pipes" "v")])
 
 ;; Call subroutine, returning value in operand 0
 ;; (which must be a hard register).
@@ -6180,7 +6389,8 @@
     output_asm_insn (AS1 (call,%P1), operands);
 
   RET;
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(set (match_operand 0 "" "=rf")
@@ -6233,7 +6443,8 @@
     output_asm_insn (AS1 (call,%P1), operands);
 
   RET;
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(set (match_operand 0 "" "=rf")
@@ -6241,7 +6452,8 @@
 	      (match_operand:SI 2 "general_operand" "g")))]
   ;; Operand 2 not used on the i386.
   "!HALF_PIC_P ()"
-  "call %P1")
+  "call %P1"
+ [(set_attr "pipes" "v")])
 
 ;; Call subroutine returning any type.
 
@@ -6338,7 +6550,8 @@
   xops[1] = stack_pointer_rtx;
   output_asm_insn (AS2 (sub%L1,%0,%1), xops);
   RET;
-}")
+}"
+ [(set_attr "pipes" "uv")])
 
 (define_insn "prologue_set_got"
   [(set (match_operand:SI 0 "" "")
@@ -6362,7 +6575,8 @@
       output_asm_insn (buffer, operands);
     }    
   RET;
-}")
+}"
+ [(set_attr "pipes" "uv")])
 
 (define_insn "prologue_get_pc"
   [(set (match_operand:SI 0 "" "")
@@ -6378,7 +6592,8 @@
       ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, \"L\", CODE_LABEL_NUMBER (operands[1]));
     }    
   RET;
-}")
+}"
+ [(set_attr "pipes" "uv")])
 
 (define_insn "prologue_get_pc_and_set_got"
   [(unspec_volatile [(match_operand:SI 0 "" "")] 3)]
@@ -6756,7 +6971,8 @@
 			 (match_operand:DF 2 "nonimmediate_operand" "fm,0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6773,7 +6989,8 @@
 	    (match_operand:DF 2 "register_operand" "0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6790,7 +7007,8 @@
 			 (match_operand:XF 2 "register_operand" "f,0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6807,7 +7025,8 @@
 	    (match_operand:XF 2 "register_operand" "0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6824,7 +7043,8 @@
 	    (match_operand:XF 2 "register_operand" "0,f")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6841,7 +7061,8 @@
 	   (float:XF (match_operand:SI 2 "nonimmediate_operand" "rm"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6859,7 +7080,8 @@
 	    (match_operand:SF 2 "nonimmediate_operand" "fm,0"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6876,7 +7098,8 @@
 	    (match_operand:DF 2 "register_operand" "0,f")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6893,7 +7116,8 @@
 	   (float:DF (match_operand:SI 2 "nonimmediate_operand" "rm"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6911,7 +7135,8 @@
 	    (match_operand:SF 2 "nonimmediate_operand" "fm,0"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6928,7 +7153,8 @@
 			 (match_operand:SF 2 "nonimmediate_operand" "fm,0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6945,7 +7171,8 @@
 	   (match_operand:SF 2 "register_operand" "0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6962,7 +7189,8 @@
 	   (float:SF (match_operand:SI 2 "nonimmediate_operand" "rm"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -7651,3 +7879,6 @@
   load_pic_register (1);
   DONE;
 }")
+
+
+

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Various changes to i386.md
  1998-09-21 20:09 Jan Hubicka
@ 1998-09-22 14:12 ` Joern Rennecke
  0 siblings, 0 replies; 3+ messages in thread
From: Joern Rennecke @ 1998-09-22 14:12 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: egcs

>  o How to split extendsi pattern? It is possible to tell gcc to generate
>    one instruction in case register is eax and split it in outher cases?

There are two possible approaches:

- predicate the define_split on REGNO (operands[0]) == 0 .  You have then
  to leave the extendsidi2 the full capability to handle all patterns,
  so that compilation can be sucessful when not scheduling (e.g. at -O0).

- Use C code in the define code to output the instructions in one or both
  cases, and finish with DONE.  This will supress the generation of the
  instructions in the result template.
  If you cover both cases in the C code, you can use a stunted result
  template, e.g. [(const_int 0)] .

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Various changes to i386.md
@ 1998-09-21 20:09 Jan Hubicka
  1998-09-22 14:12 ` Joern Rennecke
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Hubicka @ 1998-09-21 20:09 UTC (permalink / raw)
  To: egcs

Hi
Yesterday I've played a bit with i386.md, just to see, how it works.
I've made few changes and there is remote possibility, that some
of them should be usefull for inclusion to egcs. I would like if someone
should look at changes I made and tell me, what is good thing and what
is complette nonsence (mostly). So here is list of changes/problems:

 o I've added new function unit FPMUL to describe gcc, that fpmul instruction
   at pentium are not complettely pipelined and that it is good idea to put
   other instruction between them. It seems to work and brings 5-10% speedup
   at my Mandelbrot loop in XaoS
 o TEST instruction is pairable at Pentium just with EAX parameter (don't ask
   me why), so I've changed code generation to use CMP in other cases
 o Change non pairable NOT to XOR
   same should be done with NEG, but it is profitable just in case that
   scheduler does good job. (1 cycle instruction expand to two pairable 1 cycles)
   Maybe we should split it and if scheduler don't put anything between use
   define_peephole or something to recombine it..
 o FPDIV instruction takes 38 cycles at pentium, 2 cycles should overlap with
   other FP instruction and rest with integer code, so I've described it to
   scheduler
 o I've started work at clasifying instructions for U and V pipelines. It is
   just approximation now. Also sets attribute prefix, 
   if instruction has 32bit->16bit prefix. This should be usefull for other
   processors too.
   I am not sure, how to describe behaviour of some patterns. For example
   addhi3 has very strange behaviour and I don't know, if it is possible to
   describe, when pairable instruction will be generated and when not.
 o I've made an attempt to specify behaviour of pentium pipelines in greater
   detail, so it now less optimistic about them and don't try to pair imuls,
   divs and other similar instructions. It reduces register lifetimes so it
   should help a bit (0-20% in my tests).
   To describe it, I say, that some instructions uses multiple units (non
   pairable instructions uses both). It seems to work with HAIFA, but I am not
   sure, if it is correct
 o To improve scheduling it is probably good idea to split existing
   instruction patterns to generate just one instruction when possible.
   So I've started with divmod instruction patters, because they looked
   interesting. Changed them to use zero_extend/truncate oprations. It seems
   to work, just in following case generates worse code:
    unsigned int u=rand(),b=rand(),c;
    asm(""::"d"(u));
    c=u%b;
   Quite funny output is:
   movl %ebx,%esi
   xorl %edi,%edi
   movl %esi,%eax
   movl %edi,%edx
   divl,%ecx
   So I don't know, if it was good idea. Probably not...
 o I've also looked at zero_extendsidi pattern. It don't do anything special,
   so it should be IMO ommited to enable gcc's default version.

   There is problem with life analysis. GCC add clober before, wich
   generates unnecesary collision between source and target. Possibly GCC's
   default version should be modified to handle this in better way.
   I've changed optabs to generate REG_NO_CONFLICT for clobber and final move.

   Problem is, that code for handling REG_NO_CONFLICT is disabled in global.c,
   because it can't catch partial conflicts. So before this gets fixed, this
   is probably not the way to go.
   I would possibly try to change global.c to handle this case correctly,
   if someone don't plans some larger changes to global.c
 o How to split extendsi pattern? It is possible to tell gcc to generate
   one instruction in case register is eax and split it in outher cases?

and here is patch:

--- i386.md.old	Sat Sep 19 07:23:32 1998
+++ i386.md	Sun Sep 20 18:37:41 1998
@@ -75,6 +75,22 @@
   "integer,binary,memory,test,compare,fcompare,idiv,imul,lea,fld,fpop,fpdiv,fpmul"
   (const_string "integer"))
 
+;; true if instruction have 32bit to 16bit switching prefix
+;; it is _very_ rough approximation of real situation, because many
+;; instruction patterns generates many different insturctions, and I
+;; don't know how to write it more exactly. Someone should look at HI mode
+;; patterns and improve this.
+(define_attr "prefix"
+  "true,false"
+  (const_string "false"))
+
+;; pipelines used by pentium. FX is for floating point instructions, that
+;; pairs with fxch
+;; it is just aproximation for exactly same purposes as "prefix" attribute
+(define_attr "pipes"
+  "none,u,v,uv,fx"
+  (const_string "none"))
+
 (define_attr "memory" "none,load,store"
   (cond [(eq_attr "type" "idiv,lea")
 	 (const_string "none")
@@ -133,9 +149,26 @@
  (and (eq_attr "type" "fpop,fcompare") (eq_attr "cpu" "pentium,pentiumpro")) 
  3 0)
 
+;; Most FP instructions are decoded in u pipe
+(define_function_unit "upipe" 1 0
+ (and (eq_attr "type" "fpop,fcompare,fld,fpmul,fpdiv") (eq_attr "cpu" "pentium")) 
+ 1 0) 
+
+;; But some blocks vpipe too
+(define_function_unit "vpipe" 1 0
+ (and (and (eq_attr "type" "fpop,fcompare,fld,fpmul") (eq_attr "cpu" "pentium")) 
+      (eq_attr "pipes" "!fx"))
+ 1 0) 
+
 (define_function_unit "fp" 1 0
  (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) 
- 7 0)
+ 7 0) 
+;; It is recomended to put one fp instruction between two fmuls,
+;; since unit is not completely pipelined
+(define_function_unit "fpmul" 1 1
+ (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentium")) 
+ 2 2) 
+
 
 (define_function_unit "fp" 1 0
  (and (eq_attr "type" "fpmul") (eq_attr "cpu" "pentiumpro")) 
@@ -150,9 +183,18 @@
  6 0)
 
 (define_function_unit "fp" 1 0
- (eq_attr "type" "fpdiv") 
+ (and (eq_attr "type" "fpdiv") 
+ (eq_attr "cpu" "!pentium"))
  10 10)
 
+;; fpdiv takes 38 cycles. 2 cycles should be used for fp instructions and
+;; rest for integer ones.
+(define_function_unit "fp" 1 0
+ (and (eq_attr "type" "fpdiv") 
+ (eq_attr "cpu" "pentium"))
+ 38 36)
+
+
 (define_function_unit "fp" 1 0
   (and (eq_attr "type" "fld") (eq_attr "cpu" "!pentiumpro,k6"))
  1 0)
@@ -165,7 +207,28 @@
 ;; i386 and i486 have one integer unit, which need not be modeled
 
 (define_function_unit "integer" 2 0
-  (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium,pentiumpro"))
+  (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentiumpro"))
+ 1 0)
+
+;; Pentium has u and v pipelines. They works in very strange way, so this is
+;; just approximation
+(define_function_unit "upipe" 1 0
+  (and (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium"))
+       (eq_attr "pipes" "u"))
+ 1 0)
+(define_function_unit "upipe" 1 0
+  (and (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium"))
+       (eq_attr "prefix" "true"))
+ 2 0) ;; one extra cycle
+
+(define_function_unit "vpipe" 1 0
+  (and (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium"))
+       (eq_attr "pipes" "v,none"))
+ 1 0)
+; uv instruction takes one extra cycle to avoid dependencies in uv only code
+(define_function_unit "vpipe" 1 0
+  (and (and (eq_attr "type" "integer,binary,test,compare,lea") (eq_attr "cpu" "pentium"))
+       (and (eq_attr "pipes" "uv") (eq_attr "prefix" "false")))
  1 0)
 
 (define_function_unit "integer" 2 0
@@ -182,16 +245,29 @@
 	    (eq_attr "memory" "load")))
   3 0)
 
-;; Multiplies use one of the integer units
-(define_function_unit "integer" 2 0
+(define_function_unit "upipe" 1 0
+  (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul"))
+  11 11)
+(define_function_unit "vpipe" 1 0
   (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul"))
   11 11)
+(define_function_unit "fp" 1 0
+  (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul"))
+  11 11)
+
+;; Even fp unit is blocked
 
 (define_function_unit "integer" 2 0
   (and (eq_attr "cpu" "k6") (eq_attr "type" "imul"))
   2 2)
 
-(define_function_unit "integer" 2 0
+(define_function_unit "upipe" 1 0
+  (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv"))
+  25 25)
+(define_function_unit "vpipe" 1 0
+  (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv"))
+  25 25)
+(define_function_unit "fp" 1 0
   (and (eq_attr "cpu" "pentium") (eq_attr "type" "idiv"))
   25 25)
 
@@ -217,7 +293,6 @@
 (define_function_unit "store" 1 0
   (and (eq_attr "cpu" "k6") (eq_attr "type" "lea"))
   1 0)
-
 \f
 ;; "movl MEM,REG / testl REG,REG" is faster on a 486 than "cmpl $0,MEM".
 ;; But restricting MEM here would mean that gcc could not remove a redundant
@@ -243,13 +318,17 @@
   ""
   "*
 {
-  if (REG_P (operands[0]))
+  if (REG_P(operands[0]) && 
+     (!ix86_cpu == PROCESSOR_PENTIUM || optimize_size || !REGNO (operands[0])))
+	  /*for obscure reasons test is pairable at pentium just with
+	   *accumulator. Use pairable CMP with others registers*/
     return AS2 (test%L0,%0,%0);
 
   operands[1] = const0_rtx;
   return AS2 (cmp%L0,%1,%0);
 }"
-  [(set_attr "type" "test")])
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "test")])
 
 (define_expand "tstsi"
   [(set (cc0)
@@ -265,17 +344,22 @@
 
 (define_insn "tsthi_1"
   [(set (cc0)
-	(match_operand:HI 0 "nonimmediate_operand" "rm"))]
+	(match_operand:HI 0 "nonimmediate_operand" "a,r,m"))]
   ""
   "*
 {
-  if (REG_P (operands[0]))
+  if (REG_P(operands[0]) && 
+     (!ix86_cpu == PROCESSOR_PENTIUM || optimize_size || !REGNO (operands[0])))
+	  /*for obscure reasons test is pairable at pentium just with
+	   *accumulator. Use pairable CMP with others registers*/
     return AS2 (test%W0,%0,%0);
 
   operands[1] = const0_rtx;
   return AS2 (cmp%W0,%1,%0);
 }"
-  [(set_attr "type" "test")])
+  [(set_attr "pipes" "uv,none,uv")
+   (set_attr "prefix" "true")
+   (set_attr "type" "test")])
 
 (define_expand "tsthi"
   [(set (cc0)
@@ -291,17 +375,21 @@
 
 (define_insn "tstqi_1"
   [(set (cc0)
-	(match_operand:QI 0 "nonimmediate_operand" "qm"))]
+	(match_operand:QI 0 "nonimmediate_operand" "q,m"))]
   ""
   "*
 {
-  if (REG_P (operands[0]))
+  if (REG_P(operands[0]) && 
+     (!ix86_cpu == PROCESSOR_PENTIUM || optimize_size || !REGNO (operands[0])))
+	  /*for obscure reasons test is pairable at pentium just with
+	   *accumulator. Use pairable CMP with others registers*/
     return AS2 (test%B0,%0,%0);
 
   operands[1] = const0_rtx;
   return AS2 (cmp%B0,%1,%0);
 }"
-  [(set_attr "type" "test")])
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "test")])
 
 (define_expand "tstqi"
   [(set (cc0)
@@ -429,7 +517,8 @@
 		 (match_operand:SI 1 "general_operand" "ri,mr")))]
   "GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM"
   "* return AS2 (cmp%L0,%1,%0);"
-  [(set_attr "type" "compare")])
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "compare")])
 
 (define_expand "cmpsi"
   [(set (cc0)
@@ -453,7 +542,9 @@
 		 (match_operand:HI 1 "general_operand" "ri,mr")))]
   "GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM"
   "* return AS2 (cmp%W0,%1,%0);"
-  [(set_attr "type" "compare")])
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv")
+   (set_attr "type" "compare")])
 
 (define_expand "cmphi"
   [(set (cc0)
@@ -477,7 +568,8 @@
 		 (match_operand:QI 1 "general_operand" "qm,nq")))]
   "GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM"
   "* return AS2 (cmp%B0,%1,%0);"
-  [(set_attr "type" "compare")])
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "compare")])
 
 (define_expand "cmpqi"
   [(set (cc0)
@@ -825,8 +917,8 @@
 
 (define_insn ""
   [(set (cc0)
-	(and:SI (match_operand:SI 0 "general_operand" "%ro")
-		(match_operand:SI 1 "nonmemory_operand" "ri")))]
+	(and:SI (match_operand:SI 0 "general_operand" "%ro,a,ro")
+		(match_operand:SI 1 "nonmemory_operand" "r,i,i")))]
   ""
   "*
 {
@@ -880,12 +972,13 @@
 
   return AS2 (test%L1,%0,%1);
 }"
-  [(set_attr "type" "compare")])
+  [(set_attr "pipes" "uv,uv,none")
+   (set_attr "type" "compare")])
 
 (define_insn ""
   [(set (cc0)
-	(and:HI (match_operand:HI 0 "general_operand" "%ro")
-		(match_operand:HI 1 "nonmemory_operand" "ri")))]
+	(and:HI (match_operand:HI 0 "general_operand" "%ro,a,ro")
+		(match_operand:HI 1 "nonmemory_operand" "r,i,i")))]
   ""
   "*
 {
@@ -929,12 +1022,15 @@
 
   return AS2 (test%W1,%0,%1);
 }"
-  [(set_attr "type" "compare")])
+  [
+   (set_attr "pipes" "uv,uv,none")
+   (set_attr "prefix" "true,false,false") ; FIXME - bit too optimistic
+   (set_attr "type" "compare")])
 
 (define_insn ""
   [(set (cc0)
-	(and:QI (match_operand:QI 0 "nonimmediate_operand" "%qm")
-		(match_operand:QI 1 "nonmemory_operand" "qi")))]
+	(and:QI (match_operand:QI 0 "nonimmediate_operand" "%qm,a,qm")
+		(match_operand:QI 1 "nonmemory_operand" "q,i,i")))]
   ""
   "*
 {
@@ -943,7 +1039,8 @@
 
   return AS2 (test%B1,%0,%1);
 }"
-  [(set_attr "type" "compare")])
+  [(set_attr "pipes" "uv,uv,none")
+   (set_attr "type" "compare")])
 \f
 ;; move instructions.
 ;; There is one for each machine mode,
@@ -955,14 +1052,16 @@
 	(match_operand:SI 1 "nonmemory_operand" "rn"))]
   "flag_pic"
   "* return AS1 (push%L0,%1);"
-  [(set_attr "memory" "store")])
+  [(set_attr "pipes" "uv")
+   (set_attr "memory" "store")])
 
 (define_insn ""
   [(set (match_operand:SI 0 "push_operand" "=<")
 	(match_operand:SI 1 "nonmemory_operand" "ri"))]
   "!flag_pic"
   "* return AS1 (push%L0,%1);"
-  [(set_attr "memory" "store")])
+  [(set_attr "pipes" "uv")
+   (set_attr "memory" "store")])
 
 ;; On a 386, it is faster to push MEM directly.
 
@@ -1037,7 +1136,8 @@
 
   return AS2 (mov%L0,%1,%0);
 }"
-  [(set_attr "type" "integer,integer,memory")
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "integer,integer,memory")
    (set_attr "memory" "*,*,load")])
 
 (define_insn ""
@@ -1069,7 +1169,8 @@
 
   return AS2 (mov%L0,%1,%0);
 }"
-  [(set_attr "type" "integer,memory")
+  [(set_attr "pipes" "uv")
+   (set_attr "type" "integer,memory")
    (set_attr "memory" "*,load")])
 
 (define_insn ""
@@ -1077,7 +1178,9 @@
 	(match_operand:HI 1 "nonmemory_operand" "ri"))]
   ""
   "* return AS1 (push%W0,%1);"
-  [(set_attr "type" "memory")
+  [(set_attr "pipes" "uv")
+   (set_attr "prefix" "true")
+   (set_attr "type" "memory")
    (set_attr "memory" "store")])
 
 (define_insn ""
@@ -1107,8 +1210,8 @@
 }")
 
 (define_insn ""
-  [(set (match_operand:HI 0 "general_operand" "=g,r")
-	(match_operand:HI 1 "general_operand" "ri,m"))]
+  [(set (match_operand:HI 0 "general_operand" "=r,m,r")
+	(match_operand:HI 1 "general_operand" "ri,ri,m"))]
   "(!TARGET_MOVE || GET_CODE (operands[0]) != MEM) || (GET_CODE (operands[1]) != MEM)"
   "*
 {
@@ -1151,8 +1254,10 @@
 
   return AS2 (mov%W0,%1,%0);
 }"
-  [(set_attr "type" "integer,memory")
-   (set_attr "memory" "*,load")])
+  [(set_attr "prefix" "false,true,true")
+   (set_attr "pipes" "uv")
+   (set_attr "type" "integer,integer,memory")
+   (set_attr "memory" "*,*,load")])
 
 (define_expand "movstricthi"
   [(set (strict_low_part (match_operand:HI 0 "general_operand" ""))
@@ -1197,7 +1302,9 @@
 
   return AS2 (mov%W0,%1,%0);
 }"
-  [(set_attr "type" "integer,memory")])
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv")
+   (set_attr "type" "integer,memory")])
 
 ;; emit_push_insn when it calls move_by_pieces
 ;; requires an insn to "push a byte".
@@ -1207,7 +1314,9 @@
   [(set (match_operand:QI 0 "push_operand" "=<")
 	(match_operand:QI 1 "const_int_operand" "n"))]
   ""
-  "* return AS1(push%W0,%1);")
+  "* return AS1(push%W0,%1);"
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_insn ""
   [(set (match_operand:QI 0 "push_operand" "=<")
@@ -1217,7 +1326,9 @@
 {
   operands[1] = gen_rtx_REG (HImode, REGNO (operands[1]));
   return AS1 (push%W0,%1);
-}")
+}"
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 ;; On i486, incb reg is faster than movb $1,reg.
 
@@ -1275,7 +1386,8 @@
     return (AS2 (mov%L0,%k1,%k0));
 
   return (AS2 (mov%B0,%1,%0));
-}")
+}"
+  [(set_attr "pipes" "uv")])
 
 ;; If it becomes necessary to support movstrictqi into %esi or %edi,
 ;; use the insn sequence:
@@ -1334,7 +1446,8 @@
     }
 
   return AS2 (mov%B0,%1,%0);
-}")
+}"
+  [(set_attr "pipes" "uv")])
 
 (define_insn "movsf_push"
   [(set (match_operand:SF 0 "push_operand" "=<,<")
@@ -1454,7 +1567,8 @@
 
   return singlemove_string (operands);
 }"
-  [(set_attr "type" "fld")])
+  [(set_attr "type" "fld")
+   (set_attr "pipes" "none,fx,fx,none")])
 
 
 (define_insn "swapsf"
@@ -1469,7 +1583,7 @@
     return AS1 (fxch,%1);
   else
     return AS1 (fxch,%0);
-}")
+}" [(set_attr "pipes" "v")])
 
 
 (define_insn "movdf_push"
@@ -1592,7 +1706,8 @@
 
   return output_move_double (operands);
 }"
-  [(set_attr "type" "fld")])
+  [(set_attr "type" "fld")
+   (set_attr "pipes" "none,fx,fx,none")])
 
 
 
@@ -1608,7 +1723,7 @@
     return AS1 (fxch,%1);
   else
     return AS1 (fxch,%0);
-}")
+}" [(set_attr "pipes" "v")])
 
 (define_insn "movxf_push"
   [(set (match_operand:XF 0 "push_operand" "=<,<")
@@ -1743,7 +1858,7 @@
     return AS1 (fxch,%1);
   else
     return AS1 (fxch,%0);
-}")
+}" [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(set (match_operand:DI 0 "push_operand" "=<")
@@ -1829,7 +1944,9 @@
 #else
   return AS2 (movz%W0%L0,%1,%0);
 #endif
-}")
+}"
+   [(set_attr "pipes" "uv")
+    (set_attr "prefix" "false,true,true")])
 
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
@@ -1892,7 +2009,8 @@
 #else
   return AS2 (movz%B0%W0,%1,%0);
 #endif
-}")
+}"
+   [(set_attr "pipes" "uv")])
 
 (define_split
   [(set (match_operand:HI 0 "register_operand" "")
@@ -1983,7 +2101,8 @@
 #else
   return AS2 (movz%B0%L0,%1,%0);
 #endif
-}")
+}"
+   [(set_attr "pipes" "uv")])
 
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
@@ -2021,37 +2140,9 @@
 	       (const_int 255)))]
  "operands[2] = gen_rtx_REG (SImode, true_regnum (operands[1]));")
 
-(define_insn "zero_extendsidi2"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=A,r,?r,?m")
-	(zero_extend:DI (match_operand:SI 1 "register_operand" "0,0,rm,r")))]
-  ""
-  "*
-  {
-  rtx high[2], low[2], xops[4];
 
-  if (REG_P (operands[0]) && REG_P (operands[1])
-      && REGNO (operands[0]) == REGNO (operands[1]))
-    {
-      operands[0] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
-      return AS2 (xor%L0,%0,%0);
-    }
+;; - sign extension instructions
 
-  split_di (operands, 1, low, high);
-  xops[0] = low[0];
-  xops[1] = operands[1];
-  xops[2] = high[0];
-  xops[3] = const0_rtx;
-
-  output_asm_insn (AS2 (mov%L0,%1,%0), xops);
-  if (GET_CODE (low[0]) == MEM)
-    output_asm_insn (AS2 (mov%L2,%3,%2), xops);
-  else
-    output_asm_insn (AS2 (xor%L2,%2,%2), xops);
-
-  RET;
-}")
-\f
-;;- sign extension instructions
 
 (define_insn "extendsidi2"
   [(set (match_operand:DI 0 "register_operand" "=r")
@@ -2074,7 +2165,8 @@
 
   operands[0] = GEN_INT (31);
   return AS2 (sar%L1,%0,%1);
-}")
+}"
+ [(set_attr "pipes" "none,v")])
 
 ;; Note that the i386 programmers' manual says that the opcodes
 ;; are named movsx..., but the assembler on Unix does not accept that.
@@ -2148,7 +2240,7 @@
     {
       rtx target = gen_reg_rtx (SImode);
       emit_insn (gen_truncdisi2 (target, operands[1]));
-      emit_move_insn (operands[0], target);
+      emit_move_insn (operands[0], const0_rtx);
       DONE;
     }
 }")
@@ -2168,7 +2260,8 @@
     output_asm_insn (AS2 (mov%L0,%1,%0), xops);
 
   RET;
-}")
+}"
+   [(set_attr "pipes" "uv")])
 
 (define_insn ""
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,m")
@@ -2186,7 +2279,8 @@
     output_asm_insn (AS2 (mov%L0,%1,%0), xops);
 
   RET;
-}")
+}"
+   [(set_attr "pipes" "uv")])
 
 
 \f
@@ -3001,9 +3095,9 @@
   "IX86_EXPAND_BINARY_OPERATOR (PLUS, SImode, operands);")
 
 (define_insn ""
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,r")
-	(plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,r")
-		 (match_operand:SI 2 "general_operand" "rmi,ri,ri")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,r,r")
+	(plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,r,r")
+		 (match_operand:SI 2 "general_operand" "rmi,ri,0,ri")))]
   "ix86_binary_operator_ok (PLUS, SImode, operands)"
   "*
 {
@@ -3054,7 +3148,8 @@
 
   return AS2 (add%L0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary,binary,binary,lea")
+   (set_attr "pipes" "uv")])
 
 ;; addsi3 is faster, so put this after.
 
@@ -3084,7 +3179,8 @@
   CC_STATUS_INIT;
   return AS2 (lea%L0,%a1,%0);
 }"
-  [(set_attr "type" "lea")])
+  [(set_attr "type" "lea")
+   (set_attr "pipes" "uv")])
 
 ;; ??? `lea' here, for three operand add?  If leaw is used, only %bx,
 ;; %si and %di can appear in SET_SRC, and output_asm_insn might not be
@@ -3155,7 +3251,9 @@
 
   return AS2 (add%W0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_expand "addqi3"
   [(set (match_operand:QI 0 "general_operand" "")
@@ -3181,7 +3279,8 @@
 
   return AS2 (add%B0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 ;Lennart Augustsson <augustss@cs.chalmers.se>
 ;says this pattern just makes slower code:
@@ -3370,7 +3469,8 @@
 		  (match_operand:SI 2 "general_operand" "ri,rm")))]
   "ix86_binary_operator_ok (MINUS, SImode, operands)"
   "* return AS2 (sub%L0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 (define_expand "subhi3"
   [(set (match_operand:HI 0 "general_operand" "")
@@ -3411,7 +3511,8 @@
 		  (match_operand:QI 2 "general_operand" "qn,qmn")))]
   "ix86_binary_operator_ok (MINUS, QImode, operands)"
   "* return AS2 (sub%B0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 ;; The patterns that match these are at the end of this file.
 
@@ -3459,7 +3560,8 @@
     return AS2 (imul%W0,%2,%0);
   return AS3 (imul%W0,%2,%1,%0);
 }"
-  [(set_attr "type" "imul")])
+  [(set_attr "type" "imul")
+   (set_attr "prefix" "true")])
 
 (define_insn "mulsi3"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
@@ -3507,7 +3609,8 @@
 		 (sign_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm"))))]
   "TARGET_WIDE_MULTIPLY"
   "imul%L0 %2"
-  [(set_attr "type" "imul")])
+  [(set_attr "type" "imul")
+   (set_attr "prefix" "true")])
 
 (define_insn "umulsi3_highpart"
   [(set (match_operand:SI 0 "register_operand" "=d")
@@ -3596,18 +3699,13 @@
 
 (define_insn "divmodsi4"
   [(set (match_operand:SI 0 "register_operand" "=a")
-	(div:SI (match_operand:SI 1 "register_operand" "0")
-		(match_operand:SI 2 "nonimmediate_operand" "rm")))
-   (set (match_operand:SI 3 "register_operand" "=&d")
-	(mod:SI (match_dup 1) (match_dup 2)))]
+	(truncate:SI (udiv:DI (match_operand:DI 1 "register_operand" "A")
+		 (sign_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm")))))
+   (set (match_operand:SI 3 "register_operand" "=d")
+	(truncate:SI (umod:DI (match_dup 1) (sign_extend:DI (match_dup 2)))))]
   ""
   "*
 {
-#ifdef INTEL_SYNTAX
-  output_asm_insn (\"cdq\", operands);
-#else
-  output_asm_insn (\"cltd\", operands);
-#endif
   return AS1 (idiv%L0,%2);
 }"
   [(set_attr "type" "idiv")])
@@ -3620,23 +3718,47 @@
 	(mod:HI (match_dup 1) (match_dup 2)))]
   ""
   "cwtd\;idiv%W0 %2"
-  [(set_attr "type" "idiv")])
+  [(set_attr "type" "idiv")
+   (set_attr "prefix" "true")])
 
 ;; ??? Can we make gcc zero extend operand[0]?
+;; possibly by this way, but I am not sure, what I am doing, so it might be
+;; complette nonsence. Seems to work in simple cases.
+;;
+;; In following case it generates worse code:
+;;  unsigned int u=rand(),b=rand(),c;
+;;  asm(""::"d"(u));
+;;  c=u%b;
+;; Quite funny output is:
+;; movl %ebx,%esi
+;; xorl %edi,%edi
+;; movl %esi,%eax
+;; movl %edi,%edx
+;; divl,%ecx
+;; Hope this situation is rare and advantages caused by better scheduling
+;; etc. will hide this.
+
+
 (define_insn "udivmodsi4"
   [(set (match_operand:SI 0 "register_operand" "=a")
-	(udiv:SI (match_operand:SI 1 "register_operand" "0")
-		 (match_operand:SI 2 "nonimmediate_operand" "rm")))
-   (set (match_operand:SI 3 "register_operand" "=&d")
-	(umod:SI (match_dup 1) (match_dup 2)))]
+	(truncate:SI (udiv:DI (match_operand:DI 1 "register_operand" "A")
+		 (zero_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm")))))
+   (set (match_operand:SI 3 "register_operand" "=d")
+	(truncate:SI (umod:DI (match_dup 1) (zero_extend:DI (match_dup 2)))))]
   ""
-  "*
-{
-  output_asm_insn (AS2 (xor%L3,%3,%3), operands);
-  return AS1 (div%L0,%2);
-}"
+  "div%L0 %2"
   [(set_attr "type" "idiv")])
-
+/*
+(define_insn "udivmodsi4"
+  [(set (subreg:SI (match_operand:DI 0 "register_operand" "=A") 0)
+	(truncate:SI (udiv:DI (match_operand:DI 1 "register_operand" "0")
+		     (zero_extend:DI (match_operand:SI 2 "nonimmediate_operand" "rm")))))
+   (set (subreg:SI (match_dup 0) 1)
+	(truncate:SI (umod:DI (match_dup 1) (zero_extend:DI (match_dup 2)))))]
+  ""
+  "div%L0 %2"
+  [(set_attr "type" "idiv")])
+*/
 ;; ??? Can we make gcc zero extend operand[0]?
 (define_insn "udivmodhi4"
   [(set (match_operand:HI 0 "register_operand" "=a")
@@ -3650,7 +3772,8 @@
   output_asm_insn (AS2 (xor%W0,%3,%3), operands);
   return AS1 (div%W0,%2);
 }"
-  [(set_attr "type" "idiv")])
+  [(set_attr "type" "idiv")
+   (set_attr "prefix" "true")])
 
 /*
 ;;this should be a valid double division which we may want to add
@@ -3838,7 +3961,8 @@
 
   return AS2 (and%L0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 (define_insn "andhi3"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r")
@@ -3917,7 +4041,9 @@
 
   return AS2 (and%W0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_insn "andqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
@@ -3925,7 +4051,8 @@
 		(match_operand:QI 2 "general_operand" "qn,qmn")))]
   ""
   "* return AS2 (and%B0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 /* I am nervous about these two.. add them later..
 ;I presume this means that we have something in say op0= eax which is small
@@ -4042,7 +4169,8 @@
 
   return AS2 (or%L0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 (define_insn "iorhi3"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r")
@@ -4127,7 +4255,9 @@
 
   return AS2 (or%W0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_insn "iorqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
@@ -4135,7 +4265,8 @@
 		(match_operand:QI 2 "general_operand" "qn,qmn")))]
   ""
   "* return AS2 (or%B0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 \f
 ;;- xor instructions
 
@@ -4171,7 +4302,8 @@
 byte_xor_operation:
 	    CC_STATUS_INIT;
 	      
-	    if (intval == 0xff)
+	    if (intval == 0xff && (optimize_size ||
+		    ix86_cpu!=PROCESSOR_PENTIUM))
 	      return AS1 (not%B0,%b0);
 
 	    if (intval != INTVAL (operands[2]))
@@ -4187,7 +4319,8 @@
 	  if (REG_P (operands[0]))
 	    {
 	      CC_STATUS_INIT;
-	      if (intval == 0xff)
+	      if (intval == 0xff && (optimize_size ||
+		      ix86_cpu!=PROCESSOR_PENTIUM))
 		return AS1 (not%B0,%h0);
 
 	      operands[2] = GEN_INT (intval);
@@ -4224,7 +4357,8 @@
 
   return AS2 (xor%L0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 
 (define_insn "xorhi3"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r")
@@ -4244,7 +4378,8 @@
 	  if (INTVAL (operands[2]) & 0xffff0000)
 	    operands[2] = GEN_INT (INTVAL (operands[2]) & 0xffff);
 
-	  if (INTVAL (operands[2]) == 0xff)
+	  if (INTVAL (operands[2]) == 0xff && (optimize_size ||
+		  ix86_cpu!=PROCESSOR_PENTIUM))
 	    return AS1 (not%B0,%b0);
 
 	  return AS2 (xor%B0,%2,%b0);
@@ -4258,9 +4393,9 @@
 	  CC_STATUS_INIT;
 	  operands[2] = GEN_INT ((INTVAL (operands[2]) >> 8) & 0xff);
 
-	  if (INTVAL (operands[2]) == 0xff)
+	  if (INTVAL (operands[2]) == 0xff && (optimize_size ||
+	         ix86_cpu!=PROCESSOR_PENTIUM))
 	    return AS1 (not%B0,%h0);
-
 	  return AS2 (xor%B0,%2,%h0);
 	}
     }
@@ -4286,7 +4421,9 @@
 
   return AS2 (xor%W0,%2,%0);
 }"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "prefix" "true")
+   (set_attr "pipes" "uv")])
 
 (define_insn "xorqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
@@ -4294,7 +4431,8 @@
 		(match_operand:QI 2 "general_operand" "qn,qm")))]
   ""
   "* return AS2 (xor%B0,%2,%0);"
-  [(set_attr "type" "binary")])
+  [(set_attr "type" "binary")
+   (set_attr "pipes" "uv")])
 \f
 ;; logical operations for DImode
 
@@ -4367,7 +4505,8 @@
   [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
 	(neg:HI (match_operand:HI 1 "nonimmediate_operand" "0")))]
   ""
-  "neg%W0 %0")
+  "neg%W0 %0"
+  [(set_attr "prefix" "true")])
 
 (define_insn "negqi2"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
@@ -4379,31 +4518,36 @@
   [(set (match_operand:SF 0 "register_operand" "=f")
 	(neg:SF (match_operand:SF 1 "register_operand" "0")))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 
 (define_insn "negdf2"
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(neg:DF (match_operand:DF 1 "register_operand" "0")))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 
 (define_insn ""
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(neg:DF (float_extend:DF (match_operand:SF 1 "register_operand" "0"))))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 
 (define_insn "negxf2"
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(neg:XF (match_operand:XF 1 "register_operand" "0")))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 
 (define_insn ""
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(neg:XF (float_extend:XF (match_operand:DF 1 "register_operand" "0"))))]
   "TARGET_80387"
-  "fchs")
+  "fchs"
+  [(set_attr "pipes" "fx")])
 \f
 ;; Absolute value instructions
 
@@ -4412,35 +4556,40 @@
 	(abs:SF (match_operand:SF 1 "register_operand" "0")))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn "absdf2"
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(abs:DF (match_operand:DF 1 "register_operand" "0")))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn ""
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(abs:DF (float_extend:DF (match_operand:SF 1 "register_operand" "0"))))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn "absxf2"
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(abs:XF (match_operand:XF 1 "register_operand" "0")))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn ""
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(abs:XF (float_extend:XF (match_operand:DF 1 "register_operand" "0"))))]
   "TARGET_80387"
   "fabs"
-  [(set_attr "type" "fpop")])
+  [(set_attr "type" "fpop")
+   (set_attr "pipes" "fx")])
 
 (define_insn "sqrtsf2"
   [(set (match_operand:SF 0 "register_operand" "=f")
@@ -4536,22 +4685,53 @@
 ;;- one complement instructions
 
 (define_insn "one_cmplsi2"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
-	(not:SI (match_operand:SI 1 "nonimmediate_operand" "0")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,o,m")
+	(not:SI (match_operand:SI 1 "nonimmediate_operand" "0,0,0")))]
   ""
-  "not%L0 %0")
+  "*
+  rtx xops[2];
+     if (ix86_cpu == PROCESSOR_PENTIUM && !optimize_size)
+     {
+       xops[0] = operands[0];
+       xops[1] = GEN_INT (0xffffffff);
+       output_asm_insn(AS2 (xor%L0,%1,%0),xops);
+       RET;
+     }
+    return AS1 (not%L0,%0);"
+  [(set_attr "pipes" "uv,uv,u")])
 
 (define_insn "one_cmplhi2"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
-	(not:HI (match_operand:HI 1 "nonimmediate_operand" "0")))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=r,o,m")
+	(not:HI (match_operand:HI 1 "nonimmediate_operand" "0,0,0")))]
   ""
-  "not%W0 %0")
+  "*
+  rtx xops[2];
+     if (ix86_cpu == PROCESSOR_PENTIUM && !optimize_size)
+     {
+       xops[0] = operands[0];
+       xops[1] = GEN_INT (0xffff);
+       output_asm_insn(AS2 (xor%W0,%1,%0),xops);
+       RET;
+     }
+    return AS1 (not%W0,%0);"
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "uv,uv,u")])
 
 (define_insn "one_cmplqi2"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
-	(not:QI (match_operand:QI 1 "nonimmediate_operand" "0")))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=q,o,m")
+	(not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,0")))]
   ""
-  "not%B0 %0")
+  "*
+  rtx xops[2];
+     if (ix86_cpu == PROCESSOR_PENTIUM && !optimize_size)
+     {
+       xops[0] = operands[0];
+       xops[1] = GEN_INT (0xff);
+       output_asm_insn(AS2 (xor%B0,%1,%0),xops);
+       RET;
+     }
+    return AS1 (not%B0,%0);"
+  [(set_attr "pipes" "uv,uv,u")])
 \f
 ;;- arithmetic shift instructions
 
@@ -4631,7 +4811,8 @@
       output_asm_insn (AS2 (sal%L2,%0,%2), xops);
     }
   RET;
-}")
+}"
+  [(set_attr "pipes" "u")])
 
 (define_insn "ashldi3_non_const_int"
   [(set (match_operand:DI 0 "register_operand" "=&r")
@@ -4667,9 +4848,9 @@
 ;; is smaller - use leal for now unless the shift count is 1.
 
 (define_insn "ashlsi3"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm")
-	(ashift:SI (match_operand:SI 1 "nonimmediate_operand" "r,0")
-		   (match_operand:SI 2 "nonmemory_operand" "M,cI")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,rm")
+	(ashift:SI (match_operand:SI 1 "nonimmediate_operand" "r,0,0")
+		   (match_operand:SI 2 "nonmemory_operand" "M,I,c")))]
   ""
   "*
 {
@@ -4702,12 +4883,13 @@
     return AS2 (add%L0,%0,%0);
 
   return AS2 (sal%L0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,u,none")])
 
 (define_insn "ashlhi3"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
-	(ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0")
-		   (match_operand:HI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,rm")
+	(ashift:HI (match_operand:HI 1 "nonimmediate_operand" "0,0")
+		   (match_operand:HI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4718,12 +4900,14 @@
     return AS2 (add%W0,%0,%0);
 
   return AS2 (sal%W0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")
+   (set_attr "prefix" "true")])
 
 (define_insn "ashlqi3"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
-	(ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0")
-		   (match_operand:QI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,qm")
+	(ashift:QI (match_operand:QI 1 "nonimmediate_operand" "0,0")
+		   (match_operand:QI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4734,7 +4918,8 @@
     return AS2 (add%B0,%0,%0);
 
   return AS2 (sal%B0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 
 ;; See comment above `ashldi3' about how this works.
 
@@ -4781,7 +4966,8 @@
     output_asm_insn (AS2 (xor%L2,%2,%2), xops);
 
   RET;
-}")
+}"
+  [(set_attr "pipes" "uv")])
 
 (define_insn "ashrdi3_const_int"
   [(set (match_operand:DI 0 "register_operand" "=&r")
@@ -4852,9 +5038,9 @@
 }")
 
 (define_insn "ashrsi3"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
-	(ashiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0")
-		     (match_operand:SI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm")
+	(ashiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:SI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4862,12 +5048,13 @@
     return AS2 (sar%L0,%b2,%0);
   else
     return AS2 (sar%L0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 
 (define_insn "ashrhi3"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
-	(ashiftrt:HI (match_operand:HI 1 "nonimmediate_operand" "0")
-		     (match_operand:HI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,rm")
+	(ashiftrt:HI (match_operand:HI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:HI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4875,12 +5062,14 @@
     return AS2 (sar%W0,%b2,%0);
   else
     return AS2 (sar%W0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")
+   (set_attr "prefix" "true")])
 
 (define_insn "ashrqi3"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
-	(ashiftrt:QI (match_operand:QI 1 "nonimmediate_operand" "0")
-		     (match_operand:QI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,qm")
+	(ashiftrt:QI (match_operand:QI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:QI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -4888,7 +5077,8 @@
     return AS2 (sar%B0,%b2,%0);
   else
     return AS2 (sar%B0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 \f
 ;;- logical shift instructions
 
@@ -5006,9 +5196,9 @@
 }")
 
 (define_insn "lshrsi3"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
-	(lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0")
-		     (match_operand:SI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm")
+	(lshiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:SI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -5016,12 +5206,13 @@
     return AS2 (shr%L0,%b2,%0);
   else
     return AS2 (shr%L0,%2,%1);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 
 (define_insn "lshrhi3"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm")
-	(lshiftrt:HI (match_operand:HI 1 "nonimmediate_operand" "0")
-		     (match_operand:HI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,rm")
+	(lshiftrt:HI (match_operand:HI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:HI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -5029,12 +5220,14 @@
     return AS2 (shr%W0,%b2,%0);
   else
     return AS2 (shr%W0,%2,%0);
-}")
+}"
+  [(set_attr "prefix" "true")
+   (set_attr "pipes" "u,none")])
 
 (define_insn "lshrqi3"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
-	(lshiftrt:QI (match_operand:QI 1 "nonimmediate_operand" "0")
-		     (match_operand:QI 2 "nonmemory_operand" "cI")))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,qm")
+	(lshiftrt:QI (match_operand:QI 1 "nonimmediate_operand" "0,0")
+		     (match_operand:QI 2 "nonmemory_operand" "I,c")))]
   ""
   "*
 {
@@ -5042,7 +5235,8 @@
     return AS2 (shr%B0,%b2,%0);
   else
     return AS2 (shr%B0,%2,%0);
-}")
+}"
+  [(set_attr "pipes" "u,none")])
 \f
 ;;- rotate instructions
 
@@ -5070,7 +5264,8 @@
     return AS2 (rol%W0,%b2,%0);
   else
     return AS2 (rol%W0,%2,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 (define_insn "rotlqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
@@ -5109,7 +5304,8 @@
     return AS2 (ror%W0,%b2,%0);
   else
     return AS2 (ror%W0,%2,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 (define_insn "rotrqi3"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=qm")
@@ -5210,7 +5406,8 @@
     return AS2 (bts%L0,%2,%0);
   else
     return AS2 (btr%L0,%2,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 ;; Bit complement.  See comments on previous pattern.
 ;; ??? Is this really worthwhile?
@@ -5225,7 +5422,8 @@
   CC_STATUS_INIT;
 
   return AS2 (btc%L0,%1,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 (define_insn ""
   [(set (match_operand:SI 0 "nonimmediate_operand" "=rm")
@@ -5238,7 +5436,8 @@
   CC_STATUS_INIT;
 
   return AS2 (btc%L0,%2,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 \f
 ;; Recognizers for bit-test instructions.
 
@@ -5259,12 +5458,13 @@
 {
   cc_status.flags |= CC_Z_IN_NOT_C;
   return AS2 (bt%L0,%1,%0);
-}")
+}"
+ [(set_attr "prefix" "true")])
 
 (define_insn ""
-  [(set (cc0) (zero_extract (match_operand:SI 0 "register_operand" "r")
-			    (match_operand:SI 1 "const_int_operand" "n")
-			    (match_operand:SI 2 "const_int_operand" "n")))]
+  [(set (cc0) (zero_extract (match_operand:SI 0 "register_operand" "a,r")
+			    (match_operand:SI 1 "const_int_operand" "n,n")
+			    (match_operand:SI 2 "const_int_operand" "n,n")))]
   ""
   "*
 {
@@ -5292,6 +5492,7 @@
   return AS2 (test%L0,%1,%0);
 }")
 
+
 ;; ??? All bets are off if operand 0 is a volatile MEM reference.
 ;; The CPU may access unspecified bytes around the actual target byte.
 
@@ -5350,7 +5551,8 @@
     return AS2 (test%L0,%1,%0);
 
   return AS2 (test%L1,%0,%1);
-}")
+}"
+ [(set_attr "pipes" "none")])
 \f
 ;; Store-flag instructions.
 
@@ -5671,7 +5873,8 @@
     return (char *)0;
 
   return AS1(j%D0,%l1);
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(set (pc)
@@ -5725,7 +5928,8 @@
     return (char *)0;
 
   return AS1(j%d0,%l1);
-}")
+}"
+ [(set_attr "pipes" "v")])
 \f
 ;; Unconditional and other jump instructions
 
@@ -5733,7 +5937,8 @@
   [(set (pc)
 	(label_ref (match_operand 0 "" "")))]
   ""
-  "jmp %l0")
+  "jmp %l0"
+ [(set_attr "pipes" "v")])
 
 (define_insn "indirect_jump"
   [(set (pc) (match_operand:SI 0 "nonimmediate_operand" "rm"))]
@@ -5743,7 +5948,8 @@
   CC_STATUS_INIT;
 
   return AS1 (jmp,%*%0);
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 ;; ??? could transform while(--i > 0) S; to if (--i > 0) do S; while(--i);
 ;;     if S does not change i
@@ -6030,7 +6236,7 @@
 ;; call* patterns.  Each named pattern is followed by an unnamed pattern
 ;; that matches any call to a symbolic CONST (ie, a symbol_ref).  The
 ;; unnamed patterns are only used while generating PIC code, because
-;; otherwise the named patterns match.
+;; otherwise the named pa|terns match.
 
 ;; Call subroutine returning no value.
 
@@ -6075,7 +6281,8 @@
     }
   else
     return AS1 (call,%P0);
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(call (mem:QI (match_operand:SI 0 "symbolic_operand" ""))
@@ -6123,14 +6330,16 @@
     }
   else
     return AS1 (call,%P0);
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(call (mem:QI (match_operand:SI 0 "symbolic_operand" ""))
 	 (match_operand:SI 1 "general_operand" "g"))]
   ;; Operand 1 not used on the i386.
   "!HALF_PIC_P ()"
-  "call %P0")
+  "call %P0"
+ [(set_attr "pipes" "v")])
 
 ;; Call subroutine, returning value in operand 0
 ;; (which must be a hard register).
@@ -6180,7 +6389,8 @@
     output_asm_insn (AS1 (call,%P1), operands);
 
   RET;
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(set (match_operand 0 "" "=rf")
@@ -6233,7 +6443,8 @@
     output_asm_insn (AS1 (call,%P1), operands);
 
   RET;
-}")
+}"
+ [(set_attr "pipes" "v")])
 
 (define_insn ""
   [(set (match_operand 0 "" "=rf")
@@ -6241,7 +6452,8 @@
 	      (match_operand:SI 2 "general_operand" "g")))]
   ;; Operand 2 not used on the i386.
   "!HALF_PIC_P ()"
-  "call %P1")
+  "call %P1"
+ [(set_attr "pipes" "v")])
 
 ;; Call subroutine returning any type.
 
@@ -6338,7 +6550,8 @@
   xops[1] = stack_pointer_rtx;
   output_asm_insn (AS2 (sub%L1,%0,%1), xops);
   RET;
-}")
+}"
+ [(set_attr "pipes" "uv")])
 
 (define_insn "prologue_set_got"
   [(set (match_operand:SI 0 "" "")
@@ -6362,7 +6575,8 @@
       output_asm_insn (buffer, operands);
     }    
   RET;
-}")
+}"
+ [(set_attr "pipes" "uv")])
 
 (define_insn "prologue_get_pc"
   [(set (match_operand:SI 0 "" "")
@@ -6378,7 +6592,8 @@
       ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, \"L\", CODE_LABEL_NUMBER (operands[1]));
     }    
   RET;
-}")
+}"
+ [(set_attr "pipes" "uv")])
 
 (define_insn "prologue_get_pc_and_set_got"
   [(unspec_volatile [(match_operand:SI 0 "" "")] 3)]
@@ -6756,7 +6971,8 @@
 			 (match_operand:DF 2 "nonimmediate_operand" "fm,0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6773,7 +6989,8 @@
 	    (match_operand:DF 2 "register_operand" "0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6790,7 +7007,8 @@
 			 (match_operand:XF 2 "register_operand" "f,0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6807,7 +7025,8 @@
 	    (match_operand:XF 2 "register_operand" "0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6824,7 +7043,8 @@
 	    (match_operand:XF 2 "register_operand" "0,f")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6841,7 +7061,8 @@
 	   (float:XF (match_operand:SI 2 "nonimmediate_operand" "rm"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6859,7 +7080,8 @@
 	    (match_operand:SF 2 "nonimmediate_operand" "fm,0"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6876,7 +7098,8 @@
 	    (match_operand:DF 2 "register_operand" "0,f")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6893,7 +7116,8 @@
 	   (float:DF (match_operand:SI 2 "nonimmediate_operand" "rm"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6911,7 +7135,8 @@
 	    (match_operand:SF 2 "nonimmediate_operand" "fm,0"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6928,7 +7153,8 @@
 			 (match_operand:SF 2 "nonimmediate_operand" "fm,0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6945,7 +7171,8 @@
 	   (match_operand:SF 2 "register_operand" "0")]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -6962,7 +7189,8 @@
 	   (float:SF (match_operand:SI 2 "nonimmediate_operand" "rm"))]))]
   "TARGET_80387"
   "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type") 
+  [(set_attr "pipes" "fx")
+   (set (attr "type") 
         (cond [(match_operand:DF 3 "is_mul" "") 
                  (const_string "fpmul")
                (match_operand:DF 3 "is_div" "") 
@@ -7651,3 +7879,6 @@
   load_pic_register (1);
   DONE;
 }")
+
+
+

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~1998-09-22 14:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-09-21 23:45 Various changes to i386.md Jan Hubicka
  -- strict thread matches above, loose matches on Subject: below --
1998-09-21 20:09 Jan Hubicka
1998-09-22 14:12 ` Joern Rennecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).