public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations
@ 2014-03-13 20:52 Richard Henderson
  2014-04-26  9:38 ` Tom de Vries
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Henderson @ 2014-03-13 20:52 UTC (permalink / raw)
  To: GCC Patches; +Cc: Uros Bizjak, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

The original ICE is caused by the dwarf2cfi pass not noticing a stack
adjustment in the insn stream.  The reason for the miss is that the push/pop
was added by a post-reload splitter, which did nothing to mark the insn for
special treatment.

In the PR, Jakub and I threw around several ideas.

My first attempt, to annotate the stack adjustment so that dwarf2cfi could see
it, showed a lack of supporting infrastructure within the csa and dwarf2cfi
passes.  In order to make that path work in the short term, csa had to be crippled.

My second attempt removes ix86_force_to_memory and all uses thereof.  Thus
there are no longer any troublesome post-reload stack manipulations and
dwarf2cfi doesn't get confused.

In the case of int->float conversions, we can figure out at rtl-expand time
that we might need a bit o stack memory, and we can allocate it via
assign_386_stack_local.  This produces minimal churn in this area, though we
still get to remove some patterns that didn't have the scratch memory.

In the other cases the patterns are created by combine, and there we have no
chance to use assign_386_stack_local.  Here, I simply remove the register
alternatives, leaving the register allocator no choice but to force the value
into memory.  If I recall correctly, the old reload would barf on this (thus
the byzantine structure of the existing patterns).  But Vlad assured me that
LRA will handle this just fine.

Considering that we mostly will never choose the combined
int-convert-and-operate patterns (-Os or ancient cpu tunings only), I think
this is the best of the available options.

For stage1, it would be interesting to investigate whether we can eliminate the
assign_386_stack_local fiddly bits from int->float conversions, and simply rely
on LRA there as well.  It would certainly reduce some code duplication.
Indeed, with clever use of "enabled" perhaps we can reduce to a single pattern.

Tested on x86_64 and i686-linux.


r~

[-- Attachment #2: z --]
[-- Type: text/plain, Size: 25888 bytes --]

        PR debug/60438
        * config/i386/i386.c (ix86_split_fp_branch): Remove pushed argument.
        (ix86_force_to_memory, ix86_free_from_memory): Remove.
        * config/i386/i386-protos.h: Likewise.
        * config/i386/i386.md (floathi<X87MODEF>2): Use assign_386_stack_local
        in the expander instead of a splitter.
        (float<SWI48x><X87MODEF>2): Use assign_386_stack_local if there is
        any possibility of requiring a memory.
        (*floatsi<MODEF>2_vector_mixed): Remove, and the splitters.
        (*floatsi<MODEF>2_vector_sse): Remove, and the splitters.
        (fp branch splitters): Update for ix86_split_fp_branch.
        (*jcc<X87MODEF>_<SWI24>_i387): Remove r/f alternative.
        (*jcc<X87MODEF>_<SWI24>_r_i387): Likewise.
        (splitter for jcc<X87MODEF>_<SWI24>_i387 r/f): Remove.
        (*fop_<MODEF>_2_i387): Remove f/r alternative.
        (*fop_<MODEF>_3_i387): Likewise.
        (*fop_xf_2_i387, *fop_xf_3_i387): Likewise.
        (splitters for the fop_* register patterns): Remove.
        (fscalexf4_i387): Rename from *fscalexf4_i387.
        (ldexpxf3): Use gen_floatsixf2 and gen_fscalexf4_i387.



diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 3493904..6e32978 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -154,13 +154,11 @@ extern enum machine_mode ix86_fp_compare_mode (enum rtx_code);
 extern rtx ix86_libcall_value (enum machine_mode);
 extern bool ix86_function_arg_regno_p (int);
 extern void ix86_asm_output_function_label (FILE *, const char *, tree);
-extern rtx ix86_force_to_memory (enum machine_mode, rtx);
-extern void ix86_free_from_memory (enum machine_mode);
 extern void ix86_call_abi_override (const_tree);
 extern int ix86_reg_parm_stack_space (const_tree);
 
 extern void ix86_split_fp_branch (enum rtx_code code, rtx, rtx,
-				  rtx, rtx, rtx, rtx);
+				  rtx, rtx, rtx);
 extern bool ix86_hard_regno_mode_ok (int, enum machine_mode);
 extern bool ix86_modes_tieable_p (enum machine_mode, enum machine_mode);
 extern bool ix86_secondary_memory_needed (enum reg_class, enum reg_class,
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 9e33d53..64b8e0a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19993,7 +19993,7 @@ ix86_expand_branch (enum rtx_code code, rtx op0, rtx op1, rtx label)
 /* Split branch based on floating point condition.  */
 void
 ix86_split_fp_branch (enum rtx_code code, rtx op1, rtx op2,
-		      rtx target1, rtx target2, rtx tmp, rtx pushed)
+		      rtx target1, rtx target2, rtx tmp)
 {
   rtx condition;
   rtx i;
@@ -20009,10 +20009,6 @@ ix86_split_fp_branch (enum rtx_code code, rtx op1, rtx op2,
   condition = ix86_expand_fp_compare (code, op1, op2,
 				      tmp);
 
-  /* Remove pushed operand from stack.  */
-  if (pushed)
-    ix86_free_from_memory (GET_MODE (pushed));
-
   i = emit_jump_insn (gen_rtx_SET
 		      (VOIDmode, pc_rtx,
 		       gen_rtx_IF_THEN_ELSE (VOIDmode,
@@ -36994,105 +36990,6 @@ avx_vperm2f128_parallel (rtx par, enum machine_mode mode)
   return mask + 1;
 }
 \f
-/* Store OPERAND to the memory after reload is completed.  This means
-   that we can't easily use assign_stack_local.  */
-rtx
-ix86_force_to_memory (enum machine_mode mode, rtx operand)
-{
-  rtx result;
-
-  gcc_assert (reload_completed);
-  if (ix86_using_red_zone ())
-    {
-      result = gen_rtx_MEM (mode,
-			    gen_rtx_PLUS (Pmode,
-					  stack_pointer_rtx,
-					  GEN_INT (-RED_ZONE_SIZE)));
-      emit_move_insn (result, operand);
-    }
-  else if (TARGET_64BIT)
-    {
-      switch (mode)
-	{
-	case HImode:
-	case SImode:
-	  operand = gen_lowpart (DImode, operand);
-	  /* FALLTHRU */
-	case DImode:
-	  emit_insn (
-		      gen_rtx_SET (VOIDmode,
-				   gen_rtx_MEM (DImode,
-						gen_rtx_PRE_DEC (DImode,
-							stack_pointer_rtx)),
-				   operand));
-	  break;
-	default:
-	  gcc_unreachable ();
-	}
-      result = gen_rtx_MEM (mode, stack_pointer_rtx);
-    }
-  else
-    {
-      switch (mode)
-	{
-	case DImode:
-	  {
-	    rtx operands[2];
-	    split_double_mode (mode, &operand, 1, operands, operands + 1);
-	    emit_insn (
-			gen_rtx_SET (VOIDmode,
-				     gen_rtx_MEM (SImode,
-						  gen_rtx_PRE_DEC (Pmode,
-							stack_pointer_rtx)),
-				     operands[1]));
-	    emit_insn (
-			gen_rtx_SET (VOIDmode,
-				     gen_rtx_MEM (SImode,
-						  gen_rtx_PRE_DEC (Pmode,
-							stack_pointer_rtx)),
-				     operands[0]));
-	  }
-	  break;
-	case HImode:
-	  /* Store HImodes as SImodes.  */
-	  operand = gen_lowpart (SImode, operand);
-	  /* FALLTHRU */
-	case SImode:
-	  emit_insn (
-		      gen_rtx_SET (VOIDmode,
-				   gen_rtx_MEM (GET_MODE (operand),
-						gen_rtx_PRE_DEC (SImode,
-							stack_pointer_rtx)),
-				   operand));
-	  break;
-	default:
-	  gcc_unreachable ();
-	}
-      result = gen_rtx_MEM (mode, stack_pointer_rtx);
-    }
-  return result;
-}
-
-/* Free operand from the memory.  */
-void
-ix86_free_from_memory (enum machine_mode mode)
-{
-  if (!ix86_using_red_zone ())
-    {
-      int size;
-
-      if (mode == DImode || TARGET_64BIT)
-	size = 8;
-      else
-	size = 4;
-      /* Use LEA to deallocate stack space.  In peephole2 it will be converted
-         to pop or add instruction if registers are available.  */
-      emit_insn (gen_rtx_SET (VOIDmode, stack_pointer_rtx,
-			      gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-					    GEN_INT (size))));
-    }
-}
-
 /* Return a register priority for hard reg REGNO.  */
 static int
 ix86_register_priority (int hard_regno)
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index ea1d85f..03939fd 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4656,26 +4656,16 @@
 ;; wants to be able to do this between registers.
 
 (define_expand "floathi<mode>2"
-  [(set (match_operand:X87MODEF 0 "register_operand")
-	(float:X87MODEF (match_operand:HI 1 "nonimmediate_operand")))]
-  "TARGET_80387
-   && (!(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)
-       || TARGET_MIX_SSE_I387)")
-
-;; Pre-reload splitter to add memory clobber to the pattern.
-(define_insn_and_split "*floathi<mode>2_1"
-  [(set (match_operand:X87MODEF 0 "register_operand")
-	(float:X87MODEF (match_operand:HI 1 "register_operand")))]
+  [(parallel [(set (match_operand:X87MODEF 0 "register_operand")
+		   (float:X87MODEF
+		     (match_operand:HI 1 "nonimmediate_operand")))
+              (clobber (match_dup 2))])]
   "TARGET_80387
    && (!(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)
-       || TARGET_MIX_SSE_I387)
-   && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(parallel [(set (match_dup 0)
-	      (float:X87MODEF (match_dup 1)))
-   (clobber (match_dup 2))])]
-  "operands[2] = assign_386_stack_local (HImode, SLOT_TEMP);")
+       || TARGET_MIX_SSE_I387)"
+{
+  operands[2] = assign_386_stack_local (HImode, SLOT_TEMP);
+})
 
 (define_insn "*floathi<mode>2_i387_with_temp"
   [(set (match_operand:X87MODEF 0 "register_operand" "=f,f")
@@ -4723,14 +4713,17 @@
   [(set (match_dup 0) (float:X87MODEF (match_dup 1)))])
 
 (define_expand "float<SWI48x:mode><X87MODEF:mode>2"
-  [(set (match_operand:X87MODEF 0 "register_operand")
-	(float:X87MODEF
-	  (match_operand:SWI48x 1 "nonimmediate_operand")))]
+  [(parallel [(set (match_operand:X87MODEF 0 "register_operand")
+		   (float:X87MODEF
+		     (match_operand:SWI48x 1 "nonimmediate_operand")))
+              (clobber (match_dup 2))])]
   "TARGET_80387
    || ((<SWI48x:MODE>mode != DImode || TARGET_64BIT)
        && SSE_FLOAT_MODE_P (<X87MODEF:MODE>mode) && TARGET_SSE_MATH)"
 {
-  if (!((<SWI48x:MODE>mode != DImode || TARGET_64BIT)
+  bool native_int = TARGET_64BIT || <SWI48x:MODE>mode != DImode;
+
+  if (!(native_int
 	&& SSE_FLOAT_MODE_P (<X87MODEF:MODE>mode) && TARGET_SSE_MATH)
       && !X87_ENABLE_FLOAT (<X87MODEF:MODE>mode, <SWI48x:MODE>mode))
     {
@@ -4749,44 +4742,34 @@
       emit_insn (insn (operands[0], reg));
       DONE;
     }
-})
-
-;; Pre-reload splitter to add memory clobber to the pattern.
-(define_insn_and_split "*float<SWI48x:mode><X87MODEF:mode>2_1"
-  [(set (match_operand:X87MODEF 0 "register_operand")
-	(float:X87MODEF (match_operand:SWI48x 1 "register_operand")))]
-  "((TARGET_80387
-     && X87_ENABLE_FLOAT (<X87MODEF:MODE>mode, <SWI48x:MODE>mode)
-     && (!((<SWI48x:MODE>mode != DImode || TARGET_64BIT)
-	   && SSE_FLOAT_MODE_P (<X87MODEF:MODE>mode) && TARGET_SSE_MATH)
-	 || TARGET_MIX_SSE_I387))
-    || ((<SWI48x:MODE>mode != DImode || TARGET_64BIT)
-	&& SSE_FLOAT_MODE_P (<X87MODEF:MODE>mode) && TARGET_SSE_MATH
-	&& ((<SWI48x:MODE>mode == SImode
-	     && TARGET_SSE2 && TARGET_USE_VECTOR_CONVERTS
-	     && optimize_function_for_speed_p (cfun)
-	     && flag_trapping_math)
-	    || !(TARGET_INTER_UNIT_CONVERSIONS
-	         || optimize_function_for_size_p (cfun)))))
-   && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(parallel [(set (match_dup 0) (float:X87MODEF (match_dup 1)))
-	      (clobber (match_dup 2))])]
-{
-  operands[2] = assign_386_stack_local (<SWI48x:MODE>mode, SLOT_TEMP);
 
   /* Avoid store forwarding (partial memory) stall penalty
      by passing DImode value through XMM registers.  */
-  if (<SWI48x:MODE>mode == DImode && !TARGET_64BIT
+  if (!native_int
       && TARGET_80387 && TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_TO_VEC
       && optimize_function_for_speed_p (cfun))
     {
+      operands[2] = assign_386_stack_local (<SWI48x:MODE>mode, SLOT_TEMP);
       emit_insn (gen_floatdi<X87MODEF:mode>2_i387_with_xmm (operands[0],
 							    operands[1],
 							    operands[2]));
       DONE;
     }
+
+  /* Notice when we'd convert directly from general registers.  */
+  if (native_int
+      && (TARGET_MIX_SSE_I387 || TARGET_SSE_MATH)
+      && SSE_FLOAT_MODE_P (<X87MODEF:MODE>mode)
+      && (TARGET_INTER_UNIT_CONVERSIONS
+          || optimize_function_for_size_p (cfun)))
+    {
+      emit_insn (gen_rtx_SET
+                 (VOIDmode, operands[0],
+                  gen_rtx_FLOAT (<X87MODEF:MODE>mode, operands[1])));
+      DONE;
+    }
+
+  operands[2] = assign_386_stack_local (<SWI48x:MODE>mode, SLOT_TEMP);
 })
 
 (define_insn "*floatsi<mode>2_vector_mixed_with_temp"
@@ -4805,22 +4788,6 @@
    (set_attr "bdver1_decode" "*,*,double,direct,double")
    (set_attr "fp_int_src" "true")])
 
-(define_insn "*floatsi<mode>2_vector_mixed"
-  [(set (match_operand:MODEF 0 "register_operand" "=f,x")
-	(float:MODEF (match_operand:SI 1 "memory_operand" "m,m")))]
-  "TARGET_SSE2 && TARGET_MIX_SSE_I387
-   && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)"
-  "@
-   fild%Z1\t%1
-   #"
-  [(set_attr "type" "fmov,sseicvt")
-   (set_attr "mode" "<MODE>,<ssevecmode>")
-   (set_attr "unit" "i387,*")
-   (set_attr "athlon_decode" "*,direct")
-   (set_attr "amdfam10_decode" "*,double")
-   (set_attr "bdver1_decode" "*,direct")
-   (set_attr "fp_int_src" "true")])
-
 (define_insn "*float<SWI48:mode><MODEF:mode>2_mixed_with_temp"
   [(set (match_operand:MODEF 0 "register_operand" "=f,f,x,x")
 	(float:MODEF
@@ -4841,15 +4808,6 @@
 	(float:MODEF (match_operand:SWI48 1 "register_operand")))
    (clobber (match_operand:SWI48 2 "memory_operand"))]
   "SSE_FLOAT_MODE_P (<MODEF:MODE>mode) && TARGET_MIX_SSE_I387
-   && TARGET_INTER_UNIT_CONVERSIONS
-   && reload_completed && SSE_REG_P (operands[0])"
-  [(set (match_dup 0) (float:MODEF (match_dup 1)))])
-
-(define_split
-  [(set (match_operand:MODEF 0 "register_operand")
-	(float:MODEF (match_operand:SWI48 1 "register_operand")))
-   (clobber (match_operand:SWI48 2 "memory_operand"))]
-  "SSE_FLOAT_MODE_P (<MODEF:MODE>mode) && TARGET_MIX_SSE_I387
    && !(TARGET_INTER_UNIT_CONVERSIONS || optimize_function_for_size_p (cfun))
    && reload_completed && SSE_REG_P (operands[0])"
   [(set (match_dup 2) (match_dup 1))
@@ -4918,19 +4876,6 @@
    (set_attr "bdver1_decode" "double,direct,double")
    (set_attr "fp_int_src" "true")])
 
-(define_insn "*floatsi<mode>2_vector_sse"
-  [(set (match_operand:MODEF 0 "register_operand" "=x")
-	(float:MODEF (match_operand:SI 1 "memory_operand" "m")))]
-  "TARGET_SSE2 && TARGET_SSE_MATH
-   && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)"
-  "#"
-  [(set_attr "type" "sseicvt")
-   (set_attr "mode" "<MODE>")
-   (set_attr "athlon_decode" "direct")
-   (set_attr "amdfam10_decode" "double")
-   (set_attr "bdver1_decode" "direct")
-   (set_attr "fp_int_src" "true")])
-
 (define_split
   [(set (match_operand:MODEF 0 "register_operand")
 	(float:MODEF (match_operand:SI 1 "register_operand")))
@@ -4995,49 +4940,6 @@
 
 (define_split
   [(set (match_operand:MODEF 0 "register_operand")
-	(float:MODEF (match_operand:SI 1 "register_operand")))]
-  "TARGET_SSE2 && TARGET_SSE_MATH
-   && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)
-   && reload_completed && SSE_REG_P (operands[0])"
-  [(const_int 0)]
-{
-  rtx op1 = operands[1];
-
-  operands[3] = simplify_gen_subreg (<ssevecmode>mode, operands[0],
-				     <MODE>mode, 0);
-  if (GET_CODE (op1) == SUBREG)
-    op1 = SUBREG_REG (op1);
-
-  if (GENERAL_REG_P (op1))
-    {
-      operands[4] = simplify_gen_subreg (V4SImode, operands[0], <MODE>mode, 0);
-      if (TARGET_INTER_UNIT_MOVES_TO_VEC)
-	emit_insn (gen_sse2_loadld (operands[4],
-				    CONST0_RTX (V4SImode), operands[1]));
-      else
-	{
-	  operands[5] = ix86_force_to_memory (GET_MODE (operands[1]),
-					      operands[1]);
-	  emit_insn (gen_sse2_loadld (operands[4],
-				      CONST0_RTX (V4SImode), operands[5]));
-	  ix86_free_from_memory (GET_MODE (operands[1]));
-	}
-    }
-  /* We can ignore possible trapping value in the
-     high part of SSE register for non-trapping math. */
-  else if (SSE_REG_P (op1) && !flag_trapping_math)
-    operands[4] = simplify_gen_subreg (V4SImode, operands[1], SImode, 0);
-  else
-    gcc_unreachable ();
-  if (<ssevecmode>mode == V4SFmode)
-    emit_insn (gen_floatv4siv4sf2 (operands[3], operands[4]));
-  else
-    emit_insn (gen_sse2_cvtdq2pd (operands[3], operands[4]));
-  DONE;
-})
-
-(define_split
-  [(set (match_operand:MODEF 0 "register_operand")
 	(float:MODEF (match_operand:SI 1 "memory_operand")))]
   "TARGET_SSE2 && TARGET_SSE_MATH
    && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)
@@ -5134,14 +5036,6 @@
   [(set (match_dup 2) (match_dup 1))
    (set (match_dup 0) (float:MODEF (match_dup 2)))])
 
-(define_split
-  [(set (match_operand:MODEF 0 "register_operand")
-	(float:MODEF (match_operand:SWI48 1 "memory_operand")))
-   (clobber (match_operand:SWI48 2 "memory_operand"))]
-  "SSE_FLOAT_MODE_P (<MODEF:MODE>mode) && TARGET_SSE_MATH
-   && reload_completed && SSE_REG_P (operands[0])"
-  [(set (match_dup 0) (float:MODEF (match_dup 1)))])
-
 (define_insn "*float<SWI48x:mode><X87MODEF:mode>2_i387_with_temp"
   [(set (match_operand:X87MODEF 0 "register_operand" "=f,f")
 	(float:X87MODEF
@@ -11370,7 +11264,7 @@
   [(const_int 0)]
 {
   ix86_split_fp_branch (GET_CODE (operands[0]), operands[1], operands[2],
-	                operands[3], operands[4], NULL_RTX, NULL_RTX);
+	                operands[3], operands[4], NULL_RTX);
   DONE;
 })
 
@@ -11389,7 +11283,7 @@
   [(const_int 0)]
 {
   ix86_split_fp_branch (GET_CODE (operands[0]), operands[1], operands[2],
-	     		operands[3], operands[4], operands[5], NULL_RTX);
+			operands[3], operands[4], operands[5]);
   DONE;
 })
 
@@ -11403,13 +11297,13 @@
 	(if_then_else
 	  (match_operator:CCFP 0 "ix86_swapped_fp_comparison_operator"
 	    [(match_operator:X87MODEF 1 "float_operator"
-	      [(match_operand:SWI24 2 "nonimmediate_operand" "m,?r")])
-	     (match_operand:X87MODEF 3 "register_operand" "f,f")])
+	      [(match_operand:SWI24 2 "nonimmediate_operand" "m")])
+	     (match_operand:X87MODEF 3 "register_operand" "f")])
 	  (label_ref (match_operand 4))
 	  (pc)))
    (clobber (reg:CCFP FPSR_REG))
    (clobber (reg:CCFP FLAGS_REG))
-   (clobber (match_scratch:HI 5 "=a,a"))]
+   (clobber (match_scratch:HI 5 "=a"))]
   "TARGET_80387 && !TARGET_CMOVE
    && (TARGET_USE_<SWI24:MODE>MODE_FIOP
        || optimize_function_for_size_p (cfun))"
@@ -11420,13 +11314,13 @@
 	(if_then_else
 	  (match_operator:CCFP 0 "ix86_swapped_fp_comparison_operator"
 	    [(match_operator:X87MODEF 1 "float_operator"
-	      [(match_operand:SWI24 2 "nonimmediate_operand" "m,?r")])
-	     (match_operand:X87MODEF 3 "register_operand" "f,f")])
+	      [(match_operand:SWI24 2 "nonimmediate_operand" "m")])
+	     (match_operand:X87MODEF 3 "register_operand" "f")])
 	  (pc)
 	  (label_ref (match_operand 4))))
    (clobber (reg:CCFP FPSR_REG))
    (clobber (reg:CCFP FLAGS_REG))
-   (clobber (match_scratch:HI 5 "=a,a"))]
+   (clobber (match_scratch:HI 5 "=a"))]
   "TARGET_80387 && !TARGET_CMOVE
    && (TARGET_USE_<SWI24:MODE>MODE_FIOP
        || optimize_function_for_size_p (cfun))"
@@ -11450,32 +11344,7 @@
 {
   ix86_split_fp_branch (swap_condition (GET_CODE (operands[0])), operands[3],
 		        gen_rtx_FLOAT (GET_MODE (operands[1]), operands[2]),
-			operands[4], operands[5], operands[6], NULL_RTX);
-  DONE;
-})
-
-;; %%% Kill this when reload knows how to do it.
-(define_split
-  [(set (pc)
-	(if_then_else
-	  (match_operator:CCFP 0 "ix86_swapped_fp_comparison_operator"
-	    [(match_operator:X87MODEF 1 "float_operator"
-	      [(match_operand:SWI24 2 "register_operand")])
-	     (match_operand:X87MODEF 3 "register_operand")])
-	  (match_operand 4)
-	  (match_operand 5)))
-   (clobber (reg:CCFP FPSR_REG))
-   (clobber (reg:CCFP FLAGS_REG))
-   (clobber (match_scratch:HI 6))]
-  "TARGET_80387 && !TARGET_CMOVE
-   && reload_completed"
-  [(const_int 0)]
-{
-  operands[7] = ix86_force_to_memory (GET_MODE (operands[2]), operands[2]);
-
-  ix86_split_fp_branch (swap_condition (GET_CODE (operands[0])), operands[3],
-		       	gen_rtx_FLOAT (GET_MODE (operands[1]), operands[7]),
-			operands[4], operands[5], operands[6], operands[2]);
+			operands[4], operands[5], operands[6]);
   DONE;
 })
 \f
@@ -13423,15 +13292,16 @@
 
 ;; ??? Add SSE splitters for these!
 (define_insn "*fop_<MODEF:mode>_2_i387"
-  [(set (match_operand:MODEF 0 "register_operand" "=f,f")
+  [(set (match_operand:MODEF 0 "register_operand" "=f")
 	(match_operator:MODEF 3 "binary_fp_operator"
 	  [(float:MODEF
-	     (match_operand:SWI24 1 "nonimmediate_operand" "m,?r"))
-	   (match_operand:MODEF 2 "register_operand" "0,0")]))]
+	     (match_operand:SWI24 1 "nonimmediate_operand" "m"))
+	   (match_operand:MODEF 2 "register_operand" "0")]))]
   "TARGET_80387 && X87_ENABLE_FLOAT (<MODEF:MODE>mode, <SWI24:MODE>mode)
    && !(SSE_FLOAT_MODE_P (<MODEF:MODE>mode) && TARGET_SSE_MATH)
-   && (TARGET_USE_<SWI24:MODE>MODE_FIOP || optimize_function_for_size_p (cfun))"
-  "* return which_alternative ? \"#\" : output_387_binary_op (insn, operands);"
+   && (TARGET_USE_<SWI24:MODE>MODE_FIOP
+       || optimize_function_for_size_p (cfun))"
+  { return output_387_binary_op (insn, operands); }
   [(set (attr "type")
         (cond [(match_operand:MODEF 3 "mult_operator")
                  (const_string "fmul")
@@ -13443,15 +13313,16 @@
    (set_attr "mode" "<SWI24:MODE>")])
 
 (define_insn "*fop_<MODEF:mode>_3_i387"
-  [(set (match_operand:MODEF 0 "register_operand" "=f,f")
+  [(set (match_operand:MODEF 0 "register_operand" "=f")
 	(match_operator:MODEF 3 "binary_fp_operator"
-	  [(match_operand:MODEF 1 "register_operand" "0,0")
+	  [(match_operand:MODEF 1 "register_operand" "0")
 	   (float:MODEF
-	     (match_operand:SWI24 2 "nonimmediate_operand" "m,?r"))]))]
+	     (match_operand:SWI24 2 "nonimmediate_operand" "m"))]))]
   "TARGET_80387 && X87_ENABLE_FLOAT (<MODEF:MODE>mode, <SWI24:MODE>mode)
    && !(SSE_FLOAT_MODE_P (<MODEF:MODE>mode) && TARGET_SSE_MATH)
-   && (TARGET_USE_<SWI24:MODE>MODE_FIOP || optimize_function_for_size_p (cfun))"
-  "* return which_alternative ? \"#\" : output_387_binary_op (insn, operands);"
+   && (TARGET_USE_<SWI24:MODE>MODE_FIOP
+       || optimize_function_for_size_p (cfun))"
+  { return output_387_binary_op (insn, operands); }
   [(set (attr "type")
         (cond [(match_operand:MODEF 3 "mult_operator")
                  (const_string "fmul")
@@ -13550,13 +13421,14 @@
    (set_attr "mode" "XF")])
 
 (define_insn "*fop_xf_2_i387"
-  [(set (match_operand:XF 0 "register_operand" "=f,f")
+  [(set (match_operand:XF 0 "register_operand" "=f")
 	(match_operator:XF 3 "binary_fp_operator"
 	  [(float:XF
-	     (match_operand:SWI24 1 "nonimmediate_operand" "m,?r"))
-	   (match_operand:XF 2 "register_operand" "0,0")]))]
-  "TARGET_80387 && (TARGET_USE_<MODE>MODE_FIOP || optimize_function_for_size_p (cfun))"
-  "* return which_alternative ? \"#\" : output_387_binary_op (insn, operands);"
+	     (match_operand:SWI24 1 "nonimmediate_operand" "m"))
+	   (match_operand:XF 2 "register_operand" "0")]))]
+  "TARGET_80387
+   && (TARGET_USE_<MODE>MODE_FIOP || optimize_function_for_size_p (cfun))"
+  { return output_387_binary_op (insn, operands); }
   [(set (attr "type")
         (cond [(match_operand:XF 3 "mult_operator")
                  (const_string "fmul")
@@ -13568,13 +13440,14 @@
    (set_attr "mode" "<MODE>")])
 
 (define_insn "*fop_xf_3_i387"
-  [(set (match_operand:XF 0 "register_operand" "=f,f")
+  [(set (match_operand:XF 0 "register_operand" "=f")
 	(match_operator:XF 3 "binary_fp_operator"
-	  [(match_operand:XF 1 "register_operand" "0,0")
+	  [(match_operand:XF 1 "register_operand" "0")
 	   (float:XF
-	     (match_operand:SWI24 2 "nonimmediate_operand" "m,?r"))]))]
-  "TARGET_80387 && (TARGET_USE_<MODE>MODE_FIOP || optimize_function_for_size_p (cfun))"
-  "* return which_alternative ? \"#\" : output_387_binary_op (insn, operands);"
+	     (match_operand:SWI24 2 "nonimmediate_operand" "m"))]))]
+  "TARGET_80387
+   && (TARGET_USE_<MODE>MODE_FIOP || optimize_function_for_size_p (cfun))"
+  { return output_387_binary_op (insn, operands); }
   [(set (attr "type")
         (cond [(match_operand:XF 3 "mult_operator")
                  (const_string "fmul")
@@ -13636,48 +13509,6 @@
               ]
               (const_string "fop")))
    (set_attr "mode" "<MODE>")])
-
-(define_split
-  [(set (match_operand 0 "register_operand")
-	(match_operator 3 "binary_fp_operator"
-	   [(float (match_operand:SWI24 1 "register_operand"))
-	    (match_operand 2 "register_operand")]))]
-  "reload_completed
-   && X87_FLOAT_MODE_P (GET_MODE (operands[0]))
-   && X87_ENABLE_FLOAT (GET_MODE (operands[0]), GET_MODE (operands[1]))"
-  [(const_int 0)]
-{
-  operands[4] = ix86_force_to_memory (GET_MODE (operands[1]), operands[1]);
-  operands[4] = gen_rtx_FLOAT (GET_MODE (operands[0]), operands[4]);
-  emit_insn (gen_rtx_SET (VOIDmode, operands[0],
-			  gen_rtx_fmt_ee (GET_CODE (operands[3]),
-					  GET_MODE (operands[3]),
-					  operands[4],
-					  operands[2])));
-  ix86_free_from_memory (GET_MODE (operands[1]));
-  DONE;
-})
-
-(define_split
-  [(set (match_operand 0 "register_operand")
-	(match_operator 3 "binary_fp_operator"
-	   [(match_operand 1 "register_operand")
-	    (float (match_operand:SWI24 2 "register_operand"))]))]
-  "reload_completed
-   && X87_FLOAT_MODE_P (GET_MODE (operands[0]))
-   && X87_ENABLE_FLOAT (GET_MODE (operands[0]), GET_MODE (operands[2]))"
-  [(const_int 0)]
-{
-  operands[4] = ix86_force_to_memory (GET_MODE (operands[2]), operands[2]);
-  operands[4] = gen_rtx_FLOAT (GET_MODE (operands[0]), operands[4]);
-  emit_insn (gen_rtx_SET (VOIDmode, operands[0],
-			  gen_rtx_fmt_ee (GET_CODE (operands[3]),
-					  GET_MODE (operands[3]),
-					  operands[1],
-					  operands[4])));
-  ix86_free_from_memory (GET_MODE (operands[2]));
-  DONE;
-})
 \f
 ;; FPU special functions.
 
@@ -14567,7 +14398,7 @@
   [(set_attr "type" "fpspc")
    (set_attr "mode" "XF")])
 
-(define_insn "*fscalexf4_i387"
+(define_insn "fscalexf4_i387"
   [(set (match_operand:XF 0 "register_operand" "=f")
 	(unspec:XF [(match_operand:XF 2 "register_operand" "0")
 		    (match_operand:XF 3 "register_operand" "1")]
@@ -14791,15 +14622,9 @@
 })
 
 (define_expand "ldexpxf3"
-  [(set (match_dup 3)
-	(float:XF (match_operand:SI 2 "register_operand")))
-   (parallel [(set (match_operand:XF 0 " register_operand")
-		   (unspec:XF [(match_operand:XF 1 "register_operand")
-			       (match_dup 3)]
-			      UNSPEC_FSCALE_FRACT))
-	      (set (match_dup 4)
-		   (unspec:XF [(match_dup 1) (match_dup 3)]
-			      UNSPEC_FSCALE_EXP))])]
+  [(match_operand:XF 0 "register_operand")
+   (match_operand:XF 1 "register_operand")
+   (match_operand:SI 2 "register_operand")]
   "TARGET_USE_FANCY_MATH_387
    && flag_unsafe_math_optimizations"
 {
@@ -14808,6 +14633,11 @@
 
   operands[3] = gen_reg_rtx (XFmode);
   operands[4] = gen_reg_rtx (XFmode);
+
+  emit_insn (gen_floatsixf2 (operands[3], operands[2]));
+  emit_insn (gen_fscalexf4_i387 (operands[0], operands[4],
+                                 operands[1], operands[3]));
+  DONE;
 })
 
 (define_expand "ldexp<mode>3"
diff --git a/gcc/testsuite/g++.dg/torture/pr60438-1.C b/gcc/testsuite/g++.dg/torture/pr60438-1.C
new file mode 100644
index 0000000..748295a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr60438-1.C
@@ -0,0 +1,26 @@
+// { dg-do compile }
+// { dg-options "-fomit-frame-pointer" }
+
+struct A { int a; };
+struct B { A foo (); };
+struct C { B *foo (); };
+int foo (struct C *, float);
+void bar (struct C *);
+void baz (struct A *);
+int a, b, c;
+
+int
+foo (struct C *y, float x)
+{
+  struct A d;
+  if (c)
+    bar (y);
+  else
+    {
+      C g;
+      g.foo ()->foo ();
+      a = b;
+      d.a = (int) (b * x);
+    }
+  baz (&d);
+}
diff --git a/gcc/testsuite/g++.dg/torture/pr60438-2.C b/gcc/testsuite/g++.dg/torture/pr60438-2.C
new file mode 100644
index 0000000..b32576f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr60438-2.C
@@ -0,0 +1,3 @@
+// { dg-do compile }
+// { dg-options "-fomit-frame-pointer -fno-crossjumping" }
+#include "pr60438-1.C"

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations
  2014-03-13 20:52 [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations Richard Henderson
@ 2014-04-26  9:38 ` Tom de Vries
  2014-04-26 10:26   ` Uros Bizjak
  0 siblings, 1 reply; 4+ messages in thread
From: Tom de Vries @ 2014-04-26  9:38 UTC (permalink / raw)
  To: Richard Henderson; +Cc: GCC Patches, Uros Bizjak, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1724 bytes --]

On 13-03-14 21:49, Richard Henderson wrote:
>   (define_expand "ldexpxf3"
> -  [(set (match_dup 3)
> -	(float:XF (match_operand:SI 2 "register_operand")))
> -   (parallel [(set (match_operand:XF 0 " register_operand")
> -		   (unspec:XF [(match_operand:XF 1 "register_operand")
> -			       (match_dup 3)]
> -			      UNSPEC_FSCALE_FRACT))
> -	      (set (match_dup 4)
> -		   (unspec:XF [(match_dup 1) (match_dup 3)]
> -			      UNSPEC_FSCALE_EXP))])]
> +  [(match_operand:XF 0 "register_operand")
> +   (match_operand:XF 1 "register_operand")
> +   (match_operand:SI 2 "register_operand")]
>     "TARGET_USE_FANCY_MATH_387
>      && flag_unsafe_math_optimizations"
>   {
> @@ -14808,6 +14633,11 @@
>
>     operands[3] = gen_reg_rtx (XFmode);
>     operands[4] = gen_reg_rtx (XFmode);
> +
> +  emit_insn (gen_floatsixf2 (operands[3], operands[2]));
> +  emit_insn (gen_fscalexf4_i387 (operands[0], operands[4],
> +                                 operands[1], operands[3]));
> +  DONE;
>   })

Richard,

For a non-bootstrap x86_64 build, gcc.dg/builtins-34.c fails for me with a sigsegv.

I've traced it back to this code in insn-emit.c:
...
rtx
gen_ldexpxf3 (rtx operand0,
         rtx operand1,
         rtx operand2)
{
   rtx _val = 0;
   start_sequence ();
   {
     rtx operands[3];
     operands[0] = operand0;
     operands[1] = operand1;
     operands[2] = operand2;

{
   if (optimize_insn_for_size_p ())
     FAIL;

   operands[3] = gen_reg_rtx (XFmode);
   operands[4] = gen_reg_rtx (XFmode);
...

operands is declared with size 3, and operands[3,4] accesses are out of bounds.

I've done a minimal build with attached patch, and reran the test-case, which 
passes now.

OK if bootstrap succeeds?

Thanks,
- Tom

[-- Attachment #2: Fix-out-of-bounds-array-accesses-in-ldexpxf3.patch --]
[-- Type: text/x-patch, Size: 1062 bytes --]

2014-04-26  Tom de Vries  <tom@codesourcery.com>

	* config/i386/i386.md (define_expand "ldexpxf3"): Fix out-of-bounds
	array accesses.
---
 gcc/config/i386/i386.md | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 25e2e93..9f103cf 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -14427,15 +14427,16 @@
   "TARGET_USE_FANCY_MATH_387
    && flag_unsafe_math_optimizations"
 {
+  rtx tmp1, tmp2;
   if (optimize_insn_for_size_p ())
     FAIL;
 
-  operands[3] = gen_reg_rtx (XFmode);
-  operands[4] = gen_reg_rtx (XFmode);
+  tmp1 = gen_reg_rtx (XFmode);
+  tmp2 = gen_reg_rtx (XFmode);
 
-  emit_insn (gen_floatsixf2 (operands[3], operands[2]));
-  emit_insn (gen_fscalexf4_i387 (operands[0], operands[4],
-                                 operands[1], operands[3]));
+  emit_insn (gen_floatsixf2 (tmp1, operands[2]));
+  emit_insn (gen_fscalexf4_i387 (operands[0], tmp2,
+                                 operands[1], tmp1));
   DONE;
 })
 
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations
  2014-04-26  9:38 ` Tom de Vries
@ 2014-04-26 10:26   ` Uros Bizjak
  2014-04-26 17:44     ` Tom de Vries
  0 siblings, 1 reply; 4+ messages in thread
From: Uros Bizjak @ 2014-04-26 10:26 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Richard Henderson, GCC Patches, Jakub Jelinek

On Sat, Apr 26, 2014 at 11:27 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 13-03-14 21:49, Richard Henderson wrote:
>>
>>   (define_expand "ldexpxf3"
>> -  [(set (match_dup 3)
>> -       (float:XF (match_operand:SI 2 "register_operand")))
>> -   (parallel [(set (match_operand:XF 0 " register_operand")
>> -                  (unspec:XF [(match_operand:XF 1 "register_operand")
>> -                              (match_dup 3)]
>> -                             UNSPEC_FSCALE_FRACT))
>> -             (set (match_dup 4)
>> -                  (unspec:XF [(match_dup 1) (match_dup 3)]
>> -                             UNSPEC_FSCALE_EXP))])]
>> +  [(match_operand:XF 0 "register_operand")
>> +   (match_operand:XF 1 "register_operand")
>> +   (match_operand:SI 2 "register_operand")]
>>     "TARGET_USE_FANCY_MATH_387
>>      && flag_unsafe_math_optimizations"
>>   {
>> @@ -14808,6 +14633,11 @@
>>
>>     operands[3] = gen_reg_rtx (XFmode);
>>     operands[4] = gen_reg_rtx (XFmode);
>> +
>> +  emit_insn (gen_floatsixf2 (operands[3], operands[2]));
>> +  emit_insn (gen_fscalexf4_i387 (operands[0], operands[4],
>> +                                 operands[1], operands[3]));
>> +  DONE;
>>   })
>
>
> Richard,
>
> For a non-bootstrap x86_64 build, gcc.dg/builtins-34.c fails for me with a
> sigsegv.
>
> I've traced it back to this code in insn-emit.c:
> ...
> rtx
> gen_ldexpxf3 (rtx operand0,
>         rtx operand1,
>         rtx operand2)
> {
>   rtx _val = 0;
>   start_sequence ();
>   {
>     rtx operands[3];
>     operands[0] = operand0;
>     operands[1] = operand1;
>     operands[2] = operand2;
>
> {
>   if (optimize_insn_for_size_p ())
>     FAIL;
>
>   operands[3] = gen_reg_rtx (XFmode);
>   operands[4] = gen_reg_rtx (XFmode);
> ...
>
> operands is declared with size 3, and operands[3,4] accesses are out of
> bounds.
>
> I've done a minimal build with attached patch, and reran the test-case,
> which passes now.
>
> OK if bootstrap succeeds?
>
> 2014-04-26  Tom de Vries  <tom@codesourcery.com>
>
> * config/i386/i386.md (define_expand "ldexpxf3"): Fix out-of-bounds
> array accesses.

OK for mainline and 4.9 branch.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations
  2014-04-26 10:26   ` Uros Bizjak
@ 2014-04-26 17:44     ` Tom de Vries
  0 siblings, 0 replies; 4+ messages in thread
From: Tom de Vries @ 2014-04-26 17:44 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Richard Henderson, GCC Patches, Jakub Jelinek

> OK if bootstrap succeeds?

With testing of the bootstrap build of the patch, I ran into the following 
regression compared to a reference bootstrap build without the patch:
...
FAIL: g++.dg/tsan/cond_race.C  -O2  output pattern test, is ==3087==WARNING: 
Program is run with unlimited stack size, which wouldn't work with Threa\
dSanitizer.
==3087==Re-execing with stack size limited to 33554432 bytes.
==================
WARNING: ThreadSanitizer: heap-use-after-free (pid=3087)
   Read of size 8 at 0x7d180000efc8 by thread T1:
     #0 pthread_cond_signal 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.i\
nc:2266 (libtsan.so.0+0x000000039b21)
     #1 thr(void*) 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:20 
(cond_race.exe+0x000000001033\
)
   Previous write of size 8 at 0x7d180000efc8 by main thread:
     #0 operator delete(void*) 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:592 
(libtsan.so.0+0\
x0000000494b0)
     #1 main 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:34 
(cond_race.exe+0x000000000ea0)
   Location is heap block of size 96 at 0x7d180000efa0 allocated by main thread:
     #0 operator new(unsigned long) 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:560 
(libtsan.s\
o.0+0x0000000496f2)
     #1 main 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:25 
(cond_race.exe+0x000000000e12)
   Thread T1 (tid=3101, running) created by main thread at:
     #0 pthread_create 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:877 
(libtsan.so.0+0x0000000\
47c23)
     #1 main 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:29 
(cond_race.exe+0x000000000e5a)
SUMMARY: ThreadSanitizer: heap-use-after-free 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:20 
t\
hr(void*)
==================
ThreadSanitizer: reported 1 warnings
, should match ThreadSanitizer: data race.*pthread_cond_signal.*
...

I've found the same failure here: 
http://gcc.gnu.org/ml/gcc-testresults/2014-01/msg00127.html, so I'm assuming 
it's a spurious failure.

I've committed to trunk and 4.9.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-04-26 16:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-13 20:52 [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations Richard Henderson
2014-04-26  9:38 ` Tom de Vries
2014-04-26 10:26   ` Uros Bizjak
2014-04-26 17:44     ` Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).