public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] Do not consider volatile asms as optimization barriers #1
@ 2014-03-01 14:37 Eric Botcazou
  2014-03-01 16:54 ` Richard Sandiford
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Eric Botcazou @ 2014-03-01 14:37 UTC (permalink / raw)
  To: gcc-patches; +Cc: Yury Gribov

[-- Attachment #1: Type: text/plain, Size: 3815 bytes --]

There seems to be a sufficiently large consensus that volatile asms should not 
be treated as full optimization barriers by the compiler.  This patch is a 
first step towards implementing this conclusion and aims only at addressing 
the code quality regressions introduced by
  http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=193802
on the 4.8 branch and mainline for volatile asms.

It introduces a new, temporary predicate really_volatile_insn_p and invokes it 
from the 3 places in cse.c, cselib.c and dse.c which were touched above.  But 
this comes with a side effect: the "blockage" standard pattern needs to be 
adjusted to always use an UNSPEC_VOLATILE.  That's already the case for all 
the architectures that define it, but 21 architectures (aarch64 arc avr bfin 
cr16 cris epiphany h8300 m32c mcore mmix mn10300 moxie msp430 nds32 picochip 
rl78 rx score v850 xtensa) don't define it and therefore need to be adjusted.

Tested on x86_64-suse-linux and by building cc1 and compiling a relevant 
testcase for the 21 aforementioned architectures.


2014-03-01  Eric Botcazou  <ebotcazou@adacore.com>

	* doc/md.texi (blockage): Do not allow volatile asms.
	* rtl.def (UNSPEC_VOLATILE): Adjust description.
	* rtl.h (really_volatile_insn_p): Declare.
	* rtlanal.c (volatile_insn_1): New predicate copied from...
	(volatile_insn_p): ...this.  Call it.
	(really_volatile_insn_p): New predicate.
	* cse.c (cse_insn): Call really_volatile_insn_p.
	* cselib.c (cselib_process_insn): Likewise.
	* dse.c (scan_insn): Likewise.
	* emit-rtl.c (gen_blockage): Delete.
	* emit-rtl.h (gen_blockage): Likewise.
	* config/aarch64/aarch64.md (UNSPECV_BLOCKAGE): New enum value.
	(blockage): New insn.
	* config/arc/arc.md (VUNSPEC_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/avr/avr.md (UNSPECV_BLOCKAGE): New enum value.
	(blockage): New insn.
	* config/bfin/bfin.md (UNSPEC_VOLATILE_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/cr16/cr16.md (UNSPECV_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/cris/cris.md (CRIS_UNSPECV_BLOCKAGE): New enum value.
	(blockage): New insn.
	* config/epiphany/epiphany.md (UNSPECV_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/h8300/h8300.md (UNSPEC_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/m32c/m32c.md (UNS_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/mcore/mcore.md (UNSPECV_BLOCKAGE, UNSPECV_CONSTANT,
	UNSPECV_ALIGN, UNSPECV_TABLE): New enum values.
	(blockage): New insn.
	(consttable_4): Use UNSPECV_CONSTANT.
	(align_4): Use UNSPECV_ALIGN.
	(consttable_end): Use UNSPECV_TABLE.
	* config/mmix/mmix.md (UNSPECV_BLOCKAGE, UNSPECV_SYNC,
	UNSPECV_NONLOCAL): New enum values.
	(blockage): New insn.
	(nonlocal_goto_receiver): Use UNSPECV_NONLOCAL.
	(nonlocal_goto_receiver_expanded): Likewise.
	(sync_icache): Use UNSPECV_SYNC.
	* config/mn10300/mn10300.md (UNSPECV_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/moxie/moxie.md (UNSPECV_BLOCKAGE): New enum value.
	(blockage): New insn.
	* config/msp430/msp430.md (UNS_BLOCKAGE): New enum value.
	(blockage): New insn.
	* config/nds32/constants.md (UNSPEC_VOLATILE_BLOCKAGE): New enum value.
	* config/nds32/nds32.md (blockage): New insn.
	* config/nds32/nds32.c (nds32_valid_stack_push_pop): Return false if
	the insn is of the wrong kind.
	* config/picochip/picochip.md (UNSPEC_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/rl78/rl78.md (UNS_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/rx/rx.md (UNSPEC_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/score/score.md (BLOCKAGE): New constant.
	(blockage): New insn.
	* config/v850/v850.md (UNSPECV_BLOCKAGE): New constant.
	(blockage): New insn.
	* config/xtensa/xtensa.md (UNSPECV_BLOCKAGE): New constant.
	(blockage): New insn.
	
	
-- 
Eric Botcazou

[-- Attachment #2: volatile_asm1.diff --]
[-- Type: text/x-patch, Size: 24386 bytes --]

Index: doc/md.texi
===================================================================
--- doc/md.texi	(revision 208241)
+++ doc/md.texi	(working copy)
@@ -6228,7 +6228,7 @@ the values of operands 1 and 2.
 This pattern defines a pseudo insn that prevents the instruction
 scheduler and other passes from moving instructions and using register
 equivalences across the boundary defined by the blockage insn.
-This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.
+This needs to be an UNSPEC_VOLATILE pattern.
 
 @cindex @code{memory_barrier} instruction pattern
 @item @samp{memory_barrier}
Index: rtlanal.c
===================================================================
--- rtlanal.c	(revision 208241)
+++ rtlanal.c	(working copy)
@@ -2136,10 +2136,11 @@ remove_node_from_expr_list (const_rtx no
 /* Nonzero if X contains any volatile instructions.  These are instructions
    which may cause unpredictable machine state instructions, and thus no
    instructions or register uses should be moved or combined across them.
-   This includes only volatile asms and UNSPEC_VOLATILE instructions.  */
+   This includes only UNSPEC_VOLATILE instructions and, if WITH_ASMS is
+   true, also volatile asms.  */
 
-int
-volatile_insn_p (const_rtx x)
+static int
+volatile_insn_1 (const_rtx x, bool with_asms)
 {
   const RTX_CODE code = GET_CODE (x);
   switch (code)
@@ -2164,7 +2165,7 @@ volatile_insn_p (const_rtx x)
 
     case ASM_INPUT:
     case ASM_OPERANDS:
-      if (MEM_VOLATILE_P (x))
+      if (with_asms && MEM_VOLATILE_P (x))
 	return 1;
 
     default:
@@ -2196,6 +2197,32 @@ volatile_insn_p (const_rtx x)
   return 0;
 }
 
+/* Nonzero if X contains any volatile instructions.  These are instructions
+   which may cause unpredictable machine state instructions, and thus no
+   instructions or register uses should be moved or combined across them.
+   This includes only volatile asms and UNSPEC_VOLATILE instructions.
+
+   This is the historical version, now deprecated.  */
+
+int
+volatile_insn_p (const_rtx x)
+{
+  return volatile_insn_1 (x, true);
+}
+
+/* Nonzero if X contains any volatile instructions.  These are instructions
+   which may cause unpredictable machine state instructions, and thus no
+   instructions or register uses should be moved or combined across them.
+   This includes only UNSPEC_VOLATILE instructions.
+
+   This is the new version.  */
+
+int
+really_volatile_insn_p (const_rtx x)
+{
+  return volatile_insn_1 (x, false);
+}
+
 /* Nonzero if X contains any volatile memory references
    UNSPEC_VOLATILE operations or volatile ASM_OPERANDS expressions.  */
 
Index: cse.c
===================================================================
--- cse.c	(revision 208241)
+++ cse.c	(working copy)
@@ -5682,9 +5682,8 @@ cse_insn (rtx insn)
 	  invalidate (XEXP (dest, 0), GET_MODE (dest));
       }
 
-  /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything.  */
-  if (NONJUMP_INSN_P (insn)
-      && volatile_insn_p (PATTERN (insn)))
+  /* A volatile instruction invalidates everything.  */
+  if (NONJUMP_INSN_P (insn) && really_volatile_insn_p (PATTERN (insn)))
     flush_hash_table ();
 
   /* Don't cse over a call to setjmp; on some machines (eg VAX)
Index: rtl.def
===================================================================
--- rtl.def	(revision 208241)
+++ rtl.def	(working copy)
@@ -236,7 +236,9 @@ DEF_RTL_EXPR(ASM_OPERANDS, "asm_operands
    */
 DEF_RTL_EXPR(UNSPEC, "unspec", "Ei", RTX_EXTRA)
 
-/* Similar, but a volatile operation and one which may trap.  */
+/* Similar, but a volatile operation and one which may trap.  Moreover, it's a
+   full optimization barrier, i.e. no instructions may be moved and no register
+   (hard or pseudo) or memory equivalences may be used across it.  */
 DEF_RTL_EXPR(UNSPEC_VOLATILE, "unspec_volatile", "Ei", RTX_EXTRA)
 
 /* ----------------------------------------------------------------------
Index: dse.c
===================================================================
--- dse.c	(revision 208241)
+++ dse.c	(working copy)
@@ -2472,8 +2472,7 @@ scan_insn (bb_info_t bb_info, rtx insn)
 
   /* Cselib clears the table for this case, so we have to essentially
      do the same.  */
-  if (NONJUMP_INSN_P (insn)
-      && volatile_insn_p (PATTERN (insn)))
+  if (NONJUMP_INSN_P (insn) && really_volatile_insn_p (PATTERN (insn)))
     {
       add_wild_read (bb_info);
       insn_info->cannot_delete = true;
Index: emit-rtl.c
===================================================================
--- emit-rtl.c	(revision 208241)
+++ emit-rtl.c	(working copy)
@@ -329,21 +329,6 @@ get_reg_attrs (tree decl, int offset)
   return (reg_attrs *) *slot;
 }
 
-
-#if !HAVE_blockage
-/* Generate an empty ASM_INPUT, which is used to block attempts to schedule,
-   and to block register equivalences to be seen across this insn.  */
-
-rtx
-gen_blockage (void)
-{
-  rtx x = gen_rtx_ASM_INPUT (VOIDmode, "");
-  MEM_VOLATILE_P (x) = true;
-  return x;
-}
-#endif
-
-
 /* Generate a new REG rtx.  Make sure ORIGINAL_REGNO is set properly, and
    don't attempt to share with the various global pieces of rtl (such as
    frame_pointer_rtx).  */
Index: cselib.c
===================================================================
--- cselib.c	(revision 208241)
+++ cselib.c	(working copy)
@@ -2655,10 +2655,8 @@ cselib_process_insn (rtx insn)
 
   /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp.  */
   if ((LABEL_P (insn)
-       || (CALL_P (insn)
-	   && find_reg_note (insn, REG_SETJMP, NULL))
-       || (NONJUMP_INSN_P (insn)
-	   && volatile_insn_p (PATTERN (insn))))
+       || (CALL_P (insn) && find_reg_note (insn, REG_SETJMP, NULL))
+       || (NONJUMP_INSN_P (insn) && really_volatile_insn_p (PATTERN (insn))))
       && !cselib_preserve_constants)
     {
       cselib_reset_table (next_uid);
Index: emit-rtl.h
===================================================================
--- emit-rtl.h	(revision 208241)
+++ emit-rtl.h	(working copy)
@@ -57,7 +57,6 @@ extern rtx replace_equiv_address (rtx, r
 /* Likewise, but the reference is not required to be valid.  */
 extern rtx replace_equiv_address_nv (rtx, rtx);
 
-extern rtx gen_blockage (void);
 extern rtvec gen_rtvec (int, ...);
 extern rtx copy_insn_1 (rtx);
 extern rtx copy_insn (rtx);
Index: rtl.h
===================================================================
--- rtl.h	(revision 208241)
+++ rtl.h	(working copy)
@@ -2064,6 +2064,7 @@ extern void remove_reg_equal_equiv_notes
 extern int side_effects_p (const_rtx);
 extern int volatile_refs_p (const_rtx);
 extern int volatile_insn_p (const_rtx);
+extern int really_volatile_insn_p (const_rtx x);
 extern int may_trap_p_1 (const_rtx, unsigned);
 extern int may_trap_p (const_rtx);
 extern int may_trap_or_fault_p (const_rtx);
Index: config/m32c/m32c.md
===================================================================
--- config/m32c/m32c.md	(revision 208241)
+++ config/m32c/m32c.md	(working copy)
@@ -37,7 +37,8 @@ (define_constants
    ])
 
 (define_constants
-  [(UNS_PROLOGUE_END 1)
+  [(UNS_BLOCKAGE 0)
+   (UNS_PROLOGUE_END 1)
    (UNS_EPILOGUE_START 2)
    (UNS_EH_EPILOGUE 3)
    (UNS_PUSHM 4)
@@ -64,6 +65,13 @@ (define_mode_attr bwl [(QI "b") (HI "w")
 (define_code_iterator eqne_cond [eq ne])
 
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNS_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "flags" "n")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/rx/rx.md
===================================================================
--- config/rx/rx.md	(revision 208241)
+++ config/rx/rx.md	(working copy)
@@ -75,6 +75,8 @@ (define_constants
    (UNSPEC_BUILTIN_WAIT	   51)
 
    (UNSPEC_PID_ADDR	   52)
+
+   (UNSPEC_BLOCKAGE	   53)
   ]
 )
 
@@ -2607,7 +2609,13 @@ (define_insn "mvfcp"
 
 ;;---------- Misc ------------------------
 
-;; Required by cfglayout.c...
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPEC_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/avr/avr.md
===================================================================
--- config/avr/avr.md	(revision 208241)
+++ config/avr/avr.md	(working copy)
@@ -83,6 +83,7 @@ (define_c_enum "unspecv"
    UNSPECV_SLEEP
    UNSPECV_WDR
    UNSPECV_DELAY_CYCLES
+   UNSPECV_BLOCKAGE
    ])
 
 
@@ -4876,6 +4877,13 @@ (define_insn "call_value_insn"
    (set_attr "length" "1,*,1,*")
    (set_attr "adjust_len" "*,call,*,call")])
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/nds32/nds32.md
===================================================================
--- config/nds32/nds32.md	(revision 208241)
+++ config/nds32/nds32.md	(working copy)
@@ -1993,6 +1993,13 @@ (define_expand "epilogue" [(const_int 0)
 
 ;; nop instruction.
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPEC_VOLATILE_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/nds32/nds32.c
===================================================================
--- config/nds32/nds32.c	(revision 208241)
+++ config/nds32/nds32.c	(working copy)
@@ -4322,6 +4322,8 @@ nds32_valid_stack_push_pop (rtx op, bool
       elt = XVECEXP (op, 0, 0);
       /* Pick up register element.  */
       elt_reg = push_p ? SET_SRC (elt) : SET_DEST (elt);
+      if (!REG_P (elt_reg))
+	return false;
       first_regno = REGNO (elt_reg);
 
       /* The 'push' operation is a kind of store operation.
Index: config/nds32/constants.md
===================================================================
--- config/nds32/constants.md	(revision 208241)
+++ config/nds32/constants.md	(working copy)
@@ -41,6 +41,7 @@ (define_c_enum "unspec_volatile_element"
   UNSPEC_VOLATILE_MTUSR
   UNSPEC_VOLATILE_SETGIE_EN
   UNSPEC_VOLATILE_SETGIE_DIS
+  UNSPEC_VOLATILE_BLOCKAGE
 ])
 
 ;; ------------------------------------------------------------------------
Index: config/xtensa/xtensa.md
===================================================================
--- config/xtensa/xtensa.md	(revision 208241)
+++ config/xtensa/xtensa.md	(working copy)
@@ -36,6 +36,7 @@ (define_constants [
   (UNSPEC_TP		10)
   (UNSPEC_MEMW		11)
 
+  (UNSPECV_BLOCKAGE     0)
   (UNSPECV_SET_FP	1)
   (UNSPECV_ENTRY	2)
   (UNSPECV_S32RI	4)
@@ -1614,6 +1615,13 @@ (define_expand "epilogue"
   DONE;
 })
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/epiphany/epiphany.md
===================================================================
--- config/epiphany/epiphany.md	(revision 208241)
+++ config/epiphany/epiphany.md	(working copy)
@@ -52,7 +52,8 @@ (define_constants
    (UNSPEC_FP_MODE		 1)
 
    (UNSPECV_GID			 0)
-   (UNSPECV_GIE			 1)])
+   (UNSPECV_GIE			 1)
+   (UNSPECV_BLOCKAGE		 2)])
 
 ;; Insn type.  Used to default other attribute values.
 
@@ -2805,6 +2806,14 @@ (define_expand "movmisalign<mode>"
   DONE;
 })
 \f
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")
+   (set_attr "type" "flow")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/moxie/moxie.md
===================================================================
--- config/moxie/moxie.md	(revision 208241)
+++ config/moxie/moxie.md	(working copy)
@@ -18,6 +18,9 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
+(define_c_enum "unspecv"
+  [UNSPECV_BLOCKAGE])
+
 ;; -------------------------------------------------------------------------
 ;; Moxie specific constraints, predicates and attributes
 ;; -------------------------------------------------------------------------
@@ -32,6 +35,13 @@ (define_attr "length" "" (const_int 2))
 ;; nop instruction
 ;; -------------------------------------------------------------------------
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/cris/cris.md
===================================================================
--- config/cris/cris.md	(revision 208241)
+++ config/cris/cris.md	(working copy)
@@ -94,6 +94,9 @@ (define_c_enum ""
 
    ;; Swap all 32 bits of the operand; 31 <=> 0, 30 <=> 1...
    CRIS_UNSPEC_SWAP_BITS
+
+   ;; Blockage
+   CRIS_UNSPECV_BLOCKAGE
   ])
 
 ;; Register numbers.
@@ -3816,6 +3819,14 @@ (define_insn "*expanded_call_value_v32"
   [(set_attr "cc" "clobber")
    (set_attr "slottable" "has_call_slot")])
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] CRIS_UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "cc" "none")
+   (set_attr "length" "0")]
+)
+
 ;; Used in debugging.  No use for the direct pattern; unfilled
 ;; delayed-branches are taken care of by other means.
 
Index: config/mn10300/mn10300.md
===================================================================
--- config/mn10300/mn10300.md	(revision 208241)
+++ config/mn10300/mn10300.md	(working copy)
@@ -42,6 +42,8 @@ (define_constants [
   (UNSPEC_LIW		8)
   ;; This is for the low overhead loop instructions.
   (UNSPEC_SETLB         9)
+
+  (UNSPECV_BLOCKAGE     0)
 ])
 
 (include "predicates.md")
@@ -1678,6 +1680,12 @@ (define_expand "untyped_call"
   DONE;
 })
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/aarch64/aarch64.md
===================================================================
--- config/aarch64/aarch64.md	(revision 208241)
+++ config/aarch64/aarch64.md	(working copy)
@@ -106,6 +106,7 @@ (define_c_enum "unspec" [
 
 (define_c_enum "unspecv" [
     UNSPECV_EH_RETURN		; Represent EH_RETURN
+    UNSPECV_BLOCKAGE		; Represent 'blockage' insn
   ]
 )
 
@@ -286,6 +287,14 @@ (define_insn "casesi_dispatch"
    (set_attr "type" "branch")]
 )
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")
+   (set_attr "type" "no_insn")]
+)
+
 (define_insn "nop"
   [(unspec[(const_int 0)] UNSPEC_NOP)]
   ""
Index: config/picochip/picochip.md
===================================================================
--- config/picochip/picochip.md	(revision 208241)
+++ config/picochip/picochip.md	(working copy)
@@ -111,6 +111,9 @@ (define_constants
    ; Internal TSTPORT instruction, used to generate a single TSTPORT
    ; instruction for use in the testport branch split.
    (UNSPEC_INTERNAL_TESTPORT        19)
+
+   ; Blockage instruction
+   (UNSPEC_BLOCKAGE 20)
   ]
 )
 
@@ -926,6 +929,12 @@ (define_expand "movmemhi"
 ;; NOP
 ;;===========================================================================
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPEC_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")])
+
 ;; No-operation (NOP)
 (define_insn "nop"
   [(const_int 0)]
Index: config/arc/arc.md
===================================================================
--- config/arc/arc.md	(revision 208241)
+++ config/arc/arc.md	(working copy)
@@ -124,6 +124,7 @@ (define_constants
    (VUNSPEC_SR 26) ; blockage insn for writing to an auxiliary register
    (VUNSPEC_TRAP_S 27) ; blockage insn for trap_s generation
    (VUNSPEC_UNIMP_S 28) ; blockage insn for unimp_s generation
+   (VUNSPEC_BLOCKAGE 29); blockage insn
 
    (R0_REG 0)
    (R1_REG 1)
@@ -3871,6 +3872,13 @@ (define_insn "call_value_prof"
    (set_attr "predicable" "yes,yes")
    (set_attr "length" "4,8")])
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] VUNSPEC_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")
+   (set_attr "type" "misc")])
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/mcore/mcore.md
===================================================================
--- config/mcore/mcore.md	(revision 208241)
+++ config/mcore/mcore.md	(working copy)
@@ -22,6 +22,14 @@
 
 
 
+(define_c_enum "unspecv" [
+    UNSPECV_BLOCKAGE
+    UNSPECV_CONSTANT
+    UNSPECV_ALIGN
+    UNSPECV_TABLE
+  ]
+)
+
 ;; -------------------------------------------------------------------------
 ;; Attributes
 ;; -------------------------------------------------------------------------
@@ -1583,6 +1591,13 @@ (define_insn "call_value_struct"
 ;; Misc insns
 ;; ------------------------------------------------------------------------
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
@@ -2924,7 +2939,7 @@ (define_peephole
 ;;; 4 byte integer in line.
 
 (define_insn "consttable_4"
- [(unspec_volatile [(match_operand:SI 0 "general_operand" "=g")] 0)]
+ [(unspec_volatile [(match_operand:SI 0 "general_operand" "=g")] UNSPECV_CONSTANT)]
  ""
  "*
 {
@@ -2936,14 +2951,14 @@ (define_insn "consttable_4"
 ;;; align to a four byte boundary.
 
 (define_insn "align_4"
- [(unspec_volatile [(const_int 0)] 1)]
+ [(unspec_volatile [(const_int 0)] UNSPECV_ALIGN)]
  ""
  ".align 2")
 
 ;;; Handle extra constant pool entries created during final pass.
 
 (define_insn "consttable_end"
-  [(unspec_volatile [(const_int 0)] 2)]
+  [(unspec_volatile [(const_int 0)] UNSPECV_TABLE)]
   ""
   "* return mcore_output_jump_label_table ();")
 \f
Index: config/score/score.md
===================================================================
--- config/score/score.md	(revision 208241)
+++ config/score/score.md	(working copy)
@@ -68,7 +68,9 @@ (define_constants
     (LCW            8)
     (LCE            9)
 
-    (SFFS           10)])
+    (SFFS           10)
+
+    (BLOCKAGE       11)])
 
 (define_attr "type"
   "unknown,branch,jump,call,load,store,cmp,arith,move,const,nop,mul,div,cndmv,fce,tce,fsr,tsr,fcr,tcr"
@@ -1595,6 +1597,13 @@ (define_insn "return_internal_score7"
   "(TARGET_SCORE7 || TARGET_SCORE7D)"
   "br%S0\t%0")
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/msp430/msp430.md
===================================================================
--- config/msp430/msp430.md	(revision 208241)
+++ config/msp430/msp430.md	(working copy)
@@ -38,7 +38,7 @@ (define_c_enum "unspec"
 
    UNS_GROW_AND_SWAP
    UNS_SWAP_AND_SHRINK
-   
+
    UNS_DINT
    UNS_EINT
    UNS_PUSH_INTR
@@ -47,6 +47,8 @@ (define_c_enum "unspec"
    UNS_BIS_SR
 
    UNS_REFSYM_NEED_EXIT
+
+   UNS_BLOCKAGE
   ])
   
 (include "predicates.md")
@@ -1254,6 +1256,12 @@ (define_insn "*bitbranch<mode>4_z"
 ;;------------------------------------------------------------
 ;; Misc
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNS_BLOCKAGE)]
+  ""
+  ""
+)
+
 (define_insn "nop"
   [(const_int 0)]
   "1"
Index: config/rl78/rl78.md
===================================================================
--- config/rl78/rl78.md	(revision 208241)
+++ config/rl78/rl78.md	(working copy)
@@ -51,8 +51,15 @@ (define_constants
    (UNS_TRAMPOLINE_UNINIT	21)
    (UNS_NONLOCAL_GOTO		22)
 
+   (UNS_BLOCKAGE 23)
    ])
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNS_BLOCKAGE)]
+  ""
+  ""
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/h8300/h8300.md
===================================================================
--- config/h8300/h8300.md	(revision 208241)
+++ config/h8300/h8300.md	(working copy)
@@ -48,7 +48,8 @@
 
 (define_constants
   [(UNSPEC_INCDEC	0)
-   (UNSPEC_MONITOR	1)])
+   (UNSPEC_MONITOR	1)
+   (UNSPEC_BLOCKAGE     2)])
 
 (define_constants
   [(UNSPEC_MOVMD	100)
@@ -2533,6 +2534,14 @@ (define_insn "call_value"
 		      (const_int 2)
 		      (const_int 4)))])
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPEC_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "cc" "none")
+   (set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/v850/v850.md
===================================================================
--- config/v850/v850.md	(revision 208241)
+++ config/v850/v850.md	(working copy)
@@ -44,6 +44,7 @@ (define_constants
    (CC_REGNUM       		32)         ; Condition code pseudo register
    (FCC_REGNUM      		33)         ; Floating Condition code pseudo register
    (UNSPEC_LOOP                200)         ; loop counter
+   (UNSPECV_BLOCKAGE           201)         ; blockage
   ]
 )
 
@@ -1824,6 +1825,13 @@ (define_insn "call_value_internal_long"
    (set_attr "cc" "clobber,clobber")]
 )
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")
+   (set_attr "cc" "none")])
+
 (define_insn "nop"
   [(const_int 0)]
   ""
Index: config/mmix/mmix.md
===================================================================
--- config/mmix/mmix.md	(revision 208241)
+++ config/mmix/mmix.md	(working copy)
@@ -24,11 +24,13 @@
 ;; See file "rtl.def" for documentation on define_insn, match_*, et al.
 
 ;; Uses of UNSPEC in this file:
-;; UNSPEC_VOLATILE:
-;;
-;;	0	sync_icache (sync icache before trampoline jump)
-;;	1	nonlocal_goto_receiver
-;;
+
+(define_c_enum "unspecv" [
+    UNSPECV_BLOCKAGE  ;; blockage
+    UNSPECV_SYNC      ;; sync_icache (sync icache before trampoline jump
+    UNSPECV_NONLOCAL  ;; nonlocal_goto_receiver
+  ]
+)
 
 ;; The order of insns is as in Node: Standard Names, with smaller modes
 ;; before bigger modes.
@@ -1088,6 +1090,12 @@ (define_expand "epilogue"
   ""
   "mmix_expand_epilogue ();")
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""
@@ -1119,7 +1127,7 @@ (define_insn "tablejump"
 ;; of "pop 0,0" until rO equals the saved value.  (If it goes lower, we
 ;; should die with a trap.)
 (define_expand "nonlocal_goto_receiver"
-  [(parallel [(unspec_volatile [(match_dup 1)] 1)
+  [(parallel [(unspec_volatile [(match_dup 1)] UNSPECV_NONLOCAL)
 	      (clobber (scratch:DI))
 	      (clobber (reg:DI MMIX_rJ_REGNUM))])
    (set (reg:DI MMIX_rJ_REGNUM) (match_dup 0))]
@@ -1146,7 +1154,7 @@ (define_expand "nonlocal_goto_receiver"
 ;; address and re-use them after the register stack unwind, so it's best
 ;; to form the address ourselves.
 (define_insn "*nonlocal_goto_receiver_expanded"
-  [(unspec_volatile [(match_operand:DI 1 "frame_pointer_operand" "Yf")] 1)
+  [(unspec_volatile [(match_operand:DI 1 "frame_pointer_operand" "Yf")] UNSPECV_NONLOCAL)
    (clobber (match_scratch:DI 0 "=&r"))
    (clobber (reg:DI MMIX_rJ_REGNUM))]
   ""
@@ -1233,7 +1241,7 @@ (define_insn "*nxor"
 
 (define_insn "sync_icache"
   [(unspec_volatile [(match_operand:DI 0 "memory_operand" "m")
-		     (match_operand:DI 1 "const_int_operand" "I")] 0)]
+		     (match_operand:DI 1 "const_int_operand" "I")] UNSPECV_SYNC)]
   ""
   "SYNCID %1,%0")
 
Index: config/cr16/cr16.md
===================================================================
--- config/cr16/cr16.md	(revision 208241)
+++ config/cr16/cr16.md	(working copy)
@@ -40,6 +40,12 @@ (define_constants
   ]
 )
 
+;; UNSPEC_VOLATILE usage
+(define_constants
+  [(UNSPECV_BLOCKAGE            0)
+  ]
+)
+
 ;; Attributes
 (define_attr "length" "" (const_int 2))
 
@@ -1053,6 +1059,12 @@ (define_insn "cr16_call_value_insn_jump"
   [(set_attr "length" "2")]
 )
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
 
 ;;  Nop
 (define_insn "nop"
Index: config/bfin/bfin.md
===================================================================
--- config/bfin/bfin.md	(revision 208241)
+++ config/bfin/bfin.md	(working copy)
@@ -147,7 +147,8 @@ (define_constants
    (UNSPEC_VOLATILE_LOAD_FUNCDESC 3)
    (UNSPEC_VOLATILE_STORE_EH_HANDLER 4)
    (UNSPEC_VOLATILE_DUMMY 5)
-   (UNSPEC_VOLATILE_STALL 6)])
+   (UNSPEC_VOLATILE_STALL 6)
+   (UNSPEC_VOLATILE_BLOCKAGE 7)])
 
 (define_constants
   [(MACFLAG_NONE 0)
@@ -2555,6 +2556,13 @@ (define_expand "cstoresi4"
   DONE;
 })
 
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPEC_VOLATILE_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
 (define_insn "nop"
   [(const_int 0)]
   ""

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-01 14:37 [RFC] Do not consider volatile asms as optimization barriers #1 Eric Botcazou
@ 2014-03-01 16:54 ` Richard Sandiford
  2014-03-02 10:43   ` Eric Botcazou
  2014-03-03 21:07 ` Mike Stump
  2014-03-03 22:01 ` Richard Sandiford
  2 siblings, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2014-03-01 16:54 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov

Eric Botcazou <ebotcazou@adacore.com> writes:
> There seems to be a sufficiently large consensus that volatile asms should not 
> be treated as full optimization barriers by the compiler.  This patch is a 
> first step towards implementing this conclusion and aims only at addressing 
> the code quality regressions introduced by
>   http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=193802
> on the 4.8 branch and mainline for volatile asms.
>
> It introduces a new, temporary predicate really_volatile_insn_p and invokes it 
> from the 3 places in cse.c, cselib.c and dse.c which were touched above.  But 
> this comes with a side effect: the "blockage" standard pattern needs to be 
> adjusted to always use an UNSPEC_VOLATILE.  That's already the case for all 
> the architectures that define it, but 21 architectures (aarch64 arc avr bfin 
> cr16 cris epiphany h8300 m32c mcore mmix mn10300 moxie msp430 nds32 picochip 
> rl78 rx score v850 xtensa) don't define it and therefore need to be adjusted.
>
> Tested on x86_64-suse-linux and by building cc1 and compiling a relevant 
> testcase for the 21 aforementioned architectures.

Thanks for doing this.  FWIW I agree it's probably the best stop-gap fix.
But the implication seems to be that unspec_volatile and volatile asms
are volatile in different ways.  IMO they're volatile in the same way
and the problems for volatile asms apply to unspec_volatile too.

E.g. although cse.c will flush the table for unspec_volatile,
it isn't the case that unspec_volatile forces a containing function
to save all call-saved registers.  That would be excessive for a plain
blockage instruction.  So again we seem to be assuming one thing in places
like cse.c and another in the register allocator.  Code that uses the DF
framework will also assume that registers are not implicitly clobbered
by an unspec_volatile:

    case ASM_OPERANDS:
    case UNSPEC_VOLATILE:
    case TRAP_IF:
    case ASM_INPUT:
      {
	/* Traditional and volatile asm instructions must be
	   considered to use and clobber all hard registers, all
	   pseudo-registers and all of memory.  So must TRAP_IF and
	   UNSPEC_VOLATILE operations.

	   Consider for instance a volatile asm that changes the fpu
	   rounding mode.  An insn should not be moved across this
	   even if it only uses pseudo-regs because it might give an
	   incorrectly rounded result.

	   However, flow.c's liveness computation did *not* do this,
	   giving the reasoning as " ?!? Unfortunately, marking all
	   hard registers as live causes massive problems for the
	   register allocator and marking all pseudos as live creates
	   mountains of uninitialized variable warnings."

	   In order to maintain the status quo with regard to liveness
	   and uses, we do what flow.c did and just mark any regs we
	   can find in ASM_OPERANDS as used.  In global asm insns are
	   scanned and regs_asm_clobbered is filled out.

	   For all ASM_OPERANDS, we must traverse the vector of input
	   operands.  We can not just fall through here since then we
	   would be confused by the ASM_INPUT rtx inside ASM_OPERANDS,
	   which do not indicate traditional asms unlike their normal
	   usage.  */
	if (code == ASM_OPERANDS)
	  {
	    int j;

	    for (j = 0; j < ASM_OPERANDS_INPUT_LENGTH (x); j++)
	      df_uses_record (collection_rec, &ASM_OPERANDS_INPUT (x, j),
			      DF_REF_REG_USE, bb, insn_info, flags);
	    return;
	  }
	break;
      }

Also, ira-lives.c (which tracks the liveness of both pseudo and hard
registers) doesn't mention "volatile" at all.

So most passes assume that no pseudos or hard registers will be
implicitly clobbered by unspec_volatile, just like for a volatile asm.
And IMO that's right.  I think the rule should be the same for volatile
asms and unspec_volatiles, and the same for registers as it already is for
memory: if the instruction clobbers something, it should say so explicitly.
Volatile itself should:

(a) prevent deletion or duplication of the operation
(b) prevent reordering wrt other volatiles
(c) prevent the operation from being considered equivalent to any other
    operation (even if it's structurally identical and has the same inputs)

but nothing beyond that.

So in terms of where we go from here, I'd hope we'd add something like
fake hard registers to track any processor mode of interest, such as FP
rounding mode once that is modelled properly.  Then anything that relies
on a specific mode would use that register and anything that changes the
mode would set the register, meaning we have a proper dependency chain.
I think we should do the same for whatever unspec_volatiles caused the
RTL CSE & co. to be so conservative.  Trying to leave it implicit seems
too error-prone, because then you have to make sure that every rtl pass
agrees on what the implicit behaviour is.

Thanks,
Richard


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-01 16:54 ` Richard Sandiford
@ 2014-03-02 10:43   ` Eric Botcazou
  2014-03-03  8:01     ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Botcazou @ 2014-03-02 10:43 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Yury Gribov

> Thanks for doing this.  FWIW I agree it's probably the best stop-gap fix.
> But the implication seems to be that unspec_volatile and volatile asms
> are volatile in different ways.  IMO they're volatile in the same way
> and the problems for volatile asms apply to unspec_volatile too.

I disagree, we need a simple way for the RTL middle-end as well as the back-
ends to block most optimizations across a specific point (e.g. a non-local 
label as in HP's fix) and UNSPEC_VOLATILE is the best candidate, at least in 
the short term.

> E.g. although cse.c will flush the table for unspec_volatile,
> it isn't the case that unspec_volatile forces a containing function
> to save all call-saved registers.  That would be excessive for a plain
> blockage instruction.  So again we seem to be assuming one thing in places
> like cse.c and another in the register allocator.  Code that uses the DF
> framework will also assume that registers are not implicitly clobbered
> by an unspec_volatile:
> [...]
> Also, ira-lives.c (which tracks the liveness of both pseudo and hard
> registers) doesn't mention "volatile" at all.

Yes, the definition of a blockage instruction is somewhat vague and I agree 
that it shoudn't cause registers to be spilled.  But it needs to block most, 
if not all, optimizations.

> So most passes assume that no pseudos or hard registers will be
> implicitly clobbered by unspec_volatile, just like for a volatile asm.
> And IMO that's right.  I think the rule should be the same for volatile
> asms and unspec_volatiles, and the same for registers as it already is for
> memory: if the instruction clobbers something, it should say so explicitly.

IMO that would buy us nothing and, on the contrary, would add complexity where 
there currently isn't.  We really need a simple blockage instruction.

> Volatile itself should:
> 
> (a) prevent deletion or duplication of the operation
> (b) prevent reordering wrt other volatiles
> (c) prevent the operation from being considered equivalent to any other
>     operation (even if it's structurally identical and has the same inputs)
> 
> but nothing beyond that.

Maybe UNSPEC_VOLATILE is a misnomer then and we should allow volatile UNSPECs 
along the above lines.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-02 10:43   ` Eric Botcazou
@ 2014-03-03  8:01     ` Richard Sandiford
  2014-03-03 10:16       ` Richard Biener
  2014-03-03 11:19       ` Eric Botcazou
  0 siblings, 2 replies; 36+ messages in thread
From: Richard Sandiford @ 2014-03-03  8:01 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov

Eric Botcazou <ebotcazou@adacore.com> writes:
>> Thanks for doing this.  FWIW I agree it's probably the best stop-gap fix.
>> But the implication seems to be that unspec_volatile and volatile asms
>> are volatile in different ways.  IMO they're volatile in the same way
>> and the problems for volatile asms apply to unspec_volatile too.
>
> I disagree, we need a simple way for the RTL middle-end as well as the back-
> ends to block most optimizations across a specific point (e.g. a non-local 
> label as in HP's fix) and UNSPEC_VOLATILE is the best candidate, at least in 
> the short term.

I don't agree it's the best candidate if...

>> E.g. although cse.c will flush the table for unspec_volatile,
>> it isn't the case that unspec_volatile forces a containing function
>> to save all call-saved registers.  That would be excessive for a plain
>> blockage instruction.  So again we seem to be assuming one thing in places
>> like cse.c and another in the register allocator.  Code that uses the DF
>> framework will also assume that registers are not implicitly clobbered
>> by an unspec_volatile:
>> [...]
>> Also, ira-lives.c (which tracks the liveness of both pseudo and hard
>> registers) doesn't mention "volatile" at all.
>
> Yes, the definition of a blockage instruction is somewhat vague and I agree 
> that it shoudn't cause registers to be spilled.  But it needs to block most, 
> if not all, optimizations.

...it's so loosely defined.  If non-local labels are the specific problem,
I think it'd be better to limit the flush to that.

I'm back to throwing examples around, sorry, but take the MIPS testcase:

    volatile int x = 1;

    void foo (void)
    {
      x = 1;
      __builtin_mips_set_fcsr (0);
      x = 2;
    }

where __builtin_mips_set_fcsr is a handy way of getting unspec_volatile.
(I'm not interested in what the function does here.)  Even at -O2,
the cse.c code successfully prevents %hi(x) from being shared,
as you'd expect:

        li      $3,1                    # 0x1
        lui     $2,%hi(x)
        sw      $3,%lo(x)($2)
        move    $2,$0
        ctc1    $2,$31
        li      $3,2                    # 0x2
        lui     $2,%hi(x)
        sw      $3,%lo(x)($2)
        j       $31
        nop

But put it in a loop:

    void frob (void)
    {
      for (;;)
        {
          x = 1;
          __builtin_mips_set_fcsr (0);
          x = 2;
        }
    }

and we get the rather bizarre code:

        lui     $2,%hi(x)
        li      $6,1                    # 0x1
        move    $5,$0
        move    $4,$2
        li      $3,2                    # 0x2
        .align  3
.L3:
        sw      $6,%lo(x)($2)
        ctc1    $5,$31
        sw      $3,%lo(x)($4)
        j       .L3
        lui     $2,%hi(x)

Here the _second_ %hi(x), the 1 and the 2 have been hoisted but the first
%hi(x) is reloaded each time.  So what's the correct behaviour here?
Should the hoisting of the second %hi(x) have been disabled because the
loop contains an unspec_volatile?  What about the 1 (from the first store)
and the 2?

If instead it was:

   for (i = 0; i < 100; i++)

would converting to a hardware do-loop be acceptable?

>> So most passes assume that no pseudos or hard registers will be
>> implicitly clobbered by unspec_volatile, just like for a volatile asm.
>> And IMO that's right.  I think the rule should be the same for volatile
>> asms and unspec_volatiles, and the same for registers as it already is for
>> memory: if the instruction clobbers something, it should say so explicitly.
>
> IMO that would buy us nothing and, on the contrary, would add complexity where 
> there currently isn't.  We really need a simple blockage instruction.
>
>> Volatile itself should:
>> 
>> (a) prevent deletion or duplication of the operation
>> (b) prevent reordering wrt other volatiles
>> (c) prevent the operation from being considered equivalent to any other
>>     operation (even if it's structurally identical and has the same inputs)
>> 
>> but nothing beyond that.
>
> Maybe UNSPEC_VOLATILE is a misnomer then and we should allow volatile UNSPECs 
> along the above lines.

That'd be fine with me, especially since with the patch it sounds like
using volatile asm would produce better code than a built-in function
that expands to an unspec_volatile, whereas IMO the opposite should
always be true.

But I still think we need a similar list for what unspec_volatile
means now, if we decide to keep something with the current meaning.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03  8:01     ` Richard Sandiford
@ 2014-03-03 10:16       ` Richard Biener
  2014-03-03 11:05         ` Eric Botcazou
  2014-03-03 11:19       ` Eric Botcazou
  1 sibling, 1 reply; 36+ messages in thread
From: Richard Biener @ 2014-03-03 10:16 UTC (permalink / raw)
  To: Eric Botcazou, GCC Patches, Yury Gribov, Richard Sandiford

On Mon, Mar 3, 2014 at 9:01 AM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Eric Botcazou <ebotcazou@adacore.com> writes:
>>> Thanks for doing this.  FWIW I agree it's probably the best stop-gap fix.
>>> But the implication seems to be that unspec_volatile and volatile asms
>>> are volatile in different ways.  IMO they're volatile in the same way
>>> and the problems for volatile asms apply to unspec_volatile too.
>>
>> I disagree, we need a simple way for the RTL middle-end as well as the back-
>> ends to block most optimizations across a specific point (e.g. a non-local
>> label as in HP's fix) and UNSPEC_VOLATILE is the best candidate, at least in
>> the short term.
>
> I don't agree it's the best candidate if...
>
>>> E.g. although cse.c will flush the table for unspec_volatile,
>>> it isn't the case that unspec_volatile forces a containing function
>>> to save all call-saved registers.  That would be excessive for a plain
>>> blockage instruction.  So again we seem to be assuming one thing in places
>>> like cse.c and another in the register allocator.  Code that uses the DF
>>> framework will also assume that registers are not implicitly clobbered
>>> by an unspec_volatile:
>>> [...]
>>> Also, ira-lives.c (which tracks the liveness of both pseudo and hard
>>> registers) doesn't mention "volatile" at all.
>>
>> Yes, the definition of a blockage instruction is somewhat vague and I agree
>> that it shoudn't cause registers to be spilled.  But it needs to block most,
>> if not all, optimizations.
>
> ...it's so loosely defined.  If non-local labels are the specific problem,
> I think it'd be better to limit the flush to that.

non-local labels should block most optimizations by the fact they
are a receiver of control flow and thus should have an abnormal
edge coming into them.  If that's not the case (no abnormal edge)
then that's the bug to fix.

Otherwise I agree with Richard.  Please sit down and _exactly_ define
what 'volatile' in an asm provides for guarantees compared to non-volatile
asms.  Likewise do so for volatile UNSPECs.

A volatile shouldn't be a cheap way out of properly enumerating all
uses, defs and clobbers of a stmt.  If volatile is used to tell the
insn has additional uses/defs or clobbers to those explicitely given
the only reason that may be valid is because we cannot explicitely
enumerate those.  But we should fix that instead (for example with
the special register idea or by adding a middle-end wide "special"
"blockage" that you can use/def/clobber).

To better assess the problem at hand can you enumerate the cases
where you need that special easy "blockage" instruction?  With
testcases please?

Note that on GIMPLE even volatiles are not strictly ordered if they
don't have a dependence that orders them (that doesn't mean that
any existing transform deliberately re-orders them, but as shown
with the loop example below such re-ordering can happen
as a side-effect of a valid transform).

> I'm back to throwing examples around, sorry, but take the MIPS testcase:
>
>     volatile int x = 1;
>
>     void foo (void)
>     {
>       x = 1;
>       __builtin_mips_set_fcsr (0);
>       x = 2;
>     }
>
> where __builtin_mips_set_fcsr is a handy way of getting unspec_volatile.
> (I'm not interested in what the function does here.)  Even at -O2,
> the cse.c code successfully prevents %hi(x) from being shared,
> as you'd expect:
>
>         li      $3,1                    # 0x1
>         lui     $2,%hi(x)
>         sw      $3,%lo(x)($2)
>         move    $2,$0
>         ctc1    $2,$31
>         li      $3,2                    # 0x2
>         lui     $2,%hi(x)
>         sw      $3,%lo(x)($2)
>         j       $31
>         nop
>
> But put it in a loop:
>
>     void frob (void)
>     {
>       for (;;)
>         {
>           x = 1;
>           __builtin_mips_set_fcsr (0);
>           x = 2;
>         }
>     }
>
> and we get the rather bizarre code:
>
>         lui     $2,%hi(x)
>         li      $6,1                    # 0x1
>         move    $5,$0
>         move    $4,$2
>         li      $3,2                    # 0x2
>         .align  3
> .L3:
>         sw      $6,%lo(x)($2)
>         ctc1    $5,$31
>         sw      $3,%lo(x)($4)
>         j       .L3
>         lui     $2,%hi(x)
>
> Here the _second_ %hi(x), the 1 and the 2 have been hoisted but the first
> %hi(x) is reloaded each time.  So what's the correct behaviour here?
> Should the hoisting of the second %hi(x) have been disabled because the
> loop contains an unspec_volatile?  What about the 1 (from the first store)
> and the 2?
>
> If instead it was:
>
>    for (i = 0; i < 100; i++)
>
> would converting to a hardware do-loop be acceptable?
>
>>> So most passes assume that no pseudos or hard registers will be
>>> implicitly clobbered by unspec_volatile, just like for a volatile asm.
>>> And IMO that's right.  I think the rule should be the same for volatile
>>> asms and unspec_volatiles, and the same for registers as it already is for
>>> memory: if the instruction clobbers something, it should say so explicitly.
>>
>> IMO that would buy us nothing and, on the contrary, would add complexity where
>> there currently isn't.  We really need a simple blockage instruction.
>>
>>> Volatile itself should:
>>>
>>> (a) prevent deletion or duplication of the operation
>>> (b) prevent reordering wrt other volatiles
>>> (c) prevent the operation from being considered equivalent to any other
>>>     operation (even if it's structurally identical and has the same inputs)
>>>
>>> but nothing beyond that.
>>
>> Maybe UNSPEC_VOLATILE is a misnomer then and we should allow volatile UNSPECs
>> along the above lines.
>
> That'd be fine with me, especially since with the patch it sounds like
> using volatile asm would produce better code than a built-in function
> that expands to an unspec_volatile, whereas IMO the opposite should
> always be true.
>
> But I still think we need a similar list for what unspec_volatile
> means now, if we decide to keep something with the current meaning.
>
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 10:16       ` Richard Biener
@ 2014-03-03 11:05         ` Eric Botcazou
  2014-03-03 11:14           ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Botcazou @ 2014-03-03 11:05 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Yury Gribov, Richard Sandiford

> non-local labels should block most optimizations by the fact they
> are a receiver of control flow and thus should have an abnormal
> edge coming into them.  If that's not the case (no abnormal edge)
> then that's the bug to fix.

It's (of course) more complicated, you need to look at HP's fix and testcase 
to see why we need a full optimization barrier.  See also the prologue and 
epilogue of many architectures which also need a blockage when they are 
establishing or destroying the frame.

> Otherwise I agree with Richard.  Please sit down and _exactly_ define
> what 'volatile' in an asm provides for guarantees compared to non-volatile
> asms.  Likewise do so for volatile UNSPECs.

Too late, we apparently all agree about what volatile asms and future volatile 
UNSPECs mean. :-)  The remaining point is UNSPEC_VOLATILE, but the discussion 
can be deferred until the next stage 1.

> A volatile shouldn't be a cheap way out of properly enumerating all
> uses, defs and clobbers of a stmt.  If volatile is used to tell the
> insn has additional uses/defs or clobbers to those explicitely given
> the only reason that may be valid is because we cannot explicitely
> enumerate those.  But we should fix that instead (for example with
> the special register idea or by adding a middle-end wide "special"
> "blockage" that you can use/def/clobber).

For the time being this special blockage is UNSPEC_VOLATILE for RTL.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 11:05         ` Eric Botcazou
@ 2014-03-03 11:14           ` Richard Sandiford
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2014-03-03 11:14 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, gcc-patches, Yury Gribov

Eric Botcazou <ebotcazou@adacore.com> writes:
>> non-local labels should block most optimizations by the fact they
>> are a receiver of control flow and thus should have an abnormal
>> edge coming into them.  If that's not the case (no abnormal edge)
>> then that's the bug to fix.
>
> It's (of course) more complicated, you need to look at HP's fix and testcase 
> to see why we need a full optimization barrier.  See also the prologue and 
> epilogue of many architectures which also need a blockage when they are 
> establishing or destroying the frame.

But the prologue/epilogue case often doesn't need to be a full blockage.
We could move a load-immediate instruction -- or even an accesss to known-
global memory -- before the allocation or after the deallocation.  This can
actually be important on architectures that use load-multiple to restore the
return register and want the prefetcher to see the target address as early
as possible.

So I think the prologue and epilogue is one case where we really do want
to spell out what's clobbered by the allocation and deallocation.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03  8:01     ` Richard Sandiford
  2014-03-03 10:16       ` Richard Biener
@ 2014-03-03 11:19       ` Eric Botcazou
  2014-03-03 13:12         ` Richard Sandiford
  1 sibling, 1 reply; 36+ messages in thread
From: Eric Botcazou @ 2014-03-03 11:19 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Yury Gribov

> ...it's so loosely defined.  If non-local labels are the specific problem,
> I think it'd be better to limit the flush to that.

No, there was "e.g." written so non-local labels are not the only problem.

> I'm back to throwing examples around, sorry, but take the MIPS testcase:
> 
>     volatile int x = 1;
> 
>     void foo (void)
>     {
>       x = 1;
>       __builtin_mips_set_fcsr (0);
>       x = 2;
>     }
> 
> where __builtin_mips_set_fcsr is a handy way of getting unspec_volatile.
> (I'm not interested in what the function does here.)  Even at -O2,
> the cse.c code successfully prevents %hi(x) from being shared,
> as you'd expect:
> 
>         li      $3,1                    # 0x1
>         lui     $2,%hi(x)
>         sw      $3,%lo(x)($2)
>         move    $2,$0
>         ctc1    $2,$31
>         li      $3,2                    # 0x2
>         lui     $2,%hi(x)
>         sw      $3,%lo(x)($2)
>         j       $31
>         nop
> 
> But put it in a loop:
> 
>     void frob (void)
>     {
>       for (;;)
>         {
>           x = 1;
>           __builtin_mips_set_fcsr (0);
>           x = 2;
>         }
>     }
> 
> and we get the rather bizarre code:
> 
>         lui     $2,%hi(x)
>         li      $6,1                    # 0x1
>         move    $5,$0
>         move    $4,$2
>         li      $3,2                    # 0x2
>         .align  3
> .L3:
>         sw      $6,%lo(x)($2)
>         ctc1    $5,$31
>         sw      $3,%lo(x)($4)
>         j       .L3
>         lui     $2,%hi(x)
> 
> Here the _second_ %hi(x), the 1 and the 2 have been hoisted but the first
> %hi(x) is reloaded each time.  So what's the correct behaviour here?
> Should the hoisting of the second %hi(x) have been disabled because the
> loop contains an unspec_volatile?  What about the 1 (from the first store)
> and the 2?

Well, I personally wouldn't spend much time on the code generated in a loop 
containing an UNSPEC_VOLATILE.  If an instruction or a builtin is supposed to 
be performance-sensitive, then don't use an UNSPEC_VOLATILE by all means and 
properly model it instead!

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 11:19       ` Eric Botcazou
@ 2014-03-03 13:12         ` Richard Sandiford
  2014-03-03 22:58           ` Eric Botcazou
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2014-03-03 13:12 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov

Eric Botcazou <ebotcazou@adacore.com> writes:
>>> ...it's so loosely defined.  If non-local labels are the specific problem,
>> I think it'd be better to limit the flush to that.
>
> No, there was "e.g." written so non-local labels are not the only problem.

What are the others though?  As discussed in the other subthread,
I don't think prologue and epilogue barriers are quite the same.

>> I'm back to throwing examples around, sorry, but take the MIPS testcase:
>> 
>>     volatile int x = 1;
>> 
>>     void foo (void)
>>     {
>>       x = 1;
>>       __builtin_mips_set_fcsr (0);
>>       x = 2;
>>     }
>> 
>> where __builtin_mips_set_fcsr is a handy way of getting unspec_volatile.
>> (I'm not interested in what the function does here.)  Even at -O2,
>> the cse.c code successfully prevents %hi(x) from being shared,
>> as you'd expect:
>> 
>>         li      $3,1                    # 0x1
>>         lui     $2,%hi(x)
>>         sw      $3,%lo(x)($2)
>>         move    $2,$0
>>         ctc1    $2,$31
>>         li      $3,2                    # 0x2
>>         lui     $2,%hi(x)
>>         sw      $3,%lo(x)($2)
>>         j       $31
>>         nop
>> 
>> But put it in a loop:
>> 
>>     void frob (void)
>>     {
>>       for (;;)
>>         {
>>           x = 1;
>>           __builtin_mips_set_fcsr (0);
>>           x = 2;
>>         }
>>     }
>> 
>> and we get the rather bizarre code:
>> 
>>         lui     $2,%hi(x)
>>         li      $6,1                    # 0x1
>>         move    $5,$0
>>         move    $4,$2
>>         li      $3,2                    # 0x2
>>         .align  3
>> .L3:
>>         sw      $6,%lo(x)($2)
>>         ctc1    $5,$31
>>         sw      $3,%lo(x)($4)
>>         j       .L3
>>         lui     $2,%hi(x)
>> 
>> Here the _second_ %hi(x), the 1 and the 2 have been hoisted but the first
>> %hi(x) is reloaded each time.  So what's the correct behaviour here?
>> Should the hoisting of the second %hi(x) have been disabled because the
>> loop contains an unspec_volatile?  What about the 1 (from the first store)
>> and the 2?
>
> Well, I personally wouldn't spend much time on the code generated in a loop 
> containing an UNSPEC_VOLATILE.  If an instruction or a builtin is supposed to 
> be performance-sensitive, then don't use an UNSPEC_VOLATILE by all means and 
> properly model it instead!

That doesn't really answer the question though.  What's the correct
behaviour for an unspec volatile in a loop?  I don't think it's what
we did in the example above, since it doesn't seem self-consistent.
And "not spending too much time" is again a bit vague in terms of
saying what's right and what's wrong.

My point is that if the construct is well-defined enough to handle the
important things we want it to handle, the answer should be known to somebody,
even if it isn't to me. :-)

Thanks,
Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-01 14:37 [RFC] Do not consider volatile asms as optimization barriers #1 Eric Botcazou
  2014-03-01 16:54 ` Richard Sandiford
@ 2014-03-03 21:07 ` Mike Stump
  2014-03-03 22:01 ` Richard Sandiford
  2 siblings, 0 replies; 36+ messages in thread
From: Mike Stump @ 2014-03-03 21:07 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov

On Mar 1, 2014, at 6:36 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> It introduces a new, temporary predicate really_volatile_insn_p

really is a really horrible name.  Hint, if cs domain specific wikipedia describe what you were doing, what would the page be called?  really is unlikely to be it.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-01 14:37 [RFC] Do not consider volatile asms as optimization barriers #1 Eric Botcazou
  2014-03-01 16:54 ` Richard Sandiford
  2014-03-03 21:07 ` Mike Stump
@ 2014-03-03 22:01 ` Richard Sandiford
  2014-03-04  7:27   ` Richard Sandiford
                     ` (3 more replies)
  2 siblings, 4 replies; 36+ messages in thread
From: Richard Sandiford @ 2014-03-03 22:01 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov

AIUI:

(1) The main point of http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html
    was that we had:

      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
      /* This might change the hard frame pointer in ways that aren't
	 apparent to early optimization passes, so force a clobber.  */
      emit_clobber (hard_frame_pointer_rtx);

    and, after elimination, the clobber killed the assignment to the
    hard frame pointer.  Which it could. :-)

(2) Aassuming for simplicity that STARTING_FRAME_OFFSET == 0, we end up
    with an assignment like:

      (set frame_pointer_rtx hard_frame_pointer_rtx)

    And the problem is that frame_pointer_rtx often isn't a real register:
    it gets eliminated to hard_frame_pointer_rtx+X, where X isn't known
    until RA time.  I.e. the assignment is really:

      (set (plus hard_frame_pointer_rtx X) hard_frame_pointer_rtx)

    which becomes:

      (set hard_frame_pointer_rtx (plus hard_frame_pointer_rtx -X))

    Before RA it isn't obvious that the assignment clobbers
    hard_frame_pointer_rtx, so it would seem to most passes that
    frame_pointer_rtx and hard_frame_pointer_rtx are equivalent
    after the set.  That was why the clobber was there.

(3) We chose to fix this by deleting the explicit clobber and relying on
    the magicness of unspec_volatile to imply the clobber instead.

But I don't think (3) is a good idea.  We should instead fix the dataflow
so that it's accurate.

Long term it would be good to have a different representation for (2),
but I don't have any bright ideas what that might be.  Until then I think
we can model the dataflow accurately by having a use as well as a clobber
of hard_frame_pointer_rtx.  I went back to r192682:

  http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg02350.html

and saw the -m32 tests fail without the builtins.c part of the patch.
They passed after it.  I then went to r200643:

2013-07-03  Hans-Peter Nilsson  <hp@bitrange.com>

	PR middle-end/55030
	* stmt.c (expand_nl_goto_receiver): Remove almost-copy of
	expand_builtin_setjmp_receiver.
	(expand_label): Adjust, call expand_builtin_setjmp_receiver
	with NULL for the label parameter.
	* builtins.c (expand_builtin_setjmp_receiver): Don't clobber
	the frame-pointer.  Adjust comments.
	[HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
	only if LABEL is non-NULL.

and applied the full patch.  The tests still passed, despite the removal
of the volatile checks.  (To recap, the volatile checks touched here
were the same ones touched by:

  http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00868.html

Rather than reverting that part I'm removing the check entirely,
since we seem to agree that the original asm handling wasn't necessary.)

I'll run a full test overnight, but does this look like it might be
a way out, at least for 4.9?

Thanks,
Richard


gcc/
	* builtins.c (expand_builtin_setjmp_receiver): Use and clobber
	hard_frame_pointer_rtx.
	* cse.c (cse_insn): Remove volatile check.
	* cselib.c (cselib_process_insn): Likewise.
	* dse.c (scan_insn): Likewise.

Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	2014-03-03 21:47:59.749026019 +0000
+++ gcc/builtins.c	2014-03-03 21:48:00.550030853 +0000
@@ -910,18 +910,27 @@ expand_builtin_setjmp_receiver (rtx rece
 #ifdef HAVE_nonlocal_goto
   if (! HAVE_nonlocal_goto)
 #endif
-    /* First adjust our frame pointer to its actual value.  It was
-       previously set to the start of the virtual area corresponding to
-       the stacked variables when we branched here and now needs to be
-       adjusted to the actual hardware fp value.
-
-       Assignments to virtual registers are converted by
-       instantiate_virtual_regs into the corresponding assignment
-       to the underlying register (fp in this case) that makes
-       the original assignment true.
-       So the following insn will actually be decrementing fp by
-       STARTING_FRAME_OFFSET.  */
-    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
+    {
+      /* First adjust our frame pointer to its actual value.  It was
+	 previously set to the start of the virtual area corresponding to
+	 the stacked variables when we branched here and now needs to be
+	 adjusted to the actual hardware fp value.
+
+	 Assignments to virtual registers are converted by
+	 instantiate_virtual_regs into the corresponding assignment
+	 to the underlying register (fp in this case) that makes
+	 the original assignment true.
+	 So the following insn will actually be decrementing fp by
+	 STARTING_FRAME_OFFSET.  */
+      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
+
+      /* Restoring the frame pointer also modifies the hard frame pointer.
+	 Mark it used (so that the previous assignment remains live once
+	 the frame pointer is eliminated) and clobbered (to represent the
+	 implicit update from the assignment).  */
+      emit_use (hard_frame_pointer_rtx);
+      emit_clobber (hard_frame_pointer_rtx);
+    }
 
 #if !HARD_FRAME_POINTER_IS_ARG_POINTER
   if (fixed_regs[ARG_POINTER_REGNUM])
@@ -965,8 +974,7 @@ expand_builtin_setjmp_receiver (rtx rece
 
   /* We must not allow the code we just generated to be reordered by
      scheduling.  Specifically, the update of the frame pointer must
-     happen immediately, not later.  Similarly, we must block
-     (frame-related) register values to be used across this code.  */
+     happen immediately, not later.  */
   emit_insn (gen_blockage ());
 }
 
Index: gcc/cse.c
===================================================================
--- gcc/cse.c	2014-03-03 21:47:59.869026741 +0000
+++ gcc/cse.c	2014-03-03 21:48:00.625031305 +0000
@@ -5664,11 +5664,6 @@ cse_insn (rtx insn)
 	  invalidate (XEXP (dest, 0), GET_MODE (dest));
       }
 
-  /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything.  */
-  if (NONJUMP_INSN_P (insn)
-      && volatile_insn_p (PATTERN (insn)))
-    flush_hash_table ();
-
   /* Don't cse over a call to setjmp; on some machines (eg VAX)
      the regs restored by the longjmp come from a later time
      than the setjmp.  */
Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c	2014-03-03 21:47:59.870026748 +0000
+++ gcc/cselib.c	2014-03-03 21:48:00.626031311 +0000
@@ -2629,9 +2629,7 @@ cselib_process_insn (rtx insn)
   /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp.  */
   if ((LABEL_P (insn)
        || (CALL_P (insn)
-	   && find_reg_note (insn, REG_SETJMP, NULL))
-       || (NONJUMP_INSN_P (insn)
-	   && volatile_insn_p (PATTERN (insn))))
+	   && find_reg_note (insn, REG_SETJMP, NULL)))
       && !cselib_preserve_constants)
     {
       cselib_reset_table (next_uid);
Index: gcc/dse.c
===================================================================
--- gcc/dse.c	2014-03-03 21:47:59.871026754 +0000
+++ gcc/dse.c	2014-03-03 21:48:00.627031317 +0000
@@ -2470,16 +2470,6 @@ scan_insn (bb_info_t bb_info, rtx insn)
       return;
     }
 
-  /* Cselib clears the table for this case, so we have to essentially
-     do the same.  */
-  if (NONJUMP_INSN_P (insn)
-      && volatile_insn_p (PATTERN (insn)))
-    {
-      add_wild_read (bb_info);
-      insn_info->cannot_delete = true;
-      return;
-    }
-
   /* Look at all of the uses in the insn.  */
   note_uses (&PATTERN (insn), check_mem_read_use, bb_info);
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 13:12         ` Richard Sandiford
@ 2014-03-03 22:58           ` Eric Botcazou
  2014-03-04  7:30             ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Botcazou @ 2014-03-03 22:58 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Yury Gribov

> That doesn't really answer the question though.  What's the correct
> behaviour for an unspec volatile in a loop?  I don't think it's what
> we did in the example above, since it doesn't seem self-consistent.
> And "not spending too much time" is again a bit vague in terms of
> saying what's right and what's wrong.

"not spending too much time" is a polite way to say I don't really care. :-)
If you do, feel free to post a formal definition, a implementation plan and 
maybe a patch at some point.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 22:01 ` Richard Sandiford
@ 2014-03-04  7:27   ` Richard Sandiford
  2014-03-04  9:21   ` Richard Biener
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2014-03-04  7:27 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov

Richard Sandiford <rdsandiford@googlemail.com> writes:
> I'll run a full test overnight, but does this look like it might be
> a way out, at least for 4.9?

FWIW, it passed testing on x86_64-linux-gnu ({,-m32}, all,ada).
Here it is again with an updated cselib.c comment.  OK to install?

Thanks,
Richard


gcc/
	* builtins.c (expand_builtin_setjmp_receiver): Use and clobber
	hard_frame_pointer_rtx.
	* cse.c (cse_insn): Remove volatile check.
	* cselib.c (cselib_process_insn): Likewise.
	* dse.c (scan_insn): Likewise.

Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	2014-03-03 21:47:59.749026019 +0000
+++ gcc/builtins.c	2014-03-03 21:48:00.550030853 +0000
@@ -910,18 +910,27 @@ expand_builtin_setjmp_receiver (rtx rece
 #ifdef HAVE_nonlocal_goto
   if (! HAVE_nonlocal_goto)
 #endif
-    /* First adjust our frame pointer to its actual value.  It was
-       previously set to the start of the virtual area corresponding to
-       the stacked variables when we branched here and now needs to be
-       adjusted to the actual hardware fp value.
-
-       Assignments to virtual registers are converted by
-       instantiate_virtual_regs into the corresponding assignment
-       to the underlying register (fp in this case) that makes
-       the original assignment true.
-       So the following insn will actually be decrementing fp by
-       STARTING_FRAME_OFFSET.  */
-    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
+    {
+      /* First adjust our frame pointer to its actual value.  It was
+	 previously set to the start of the virtual area corresponding to
+	 the stacked variables when we branched here and now needs to be
+	 adjusted to the actual hardware fp value.
+
+	 Assignments to virtual registers are converted by
+	 instantiate_virtual_regs into the corresponding assignment
+	 to the underlying register (fp in this case) that makes
+	 the original assignment true.
+	 So the following insn will actually be decrementing fp by
+	 STARTING_FRAME_OFFSET.  */
+      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
+
+      /* Restoring the frame pointer also modifies the hard frame pointer.
+	 Mark it used (so that the previous assignment remains live once
+	 the frame pointer is eliminated) and clobbered (to represent the
+	 implicit update from the assignment).  */
+      emit_use (hard_frame_pointer_rtx);
+      emit_clobber (hard_frame_pointer_rtx);
+    }
 
 #if !HARD_FRAME_POINTER_IS_ARG_POINTER
   if (fixed_regs[ARG_POINTER_REGNUM])
@@ -965,8 +974,7 @@ expand_builtin_setjmp_receiver (rtx rece
 
   /* We must not allow the code we just generated to be reordered by
      scheduling.  Specifically, the update of the frame pointer must
-     happen immediately, not later.  Similarly, we must block
-     (frame-related) register values to be used across this code.  */
+     happen immediately, not later.  */
   emit_insn (gen_blockage ());
 }
 
Index: gcc/cse.c
===================================================================
--- gcc/cse.c	2014-03-03 21:47:59.869026741 +0000
+++ gcc/cse.c	2014-03-03 21:48:00.625031305 +0000
@@ -5664,11 +5664,6 @@ cse_insn (rtx insn)
 	  invalidate (XEXP (dest, 0), GET_MODE (dest));
       }
 
-  /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything.  */
-  if (NONJUMP_INSN_P (insn)
-      && volatile_insn_p (PATTERN (insn)))
-    flush_hash_table ();
-
   /* Don't cse over a call to setjmp; on some machines (eg VAX)
      the regs restored by the longjmp come from a later time
      than the setjmp.  */
Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c	2014-03-03 21:47:59.870026748 +0000
+++ gcc/cselib.c	2014-03-03 22:09:24.211994918 +0000
@@ -2626,12 +2626,10 @@ cselib_process_insn (rtx insn)
 
   cselib_current_insn = insn;
 
-  /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp.  */
+  /* Forget everything at a CODE_LABEL or a setjmp.  */
   if ((LABEL_P (insn)
        || (CALL_P (insn)
-	   && find_reg_note (insn, REG_SETJMP, NULL))
-       || (NONJUMP_INSN_P (insn)
-	   && volatile_insn_p (PATTERN (insn))))
+	   && find_reg_note (insn, REG_SETJMP, NULL)))
       && !cselib_preserve_constants)
     {
       cselib_reset_table (next_uid);
Index: gcc/dse.c
===================================================================
--- gcc/dse.c	2014-03-03 21:47:59.871026754 +0000
+++ gcc/dse.c	2014-03-03 21:48:00.627031317 +0000
@@ -2470,16 +2470,6 @@ scan_insn (bb_info_t bb_info, rtx insn)
       return;
     }
 
-  /* Cselib clears the table for this case, so we have to essentially
-     do the same.  */
-  if (NONJUMP_INSN_P (insn)
-      && volatile_insn_p (PATTERN (insn)))
-    {
-      add_wild_read (bb_info);
-      insn_info->cannot_delete = true;
-      return;
-    }
-
   /* Look at all of the uses in the insn.  */
   note_uses (&PATTERN (insn), check_mem_read_use, bb_info);
 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 22:58           ` Eric Botcazou
@ 2014-03-04  7:30             ` Richard Sandiford
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2014-03-04  7:30 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov

Eric Botcazou <ebotcazou@adacore.com> writes:
>> That doesn't really answer the question though.  What's the correct
>> behaviour for an unspec volatile in a loop?  I don't think it's what
>> we did in the example above, since it doesn't seem self-consistent.
>> And "not spending too much time" is again a bit vague in terms of
>> saying what's right and what's wrong.
>
> "not spending too much time" is a polite way to say I don't really care. :-)
> If you do, feel free to post a formal definition, a implementation plan and 
> maybe a patch at some point.

Well, I still reckon unspec_volatile and volatile asm should be volatile
in the same way.  There should be no implicit magic clobbers.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 22:01 ` Richard Sandiford
  2014-03-04  7:27   ` Richard Sandiford
@ 2014-03-04  9:21   ` Richard Biener
  2014-03-04 10:11     ` Richard Sandiford
  2014-03-04 11:42   ` Bernd Schmidt
  2014-03-11  6:42   ` Hans-Peter Nilsson
  3 siblings, 1 reply; 36+ messages in thread
From: Richard Biener @ 2014-03-04  9:21 UTC (permalink / raw)
  To: Eric Botcazou, GCC Patches, Yury Gribov, Richard Sandiford

On Mon, Mar 3, 2014 at 11:01 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> AIUI:
>
> (1) The main point of http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html
>     was that we had:
>
>       emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
>       /* This might change the hard frame pointer in ways that aren't
>          apparent to early optimization passes, so force a clobber.  */
>       emit_clobber (hard_frame_pointer_rtx);
>
>     and, after elimination, the clobber killed the assignment to the
>     hard frame pointer.  Which it could. :-)
>
> (2) Aassuming for simplicity that STARTING_FRAME_OFFSET == 0, we end up
>     with an assignment like:
>
>       (set frame_pointer_rtx hard_frame_pointer_rtx)
>
>     And the problem is that frame_pointer_rtx often isn't a real register:
>     it gets eliminated to hard_frame_pointer_rtx+X, where X isn't known
>     until RA time.  I.e. the assignment is really:
>
>       (set (plus hard_frame_pointer_rtx X) hard_frame_pointer_rtx)
>
>     which becomes:
>
>       (set hard_frame_pointer_rtx (plus hard_frame_pointer_rtx -X))
>
>     Before RA it isn't obvious that the assignment clobbers
>     hard_frame_pointer_rtx, so it would seem to most passes that
>     frame_pointer_rtx and hard_frame_pointer_rtx are equivalent
>     after the set.  That was why the clobber was there.
>
> (3) We chose to fix this by deleting the explicit clobber and relying on
>     the magicness of unspec_volatile to imply the clobber instead.
>
> But I don't think (3) is a good idea.  We should instead fix the dataflow
> so that it's accurate.
>
> Long term it would be good to have a different representation for (2),

An UNSPEC that sets frame_pointer_rtx, uses and clobbers hard_frame_pointer_rtx.

In fact the plain move_insn is a lie and the clobber should have been
at least in a parallel with the set of frame_pointer_rtx ...

?

> but I don't have any bright ideas what that might be.  Until then I think
> we can model the dataflow accurately by having a use as well as a clobber
> of hard_frame_pointer_rtx.  I went back to r192682:

Indeed.

>   http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg02350.html
>
> and saw the -m32 tests fail without the builtins.c part of the patch.
> They passed after it.  I then went to r200643:
>
> 2013-07-03  Hans-Peter Nilsson  <hp@bitrange.com>
>
>         PR middle-end/55030
>         * stmt.c (expand_nl_goto_receiver): Remove almost-copy of
>         expand_builtin_setjmp_receiver.
>         (expand_label): Adjust, call expand_builtin_setjmp_receiver
>         with NULL for the label parameter.
>         * builtins.c (expand_builtin_setjmp_receiver): Don't clobber
>         the frame-pointer.  Adjust comments.
>         [HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
>         only if LABEL is non-NULL.
>
> and applied the full patch.  The tests still passed, despite the removal
> of the volatile checks.  (To recap, the volatile checks touched here
> were the same ones touched by:
>
>   http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00868.html
>
> Rather than reverting that part I'm removing the check entirely,
> since we seem to agree that the original asm handling wasn't necessary.)
>
> I'll run a full test overnight, but does this look like it might be
> a way out, at least for 4.9?

Shouldn't the use and clobber be part of the "move" and thus wrapped
inside a parallel?  Or am I misunderstanding how parallel works?
What keeps those three insns adjacent otherwise?

Thansk,
Richard.

> Thanks,
> Richard
>
>
> gcc/
>         * builtins.c (expand_builtin_setjmp_receiver): Use and clobber
>         hard_frame_pointer_rtx.
>         * cse.c (cse_insn): Remove volatile check.
>         * cselib.c (cselib_process_insn): Likewise.
>         * dse.c (scan_insn): Likewise.
>
> Index: gcc/builtins.c
> ===================================================================
> --- gcc/builtins.c      2014-03-03 21:47:59.749026019 +0000
> +++ gcc/builtins.c      2014-03-03 21:48:00.550030853 +0000
> @@ -910,18 +910,27 @@ expand_builtin_setjmp_receiver (rtx rece
>  #ifdef HAVE_nonlocal_goto
>    if (! HAVE_nonlocal_goto)
>  #endif
> -    /* First adjust our frame pointer to its actual value.  It was
> -       previously set to the start of the virtual area corresponding to
> -       the stacked variables when we branched here and now needs to be
> -       adjusted to the actual hardware fp value.
> -
> -       Assignments to virtual registers are converted by
> -       instantiate_virtual_regs into the corresponding assignment
> -       to the underlying register (fp in this case) that makes
> -       the original assignment true.
> -       So the following insn will actually be decrementing fp by
> -       STARTING_FRAME_OFFSET.  */
> -    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> +    {
> +      /* First adjust our frame pointer to its actual value.  It was
> +        previously set to the start of the virtual area corresponding to
> +        the stacked variables when we branched here and now needs to be
> +        adjusted to the actual hardware fp value.
> +
> +        Assignments to virtual registers are converted by
> +        instantiate_virtual_regs into the corresponding assignment
> +        to the underlying register (fp in this case) that makes
> +        the original assignment true.
> +        So the following insn will actually be decrementing fp by
> +        STARTING_FRAME_OFFSET.  */
> +      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
> +
> +      /* Restoring the frame pointer also modifies the hard frame pointer.
> +        Mark it used (so that the previous assignment remains live once
> +        the frame pointer is eliminated) and clobbered (to represent the
> +        implicit update from the assignment).  */
> +      emit_use (hard_frame_pointer_rtx);
> +      emit_clobber (hard_frame_pointer_rtx);
> +    }
>
>  #if !HARD_FRAME_POINTER_IS_ARG_POINTER
>    if (fixed_regs[ARG_POINTER_REGNUM])
> @@ -965,8 +974,7 @@ expand_builtin_setjmp_receiver (rtx rece
>
>    /* We must not allow the code we just generated to be reordered by
>       scheduling.  Specifically, the update of the frame pointer must
> -     happen immediately, not later.  Similarly, we must block
> -     (frame-related) register values to be used across this code.  */
> +     happen immediately, not later.  */
>    emit_insn (gen_blockage ());
>  }
>
> Index: gcc/cse.c
> ===================================================================
> --- gcc/cse.c   2014-03-03 21:47:59.869026741 +0000
> +++ gcc/cse.c   2014-03-03 21:48:00.625031305 +0000
> @@ -5664,11 +5664,6 @@ cse_insn (rtx insn)
>           invalidate (XEXP (dest, 0), GET_MODE (dest));
>        }
>
> -  /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything.  */
> -  if (NONJUMP_INSN_P (insn)
> -      && volatile_insn_p (PATTERN (insn)))
> -    flush_hash_table ();
> -
>    /* Don't cse over a call to setjmp; on some machines (eg VAX)
>       the regs restored by the longjmp come from a later time
>       than the setjmp.  */
> Index: gcc/cselib.c
> ===================================================================
> --- gcc/cselib.c        2014-03-03 21:47:59.870026748 +0000
> +++ gcc/cselib.c        2014-03-03 21:48:00.626031311 +0000
> @@ -2629,9 +2629,7 @@ cselib_process_insn (rtx insn)
>    /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp.  */
>    if ((LABEL_P (insn)
>         || (CALL_P (insn)
> -          && find_reg_note (insn, REG_SETJMP, NULL))
> -       || (NONJUMP_INSN_P (insn)
> -          && volatile_insn_p (PATTERN (insn))))
> +          && find_reg_note (insn, REG_SETJMP, NULL)))
>        && !cselib_preserve_constants)
>      {
>        cselib_reset_table (next_uid);
> Index: gcc/dse.c
> ===================================================================
> --- gcc/dse.c   2014-03-03 21:47:59.871026754 +0000
> +++ gcc/dse.c   2014-03-03 21:48:00.627031317 +0000
> @@ -2470,16 +2470,6 @@ scan_insn (bb_info_t bb_info, rtx insn)
>        return;
>      }
>
> -  /* Cselib clears the table for this case, so we have to essentially
> -     do the same.  */
> -  if (NONJUMP_INSN_P (insn)
> -      && volatile_insn_p (PATTERN (insn)))
> -    {
> -      add_wild_read (bb_info);
> -      insn_info->cannot_delete = true;
> -      return;
> -    }
> -
>    /* Look at all of the uses in the insn.  */
>    note_uses (&PATTERN (insn), check_mem_read_use, bb_info);
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-04  9:21   ` Richard Biener
@ 2014-03-04 10:11     ` Richard Sandiford
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Sandiford @ 2014-03-04 10:11 UTC (permalink / raw)
  To: Richard Biener; +Cc: Eric Botcazou, GCC Patches, Yury Gribov

Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Mar 3, 2014 at 11:01 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> AIUI:
>>
>> (1) The main point of http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html
>>     was that we had:
>>
>>       emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
>>       /* This might change the hard frame pointer in ways that aren't
>>          apparent to early optimization passes, so force a clobber.  */
>>       emit_clobber (hard_frame_pointer_rtx);
>>
>>     and, after elimination, the clobber killed the assignment to the
>>     hard frame pointer.  Which it could. :-)
>>
>> (2) Aassuming for simplicity that STARTING_FRAME_OFFSET == 0, we end up
>>     with an assignment like:
>>
>>       (set frame_pointer_rtx hard_frame_pointer_rtx)
>>
>>     And the problem is that frame_pointer_rtx often isn't a real register:
>>     it gets eliminated to hard_frame_pointer_rtx+X, where X isn't known
>>     until RA time.  I.e. the assignment is really:
>>
>>       (set (plus hard_frame_pointer_rtx X) hard_frame_pointer_rtx)
>>
>>     which becomes:
>>
>>       (set hard_frame_pointer_rtx (plus hard_frame_pointer_rtx -X))
>>
>>     Before RA it isn't obvious that the assignment clobbers
>>     hard_frame_pointer_rtx, so it would seem to most passes that
>>     frame_pointer_rtx and hard_frame_pointer_rtx are equivalent
>>     after the set.  That was why the clobber was there.
>>
>> (3) We chose to fix this by deleting the explicit clobber and relying on
>>     the magicness of unspec_volatile to imply the clobber instead.
>>
>> But I don't think (3) is a good idea.  We should instead fix the dataflow
>> so that it's accurate.
>>
>> Long term it would be good to have a different representation for (2),
>
> An UNSPEC that sets frame_pointer_rtx, uses and clobbers hard_frame_pointer_rtx.
>
> In fact the plain move_insn is a lie and the clobber should have been
> at least in a parallel with the set of frame_pointer_rtx ...
>
> ?

Yeah, the plain move is lie, which is why ideally I'd like to change it.
But the problem is that the elimination code specifically expects this
kind of pattern to be used.  It eliminates the target of the move to
reg+offset and treats the new insn as being an assignment to reg with
offset subtracted from the source.  It needs to be something simple like
a plain move or a:

  (set (reg) (plus (reg) (const_int)))

for subtracting offset to work correctly.

So changing the representation of the move would mean changing the way
that those eliminations are handled too.  Not 4.9 material :-)

>> but I don't have any bright ideas what that might be.  Until then I think
>> we can model the dataflow accurately by having a use as well as a clobber
>> of hard_frame_pointer_rtx.  I went back to r192682:
>
> Indeed.
>
>>   http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg02350.html
>>
>> and saw the -m32 tests fail without the builtins.c part of the patch.
>> They passed after it.  I then went to r200643:
>>
>> 2013-07-03  Hans-Peter Nilsson  <hp@bitrange.com>
>>
>>         PR middle-end/55030
>>         * stmt.c (expand_nl_goto_receiver): Remove almost-copy of
>>         expand_builtin_setjmp_receiver.
>>         (expand_label): Adjust, call expand_builtin_setjmp_receiver
>>         with NULL for the label parameter.
>>         * builtins.c (expand_builtin_setjmp_receiver): Don't clobber
>>         the frame-pointer.  Adjust comments.
>>         [HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver
>>         only if LABEL is non-NULL.
>>
>> and applied the full patch.  The tests still passed, despite the removal
>> of the volatile checks.  (To recap, the volatile checks touched here
>> were the same ones touched by:
>>
>>   http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00868.html
>>
>> Rather than reverting that part I'm removing the check entirely,
>> since we seem to agree that the original asm handling wasn't necessary.)
>>
>> I'll run a full test overnight, but does this look like it might be
>> a way out, at least for 4.9?
>
> Shouldn't the use and clobber be part of the "move" and thus wrapped
> inside a parallel?  Or am I misunderstanding how parallel works?
> What keeps those three insns adjacent otherwise?

Strict adjacency doesn't really matter.  We just need to kill the
equivalence between the soft and hard frame pointers for anything
after the clobber.

I did wonder about using an empty asm that takes the old hard frame
pointer as input and produces a new hard frame pointer as output,
but normal asms aren't allowed to change the frame pointer.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 22:01 ` Richard Sandiford
  2014-03-04  7:27   ` Richard Sandiford
  2014-03-04  9:21   ` Richard Biener
@ 2014-03-04 11:42   ` Bernd Schmidt
  2014-03-07 14:05     ` Richard Sandiford
  2014-03-11  6:42   ` Hans-Peter Nilsson
  3 siblings, 1 reply; 36+ messages in thread
From: Bernd Schmidt @ 2014-03-04 11:42 UTC (permalink / raw)
  To: Eric Botcazou, gcc-patches, Yury Gribov, rdsandiford

On 03/03/2014 11:01 PM, Richard Sandiford wrote:

> I'll run a full test overnight, but does this look like it might be
> a way out, at least for 4.9?

Pretty much agree with everything you've written in this thread.  I 
think this patch is fine.

> gcc/
> 	* builtins.c (expand_builtin_setjmp_receiver): Use and clobber
> 	hard_frame_pointer_rtx.
> 	* cse.c (cse_insn): Remove volatile check.
> 	* cselib.c (cselib_process_insn): Likewise.
> 	* dse.c (scan_insn): Likewise.


Bernd

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-04 11:42   ` Bernd Schmidt
@ 2014-03-07 14:05     ` Richard Sandiford
  2014-03-11  6:17       ` Yury Gribov
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2014-03-07 14:05 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Eric Botcazou, gcc-patches, Yury Gribov

Bernd Schmidt <bernds@codesourcery.com> writes:
> On 03/03/2014 11:01 PM, Richard Sandiford wrote:
>
>> I'll run a full test overnight, but does this look like it might be
>> a way out, at least for 4.9?
>
> Pretty much agree with everything you've written in this thread.  I 
> think this patch is fine.

Thanks.  The thread seems to have died down, so should I go ahead
and install it?  I didn't want to be too hasty since it's obviously
a bit of a controversial area.

Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-07 14:05     ` Richard Sandiford
@ 2014-03-11  6:17       ` Yury Gribov
  0 siblings, 0 replies; 36+ messages in thread
From: Yury Gribov @ 2014-03-11  6:17 UTC (permalink / raw)
  To: Bernd Schmidt, Eric Botcazou, gcc-patches, rdsandiford

Richard wrote:
 > The thread seems to have died down,
 > so should I go ahead and install it?

Looks like all reviews were more or less positive...

-Y

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-03 22:01 ` Richard Sandiford
                     ` (2 preceding siblings ...)
  2014-03-04 11:42   ` Bernd Schmidt
@ 2014-03-11  6:42   ` Hans-Peter Nilsson
  2014-03-11 21:57     ` Richard Sandiford
  3 siblings, 1 reply; 36+ messages in thread
From: Hans-Peter Nilsson @ 2014-03-11  6:42 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Eric Botcazou, gcc-patches, Yury Gribov

On Mon, 3 Mar 2014, Richard Sandiford wrote:
> AIUI:

Reading back the references don't yield any dissenting
flash-backs, FWIW.

So, a (use fp) then a (clobber fp)?  That was probably just too
weird for me to think of, much like a hypercorrect ending of the
previous clause. :)

Thanks for dealing with this, and for not making my initial
nightmarish interpretation of $SUBJECT come true: "Do not
consider volatile asms as anything we have to consider".
At least I hope so.  Dig up this horse in 6 months?

brgds, H-P

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-11  6:42   ` Hans-Peter Nilsson
@ 2014-03-11 21:57     ` Richard Sandiford
  2014-03-12  8:51       ` Eric Botcazou
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2014-03-11 21:57 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Eric Botcazou, gcc-patches, Yury Gribov

Hans-Peter Nilsson <hp@bitrange.com> writes:
> On Mon, 3 Mar 2014, Richard Sandiford wrote:
>> AIUI:
>
> Reading back the references don't yield any dissenting
> flash-backs, FWIW.
>
> So, a (use fp) then a (clobber fp)?  That was probably just too
> weird for me to think of, much like a hypercorrect ending of the
> previous clause. :)
>
> Thanks for dealing with this, and for not making my initial
> nightmarish interpretation of $SUBJECT come true: "Do not
> consider volatile asms as anything we have to consider".
> At least I hope so.  Dig up this horse in 6 months?

Thanks, and to Bernd for the review.  I went ahead and applied it to trunk.

Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-11 21:57     ` Richard Sandiford
@ 2014-03-12  8:51       ` Eric Botcazou
  2014-03-13  7:26         ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Botcazou @ 2014-03-12  8:51 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: gcc-patches, Hans-Peter Nilsson, Yury Gribov

> Thanks, and to Bernd for the review.  I went ahead and applied it to trunk.

Thanks.  We need something for the 4.8 branch as well, probably the builtins.c 
hunk and the reversion of the cse.c/cselib.c/dse.c changes to the 4.7 state.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-12  8:51       ` Eric Botcazou
@ 2014-03-13  7:26         ` Richard Sandiford
  2014-03-13  7:58           ` Jakub Jelinek
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2014-03-13  7:26 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Hans-Peter Nilsson, Yury Gribov

Eric Botcazou <ebotcazou@adacore.com> writes:
>> Thanks, and to Bernd for the review.  I went ahead and applied it to trunk.
>
> Thanks.  We need something for the 4.8 branch as well, probably the builtins.c 
> hunk and the reversion of the cse.c/cselib.c/dse.c changes to the 4.7 state.

OK, how about this?  It looks like the builtins.c and stmt.c stuff wasn't
merged until 4.9, and at this stage it seemed safer to just add the same
use/clobber sequence to both places.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
	* builtins.c (expand_builtin_setjmp_receiver): Emit a use of
	the hard frame pointer.  Synchronize commentary with mainline.
	* cse.c (cse_insn): Only check for volatile asms.
	* cselib.c (cselib_process_insn): Likewise.
	* dse.c (scan_insn): Likewise.
	* stmt.c (expand_nl_goto_receiver): Emit a use and a clobber of
	the hard frame pointer.

Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	2014-03-12 18:24:02.919132339 +0000
+++ gcc/builtins.c	2014-03-12 18:24:17.679262346 +0000
@@ -905,9 +905,24 @@ expand_builtin_setjmp_receiver (rtx rece
   if (! HAVE_nonlocal_goto)
 #endif
     {
+      /* First adjust our frame pointer to its actual value.  It was
+	 previously set to the start of the virtual area corresponding to
+	 the stacked variables when we branched here and now needs to be
+	 adjusted to the actual hardware fp value.
+
+	 Assignments to virtual registers are converted by
+	 instantiate_virtual_regs into the corresponding assignment
+	 to the underlying register (fp in this case) that makes
+	 the original assignment true.
+	 So the following insn will actually be decrementing fp by
+	 STARTING_FRAME_OFFSET.  */
       emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
-      /* This might change the hard frame pointer in ways that aren't
-	 apparent to early optimization passes, so force a clobber.  */
+
+      /* Restoring the frame pointer also modifies the hard frame pointer.
+	 Mark it used (so that the previous assignment remains live once
+	 the frame pointer is eliminated) and clobbered (to represent the
+	 implicit update from the assignment).  */
+      emit_use (hard_frame_pointer_rtx);
       emit_clobber (hard_frame_pointer_rtx);
     }
 
@@ -948,8 +963,7 @@ expand_builtin_setjmp_receiver (rtx rece
 
   /* We must not allow the code we just generated to be reordered by
      scheduling.  Specifically, the update of the frame pointer must
-     happen immediately, not later.  Similarly, we must block
-     (frame-related) register values to be used across this code.  */
+     happen immediately, not later.  */
   emit_insn (gen_blockage ());
 }
 
Index: gcc/cse.c
===================================================================
--- gcc/cse.c	2014-03-12 18:24:02.919132339 +0000
+++ gcc/cse.c	2014-03-12 18:24:17.680262355 +0000
@@ -5659,9 +5659,10 @@ cse_insn (rtx insn)
 	  invalidate (XEXP (dest, 0), GET_MODE (dest));
       }
 
-  /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything.  */
+  /* A volatile ASM invalidates everything.  */
   if (NONJUMP_INSN_P (insn)
-      && volatile_insn_p (PATTERN (insn)))
+      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
+      && MEM_VOLATILE_P (PATTERN (insn)))
     flush_hash_table ();
 
   /* Don't cse over a call to setjmp; on some machines (eg VAX)
Index: gcc/cselib.c
===================================================================
--- gcc/cselib.c	2014-03-12 18:24:02.919132339 +0000
+++ gcc/cselib.c	2014-03-12 18:24:17.681262364 +0000
@@ -2623,12 +2623,13 @@ cselib_process_insn (rtx insn)
 
   cselib_current_insn = insn;
 
-  /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp.  */
+  /* Forget everything at a CODE_LABEL, a volatile asm, or a setjmp.  */
   if ((LABEL_P (insn)
        || (CALL_P (insn)
 	   && find_reg_note (insn, REG_SETJMP, NULL))
        || (NONJUMP_INSN_P (insn)
-	   && volatile_insn_p (PATTERN (insn))))
+	   && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
+	   && MEM_VOLATILE_P (PATTERN (insn))))
       && !cselib_preserve_constants)
     {
       cselib_reset_table (next_uid);
Index: gcc/dse.c
===================================================================
--- gcc/dse.c	2014-03-12 18:24:02.919132339 +0000
+++ gcc/dse.c	2014-03-12 18:24:17.681262364 +0000
@@ -2518,7 +2518,8 @@ scan_insn (bb_info_t bb_info, rtx insn)
   /* Cselib clears the table for this case, so we have to essentially
      do the same.  */
   if (NONJUMP_INSN_P (insn)
-      && volatile_insn_p (PATTERN (insn)))
+      && GET_CODE (PATTERN (insn)) == ASM_OPERANDS
+      && MEM_VOLATILE_P (PATTERN (insn)))
     {
       add_wild_read (bb_info);
       insn_info->cannot_delete = true;
Index: gcc/stmt.c
===================================================================
--- gcc/stmt.c	2014-03-12 18:24:15.789245708 +0000
+++ gcc/stmt.c	2014-03-12 18:24:24.062318544 +0000
@@ -1602,18 +1602,27 @@ expand_nl_goto_receiver (void)
 #ifdef HAVE_nonlocal_goto
   if (! HAVE_nonlocal_goto)
 #endif
-    /* First adjust our frame pointer to its actual value.  It was
-       previously set to the start of the virtual area corresponding to
-       the stacked variables when we branched here and now needs to be
-       adjusted to the actual hardware fp value.
+    {
+      /* First adjust our frame pointer to its actual value.  It was
+	 previously set to the start of the virtual area corresponding to
+	 the stacked variables when we branched here and now needs to be
+	 adjusted to the actual hardware fp value.
 
-       Assignments are to virtual registers are converted by
-       instantiate_virtual_regs into the corresponding assignment
-       to the underlying register (fp in this case) that makes
-       the original assignment true.
-       So the following insn will actually be
-       decrementing fp by STARTING_FRAME_OFFSET.  */
-    emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
+	 Assignments to virtual registers are converted by
+	 instantiate_virtual_regs into the corresponding assignment
+	 to the underlying register (fp in this case) that makes
+	 the original assignment true.
+	 So the following insn will actually be decrementing fp by
+	 STARTING_FRAME_OFFSET.  */
+      emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx);
+
+      /* Restoring the frame pointer also modifies the hard frame pointer.
+	 Mark it used (so that the previous assignment remains live once
+	 the frame pointer is eliminated) and clobbered (to represent the
+	 implicit update from the assignment).  */
+      emit_use (hard_frame_pointer_rtx);
+      emit_clobber (hard_frame_pointer_rtx);
+    }
 
 #if !HARD_FRAME_POINTER_IS_ARG_POINTER
   if (fixed_regs[ARG_POINTER_REGNUM])

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-13  7:26         ` Richard Sandiford
@ 2014-03-13  7:58           ` Jakub Jelinek
  2014-03-25  7:58             ` Yury Gribov
  0 siblings, 1 reply; 36+ messages in thread
From: Jakub Jelinek @ 2014-03-13  7:58 UTC (permalink / raw)
  To: Eric Botcazou, gcc-patches, Hans-Peter Nilsson, Yury Gribov, rdsandiford

On Thu, Mar 13, 2014 at 07:15:34AM +0000, Richard Sandiford wrote:
> Eric Botcazou <ebotcazou@adacore.com> writes:
> >> Thanks, and to Bernd for the review.  I went ahead and applied it to trunk.
> >
> > Thanks.  We need something for the 4.8 branch as well, probably the builtins.c 
> > hunk and the reversion of the cse.c/cselib.c/dse.c changes to the 4.7 state.
> 
> OK, how about this?  It looks like the builtins.c and stmt.c stuff wasn't
> merged until 4.9, and at this stage it seemed safer to just add the same
> use/clobber sequence to both places.

Please wait a little bit, the patch has been committed to the trunk only
very recently, we want to see if it has any fallout.

	Jakub

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC] Do not consider volatile asms as optimization barriers #1
  2014-03-13  7:58           ` Jakub Jelinek
@ 2014-03-25  7:58             ` Yury Gribov
  2014-04-17 12:16               ` [RFC][PING^2] " Yury Gribov
  0 siblings, 1 reply; 36+ messages in thread
From: Yury Gribov @ 2014-03-25  7:58 UTC (permalink / raw)
  To: Jakub Jelinek, Eric Botcazou, gcc-patches, Hans-Peter Nilsson,
	rdsandiford

Jakub Jelinek wrote:
> Richard Sandiford wrote:
>> OK, how about this?  It looks like the builtins.c and stmt.c stuff wasn't
>> merged until 4.9, and at this stage it seemed safer to just add the same
>> use/clobber sequence to both places.
>
> Please wait a little bit, the patch has been committed to the trunk only
> very recently, we want to see if it has any fallout.

It has been two weeks since Richard commited this to trunk. Perhaps it's 
ok to backport to 4.8 branch now?

-Y

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-03-25  7:58             ` Yury Gribov
@ 2014-04-17 12:16               ` Yury Gribov
  2014-05-09  6:44                 ` Eric Botcazou
  0 siblings, 1 reply; 36+ messages in thread
From: Yury Gribov @ 2014-04-17 12:16 UTC (permalink / raw)
  To: Jakub Jelinek, Eric Botcazou, gcc-patches, Hans-Peter Nilsson,
	rdsandiford

----------------------------------------------
From: Yury Gribov <y.gribov@samsung.com>
Sent:  Tuesday, March 25, 2014 11:57AM
To: Jakub Jelinek <jakub@redhat.com>, Eric Botcazou 
<ebotcazou@adacore.com>, gcc-patches@gcc.gnu.org, Hans-Peter Nilsson 
<hp@bitrange.com>, rdsandiford@googlemail.com
Subject: Re: [RFC] Do not consider volatile asms as optimization barriers #1

On 03/25/2014 11:57 AM, Yury Gribov wrote:
Jakub Jelinek wrote:
> Richard Sandiford wrote:
>> OK, how about this?  It looks like the builtins.c and stmt.c stuff
>> wasn't
>> merged until 4.9, and at this stage it seemed safer to just add the same
>> use/clobber sequence to both places.
>
> Please wait a little bit, the patch has been committed to the trunk only
> very recently, we want to see if it has any fallout.

It has been two weeks since Richard commited this to trunk. Perhaps it's
ok to backport to 4.8 branch now?

-Y

Link to original email: 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01306.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-04-17 12:16               ` [RFC][PING^2] " Yury Gribov
@ 2014-05-09  6:44                 ` Eric Botcazou
  2014-05-09  7:24                   ` Richard Sandiford
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Botcazou @ 2014-05-09  6:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Yury Gribov, Jakub Jelinek, Hans-Peter Nilsson, rdsandiford

> It has been two weeks since Richard commited this to trunk. Perhaps it's
> ok to backport to 4.8 branch now?

Richard, can you do that before the 4.8.3 release?  Thanks in advance.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-09  6:44                 ` Eric Botcazou
@ 2014-05-09  7:24                   ` Richard Sandiford
  2014-05-09  8:17                     ` Eric Botcazou
  0 siblings, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2014-05-09  7:24 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov, Jakub Jelinek, Hans-Peter Nilsson

Eric Botcazou <ebotcazou@adacore.com> writes:
>> It has been two weeks since Richard commited this to trunk. Perhaps it's
>> ok to backport to 4.8 branch now?
>
> Richard, can you do that before the 4.8.3 release?  Thanks in advance.

Sure, that'd be fine by me.  I'm not sure whether the backport has
been approved yet though.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-09  7:24                   ` Richard Sandiford
@ 2014-05-09  8:17                     ` Eric Botcazou
  2014-05-09  8:36                       ` Jakub Jelinek
  2014-05-09 10:17                       ` Richard Sandiford
  0 siblings, 2 replies; 36+ messages in thread
From: Eric Botcazou @ 2014-05-09  8:17 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: gcc-patches, Yury Gribov, Jakub Jelinek, Hans-Peter Nilsson

> Sure, that'd be fine by me.  I'm not sure whether the backport has
> been approved yet though.

At least it looks fine to me...

[I still don't grasp this "wait-a-little-before-backporting" policy, it always 
leads to forgotten patches that need to be applied just days ahead of releases 
instead of undergoing a longer period of testing]

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-09  8:17                     ` Eric Botcazou
@ 2014-05-09  8:36                       ` Jakub Jelinek
  2014-05-13 17:20                         ` Eric Botcazou
  2014-05-09 10:17                       ` Richard Sandiford
  1 sibling, 1 reply; 36+ messages in thread
From: Jakub Jelinek @ 2014-05-09  8:36 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: Richard Sandiford, gcc-patches, Yury Gribov, Hans-Peter Nilsson

On Fri, May 09, 2014 at 10:16:14AM +0200, Eric Botcazou wrote:
> > Sure, that'd be fine by me.  I'm not sure whether the backport has
> > been approved yet though.
> 
> At least it looks fine to me...
> 
> [I still don't grasp this "wait-a-little-before-backporting" policy, it always 
> leads to forgotten patches that need to be applied just days ahead of releases 
> instead of undergoing a longer period of testing]

The point of it is that the release branches are actually used by GCC users,
many people don't use just official releases, but arbitrary snapshots from
the release branches.  If a potentially risky patch is applied immediately
also to release branches and soon needs follow-ups (happened many times in
the recent history), then that results in unnecessary breakage.
Of course for lots of simpler patches that aren't deemed that risky waiting
before applying is not needed, but I think this patch wasn't such a
category.

	Jakub

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-09  8:17                     ` Eric Botcazou
  2014-05-09  8:36                       ` Jakub Jelinek
@ 2014-05-09 10:17                       ` Richard Sandiford
  2014-05-13 17:21                         ` Eric Botcazou
  1 sibling, 1 reply; 36+ messages in thread
From: Richard Sandiford @ 2014-05-09 10:17 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Yury Gribov, Jakub Jelinek, Hans-Peter Nilsson

Eric Botcazou <ebotcazou@adacore.com> writes:
>> Sure, that'd be fine by me.  I'm not sure whether the backport has
>> been approved yet though.
>
> At least it looks fine to me...

OK, thanks.  Richard also gave an RM's OK on IRC so I've now applied it.

> [I still don't grasp this "wait-a-little-before-backporting" policy,
> it always leads to forgotten patches that need to be applied just days
> ahead of releases instead of undergoing a longer period of testing]

FWIW I hadn't forgotten about it.  Yury was pinging it on a regular
basis so I felt I didn't have to :-)

Richard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-09  8:36                       ` Jakub Jelinek
@ 2014-05-13 17:20                         ` Eric Botcazou
  2014-05-13 17:25                           ` Jakub Jelinek
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Botcazou @ 2014-05-13 17:20 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Richard Sandiford, Yury Gribov, Hans-Peter Nilsson

> The point of it is that the release branches are actually used by GCC users,
> many people don't use just official releases, but arbitrary snapshots from
> the release branches.  If a potentially risky patch is applied immediately
> also to release branches and soon needs follow-ups (happened many times in
> the recent history), then that results in unnecessary breakage.

IMO that's a rather weak point, if people use arbitrary snapshots then they 
can revert to the previous one; similarly, if they use the SVN tree, then they 
can back out the patch manually.  You cannot do that for an official release.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-09 10:17                       ` Richard Sandiford
@ 2014-05-13 17:21                         ` Eric Botcazou
  0 siblings, 0 replies; 36+ messages in thread
From: Eric Botcazou @ 2014-05-13 17:21 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: gcc-patches, Yury Gribov, Jakub Jelinek, Hans-Peter Nilsson

> OK, thanks.  Richard also gave an RM's OK on IRC so I've now applied it.

Thanks!

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-13 17:20                         ` Eric Botcazou
@ 2014-05-13 17:25                           ` Jakub Jelinek
  2014-05-13 17:35                             ` Eric Botcazou
  0 siblings, 1 reply; 36+ messages in thread
From: Jakub Jelinek @ 2014-05-13 17:25 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Richard Sandiford, Yury Gribov, Hans-Peter Nilsson

On Tue, May 13, 2014 at 07:18:56PM +0200, Eric Botcazou wrote:
> > The point of it is that the release branches are actually used by GCC users,
> > many people don't use just official releases, but arbitrary snapshots from
> > the release branches.  If a potentially risky patch is applied immediately
> > also to release branches and soon needs follow-ups (happened many times in
> > the recent history), then that results in unnecessary breakage.
> 
> IMO that's a rather weak point, if people use arbitrary snapshots then they 
> can revert to the previous one; similarly, if they use the SVN tree, then they 
> can back out the patch manually.  You cannot do that for an official release.

Even with official release you can apply a patch, that is not the point.
The point is that many people expect the release branches (and IMHO rightly
so) to be supposedly stable all the time, rather than being seriously
unstable most of the time and only converging to stability around the
official releases.  So, if there is something possibly risky, it is better
to give it some time on the trunk rather than breaking everybody and fixing
it a few days afterwards.  That doesn't mean you should forget about your
patch and only remember it days before the release of course.

	Jakub

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-13 17:25                           ` Jakub Jelinek
@ 2014-05-13 17:35                             ` Eric Botcazou
  2014-05-13 17:47                               ` Jakub Jelinek
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Botcazou @ 2014-05-13 17:35 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Richard Sandiford, Yury Gribov, Hans-Peter Nilsson

> Even with official release you can apply a patch, that is not the point.
> The point is that many people expect the release branches (and IMHO rightly
> so) to be supposedly stable all the time, rather than being seriously
> unstable most of the time and only converging to stability around the
> official releases.

Possibly, but with our scheme it's the opposite: it's stable most of the time 
and series of patches are applied just before the releases...

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC][PING^2] Do not consider volatile asms as optimization barriers #1
  2014-05-13 17:35                             ` Eric Botcazou
@ 2014-05-13 17:47                               ` Jakub Jelinek
  0 siblings, 0 replies; 36+ messages in thread
From: Jakub Jelinek @ 2014-05-13 17:47 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Richard Sandiford, Yury Gribov, Hans-Peter Nilsson

On Tue, May 13, 2014 at 07:34:51PM +0200, Eric Botcazou wrote:
> > Even with official release you can apply a patch, that is not the point.
> > The point is that many people expect the release branches (and IMHO rightly
> > so) to be supposedly stable all the time, rather than being seriously
> > unstable most of the time and only converging to stability around the
> > official releases.
> 
> Possibly, but with our scheme it's the opposite: it's stable most of the time 
> and series of patches are applied just before the releases...

That's not completely true.  Some people do (at least for the non-risky
changes) backports immediately, others often do it in batches, days to weeks
to months after trunk commits, but most of the batches aren't close to releases.
If you look e.g. at 4.8 branch, there were less than 50 (non-DATESTAMP)
commits in the last month, but more than 300 (non-DATESTAMP) commits this year
(and quite a few in between 4.8.2 release and New Year).  So I'd say the
last month on 4.8 branch was pretty much a normal month on the branch,
nothing exceptional.

	Jakub

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2014-05-13 17:47 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-01 14:37 [RFC] Do not consider volatile asms as optimization barriers #1 Eric Botcazou
2014-03-01 16:54 ` Richard Sandiford
2014-03-02 10:43   ` Eric Botcazou
2014-03-03  8:01     ` Richard Sandiford
2014-03-03 10:16       ` Richard Biener
2014-03-03 11:05         ` Eric Botcazou
2014-03-03 11:14           ` Richard Sandiford
2014-03-03 11:19       ` Eric Botcazou
2014-03-03 13:12         ` Richard Sandiford
2014-03-03 22:58           ` Eric Botcazou
2014-03-04  7:30             ` Richard Sandiford
2014-03-03 21:07 ` Mike Stump
2014-03-03 22:01 ` Richard Sandiford
2014-03-04  7:27   ` Richard Sandiford
2014-03-04  9:21   ` Richard Biener
2014-03-04 10:11     ` Richard Sandiford
2014-03-04 11:42   ` Bernd Schmidt
2014-03-07 14:05     ` Richard Sandiford
2014-03-11  6:17       ` Yury Gribov
2014-03-11  6:42   ` Hans-Peter Nilsson
2014-03-11 21:57     ` Richard Sandiford
2014-03-12  8:51       ` Eric Botcazou
2014-03-13  7:26         ` Richard Sandiford
2014-03-13  7:58           ` Jakub Jelinek
2014-03-25  7:58             ` Yury Gribov
2014-04-17 12:16               ` [RFC][PING^2] " Yury Gribov
2014-05-09  6:44                 ` Eric Botcazou
2014-05-09  7:24                   ` Richard Sandiford
2014-05-09  8:17                     ` Eric Botcazou
2014-05-09  8:36                       ` Jakub Jelinek
2014-05-13 17:20                         ` Eric Botcazou
2014-05-13 17:25                           ` Jakub Jelinek
2014-05-13 17:35                             ` Eric Botcazou
2014-05-13 17:47                               ` Jakub Jelinek
2014-05-09 10:17                       ` Richard Sandiford
2014-05-13 17:21                         ` Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).