* Re: Optimization of conditional access to globals: thread-unsafe? [not found] ` <Pine.LNX.4.64.0710261836440.23011@wotan.suse.de> @ 2007-10-26 21:47 ` Ian Lance Taylor 2007-10-26 21:51 ` Diego Novillo ` (4 more replies) 0 siblings, 5 replies; 15+ messages in thread From: Ian Lance Taylor @ 2007-10-26 21:47 UTC (permalink / raw) To: Michael Matz; +Cc: gcc, gcc-patches Michael Matz <matz@suse.de> writes: > Both, the assessment of far-stretchedness and these numbers seem to be > invented ad hoc. The latter is irrelevant (it's not interesting how many > cases there are, but how important those cases which occur are, for some > metric, let's say performance). And the former isn't true, i.e. the > concern is not far-stretched. For 456.hmmer for instance it is crucial > that this transformation happens, the basic situation looks like so: What do people think of this patch? This seems to fix the problem case without breaking Michael's case. It basically avoids store speculation: we don't write to a MEM unless the function unconditionally writes to the MEM anyhow. This is basically a public relations exercise. I doubt this optimization is especially important, so I think it's OK to disable it to keep people happy. Even though the optimization has been there since gcc 3.4 and nobody noticed. Of course this kind of thing will break again until somebody takes the time to fully implement something like the C++0x memory model. I haven't tested this patch. Ian Index: ifcvt.c =================================================================== --- ifcvt.c (revision 128958) +++ ifcvt.c (working copy) @@ -2139,6 +2139,32 @@ noce_mem_write_may_trap_or_fault_p (cons return false; } +/* Return whether a MEM is unconditionally set in the function + following TOP_BB. */ + +static bool +noce_mem_unconditionally_set_p (basic_block top_bb, const_rtx mem) +{ + basic_block dominator; + + for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb); + dominator != NULL; + dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator)) + { + rtx insn; + + FOR_BB_INSNS (dominator, insn) + { + if (memory_modified_in_insn_p (mem, insn)) + return true; + if (modified_in_p (XEXP (mem, 0), insn)) + return false; + } + } + + return false; +} + /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert it without using conditional execution. Return TRUE if we were successful at converting the block. */ @@ -2292,17 +2318,31 @@ noce_process_if_block (struct noce_if_in goto success; } - /* Disallow the "if (...) x = a;" form (with an implicit "else x = x;") - for optimizations if writing to x may trap or fault, i.e. it's a memory - other than a static var or a stack slot, is misaligned on strict - aligned machines or is read-only. - If x is a read-only memory, then the program is valid only if we - avoid the store into it. If there are stores on both the THEN and - ELSE arms, then we can go ahead with the conversion; either the - program is broken, or the condition is always false such that the - other memory is selected. */ - if (!set_b && MEM_P (orig_x) && noce_mem_write_may_trap_or_fault_p (orig_x)) - return FALSE; + if (!set_b && MEM_P (orig_x)) + { + /* Disallow the "if (...) x = a;" form (implicit "else x = x;") + for optimizations if writing to x may trap or fault, + i.e. it's a memory other than a static var or a stack slot, + is misaligned on strict aligned machines or is read-only. If + x is a read-only memory, then the program is valid only if we + avoid the store into it. If there are stores on both the + THEN and ELSE arms, then we can go ahead with the conversion; + either the program is broken, or the condition is always + false such that the other memory is selected. */ + if (noce_mem_write_may_trap_or_fault_p (orig_x)) + return FALSE; + + /* Avoid store speculation: given "if (...) x = a" where x is a + MEM, we only want to do the store if x is always set + somewhere in the function. This avoids cases like + if (pthread_mutex_trylock(mutex)) + ++global_variable; + where we only want global_variable to be changed if the mutex + is held. FIXME: This should ideally be expressed directly in + RTL somehow. */ + if (!noce_mem_unconditionally_set_p (test_bb, orig_x)) + return FALSE; + } if (noce_try_move (if_info)) goto success; @@ -3957,7 +3997,7 @@ dead_or_predicable (basic_block test_bb, /* Main entry point for all if-conversion. */ static void -if_convert (bool recompute_dominance) +if_convert (void) { basic_block bb; int pass; @@ -3977,9 +4017,8 @@ if_convert (bool recompute_dominance) loop_optimizer_finalize (); free_dominance_info (CDI_DOMINATORS); - /* Compute postdominators if we think we'll use them. */ - if (HAVE_conditional_execution || recompute_dominance) - calculate_dominance_info (CDI_POST_DOMINATORS); + /* Compute postdominators. */ + calculate_dominance_info (CDI_POST_DOMINATORS); df_set_flags (DF_LR_RUN_DCE); @@ -4068,7 +4107,7 @@ rest_of_handle_if_conversion (void) if (dump_file) dump_flow_info (dump_file, dump_flags); cleanup_cfg (CLEANUP_EXPENSIVE); - if_convert (false); + if_convert (); } cleanup_cfg (0); @@ -4105,7 +4144,7 @@ gate_handle_if_after_combine (void) static unsigned int rest_of_handle_if_after_combine (void) { - if_convert (true); + if_convert (); return 0; } @@ -4138,7 +4177,7 @@ gate_handle_if_after_reload (void) static unsigned int rest_of_handle_if_after_reload (void) { - if_convert (true); + if_convert (); return 0; } ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 21:47 ` Optimization of conditional access to globals: thread-unsafe? Ian Lance Taylor @ 2007-10-26 21:51 ` Diego Novillo 2007-10-26 23:10 ` Ian Lance Taylor 2007-10-26 22:06 ` Daniel Jacobowitz ` (3 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Diego Novillo @ 2007-10-26 21:51 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Michael Matz, gcc, gcc-patches On 26 Oct 2007 14:24:21 -0700, Ian Lance Taylor <iant@google.com> wrote: > What do people think of this patch? This seems to fix the problem > case without breaking Michael's case. It basically avoids store > speculation: we don't write to a MEM unless the function > unconditionally writes to the MEM anyhow. I think it couldn't hurt. Providing it as a QOI feature might be good. However, we should predicate these changes on a -fthread-safe flag. More and more of these corner cases will start popping up. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 21:51 ` Diego Novillo @ 2007-10-26 23:10 ` Ian Lance Taylor 2007-10-27 0:06 ` Jonathan Wakely ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Ian Lance Taylor @ 2007-10-26 23:10 UTC (permalink / raw) To: Diego Novillo; +Cc: Michael Matz, gcc, gcc-patches "Diego Novillo" <dnovillo@google.com> writes: > On 26 Oct 2007 14:24:21 -0700, Ian Lance Taylor <iant@google.com> wrote: > > > What do people think of this patch? This seems to fix the problem > > case without breaking Michael's case. It basically avoids store > > speculation: we don't write to a MEM unless the function > > unconditionally writes to the MEM anyhow. > > I think it couldn't hurt. Providing it as a QOI feature might be > good. However, we should predicate these changes on a -fthread-safe > flag. More and more of these corner cases will start popping up. It appears that the draft C++0x memory model prohibits speculative stores. Therefore I now think we should aim toward prohibiting them unconditionally. That memory model is just a draft. But I think we should implement it unconditionally when it exists. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 23:10 ` Ian Lance Taylor @ 2007-10-27 0:06 ` Jonathan Wakely 2007-10-27 0:43 ` Diego Novillo 2007-10-31 21:14 ` Jason Merrill 2 siblings, 0 replies; 15+ messages in thread From: Jonathan Wakely @ 2007-10-27 0:06 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Diego Novillo, Michael Matz, gcc, gcc-patches On 26 Oct 2007 15:20:01 -0700, Ian Lance Taylor <iant@google.com> wrote: > > It appears that the draft C++0x memory model prohibits speculative > stores. > > Therefore I now think we should aim toward prohibiting them > unconditionally. That memory model is just a draft. But I think we > should implement it unconditionally when it exists. In case anyone who's interested hasn't seen it, the draft memory model is accompanied by N2338, "Concurrency memory model compiler consequences" http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html Jon ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 23:10 ` Ian Lance Taylor 2007-10-27 0:06 ` Jonathan Wakely @ 2007-10-27 0:43 ` Diego Novillo 2007-10-31 21:14 ` Jason Merrill 2 siblings, 0 replies; 15+ messages in thread From: Diego Novillo @ 2007-10-27 0:43 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Michael Matz, gcc, gcc-patches On 26 Oct 2007 15:20:01 -0700, Ian Lance Taylor <iant@google.com> wrote: > It appears that the draft C++0x memory model prohibits speculative > stores. Well, sure, might as well. Though the final form of the standard may be different, I doubt that this case will change significantly. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 23:10 ` Ian Lance Taylor 2007-10-27 0:06 ` Jonathan Wakely 2007-10-27 0:43 ` Diego Novillo @ 2007-10-31 21:14 ` Jason Merrill 2007-10-31 22:16 ` Jason Merrill 2 siblings, 1 reply; 15+ messages in thread From: Jason Merrill @ 2007-10-31 21:14 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Diego Novillo, Michael Matz, gcc, gcc-patches Ian Lance Taylor wrote: > It appears that the draft C++0x memory model prohibits speculative > stores. > > Therefore I now think we should aim toward prohibiting them > unconditionally. I agree, or perhaps unless the user specifies a flag like -fthread-unsafe-opts or something. > That memory model is just a draft. It was voted into the C++ standard working paper at the last meeting. And the C committee has expressed interest in adopting it, or something similar, as well. Jason ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-31 21:14 ` Jason Merrill @ 2007-10-31 22:16 ` Jason Merrill 0 siblings, 0 replies; 15+ messages in thread From: Jason Merrill @ 2007-10-31 22:16 UTC (permalink / raw) To: gcc-patches; +Cc: gcc, gcc-patches Ian Lance Taylor wrote: > It appears that the draft C++0x memory model prohibits speculative > stores. > > Therefore I now think we should aim toward prohibiting them > unconditionally. I agree, or perhaps unless the user specifies a flag like -fthread-unsafe-opts or something. > That memory model is just a draft. It was voted into the C++ standard working paper at the last meeting. And the C committee has expressed interest in adopting it, or something similar, as well. Jason ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 21:47 ` Optimization of conditional access to globals: thread-unsafe? Ian Lance Taylor 2007-10-26 21:51 ` Diego Novillo @ 2007-10-26 22:06 ` Daniel Jacobowitz 2007-10-26 22:50 ` Jakub Jelinek ` (2 subsequent siblings) 4 siblings, 0 replies; 15+ messages in thread From: Daniel Jacobowitz @ 2007-10-26 22:06 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Michael Matz, gcc, gcc-patches On Fri, Oct 26, 2007 at 02:24:21PM -0700, Ian Lance Taylor wrote: > What do people think of this patch? This seems to fix the problem > case without breaking Michael's case. It basically avoids store > speculation: we don't write to a MEM unless the function > unconditionally writes to the MEM anyhow. > > This is basically a public relations exercise. I doubt this > optimization is especially important, so I think it's OK to disable it > to keep people happy. Even though the optimization has been there > since gcc 3.4 and nobody noticed. > > Of course this kind of thing will break again until somebody takes the > time to fully implement something like the C++0x memory model. Right. In fact it seems to me to be still broken; you just need a bigger test case. if (trylock) { var++; unlock; } sleep lock var++; unlock I'm sure someone can turn that into a sensible looking example, with a little inlining. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 21:47 ` Optimization of conditional access to globals: thread-unsafe? Ian Lance Taylor 2007-10-26 21:51 ` Diego Novillo 2007-10-26 22:06 ` Daniel Jacobowitz @ 2007-10-26 22:50 ` Jakub Jelinek 2007-10-27 0:15 ` Ian Lance Taylor 2007-10-27 1:04 ` skaller 2007-10-29 21:23 ` Ian Lance Taylor 4 siblings, 1 reply; 15+ messages in thread From: Jakub Jelinek @ 2007-10-26 22:50 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Michael Matz, gcc, gcc-patches On Fri, Oct 26, 2007 at 02:24:21PM -0700, Ian Lance Taylor wrote: > What do people think of this patch? This seems to fix the problem > case without breaking Michael's case. It basically avoids store > speculation: we don't write to a MEM unless the function > unconditionally writes to the MEM anyhow. This still isn't enough. If you have a non-pure/non-const CALL_INSN before the unconditional store into it, you need to return false from noce_mem_unconditionally_set_p as that function could have a barrier in it. Similarly for inline asm or __sync_* builtin generated insns (not sure ATM if just stopping on UNSPEC_VOLATILE/ASM_INPUT/ASM_OPERANDS or something else is needed). Jakub ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 22:50 ` Jakub Jelinek @ 2007-10-27 0:15 ` Ian Lance Taylor 0 siblings, 0 replies; 15+ messages in thread From: Ian Lance Taylor @ 2007-10-27 0:15 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Michael Matz, gcc, gcc-patches Jakub Jelinek <jakub@redhat.com> writes: > On Fri, Oct 26, 2007 at 02:24:21PM -0700, Ian Lance Taylor wrote: > > What do people think of this patch? This seems to fix the problem > > case without breaking Michael's case. It basically avoids store > > speculation: we don't write to a MEM unless the function > > unconditionally writes to the MEM anyhow. > > This still isn't enough. If you have a non-pure/non-const CALL_INSN > before the unconditional store into it, you need to return false from > noce_mem_unconditionally_set_p as that function could have a barrier > in it. Similarly for inline asm or __sync_* builtin generated insns > (not sure ATM if just stopping on UNSPEC_VOLATILE/ASM_INPUT/ASM_OPERANDS > or something else is needed). Yeah, I thought of that later. This is the patch I'm actually testing. Does this look OK to you? Ian Index: ifcvt.c =================================================================== --- ifcvt.c (revision 129661) +++ ifcvt.c (working copy) @@ -2139,6 +2139,47 @@ noce_mem_write_may_trap_or_fault_p (cons return false; } +/* Return whether we can use store speculation for MEM. TOP_BB is the + basic block above the conditional block where we are considering + doing the speculative store. We look for whether MEM is set + unconditionally later in the function. */ + +static bool +noce_can_store_speculate_p (basic_block top_bb, const_rtx mem) +{ + basic_block dominator; + + for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb); + dominator != NULL; + dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator)) + { + rtx insn; + + FOR_BB_INSNS (dominator, insn) + { + if (memory_modified_in_insn_p (mem, insn)) + return true; + if (modified_in_p (XEXP (mem, 0), insn)) + return false; + + /* If we see something that might be a memory barrier, we + have to stop looking. Even if the MEM is set later in + the function, we still don't want to set it + unconditionally before the barrier. Note that + memory_modified_in_p will return true for an asm which + clobbers memory. */ + if (INSN_P (insn) + && (volatile_insn_p (PATTERN (insn)) + || (CALL_P (insn) + && (!CONST_OR_PURE_CALL_P (insn) + || pure_call_p (insn))))) + return false; + } + } + + return false; +} + /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert it without using conditional execution. Return TRUE if we were successful at converting the block. */ @@ -2292,17 +2333,31 @@ noce_process_if_block (struct noce_if_in goto success; } - /* Disallow the "if (...) x = a;" form (with an implicit "else x = x;") - for optimizations if writing to x may trap or fault, i.e. it's a memory - other than a static var or a stack slot, is misaligned on strict - aligned machines or is read-only. - If x is a read-only memory, then the program is valid only if we - avoid the store into it. If there are stores on both the THEN and - ELSE arms, then we can go ahead with the conversion; either the - program is broken, or the condition is always false such that the - other memory is selected. */ - if (!set_b && MEM_P (orig_x) && noce_mem_write_may_trap_or_fault_p (orig_x)) - return FALSE; + if (!set_b && MEM_P (orig_x)) + { + /* Disallow the "if (...) x = a;" form (implicit "else x = x;") + for optimizations if writing to x may trap or fault, + i.e. it's a memory other than a static var or a stack slot, + is misaligned on strict aligned machines or is read-only. If + x is a read-only memory, then the program is valid only if we + avoid the store into it. If there are stores on both the + THEN and ELSE arms, then we can go ahead with the conversion; + either the program is broken, or the condition is always + false such that the other memory is selected. */ + if (noce_mem_write_may_trap_or_fault_p (orig_x)) + return FALSE; + + /* Avoid store speculation: given "if (...) x = a" where x is a + MEM, we only want to do the store if x is always set + somewhere in the function. This avoids cases like + if (pthread_mutex_trylock(mutex)) + ++global_variable; + where we only want global_variable to be changed if the mutex + is held. FIXME: This should ideally be expressed directly in + RTL somehow. */ + if (!noce_can_store_speculate_p (test_bb, orig_x)) + return FALSE; + } if (noce_try_move (if_info)) goto success; @@ -3957,7 +4012,7 @@ dead_or_predicable (basic_block test_bb, /* Main entry point for all if-conversion. */ static void -if_convert (bool recompute_dominance) +if_convert (void) { basic_block bb; int pass; @@ -3977,9 +4032,8 @@ if_convert (bool recompute_dominance) loop_optimizer_finalize (); free_dominance_info (CDI_DOMINATORS); - /* Compute postdominators if we think we'll use them. */ - if (HAVE_conditional_execution || recompute_dominance) - calculate_dominance_info (CDI_POST_DOMINATORS); + /* Compute postdominators. */ + calculate_dominance_info (CDI_POST_DOMINATORS); df_set_flags (DF_LR_RUN_DCE); @@ -4068,7 +4122,7 @@ rest_of_handle_if_conversion (void) if (dump_file) dump_flow_info (dump_file, dump_flags); cleanup_cfg (CLEANUP_EXPENSIVE); - if_convert (false); + if_convert (); } cleanup_cfg (0); @@ -4105,7 +4159,7 @@ gate_handle_if_after_combine (void) static unsigned int rest_of_handle_if_after_combine (void) { - if_convert (true); + if_convert (); return 0; } @@ -4138,7 +4192,7 @@ gate_handle_if_after_reload (void) static unsigned int rest_of_handle_if_after_reload (void) { - if_convert (true); + if_convert (); return 0; } ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 21:47 ` Optimization of conditional access to globals: thread-unsafe? Ian Lance Taylor ` (2 preceding siblings ...) 2007-10-26 22:50 ` Jakub Jelinek @ 2007-10-27 1:04 ` skaller 2007-10-27 13:40 ` Andrew Haley 2007-10-29 21:23 ` Ian Lance Taylor 4 siblings, 1 reply; 15+ messages in thread From: skaller @ 2007-10-27 1:04 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Michael Matz, gcc, gcc-patches On Fri, 2007-10-26 at 14:24 -0700, Ian Lance Taylor wrote: > Michael Matz <matz@suse.de> writes: > This is basically a public relations exercise. I doubt this > optimization is especially important, so I think it's OK to disable it > to keep people happy. Even though the optimization has been there > since gcc 3.4 and nobody noticed. Most people didn't have multi-core processors then.. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-27 1:04 ` skaller @ 2007-10-27 13:40 ` Andrew Haley 0 siblings, 0 replies; 15+ messages in thread From: Andrew Haley @ 2007-10-27 13:40 UTC (permalink / raw) To: skaller; +Cc: Ian Lance Taylor, Michael Matz, gcc, gcc-patches skaller writes: > > On Fri, 2007-10-26 at 14:24 -0700, Ian Lance Taylor wrote: > > > This is basically a public relations exercise. I doubt this > > optimization is especially important, so I think it's OK to > > disable it to keep people happy. Even though the optimization > > has been there since gcc 3.4 and nobody noticed. > > Most people didn't have multi-core processors then.. It's partly that. It's also that some people, particularly kernel hackers, are super-paranoid about this sort of thing, and that's all to the good. The window of vulnerability introduced by this "optimization" is extremely small -- but it is there. As far as I can tell from reading the Linux kernel list, almost every kernel programmer wants this transformation to be removed. I suspect that most pthreads programmers do too. So yeah, it's mostly a public relations exercise, but a worthwhile one. I look forward to going to the kernel list and telling them we've done what they wanted us to do. They're an important part of our community. Andrew. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-26 21:47 ` Optimization of conditional access to globals: thread-unsafe? Ian Lance Taylor ` (3 preceding siblings ...) 2007-10-27 1:04 ` skaller @ 2007-10-29 21:23 ` Ian Lance Taylor 2007-10-30 7:15 ` Ian Lance Taylor 4 siblings, 1 reply; 15+ messages in thread From: Ian Lance Taylor @ 2007-10-29 21:23 UTC (permalink / raw) To: gcc-patches Ian Lance Taylor <iant@google.com> writes: > What do people think of this patch? This seems to fix the problem > case without breaking Michael's case. It basically avoids store > speculation: we don't write to a MEM unless the function > unconditionally writes to the MEM anyhow. Bootstrapped and tested on i686-pc-linux-gnu. Committed to mainline as follows. Ian 2007-10-29 Ian Lance Taylor <iant@google.com> * ifcvt.c (noce_can_store_speculate_p): New static function. (noce_process_if_block): Call it. (if_convert): Remove recompute_dominance parameter. Change all callers. Index: ifcvt.c =================================================================== --- ifcvt.c (revision 129728) +++ ifcvt.c (working copy) @@ -2139,6 +2139,46 @@ noce_mem_write_may_trap_or_fault_p (cons return false; } +/* Return whether we can use store speculation for MEM. TOP_BB is the + basic block above the conditional block where we are considering + doing the speculative store. We look for whether MEM is set + unconditionally later in the function. */ + +static bool +noce_can_store_speculate_p (basic_block top_bb, const_rtx mem) +{ + basic_block dominator; + + for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb); + dominator != NULL; + dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator)) + { + rtx insn; + + FOR_BB_INSNS (dominator, insn) + { + /* If we see something that might be a memory barrier, we + have to stop looking. Even if the MEM is set later in + the function, we still don't want to set it + unconditionally before the barrier. */ + if (INSN_P (insn) + && (volatile_insn_p (PATTERN (insn)) + || (CALL_P (insn) + && (!CONST_OR_PURE_CALL_P (insn) + || pure_call_p (insn))))) + return false; + + if (memory_modified_in_insn_p (mem, insn)) + return true; + if (modified_in_p (XEXP (mem, 0), insn)) + return false; + + } + } + + return false; +} + /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert it without using conditional execution. Return TRUE if we were successful at converting the block. */ @@ -2292,17 +2332,31 @@ noce_process_if_block (struct noce_if_in goto success; } - /* Disallow the "if (...) x = a;" form (with an implicit "else x = x;") - for optimizations if writing to x may trap or fault, i.e. it's a memory - other than a static var or a stack slot, is misaligned on strict - aligned machines or is read-only. - If x is a read-only memory, then the program is valid only if we - avoid the store into it. If there are stores on both the THEN and - ELSE arms, then we can go ahead with the conversion; either the - program is broken, or the condition is always false such that the - other memory is selected. */ - if (!set_b && MEM_P (orig_x) && noce_mem_write_may_trap_or_fault_p (orig_x)) - return FALSE; + if (!set_b && MEM_P (orig_x)) + { + /* Disallow the "if (...) x = a;" form (implicit "else x = x;") + for optimizations if writing to x may trap or fault, + i.e. it's a memory other than a static var or a stack slot, + is misaligned on strict aligned machines or is read-only. If + x is a read-only memory, then the program is valid only if we + avoid the store into it. If there are stores on both the + THEN and ELSE arms, then we can go ahead with the conversion; + either the program is broken, or the condition is always + false such that the other memory is selected. */ + if (noce_mem_write_may_trap_or_fault_p (orig_x)) + return FALSE; + + /* Avoid store speculation: given "if (...) x = a" where x is a + MEM, we only want to do the store if x is always set + somewhere in the function. This avoids cases like + if (pthread_mutex_trylock(mutex)) + ++global_variable; + where we only want global_variable to be changed if the mutex + is held. FIXME: This should ideally be expressed directly in + RTL somehow. */ + if (!noce_can_store_speculate_p (test_bb, orig_x)) + return FALSE; + } if (noce_try_move (if_info)) goto success; @@ -3957,7 +4011,7 @@ dead_or_predicable (basic_block test_bb, /* Main entry point for all if-conversion. */ static void -if_convert (bool recompute_dominance) +if_convert (void) { basic_block bb; int pass; @@ -3977,9 +4031,8 @@ if_convert (bool recompute_dominance) loop_optimizer_finalize (); free_dominance_info (CDI_DOMINATORS); - /* Compute postdominators if we think we'll use them. */ - if (HAVE_conditional_execution || recompute_dominance) - calculate_dominance_info (CDI_POST_DOMINATORS); + /* Compute postdominators. */ + calculate_dominance_info (CDI_POST_DOMINATORS); df_set_flags (DF_LR_RUN_DCE); @@ -4068,7 +4121,7 @@ rest_of_handle_if_conversion (void) if (dump_file) dump_flow_info (dump_file, dump_flags); cleanup_cfg (CLEANUP_EXPENSIVE); - if_convert (false); + if_convert (); } cleanup_cfg (0); @@ -4105,7 +4158,7 @@ gate_handle_if_after_combine (void) static unsigned int rest_of_handle_if_after_combine (void) { - if_convert (true); + if_convert (); return 0; } @@ -4138,7 +4191,7 @@ gate_handle_if_after_reload (void) static unsigned int rest_of_handle_if_after_reload (void) { - if_convert (true); + if_convert (); return 0; } ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-29 21:23 ` Ian Lance Taylor @ 2007-10-30 7:15 ` Ian Lance Taylor 2007-10-30 14:33 ` Ian Lance Taylor 0 siblings, 1 reply; 15+ messages in thread From: Ian Lance Taylor @ 2007-10-30 7:15 UTC (permalink / raw) To: gcc-patches Ian Lance Taylor <iant@google.com> writes: > Ian Lance Taylor <iant@google.com> writes: > > > What do people think of this patch? This seems to fix the problem > > case without breaking Michael's case. It basically avoids store > > speculation: we don't write to a MEM unless the function > > unconditionally writes to the MEM anyhow. > > Bootstrapped and tested on i686-pc-linux-gnu. > > Committed to mainline as follows. Here is the version I committed to 4.2 branch. Bootstrapped and tested on i686-pc-linux-gnu. Ian 2007-10-29 Ian Lance Taylor <iant@google.com> * ifcvt.c (noce_can_store_speculate_p): New static function. (noce_process_if_block): Call it. (find_if_header): Only call find_if_case_1 and find_if_case_2 if life_data_ok is set. (if_convert): Always compute postdominators. Index: ifcvt.c =================================================================== --- ifcvt.c (revision 129762) +++ ifcvt.c (working copy) @@ -2163,6 +2163,46 @@ noce_mem_write_may_trap_or_fault_p (rtx return false; } +/* Return whether we can use store speculation for MEM. TOP_BB is the + basic block above the conditional block where we are considering + doing the speculative store. We look for whether MEM is set + unconditionally later in the function. */ + +static bool +noce_can_store_speculate_p (basic_block top_bb, rtx mem) +{ + basic_block dominator; + + for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb); + dominator != NULL; + dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator)) + { + rtx insn; + + FOR_BB_INSNS (dominator, insn) + { + /* If we see something that might be a memory barrier, we + have to stop looking. Even if the MEM is set later in + the function, we still don't want to set it + unconditionally before the barrier. */ + if (INSN_P (insn) + && (volatile_insn_p (PATTERN (insn)) + || (CALL_P (insn) + && (!CONST_OR_PURE_CALL_P (insn) + || pure_call_p (insn))))) + return false; + + if (memory_modified_in_insn_p (mem, insn)) + return true; + if (modified_in_p (XEXP (mem, 0), insn)) + return false; + + } + } + + return false; +} + /* Given a simple IF-THEN or IF-THEN-ELSE block, attempt to convert it without using conditional execution. Return TRUE if we were successful at converting the block. */ @@ -2321,17 +2361,31 @@ noce_process_if_block (struct ce_if_bloc goto success; } - /* Disallow the "if (...) x = a;" form (with an implicit "else x = x;") - for optimizations if writing to x may trap or fault, i.e. it's a memory - other than a static var or a stack slot, is misaligned on strict - aligned machines or is read-only. - If x is a read-only memory, then the program is valid only if we - avoid the store into it. If there are stores on both the THEN and - ELSE arms, then we can go ahead with the conversion; either the - program is broken, or the condition is always false such that the - other memory is selected. */ - if (!set_b && MEM_P (orig_x) && noce_mem_write_may_trap_or_fault_p (orig_x)) - return FALSE; + if (!set_b && MEM_P (orig_x)) + { + /* Disallow the "if (...) x = a;" form (implicit "else x = x;") + for optimizations if writing to x may trap or fault, + i.e. it's a memory other than a static var or a stack slot, + is misaligned on strict aligned machines or is read-only. If + x is a read-only memory, then the program is valid only if we + avoid the store into it. If there are stores on both the + THEN and ELSE arms, then we can go ahead with the conversion; + either the program is broken, or the condition is always + false such that the other memory is selected. */ + if (noce_mem_write_may_trap_or_fault_p (orig_x)) + return FALSE; + + /* Avoid store speculation: given "if (...) x = a" where x is a + MEM, we only want to do the store if x is always set + somewhere in the function. This avoids cases like + if (pthread_mutex_trylock(mutex)) + ++global_variable; + where we only want global_variable to be changed if the mutex + is held. FIXME: This should ideally be expressed directly in + RTL somehow. */ + if (!noce_can_store_speculate_p (test_bb, orig_x)) + return FALSE; + } if (noce_try_move (&if_info)) goto success; @@ -2847,7 +2901,8 @@ find_if_header (basic_block test_bb, int && find_cond_trap (test_bb, then_edge, else_edge)) goto success; - if (dom_computed[CDI_POST_DOMINATORS] >= DOM_NO_FAST_QUERY + if (life_data_ok + && dom_computed[CDI_POST_DOMINATORS] >= DOM_NO_FAST_QUERY && (! HAVE_conditional_execution || reload_completed)) { if (find_if_case_1 (test_bb, then_edge, else_edge)) @@ -3862,9 +3917,8 @@ if_convert (int x_life_data_ok) free_dominance_info (CDI_DOMINATORS); } - /* Compute postdominators if we think we'll use them. */ - if (HAVE_conditional_execution || life_data_ok) - calculate_dominance_info (CDI_POST_DOMINATORS); + /* Compute postdominators. */ + calculate_dominance_info (CDI_POST_DOMINATORS); if (life_data_ok) clear_bb_flags (); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Optimization of conditional access to globals: thread-unsafe? 2007-10-30 7:15 ` Ian Lance Taylor @ 2007-10-30 14:33 ` Ian Lance Taylor 0 siblings, 0 replies; 15+ messages in thread From: Ian Lance Taylor @ 2007-10-30 14:33 UTC (permalink / raw) To: gcc-patches Ian Lance Taylor <iant@google.com> writes: > Ian Lance Taylor <iant@google.com> writes: > > > Ian Lance Taylor <iant@google.com> writes: > > > > > What do people think of this patch? This seems to fix the problem > > > case without breaking Michael's case. It basically avoids store > > > speculation: we don't write to a MEM unless the function > > > unconditionally writes to the MEM anyhow. > > > > Bootstrapped and tested on i686-pc-linux-gnu. > > > > Committed to mainline as follows. > > Here is the version I committed to 4.2 branch. Bootstrapped and > tested on i686-pc-linux-gnu. And here is the version I committed to 4.1 branch. Ian 2007-10-30 Ian Lance Taylor <iant@google.com> * ifcvt.c (noce_can_store_speculate_p): New static function. (noce_process_if_block): Call it. (find_if_header): Only call find_if_case_1 and find_if_case_2 if life_data_ok is set. (if_convert): Always compute postdominators. Index: ifcvt.c =================================================================== --- ifcvt.c (revision 129770) +++ ifcvt.c (working copy) @@ -2116,6 +2116,46 @@ noce_mem_write_may_trap_or_fault_p (rtx return false; } +/* Return whether we can use store speculation for MEM. TOP_BB is the + basic block above the conditional block where we are considering + doing the speculative store. We look for whether MEM is set + unconditionally later in the function. */ + +static bool +noce_can_store_speculate_p (basic_block top_bb, rtx mem) +{ + basic_block dominator; + + for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb); + dominator != NULL && dominator != EXIT_BLOCK_PTR; + dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator)) + { + rtx insn; + + FOR_BB_INSNS (dominator, insn) + { + /* If we see something that might be a memory barrier, we + have to stop looking. Even if the MEM is set later in + the function, we still don't want to set it + unconditionally before the barrier. */ + if (INSN_P (insn) + && (volatile_insn_p (PATTERN (insn)) + || (CALL_P (insn) + && (!CONST_OR_PURE_CALL_P (insn) + || pure_call_p (insn))))) + return false; + + if (memory_modified_in_insn_p (mem, insn)) + return true; + if (modified_in_p (XEXP (mem, 0), insn)) + return false; + + } + } + + return false; +} + /* Given a simple IF-THEN or IF-THEN-ELSE block, attempt to convert it without using conditional execution. Return TRUE if we were successful at converting the block. */ @@ -2299,17 +2339,31 @@ noce_process_if_block (struct ce_if_bloc goto success; } - /* Disallow the "if (...) x = a;" form (with an implicit "else x = x;") - for optimizations if writing to x may trap or fault, i.e. it's a memory - other than a static var or a stack slot, is misaligned on strict - aligned machines or is read-only. - If x is a read-only memory, then the program is valid only if we - avoid the store into it. If there are stores on both the THEN and - ELSE arms, then we can go ahead with the conversion; either the - program is broken, or the condition is always false such that the - other memory is selected. */ - if (!set_b && MEM_P (orig_x) && noce_mem_write_may_trap_or_fault_p (orig_x)) - return FALSE; + if (!set_b && MEM_P (orig_x)) + { + /* Disallow the "if (...) x = a;" form (implicit "else x = x;") + for optimizations if writing to x may trap or fault, + i.e. it's a memory other than a static var or a stack slot, + is misaligned on strict aligned machines or is read-only. If + x is a read-only memory, then the program is valid only if we + avoid the store into it. If there are stores on both the + THEN and ELSE arms, then we can go ahead with the conversion; + either the program is broken, or the condition is always + false such that the other memory is selected. */ + if (noce_mem_write_may_trap_or_fault_p (orig_x)) + return FALSE; + + /* Avoid store speculation: given "if (...) x = a" where x is a + MEM, we only want to do the store if x is always set + somewhere in the function. This avoids cases like + if (pthread_mutex_trylock(mutex)) + ++global_variable; + where we only want global_variable to be changed if the mutex + is held. FIXME: This should ideally be expressed directly in + RTL somehow. */ + if (!noce_can_store_speculate_p (test_bb, orig_x)) + return FALSE; + } if (noce_try_move (&if_info)) goto success; @@ -2584,7 +2638,8 @@ find_if_header (basic_block test_bb, int && find_cond_trap (test_bb, then_edge, else_edge)) goto success; - if (dom_computed[CDI_POST_DOMINATORS] >= DOM_NO_FAST_QUERY + if (life_data_ok + && dom_computed[CDI_POST_DOMINATORS] >= DOM_NO_FAST_QUERY && (! HAVE_conditional_execution || reload_completed)) { if (find_if_case_1 (test_bb, then_edge, else_edge)) @@ -3592,9 +3647,8 @@ if_convert (int x_life_data_ok) free_dominance_info (CDI_DOMINATORS); } - /* Compute postdominators if we think we'll use them. */ - if (HAVE_conditional_execution || life_data_ok) - calculate_dominance_info (CDI_POST_DOMINATORS); + /* Compute postdominators. */ + calculate_dominance_info (CDI_POST_DOMINATORS); if (life_data_ok) clear_bb_flags (); ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2007-10-31 20:45 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <e2e108260710260541n61462585u99de9bc0617720f4@mail.gmail.com> [not found] ` <e2e108260710260620k2a2e21b3t1d6c052f14d36094@mail.gmail.com> [not found] ` <20071026143334.GA5041@moonlight.home> [not found] ` <m38x5pj3ig.fsf@localhost.localdomain> [not found] ` <20071026155101.GB5041@moonlight.home> [not found] ` <016201c817e9$5454edd0$2e08a8c0@CAM.ARTIMI.COM> [not found] ` <20071026161739.GC5041@moonlight.home> [not found] ` <Pine.LNX.4.64.0710261836440.23011@wotan.suse.de> 2007-10-26 21:47 ` Optimization of conditional access to globals: thread-unsafe? Ian Lance Taylor 2007-10-26 21:51 ` Diego Novillo 2007-10-26 23:10 ` Ian Lance Taylor 2007-10-27 0:06 ` Jonathan Wakely 2007-10-27 0:43 ` Diego Novillo 2007-10-31 21:14 ` Jason Merrill 2007-10-31 22:16 ` Jason Merrill 2007-10-26 22:06 ` Daniel Jacobowitz 2007-10-26 22:50 ` Jakub Jelinek 2007-10-27 0:15 ` Ian Lance Taylor 2007-10-27 1:04 ` skaller 2007-10-27 13:40 ` Andrew Haley 2007-10-29 21:23 ` Ian Lance Taylor 2007-10-30 7:15 ` Ian Lance Taylor 2007-10-30 14:33 ` Ian Lance Taylor
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).