From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28789 invoked by alias); 26 Oct 2007 22:50:49 -0000 Received: (qmail 28769 invoked by uid 22791); 26 Oct 2007 22:50:48 -0000 X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.33.17) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 26 Oct 2007 22:50:45 +0000 Received: from zps38.corp.google.com (zps38.corp.google.com [172.25.146.38]) by smtp-out.google.com with ESMTP id l9QModV9026161; Fri, 26 Oct 2007 23:50:40 +0100 Received: from localhost.localdomain.google.com (dhcp-172-18-116-197.corp.google.com [172.18.116.197]) (authenticated bits=0) by zps38.corp.google.com with ESMTP id l9QMocSB019836 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Fri, 26 Oct 2007 15:50:39 -0700 To: Jakub Jelinek Cc: Michael Matz , gcc@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: Optimization of conditional access to globals: thread-unsafe? References: <20071026143334.GA5041@moonlight.home> <20071026155101.GB5041@moonlight.home> <016201c817e9$5454edd0$2e08a8c0@CAM.ARTIMI.COM> <20071026161739.GC5041@moonlight.home> <20071026220614.GS5451@devserv.devel.redhat.com> From: Ian Lance Taylor Date: Fri, 26 Oct 2007 22:55:00 -0000 In-Reply-To: <20071026220614.GS5451@devserv.devel.redhat.com> Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2007-10/txt/msg00555.txt.bz2 Jakub Jelinek writes: > On Fri, Oct 26, 2007 at 02:24:21PM -0700, Ian Lance Taylor wrote: > > What do people think of this patch? This seems to fix the problem > > case without breaking Michael's case. It basically avoids store > > speculation: we don't write to a MEM unless the function > > unconditionally writes to the MEM anyhow. > > This still isn't enough. If you have a non-pure/non-const CALL_INSN > before the unconditional store into it, you need to return false from > noce_mem_unconditionally_set_p as that function could have a barrier > in it. Similarly for inline asm or __sync_* builtin generated insns > (not sure ATM if just stopping on UNSPEC_VOLATILE/ASM_INPUT/ASM_OPERANDS > or something else is needed). Yeah, I thought of that later. This is the patch I'm actually testing. Does this look OK to you? Ian Index: ifcvt.c =================================================================== --- ifcvt.c (revision 129661) +++ ifcvt.c (working copy) @@ -2139,6 +2139,47 @@ noce_mem_write_may_trap_or_fault_p (cons return false; } +/* Return whether we can use store speculation for MEM. TOP_BB is the + basic block above the conditional block where we are considering + doing the speculative store. We look for whether MEM is set + unconditionally later in the function. */ + +static bool +noce_can_store_speculate_p (basic_block top_bb, const_rtx mem) +{ + basic_block dominator; + + for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb); + dominator != NULL; + dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator)) + { + rtx insn; + + FOR_BB_INSNS (dominator, insn) + { + if (memory_modified_in_insn_p (mem, insn)) + return true; + if (modified_in_p (XEXP (mem, 0), insn)) + return false; + + /* If we see something that might be a memory barrier, we + have to stop looking. Even if the MEM is set later in + the function, we still don't want to set it + unconditionally before the barrier. Note that + memory_modified_in_p will return true for an asm which + clobbers memory. */ + if (INSN_P (insn) + && (volatile_insn_p (PATTERN (insn)) + || (CALL_P (insn) + && (!CONST_OR_PURE_CALL_P (insn) + || pure_call_p (insn))))) + return false; + } + } + + return false; +} + /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert it without using conditional execution. Return TRUE if we were successful at converting the block. */ @@ -2292,17 +2333,31 @@ noce_process_if_block (struct noce_if_in goto success; } - /* Disallow the "if (...) x = a;" form (with an implicit "else x = x;") - for optimizations if writing to x may trap or fault, i.e. it's a memory - other than a static var or a stack slot, is misaligned on strict - aligned machines or is read-only. - If x is a read-only memory, then the program is valid only if we - avoid the store into it. If there are stores on both the THEN and - ELSE arms, then we can go ahead with the conversion; either the - program is broken, or the condition is always false such that the - other memory is selected. */ - if (!set_b && MEM_P (orig_x) && noce_mem_write_may_trap_or_fault_p (orig_x)) - return FALSE; + if (!set_b && MEM_P (orig_x)) + { + /* Disallow the "if (...) x = a;" form (implicit "else x = x;") + for optimizations if writing to x may trap or fault, + i.e. it's a memory other than a static var or a stack slot, + is misaligned on strict aligned machines or is read-only. If + x is a read-only memory, then the program is valid only if we + avoid the store into it. If there are stores on both the + THEN and ELSE arms, then we can go ahead with the conversion; + either the program is broken, or the condition is always + false such that the other memory is selected. */ + if (noce_mem_write_may_trap_or_fault_p (orig_x)) + return FALSE; + + /* Avoid store speculation: given "if (...) x = a" where x is a + MEM, we only want to do the store if x is always set + somewhere in the function. This avoids cases like + if (pthread_mutex_trylock(mutex)) + ++global_variable; + where we only want global_variable to be changed if the mutex + is held. FIXME: This should ideally be expressed directly in + RTL somehow. */ + if (!noce_can_store_speculate_p (test_bb, orig_x)) + return FALSE; + } if (noce_try_move (if_info)) goto success; @@ -3957,7 +4012,7 @@ dead_or_predicable (basic_block test_bb, /* Main entry point for all if-conversion. */ static void -if_convert (bool recompute_dominance) +if_convert (void) { basic_block bb; int pass; @@ -3977,9 +4032,8 @@ if_convert (bool recompute_dominance) loop_optimizer_finalize (); free_dominance_info (CDI_DOMINATORS); - /* Compute postdominators if we think we'll use them. */ - if (HAVE_conditional_execution || recompute_dominance) - calculate_dominance_info (CDI_POST_DOMINATORS); + /* Compute postdominators. */ + calculate_dominance_info (CDI_POST_DOMINATORS); df_set_flags (DF_LR_RUN_DCE); @@ -4068,7 +4122,7 @@ rest_of_handle_if_conversion (void) if (dump_file) dump_flow_info (dump_file, dump_flags); cleanup_cfg (CLEANUP_EXPENSIVE); - if_convert (false); + if_convert (); } cleanup_cfg (0); @@ -4105,7 +4159,7 @@ gate_handle_if_after_combine (void) static unsigned int rest_of_handle_if_after_combine (void) { - if_convert (true); + if_convert (); return 0; } @@ -4138,7 +4192,7 @@ gate_handle_if_after_reload (void) static unsigned int rest_of_handle_if_after_reload (void) { - if_convert (true); + if_convert (); return 0; }