Remove noce_mem_write_may_trap_or_fault

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Remove noce_mem_write_may_trap_or_fault_p in ifcvt
@ 2015-11-06 17:21 Bernd Schmidt
  2015-11-06 18:39 ` Jeff Law
  0 siblings, 1 reply; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-06 17:21 UTC (permalink / raw)
  To: GCC Patches, Jakub Jelinek, Sebastian Pop

[-- Attachment #1: Type: text/plain, Size: 989 bytes --]

The ifcvt scratchpad patch had some modifications to the function 
noce_mem_write_may_trap_or_fault_p in ifcvt, but as far as I can tell, 
that function itself makes no sense whatsoever. It returns true for 
MEM_READONLY_P which is sensible, but then it also goes on an unreliable 
search through the address, looking for SYMBOL_REFs that are in 
decl_readonly_section. Needless to say, this won't ever find anything on 
targets that don't allow symbolic addresses, and is not reliable even on 
targets that do. The MEM_READONLY_P test must suffice; if there's a 
reason why it doesn't, we'd need to figure out why and possibly disable 
if-conversion of stores more thoroughly.

As a sanity check I bootstrapped and tested the first of the two 
attached patches, which changes the "return true" paths to 
gcc_unreachable. Since that passed, I propose the second of the two 
attached patches, which removes the function and replaces it with a 
simpler check. Ok if that passes too?


Bernd

[-- Attachment #2: ic-sanity.diff --]
[-- Type: text/x-patch, Size: 508 bytes --]

Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	(revision 229049)
+++ gcc/ifcvt.c	(working copy)
@@ -2868,11 +2868,12 @@ noce_mem_write_may_trap_or_fault_p (cons
 	  return false;
 	break;
       case LABEL_REF:
-	return true;
+	gcc_unreachable ();
       case SYMBOL_REF:
 	if (SYMBOL_REF_DECL (addr)
 	    && decl_readonly_section (SYMBOL_REF_DECL (addr), 0))
-	  return true;
+	  gcc_unreachable ();
+
 	return false;
       default:
 	return false;

[-- Attachment #3: ic-memro.diff --]
[-- Type: text/x-patch, Size: 2115 bytes --]

	* ifcvt.c (noce_mem_write_may_trap_or_fault_p): Delete function.
	(noce_process_if_block): Don't call it, test MEM_READONLY_P and
	may_trap_or_fault_p instead.

Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	(revision 229049)
+++ gcc/ifcvt.c	(working copy)
@@ -2828,59 +2828,6 @@ noce_operand_ok (const_rtx op)
   return ! may_trap_p (op);
 }
 
-/* Return true if a write into MEM may trap or fault.  */
-
-static bool
-noce_mem_write_may_trap_or_fault_p (const_rtx mem)
-{
-  rtx addr;
-
-  if (MEM_READONLY_P (mem))
-    return true;
-
-  if (may_trap_or_fault_p (mem))
-    return true;
-
-  addr = XEXP (mem, 0);
-
-  /* Call target hook to avoid the effects of -fpic etc....  */
-  addr = targetm.delegitimize_address (addr);
-
-  while (addr)
-    switch (GET_CODE (addr))
-      {
-      case CONST:
-      case PRE_DEC:
-      case PRE_INC:
-      case POST_DEC:
-      case POST_INC:
-      case POST_MODIFY:
-	addr = XEXP (addr, 0);
-	break;
-      case LO_SUM:
-      case PRE_MODIFY:
-	addr = XEXP (addr, 1);
-	break;
-      case PLUS:
-	if (CONST_INT_P (XEXP (addr, 1)))
-	  addr = XEXP (addr, 0);
-	else
-	  return false;
-	break;
-      case LABEL_REF:
-	return true;
-      case SYMBOL_REF:
-	if (SYMBOL_REF_DECL (addr)
-	    && decl_readonly_section (SYMBOL_REF_DECL (addr), 0))
-	  return true;
-	return false;
-      default:
-	return false;
-      }
-
-  return false;
-}
-
 /* Return whether we can use store speculation for MEM.  TOP_BB is the
    basic block above the conditional block where we are considering
    doing the speculative store.  We look for whether MEM is set
@@ -3210,7 +3157,8 @@ noce_process_if_block (struct noce_if_in
 	 THEN and ELSE arms, then we can go ahead with the conversion;
 	 either the program is broken, or the condition is always
 	 false such that the other memory is selected.  */
-      if (noce_mem_write_may_trap_or_fault_p (orig_x))
+      if (may_trap_or_fault_p (orig_x)
+	  || MEM_READONLY_P (orig_x))
 	return FALSE;
 
       /* Avoid store speculation: given "if (...) x = a" where x is a

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-06 17:21 Remove noce_mem_write_may_trap_or_fault_p in ifcvt Bernd Schmidt
@ 2015-11-06 18:39 ` Jeff Law
  2015-11-06 18:44   ` Bernd Schmidt
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Law @ 2015-11-06 18:39 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches, Jakub Jelinek, Sebastian Pop

On 11/06/2015 10:21 AM, Bernd Schmidt wrote:
> The ifcvt scratchpad patch had some modifications to the function
> noce_mem_write_may_trap_or_fault_p in ifcvt, but as far as I can tell,
> that function itself makes no sense whatsoever. It returns true for
> MEM_READONLY_P which is sensible, but then it also goes on an unreliable
> search through the address, looking for SYMBOL_REFs that are in
> decl_readonly_section. Needless to say, this won't ever find anything on
> targets that don't allow symbolic addresses, and is not reliable even on
> targets that do. The MEM_READONLY_P test must suffice; if there's a
> reason why it doesn't, we'd need to figure out why and possibly disable
> if-conversion of stores more thoroughly.
>
> As a sanity check I bootstrapped and tested the first of the two
> attached patches, which changes the "return true" paths to
> gcc_unreachable. Since that passed, I propose the second of the two
> attached patches, which removes the function and replaces it with a
> simpler check. Ok if that passes too?
Given the name "..may_trap_or_fault_p" ISTM that its mode of operation 
should be to return true (the safe value) unless we can prove the write 
will not fault.  The more cases we can prove true, the better AFAICT.

The PLUS case looks totally wrong.  Though it could possibly be made 
correct by looking for [sp,fp,ap] + offset addresses and verifying we're 
doing a mis-aligned write.  We'd probably also need some kind of 
sensible verification that the offset isn't too large/small.

The SYMBOL_REF case is interesting too.  Can a write to a SYMBOL_REF 
that is not in a readonly section ever fault?  Perhaps a weak symbol? 
Or a mis-aligned write?

So I totally agree this code is bogus.  The fact that we're not hitting 
those cases is disturbing though.

You might look at BZ23567 and its associated test 20051104-1.c. 
Presumably those aren't hitting this path anymore, but why?



Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-06 18:39 ` Jeff Law
@ 2015-11-06 18:44   ` Bernd Schmidt
  2015-11-06 19:20     ` Jeff Law
  0 siblings, 1 reply; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-06 18:44 UTC (permalink / raw)
  To: Jeff Law, Bernd Schmidt, GCC Patches, Jakub Jelinek, Sebastian Pop

On 11/06/2015 07:39 PM, Jeff Law wrote:
> Given the name "..may_trap_or_fault_p" ISTM that its mode of operation
> should be to return true (the safe value) unless we can prove the write
> will not fault.  The more cases we can prove true, the better AFAICT.
>
> The PLUS case looks totally wrong.  Though it could possibly be made
> correct by looking for [sp,fp,ap] + offset addresses and verifying we're
> doing a mis-aligned write.  We'd probably also need some kind of
> sensible verification that the offset isn't too large/small.

I'm guessing this is already covered by the call to may_trap_or_fault_p. 
The only additional thing that this function tries to prove is that the 
mem isn't readonly. IMO either MEM_READONLY_P is sufficient for that 
(and my patches operate under that assumption), or it isn't sufficient 
and no amount of checking the address is going to make the function useful.


Bernd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-06 18:44   ` Bernd Schmidt
@ 2015-11-06 19:20     ` Jeff Law
  2015-11-06 19:30       ` Bernd Schmidt
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Law @ 2015-11-06 19:20 UTC (permalink / raw)
  To: Bernd Schmidt, Bernd Schmidt, GCC Patches, Jakub Jelinek, Sebastian Pop

On 11/06/2015 11:44 AM, Bernd Schmidt wrote:
> On 11/06/2015 07:39 PM, Jeff Law wrote:
>> Given the name "..may_trap_or_fault_p" ISTM that its mode of operation
>> should be to return true (the safe value) unless we can prove the write
>> will not fault.  The more cases we can prove true, the better AFAICT.
>>
>> The PLUS case looks totally wrong.  Though it could possibly be made
>> correct by looking for [sp,fp,ap] + offset addresses and verifying we're
>> doing a mis-aligned write.  We'd probably also need some kind of
>> sensible verification that the offset isn't too large/small.
>
> I'm guessing this is already covered by the call to may_trap_or_fault_p.
Yea, it looks like may_trap_or_fault_p tries to handle unaligned and 
wacky offsets for sp,fp,ap relative addresses.

So maybe what noce_mem_write_may_trap_or_fault_p is really trying to do 
is take objects where may_trap_or_fault_p returns false, but which in 
fact may fault if we write that memory?  ie, working around lameness in 
may_trap_or_fault_p not knowing the context (ie read vs write).

> The only additional thing that this function tries to prove is that the
> mem isn't readonly. IMO either MEM_READONLY_P is sufficient for that
> (and my patches operate under that assumption), or it isn't sufficient
> and no amount of checking the address is going to make the function useful.
Right.  I think we've come to agreement on what 
noce_mem_write_may_trap_or_fault_p is trying to to.

The problem I have is I don't think we can assume that the lack of 
MEM_READONLY_P is sufficient to determine that a particular memory 
reference is not to readonly memory.   ie, when MEM_READONLY_P is set, 
the object must be readonly, when it is not set, we know nothing.

I think what we really want here is to fix may_trap_or_fault_p to be 
safe.  It's returning "false" in cases where it really ought be 
returning "true".  Then we just use may_trap_or_fault_p, dropping 
noce_mem_write_may_trap_or_fault_p.

Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-06 19:20     ` Jeff Law
@ 2015-11-06 19:30       ` Bernd Schmidt
  2015-11-06 21:09         ` Jeff Law
  0 siblings, 1 reply; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-06 19:30 UTC (permalink / raw)
  To: Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop

On 11/06/2015 08:20 PM, Jeff Law wrote:
> So maybe what noce_mem_write_may_trap_or_fault_p is really trying to do
> is take objects where may_trap_or_fault_p returns false, but which in
> fact may fault if we write that memory?  ie, working around lameness in
> may_trap_or_fault_p not knowing the context (ie read vs write).

Yes exactly, that is the aim - detect extra possibilities for faults 
knowing that the context is a write.

> I think what we really want here is to fix may_trap_or_fault_p to be
> safe.  It's returning "false" in cases where it really ought be
> returning "true".  Then we just use may_trap_or_fault_p, dropping
> noce_mem_write_may_trap_or_fault_p.

Well, I think if MEM_READONLY_P is insufficient (and I guess people 
could cast away const - bother), then the current 
noce_mem_write_may_trap_or_fault_p should be simplified to "return 
true;" and eliminated.  We could try to special case stack accesses 
within a known limit later on.

Maybe it doesn't even matter given that we call noce_can_store_speculate 
immediately after this test.

Bernd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-06 19:30       ` Bernd Schmidt
@ 2015-11-06 21:09         ` Jeff Law
  2015-11-18 19:16           ` Bernd Schmidt
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Law @ 2015-11-06 21:09 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches, Jakub Jelinek, Sebastian Pop

On 11/06/2015 12:30 PM, Bernd Schmidt wrote:
> On 11/06/2015 08:20 PM, Jeff Law wrote:
>> So maybe what noce_mem_write_may_trap_or_fault_p is really trying to do
>> is take objects where may_trap_or_fault_p returns false, but which in
>> fact may fault if we write that memory?  ie, working around lameness in
>> may_trap_or_fault_p not knowing the context (ie read vs write).
>
> Yes exactly, that is the aim - detect extra possibilities for faults
> knowing that the context is a write.
>
>> I think what we really want here is to fix may_trap_or_fault_p to be
>> safe.  It's returning "false" in cases where it really ought be
>> returning "true".  Then we just use may_trap_or_fault_p, dropping
>> noce_mem_write_may_trap_or_fault_p.
>
> Well, I think if MEM_READONLY_P is insufficient (and I guess people
> could cast away const - bother), then the current
> noce_mem_write_may_trap_or_fault_p should be simplified to "return
> true;" and eliminated.  We could try to special case stack accesses
> within a known limit later on.
>
> Maybe it doesn't even matter given that we call noce_can_store_speculate
> immediately after this test.
may_trap_or_fault_p already has this kind of knowledge.  It just doesn't 
know if the memory is a read or a write.  Hence my suggestion we should 
fix may_trap_or_fault_p.

Essentially we pass in the type of access, read, write, unknown (where 
unknown must essentially be treated as write).  That gets passed down 
into the helpers where it's taken into account.  Particularly in 
rtx_addr_can_trap_p_1.

My concern with relying on MEM_READONLY_P and/or may_trap_or_fault_p is 
we know that may_trap_or_fault_p been called before in cases where it's 
returned false for readonly memory locations.  Ripping out 
noce_mem_write_may_trap_or_fault_p without fixing may_trap_or_fault_p 
introduces a latent code code generation issue.

Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-06 21:09         ` Jeff Law
@ 2015-11-18 19:16           ` Bernd Schmidt
  2015-11-18 23:49             ` Jeff Law
  0 siblings, 1 reply; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-18 19:16 UTC (permalink / raw)
  To: Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop

[-- Attachment #1: Type: text/plain, Size: 3556 bytes --]

Ok, so this is a thorny problem. I thought I had a solution, and I'll 
start by describing it, but in the end I think it doesn't actually work.

On 11/06/2015 10:09 PM, Jeff Law wrote:
> On 11/06/2015 12:30 PM, Bernd Schmidt wrote:
>> Well, I think if MEM_READONLY_P is insufficient (and I guess people
>> could cast away const - bother), then the current
>> noce_mem_write_may_trap_or_fault_p should be simplified to "return
>> true;" and eliminated.  We could try to special case stack accesses
>> within a known limit later on.
>>
>> Maybe it doesn't even matter given that we call noce_can_store_speculate
>> immediately after this test.
> may_trap_or_fault_p already has this kind of knowledge.  It just doesn't
> know if the memory is a read or a write.  Hence my suggestion we should
> fix may_trap_or_fault_p.

For the current issue I've come to the conclusion that this kind of 
analysis is irrelevant here (and that is not subject to the problem I'll 
describe later), because of the use of noce_can_store_speculate. See below.

> Ripping out noce_mem_write_may_trap_or_fault_p without fixing
> may_trap_or_fault_p introduces a latent code code generation issue.

I don't think so, actually. One safe option would be to rip it out and 
just stop transforming this case, but let's start by looking at the code 
just a bit further down, calling noce_can_store_speculate. This was 
added later than the code we're discussing, and it tries to verify that 
the same memory location will unconditionally be written to at a point 
later than the one we're trying to convert (but why aren't we testing 
for prior writes?). That should make any may_trap test unnecessary, and 
we can just simply delete the call to 
noce_mem_write_may_trap_or_fault_p. (We could leave such a test in as a 
compile-time focused early-out, but it may actually be too conservative. 
We may have an address which we can't prove safe in isolation, but if we 
know that we also have an unconditional store, it must be safe.)

So... I was about to propose the attached patch, which also fixes some 
oversights in the can_store_speculate path: we shouldn't allow autoinc 
addresses here. The added test in noce_can_store_speculate_p is not 
quite necessary given that the same one is also added to 
memory_must_be_modified_in_insn_p, but it avoids the need for an insn 
walk in cases where it isn't necessary. This bootstrapped and tested ok 
on x86_64-linux.

But then, I looked at noce_can_store_speculate again, and it seems 
broken. We walk over the post-dominators of the block, see if we find a 
store, and fail if something modifies the address:

   for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb);
        dominator != NULL;
        dominator = get_immediate_dominator (CDI_POST_DOMINATORS, 
dominator))
     {
       rtx_insn *insn;

       FOR_BB_INSNS (dominator, insn)
	{
[...]
	  if (modified_in_p (XEXP (mem, 0), insn))
	    return false;
	}

But what if the address is modified, but not in a post-dominator block? 
Let's say

  if (cond)
    *p = x; // this is the store we want to convert
  if (other_cond)
    p = some_pointer;
  else
    p = some_other_pointer;
  *p = y; // we'll see this store which postdominates the original
          // one, but not the modifications of p

So I think that algorithm doesn't work. My suggestion at this point 
would be to give up on converting stores entirely (deleting 
noce_can_store_speculate_p and noce_mem_write_may_trap_or_fault_p) until 
someone produces a reasonable scratchpad patch.


Bernd

[-- Attachment #2: noce-write.diff --]
[-- Type: text/x-patch, Size: 4584 bytes --]

	* alias.c (memory_must_be_modified_in_insn_p): Return false for
	addresses with side effects.
	* ifcvt.c (noce_mem_write_may_trap_or_fault_p): Delete function.
	(noce_can_store_speculate_p): Return false for addresses with side
	effects.
	(noce_process_if_block): Rely only on noce_can_store_speculate_p.

Index: gcc/alias.c
===================================================================
--- gcc/alias.c	(revision 230541)
+++ gcc/alias.c	(working copy)
@@ -2974,6 +2974,10 @@ memory_must_be_modified_in_insn_p (const
 {
   if (!INSN_P (insn))
     return false;
+  /* Our rtx_equal_p tests will not do the right thing in this case.  */
+  if (side_effects_p (XEXP (mem, 0)))
+    return false;
+
   insn = PATTERN (insn);
   if (GET_CODE (insn) == SET)
     return set_dest_equal_p (insn, mem);
@@ -2984,7 +2988,7 @@ memory_must_be_modified_in_insn_p (const
 	{
 	  rtx sub = XVECEXP (insn, 0, i);
 	  if (GET_CODE (sub) == SET
-	      &&  set_dest_equal_p (sub, mem))
+	      && set_dest_equal_p (sub, mem))
 	    return true;
 	}
     }
Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	(revision 230541)
+++ gcc/ifcvt.c	(working copy)
@@ -2919,59 +2919,6 @@ noce_operand_ok (const_rtx op)
   return ! may_trap_p (op);
 }
 
-/* Return true if a write into MEM may trap or fault.  */
-
-static bool
-noce_mem_write_may_trap_or_fault_p (const_rtx mem)
-{
-  rtx addr;
-
-  if (MEM_READONLY_P (mem))
-    return true;
-
-  if (may_trap_or_fault_p (mem))
-    return true;
-
-  addr = XEXP (mem, 0);
-
-  /* Call target hook to avoid the effects of -fpic etc....  */
-  addr = targetm.delegitimize_address (addr);
-
-  while (addr)
-    switch (GET_CODE (addr))
-      {
-      case CONST:
-      case PRE_DEC:
-      case PRE_INC:
-      case POST_DEC:
-      case POST_INC:
-      case POST_MODIFY:
-	addr = XEXP (addr, 0);
-	break;
-      case LO_SUM:
-      case PRE_MODIFY:
-	addr = XEXP (addr, 1);
-	break;
-      case PLUS:
-	if (CONST_INT_P (XEXP (addr, 1)))
-	  addr = XEXP (addr, 0);
-	else
-	  return false;
-	break;
-      case LABEL_REF:
-	return true;
-      case SYMBOL_REF:
-	if (SYMBOL_REF_DECL (addr)
-	    && decl_readonly_section (SYMBOL_REF_DECL (addr), 0))
-	  return true;
-	return false;
-      default:
-	return false;
-      }
-
-  return false;
-}
-
 /* Return whether we can use store speculation for MEM.  TOP_BB is the
    basic block above the conditional block where we are considering
    doing the speculative store.  We look for whether MEM is set
@@ -2982,6 +2929,9 @@ noce_can_store_speculate_p (basic_block
 {
   basic_block dominator;
 
+  if (side_effects_p (XEXP (mem, 0)))
+    return false;
+
   for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb);
        dominator != NULL;
        dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator))
@@ -3540,26 +3490,21 @@ noce_process_if_block (struct noce_if_in
 
   if (!set_b && MEM_P (orig_x))
     {
-      /* Disallow the "if (...) x = a;" form (implicit "else x = x;")
-	 for optimizations if writing to x may trap or fault,
-	 i.e. it's a memory other than a static var or a stack slot,
-	 is misaligned on strict aligned machines or is read-only.  If
-	 x is a read-only memory, then the program is valid only if we
-	 avoid the store into it.  If there are stores on both the
-	 THEN and ELSE arms, then we can go ahead with the conversion;
-	 either the program is broken, or the condition is always
-	 false such that the other memory is selected.  */
-      if (noce_mem_write_may_trap_or_fault_p (orig_x))
-	return FALSE;
+      /* Here we have the "if (...) x = a;" form (with an implicit
+	 "else x = x;"), where x is a store to memory.
 
-      /* Avoid store speculation: given "if (...) x = a" where x is a
-	 MEM, we only want to do the store if x is always set
-	 somewhere in the function.  This avoids cases like
+	 We want to avoid store speculation: we only want to do the store
+	 if x is always set on a path leading to this point without an
+	 intervening memory barrier.  This avoids cases like
 	   if (pthread_mutex_trylock(mutex))
 	     ++global_variable;
 	 where we only want global_variable to be changed if the mutex
 	 is held.  FIXME: This should ideally be expressed directly in
-	 RTL somehow.  */
+	 RTL somehow.
+
+	 A success of this test implies that the access cannot trap, so
+	 we need not call may_trap_or_fault_p; doing so may even lose
+	 valid optimization opportunities.  */
       if (!noce_can_store_speculate_p (test_bb, orig_x))
 	return FALSE;
     }

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-18 19:16           ` Bernd Schmidt
@ 2015-11-18 23:49             ` Jeff Law
  2015-11-20 14:09               ` Bernd Schmidt
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Law @ 2015-11-18 23:49 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches, Jakub Jelinek, Sebastian Pop

On 11/18/2015 12:16 PM, Bernd Schmidt wrote:
>
> For the current issue I've come to the conclusion that this kind of
> analysis is irrelevant here (and that is not subject to the problem I'll
> describe later), because of the use of noce_can_store_speculate. See below.
Hmm....

>
>> Ripping out noce_mem_write_may_trap_or_fault_p without fixing
>> may_trap_or_fault_p introduces a latent code code generation issue.
>
> I don't think so, actually. One safe option would be to rip it out and
> just stop transforming this case, but let's start by looking at the code
> just a bit further down, calling noce_can_store_speculate. This was
> added later than the code we're discussing, and it tries to verify that
> the same memory location will unconditionally be written to at a point
> later than the one we're trying to convert
And if we dig into that thread, Ian suggests this isn't terribly 
important anyway.

However, if we go even further upthread, we find an assertion from 
Michael that this is critical for 456.hmmer and references BZ 27313.

https://gcc.gnu.org/ml/gcc-patches/2007-08/msg01978.html

Sadly, no testcase was included.


> (but why aren't we testing
> for prior writes?).
Oversight I suspect.  Essentially you want to know if the store is 
anticipated (occurs on all paths to a given point).  There's a similar 
term for always occurs on all paths leaving a given point, but I can't 
remember it offhand.

>
> So... I was about to propose the attached patch, which also fixes some
> oversights in the can_store_speculate path: we shouldn't allow autoinc
> addresses here. The added test in noce_can_store_speculate_p is not
> quite necessary given that the same one is also added to
> memory_must_be_modified_in_insn_p, but it avoids the need for an insn
> walk in cases where it isn't necessary. This bootstrapped and tested ok
> on x86_64-linux.
Right.....

>
> But then, I looked at noce_can_store_speculate again, and it seems
> broken. We walk over the post-dominators of the block, see if we find a
> store, and fail if something modifies the address:
Egad.  That's clearly wrong.  Post domination means that paths to the 
exit must go through the post-dominator.  A path not


>
>    for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb);
>         dominator != NULL;
>         dominator = get_immediate_dominator (CDI_POST_DOMINATORS,
> dominator))
>      {
>        rtx_insn *insn;
>
>        FOR_BB_INSNS (dominator, insn)
>      {
> [...]
>        if (modified_in_p (XEXP (mem, 0), insn))
>          return false;
>      }
>
> But what if the address is modified, but not in a post-dominator block?
> Let's say
>
>   if (cond)
>     *p = x; // this is the store we want to convert
>   if (other_cond)
>     p = some_pointer;
>   else
>     p = some_other_pointer;
>   *p = y; // we'll see this store which postdominates the original
>           // one, but not the modifications of p
Right.  You essentially have to check all the blocks in the path.  At 
which point you might as well just run standard dataflow algorithms 
rather than coding up something ad-hoc here.

>
> So I think that algorithm doesn't work. My suggestion at this point
> would be to give up on converting stores entirely (deleting
> noce_can_store_speculate_p and noce_mem_write_may_trap_or_fault_p) until
> someone produces a reasonable scratchpad patch.
So if it weren't for the assertion that it's critical for hmmr, I'd be 
convinced that just ripping out was the right thing to do.

Can you look at 27313 and hmmr and see if there's an impact.  Just maybe 
the critical stuff for those is handled by the tree if converter and we 
can just rip out the clearly incorrect RTL bits without regressing 
anything performance-wise.  If there is an impact, then I think we have 
to look at either improving the tree bits (so we can remove the rtl 
bits) or we have to do real dataflow analysis in the rtl bits.


jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-18 23:49             ` Jeff Law
@ 2015-11-20 14:09               ` Bernd Schmidt
  2015-11-20 18:57                 ` Jeff Law
  2015-11-23 16:03                 ` Michael Matz
  0 siblings, 2 replies; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-20 14:09 UTC (permalink / raw)
  To: Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop, Michael Matz

On 11/19/2015 12:49 AM, Jeff Law wrote:
> On 11/18/2015 12:16 PM, Bernd Schmidt wrote:
>> I don't think so, actually. One safe option would be to rip it out and
>> just stop transforming this case, but let's start by looking at the code
>> just a bit further down, calling noce_can_store_speculate. This was
>> added later than the code we're discussing, and it tries to verify that
>> the same memory location will unconditionally be written to at a point
>> later than the one we're trying to convert
> And if we dig into that thread, Ian suggests this isn't terribly
> important anyway.
>
> However, if we go even further upthread, we find an assertion from
> Michael that this is critical for 456.hmmer and references BZ 27313.

Cc'd in case he has additional input.

> https://gcc.gnu.org/ml/gcc-patches/2007-08/msg01978.html
>
> Sadly, no testcase was included.

BZ27313 is marked as fixed by the introduction of the tree cselim pass, 
thus the problem won't even be seen at RTL level.
I'm undecided on whether cs-elim is safe wrt the store speculation vs 
locks concerns raised in the thread discussing Ian's 
noce_can_store_speculate_p, but that's not something we have to consider 
to solve the problem at hand.

> So if it weren't for the assertion that it's critical for hmmr, I'd be
> convinced that just ripping out was the right thing to do.
>
> Can you look at 27313 and hmmr and see if there's an impact.  Just maybe
> the critical stuff for those is handled by the tree if converter and we
> can just rip out the clearly incorrect RTL bits without regressing
> anything performance-wise.  If there is an impact, then I think we have
> to look at either improving the tree bits (so we can remove the rtl
> bits) or we have to do real dataflow analysis in the rtl bits.

So I made this change:

    if (!set_b && MEM_P (orig_x))
-    {
-      /* Disallow the "if (...) x = a;" form (implicit "else x = x;")
-        for optimizations if writing to x may trap or fault,
-        i.e. it's a memory other than a static var or a stack slot,
-        is misaligned on strict aligned machines or is read-only.  If
-        x is a read-only memory, then the program is valid only if we
-        avoid the store into it.  If there are stores on both the
-        THEN and ELSE arms, then we can go ahead with the conversion;
-        either the program is broken, or the condition is always
-        false such that the other memory is selected.  */
-      if (noce_mem_write_may_trap_or_fault_p (orig_x))
-       return FALSE;
-
-      /* Avoid store speculation: given "if (...) x = a" where x is a
-        MEM, we only want to do the store if x is always set
-        somewhere in the function.  This avoids cases like
-          if (pthread_mutex_trylock(mutex))
-            ++global_variable;
-        where we only want global_variable to be changed if the mutex
-        is held.  FIXME: This should ideally be expressed directly in
-        RTL somehow.  */
-      if (!noce_can_store_speculate_p (test_bb, orig_x))
-       return FALSE;
-    }
+    return FALSE;

As far as I can tell hmmer and the 27313 testcase are unaffected at -O2 
(if anything, hmmer was very slightly faster afterwards). The run wasn't 
super-scientific, but I wouldn't have expected anything else given the 
existence of cs-elim.

Ok to do the above, removing all the bits made unnecessary (including 
memory_must_be_modified_in_insn_p in alias.c)?


Bernd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-20 14:09               ` Bernd Schmidt
@ 2015-11-20 18:57                 ` Jeff Law
  2015-11-23 16:07                   ` Michael Matz
  2015-11-25 10:36                   ` Bernd Schmidt
  2015-11-23 16:03                 ` Michael Matz
  1 sibling, 2 replies; 24+ messages in thread
From: Jeff Law @ 2015-11-20 18:57 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches, Jakub Jelinek, Sebastian Pop, Michael Matz

On 11/20/2015 07:08 AM, Bernd Schmidt wrote:
>
> BZ27313 is marked as fixed by the introduction of the tree cselim pass,
> thus the problem won't even be seen at RTL level.
Cool.

> I'm undecided on whether cs-elim is safe wrt the store speculation vs
> locks concerns raised in the thread discussing Ian's
> noce_can_store_speculate_p, but that's not something we have to consider
> to solve the problem at hand.
I don't think cs-elim is safe WRT locks and such in multi-threaded code.

    In particular it replaces this:

      bb0:
        if (cond) goto bb2; else goto bb1;
      bb1:
        *p = RHS;
      bb2:

    with

      bb0:
        if (cond) goto bb1; else goto bb2;
      bb1:
        condtmp' = *p;
      bb2:
        condtmp = PHI <RHS, condtmp'>
        *p = condtmp;


If *p is a shared memory location, then there may be another writer.  If 
that writer happens to store something in that location after the load 
of *p, but before the store to *p, then that store will get lost in the 
transformed pseudo code.

That seems to introduce a data race.  Presumably one would call the 
original code ill-formed WRT the C11/C++11 memory model since the shared 
location is not marked as such.

I'm willing to consider this an independent problem.



>
> As far as I can tell hmmer and the 27313 testcase are unaffected at -O2
> (if anything, hmmer was very slightly faster afterwards). The run wasn't
> super-scientific, but I wouldn't have expected anything else given the
> existence of cs-elim.
Cool.  It doesn't have to be super-scientific.  Good to see it didn't 
harm anything -- thanks for going the extra mile on this one.

>
> Ok to do the above, removing all the bits made unnecessary (including
> memory_must_be_modified_in_insn_p in alias.c)?
Yup.  Zap it.
jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-20 14:09               ` Bernd Schmidt
  2015-11-20 18:57                 ` Jeff Law
@ 2015-11-23 16:03                 ` Michael Matz
  1 sibling, 0 replies; 24+ messages in thread
From: Michael Matz @ 2015-11-23 16:03 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop

Hi,

On Fri, 20 Nov 2015, Bernd Schmidt wrote:

> On 11/19/2015 12:49 AM, Jeff Law wrote:
> > On 11/18/2015 12:16 PM, Bernd Schmidt wrote:
> > > I don't think so, actually. One safe option would be to rip it out and
> > > just stop transforming this case, but let's start by looking at the code
> > > just a bit further down, calling noce_can_store_speculate. This was
> > > added later than the code we're discussing, and it tries to verify that
> > > the same memory location will unconditionally be written to at a point
> > > later than the one we're trying to convert
> > And if we dig into that thread, Ian suggests this isn't terribly
> > important anyway.
> > 
> > However, if we go even further upthread, we find an assertion from
> > Michael that this is critical for 456.hmmer and references BZ 27313.
> 
> Cc'd in case he has additional input.

The transformation I indicated in the mail introducing the cselim pass is 
indeed crucial for hmmer.  But that has nothing to do with 
noce_mem_write_may_trap_or_fault_p.  The latter is purely an RTL 
transformation, the transformation needs to happen on gimple level.  I 
agree with Bernd that the dominator walking code on the RTL side trying to 
find an unconditional write to the same address in order to validate the 
transformation looks wrong.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-20 18:57                 ` Jeff Law
@ 2015-11-23 16:07                   ` Michael Matz
  2015-11-25 10:46                     ` Bernd Schmidt
  2015-11-25 10:36                   ` Bernd Schmidt
  1 sibling, 1 reply; 24+ messages in thread
From: Michael Matz @ 2015-11-23 16:07 UTC (permalink / raw)
  To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches, Jakub Jelinek, Sebastian Pop

Hi,

On Fri, 20 Nov 2015, Jeff Law wrote:

> > I'm undecided on whether cs-elim is safe wrt the store speculation vs
> > locks concerns raised in the thread discussing Ian's
> > noce_can_store_speculate_p, but that's not something we have to consider
> > to solve the problem at hand.
> I don't think cs-elim is safe WRT locks and such in multi-threaded code.
> 
>    In particular it replaces this:
> 
>      bb0:
>        if (cond) goto bb2; else goto bb1;
>      bb1:
>        *p = RHS;
>      bb2:
> 
>    with
> 
>      bb0:
>        if (cond) goto bb1; else goto bb2;
>      bb1:
>        condtmp' = *p;
>      bb2:
>        condtmp = PHI <RHS, condtmp'>
>        *p = condtmp;

It only does so under some conditions, amongst them if it sees a 
dominating access to the same memory of the same type (load or store) and 
size.  So it doesn't introduce writes on paths that don't already contain 
a write, and hence are multi-thread safe.  Or, at least, that's the 
intention.

> If *p is a shared memory location, then there may be another writer.  
> If that writer happens to store something in that location after the 
> load of *p, but before the store to *p, then that store will get lost in 
> the transformed pseudo code.

Due to the above checking, also the first thread must have been an writer 
so there already are two writers on the same location, and hence a 
race is pre-existing.  It's only when you introduce a write when there was 
none whatsoever before (perhaps due to conditionals) that you create a new 
problem.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-20 18:57                 ` Jeff Law
  2015-11-23 16:07                   ` Michael Matz
@ 2015-11-25 10:36                   ` Bernd Schmidt
  1 sibling, 0 replies; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-25 10:36 UTC (permalink / raw)
  To: Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop, Michael Matz

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

On 11/20/2015 07:57 PM, Jeff Law wrote:
>> Ok to do the above, removing all the bits made unnecessary (including
>> memory_must_be_modified_in_insn_p in alias.c)?
> Yup.  Zap it.

I briefly looked into keeping a reduced version of 
noce_can_store_speculate_p, looking only at the test block, so as to 
catch what I guess is a fairly common idiom:

*p = v1;
if (test)
   *p = v2;

But it turns out the SSA optimizers are already turning this into

if (test)
   *p = v2;
else
   *p = v1;

which I think ifcvt can handle has a different case, so there really 
seems no point. I committed the following after a test cycle.


Bernd

[-- Attachment #2: noce-write2.diff --]
[-- Type: text/x-patch, Size: 6022 bytes --]

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 49fa59b..1e788e8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2015-11-25  Bernd Schmidt  <bschmidt@redhat.com>
+
+	* ifcvt.c (noce_mem_write_may_trap_or_fault_p,
+	noce_can_store_speculate): Delete.
+	(noce_process_if_block): Don't try to handle single MEM stores.
+	* rtl.h (memory_must_be_modified_in_insn_p): Don't declare.
+	* alias.c (memory_must_be_modified_in_insn_p): Delete.
+
 2015-11-25  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
 
 	PR rtl-optimization/68435
diff --git a/gcc/alias.c b/gcc/alias.c
index fb7919a..9a642dd 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -3032,30 +3032,6 @@ set_dest_equal_p (const_rtx set, const_rtx item)
   return rtx_equal_p (dest, item);
 }
 
-/* Like memory_modified_in_insn_p, but return TRUE if INSN will
-   *DEFINITELY* modify the memory contents of MEM.  */
-bool
-memory_must_be_modified_in_insn_p (const_rtx mem, const_rtx insn)
-{
-  if (!INSN_P (insn))
-    return false;
-  insn = PATTERN (insn);
-  if (GET_CODE (insn) == SET)
-    return set_dest_equal_p (insn, mem);
-  else if (GET_CODE (insn) == PARALLEL)
-    {
-      int i;
-      for (i = 0; i < XVECLEN (insn, 0); i++)
-	{
-	  rtx sub = XVECEXP (insn, 0, i);
-	  if (GET_CODE (sub) == SET
-	      &&  set_dest_equal_p (sub, mem))
-	    return true;
-	}
-    }
-  return false;
-}
-
 /* Initialize the aliasing machinery.  Initialize the REG_KNOWN_VALUE
    array.  */
 
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 212d320..fc724bc 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -2962,97 +2962,6 @@ noce_operand_ok (const_rtx op)
   return ! may_trap_p (op);
 }
 
-/* Return true if a write into MEM may trap or fault.  */
-
-static bool
-noce_mem_write_may_trap_or_fault_p (const_rtx mem)
-{
-  rtx addr;
-
-  if (MEM_READONLY_P (mem))
-    return true;
-
-  if (may_trap_or_fault_p (mem))
-    return true;
-
-  addr = XEXP (mem, 0);
-
-  /* Call target hook to avoid the effects of -fpic etc....  */
-  addr = targetm.delegitimize_address (addr);
-
-  while (addr)
-    switch (GET_CODE (addr))
-      {
-      case CONST:
-      case PRE_DEC:
-      case PRE_INC:
-      case POST_DEC:
-      case POST_INC:
-      case POST_MODIFY:
-	addr = XEXP (addr, 0);
-	break;
-      case LO_SUM:
-      case PRE_MODIFY:
-	addr = XEXP (addr, 1);
-	break;
-      case PLUS:
-	if (CONST_INT_P (XEXP (addr, 1)))
-	  addr = XEXP (addr, 0);
-	else
-	  return false;
-	break;
-      case LABEL_REF:
-	return true;
-      case SYMBOL_REF:
-	if (SYMBOL_REF_DECL (addr)
-	    && decl_readonly_section (SYMBOL_REF_DECL (addr), 0))
-	  return true;
-	return false;
-      default:
-	return false;
-      }
-
-  return false;
-}
-
-/* Return whether we can use store speculation for MEM.  TOP_BB is the
-   basic block above the conditional block where we are considering
-   doing the speculative store.  We look for whether MEM is set
-   unconditionally later in the function.  */
-
-static bool
-noce_can_store_speculate_p (basic_block top_bb, const_rtx mem)
-{
-  basic_block dominator;
-
-  for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb);
-       dominator != NULL;
-       dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator))
-    {
-      rtx_insn *insn;
-
-      FOR_BB_INSNS (dominator, insn)
-	{
-	  /* If we see something that might be a memory barrier, we
-	     have to stop looking.  Even if the MEM is set later in
-	     the function, we still don't want to set it
-	     unconditionally before the barrier.  */
-	  if (INSN_P (insn)
-	      && (volatile_insn_p (PATTERN (insn))
-		  || (CALL_P (insn) && (!RTL_CONST_CALL_P (insn)))))
-	    return false;
-
-	  if (memory_must_be_modified_in_insn_p (mem, insn))
-	    return true;
-	  if (modified_in_p (XEXP (mem, 0), insn))
-	    return false;
-
-	}
-    }
-
-  return false;
-}
-
 /* Return true if X contains a MEM subrtx.  */
 
 static bool
@@ -3582,30 +3491,12 @@ noce_process_if_block (struct noce_if_info *if_info)
     }
 
   if (!set_b && MEM_P (orig_x))
-    {
-      /* Disallow the "if (...) x = a;" form (implicit "else x = x;")
-	 for optimizations if writing to x may trap or fault,
-	 i.e. it's a memory other than a static var or a stack slot,
-	 is misaligned on strict aligned machines or is read-only.  If
-	 x is a read-only memory, then the program is valid only if we
-	 avoid the store into it.  If there are stores on both the
-	 THEN and ELSE arms, then we can go ahead with the conversion;
-	 either the program is broken, or the condition is always
-	 false such that the other memory is selected.  */
-      if (noce_mem_write_may_trap_or_fault_p (orig_x))
-	return FALSE;
-
-      /* Avoid store speculation: given "if (...) x = a" where x is a
-	 MEM, we only want to do the store if x is always set
-	 somewhere in the function.  This avoids cases like
-	   if (pthread_mutex_trylock(mutex))
-	     ++global_variable;
-	 where we only want global_variable to be changed if the mutex
-	 is held.  FIXME: This should ideally be expressed directly in
-	 RTL somehow.  */
-      if (!noce_can_store_speculate_p (test_bb, orig_x))
-	return FALSE;
-    }
+    /* We want to avoid store speculation to avoid cases like
+	 if (pthread_mutex_trylock(mutex))
+	   ++global_variable;
+       Rather than go to much effort here, we rely on the SSA optimizers,
+       which do a good enough job these days.  */
+    return FALSE;
 
   if (noce_try_move (if_info))
     goto success;
diff --git a/gcc/rtl.h b/gcc/rtl.h
index 194ed9b..0033466 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3657,7 +3657,6 @@ extern void init_alias_analysis (void);
 extern void end_alias_analysis (void);
 extern void vt_equate_reg_base_value (const_rtx, const_rtx);
 extern bool memory_modified_in_insn_p (const_rtx, const_rtx);
-extern bool memory_must_be_modified_in_insn_p (const_rtx, const_rtx);
 extern bool may_be_sp_based_p (rtx);
 extern rtx gen_hard_reg_clobber (machine_mode, unsigned int);
 extern rtx get_reg_known_value (unsigned int);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-23 16:07                   ` Michael Matz
@ 2015-11-25 10:46                     ` Bernd Schmidt
  2015-11-25 11:53                       ` Richard Biener
  2015-11-25 13:18                       ` Michael Matz
  0 siblings, 2 replies; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-25 10:46 UTC (permalink / raw)
  To: Michael Matz, Jeff Law; +Cc: GCC Patches, Jakub Jelinek, Sebastian Pop

On 11/23/2015 05:05 PM, Michael Matz wrote:
>
> It only does so under some conditions, amongst them if it sees a
> dominating access to the same memory of the same type (load or store) and
> size.  So it doesn't introduce writes on paths that don't already contain
> a write, and hence are multi-thread safe.  Or, at least, that's the
> intention.

Does it also ensure there's no memory barrier in between?


Bernd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 10:46                     ` Bernd Schmidt
@ 2015-11-25 11:53                       ` Richard Biener
  2015-11-25 13:18                         ` Michael Matz
  2015-11-25 13:18                       ` Michael Matz
  1 sibling, 1 reply; 24+ messages in thread
From: Richard Biener @ 2015-11-25 11:53 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Michael Matz, Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop

On Wed, Nov 25, 2015 at 11:44 AM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 11/23/2015 05:05 PM, Michael Matz wrote:
>>
>>
>> It only does so under some conditions, amongst them if it sees a
>> dominating access to the same memory of the same type (load or store) and
>> size.  So it doesn't introduce writes on paths that don't already contain
>> a write, and hence are multi-thread safe.  Or, at least, that's the
>> intention.
>
>
> Does it also ensure there's no memory barrier in between?

I don't think so.  Btw, if you want to add this please add a new gimple
predicate to identify "memory barrier" (any call or asm with a VDEF).

Richard.

>
> Bernd
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 11:53                       ` Richard Biener
@ 2015-11-25 13:18                         ` Michael Matz
  2015-11-25 14:52                           ` Richard Biener
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Matz @ 2015-11-25 13:18 UTC (permalink / raw)
  To: Richard Biener
  Cc: Bernd Schmidt, Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop

Hi,

On Wed, 25 Nov 2015, Richard Biener wrote:

> I don't think so.  Btw, if you want to add this please add a new gimple 
> predicate to identify "memory barrier" (any call or asm with a VDEF).

      if (is_gimple_call (stmt) && !nonfreeing_call_p (stmt))
        nt_call_phase++;


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 10:46                     ` Bernd Schmidt
  2015-11-25 11:53                       ` Richard Biener
@ 2015-11-25 13:18                       ` Michael Matz
  1 sibling, 0 replies; 24+ messages in thread
From: Michael Matz @ 2015-11-25 13:18 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop

Hi,

On Wed, 25 Nov 2015, Bernd Schmidt wrote:

> On 11/23/2015 05:05 PM, Michael Matz wrote:
> > 
> > It only does so under some conditions, amongst them if it sees a
> > dominating access to the same memory of the same type (load or store) and
> > size.  So it doesn't introduce writes on paths that don't already contain
> > a write, and hence are multi-thread safe.  Or, at least, that's the
> > intention.
> 
> Does it also ensure there's no memory barrier in between?

Yes.  Whenever it sees any unknown call (including mem barriers) it 
forgets all known mem accesses (and hence can't mark accesses further down 
as not trapping).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 13:18                         ` Michael Matz
@ 2015-11-25 14:52                           ` Richard Biener
  2015-11-25 15:02                             ` Jakub Jelinek
  0 siblings, 1 reply; 24+ messages in thread
From: Richard Biener @ 2015-11-25 14:52 UTC (permalink / raw)
  To: Michael Matz
  Cc: Bernd Schmidt, Jeff Law, GCC Patches, Jakub Jelinek, Sebastian Pop

On Wed, Nov 25, 2015 at 2:18 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Wed, 25 Nov 2015, Richard Biener wrote:
>
>> I don't think so.  Btw, if you want to add this please add a new gimple
>> predicate to identify "memory barrier" (any call or asm with a VDEF).
>
>       if (is_gimple_call (stmt) && !nonfreeing_call_p (stmt))
>         nt_call_phase++;

That looks bogus to me.  It misses asm()s and at least today
nonfreeing_call_p too much checks what it sounds like it checks.
In practice it might work though.  At least all the __sync_* and
__atomic_* calls are _not_ barriers this way.  A pthread_mutex_lock is
though as we don't have a builtin for it.

I'd change the above to a conservative

   if ((is_gimple_call (stmt) || is_gimple_asm (stmt)) && gimple_vuse (stmt))

Richard.

>
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 14:52                           ` Richard Biener
@ 2015-11-25 15:02                             ` Jakub Jelinek
  2015-11-25 15:18                               ` Michael Matz
  2015-11-25 15:49                               ` Bernd Schmidt
  0 siblings, 2 replies; 24+ messages in thread
From: Jakub Jelinek @ 2015-11-25 15:02 UTC (permalink / raw)
  To: Richard Biener
  Cc: Michael Matz, Bernd Schmidt, Jeff Law, GCC Patches, Sebastian Pop

On Wed, Nov 25, 2015 at 03:52:10PM +0100, Richard Biener wrote:
> On Wed, Nov 25, 2015 at 2:18 PM, Michael Matz <matz@suse.de> wrote:
> > Hi,
> >
> > On Wed, 25 Nov 2015, Richard Biener wrote:
> >
> >> I don't think so.  Btw, if you want to add this please add a new gimple
> >> predicate to identify "memory barrier" (any call or asm with a VDEF).
> >
> >       if (is_gimple_call (stmt) && !nonfreeing_call_p (stmt))
> >         nt_call_phase++;
> 
> That looks bogus to me.  It misses asm()s and at least today
> nonfreeing_call_p too much checks what it sounds like it checks.
> In practice it might work though.  At least all the __sync_* and
> __atomic_* calls are _not_ barriers this way.  A pthread_mutex_lock is
> though as we don't have a builtin for it.
> 
> I'd change the above to a conservative
> 
>    if ((is_gimple_call (stmt) || is_gimple_asm (stmt)) && gimple_vuse (stmt))

nonfreeing_call_p is one necessary condition (if that is true, it means
the call could mean that the first access does not trap while the second one
does).
But I agree that we need a predicate for nonbarrier_call_p or similar.
Some atomic builtins are not barriers though and IMHO should not be treated
as such, at least relaxed atomic loads/stores and pehraps even other relaxed
atomics.  And it would be nice to have IPA discovery of nonbarrier_call_p
calls, like we have for nonfreeing_fn.

	Jakub

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 15:02                             ` Jakub Jelinek
@ 2015-11-25 15:18                               ` Michael Matz
  2015-11-25 15:49                               ` Bernd Schmidt
  1 sibling, 0 replies; 24+ messages in thread
From: Michael Matz @ 2015-11-25 15:18 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Bernd Schmidt, Jeff Law, GCC Patches, Sebastian Pop

Hi,

On Wed, 25 Nov 2015, Jakub Jelinek wrote:

> > That looks bogus to me.  It misses asm()s and at least today 
> > nonfreeing_call_p too much checks what it sounds like it checks. In 
> > practice it might work though.  At least all the __sync_* and 
> > __atomic_* calls are _not_ barriers this way.  A pthread_mutex_lock is 
> > though as we don't have a builtin for it.
> > 
> > I'd change the above to a conservative
> > 
> >    if ((is_gimple_call (stmt) || is_gimple_asm (stmt)) && gimple_vuse (stmt))
> 
> nonfreeing_call_p is one necessary condition (if that is true, it means
> the call could mean that the first access does not trap while the second one
> does).

Yes, Richis test is conservatively catching this as well (it's clear the 
table for all calls and asms that even remotely can change memory state 
(well, when using gimple_vdef, not gimple_vuse).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 15:02                             ` Jakub Jelinek
  2015-11-25 15:18                               ` Michael Matz
@ 2015-11-25 15:49                               ` Bernd Schmidt
  2015-11-25 15:55                                 ` Michael Matz
  1 sibling, 1 reply; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-25 15:49 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener
  Cc: Michael Matz, Jeff Law, GCC Patches, Sebastian Pop

[-- Attachment #1: Type: text/plain, Size: 700 bytes --]

On 11/25/2015 04:00 PM, Jakub Jelinek wrote:
> nonfreeing_call_p is one necessary condition (if that is true, it means
> the call could mean that the first access does not trap while the second one
> does).
> But I agree that we need a predicate for nonbarrier_call_p or similar.
> Some atomic builtins are not barriers though and IMHO should not be treated
> as such, at least relaxed atomic loads/stores and pehraps even other relaxed
> atomics.  And it would be nice to have IPA discovery of nonbarrier_call_p
> calls, like we have for nonfreeing_fn.

So here's a very basic version which I think is appropriate for the 
current stage, and can be extended later. Ok if it passes testing?


Bernd


[-- Attachment #2: nonbarrier.diff --]
[-- Type: text/x-patch, Size: 1840 bytes --]

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 2764df8..bf552a7 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -2636,6 +2636,18 @@ nonfreeing_call_p (gimple *call)
   return n->nonfreeing_fn;
 }
 
+/* Return true when CALL is a call stmt that definitely need not
+   be considered to be a memory barrier.  */
+bool
+nonbarrier_call_p (gimple *call)
+{
+  if (gimple_call_flags (call) & (ECF_PURE | ECF_CONST))
+    return true;
+  /* Should extend this to have a nonbarrier_fn flag, just as above in
+     the nonfreeing case.  */
+  return false;
+}
+
 /* Callback for walk_stmt_load_store_ops.
  
    Return TRUE if OP will dereference the tree stored in DATA, FALSE
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 6eb22de..0b04804 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1507,6 +1507,7 @@ extern bool gimple_call_builtin_p (const gimple *, enum built_in_function);
 extern bool gimple_asm_clobbers_memory_p (const gasm *);
 extern void dump_decl_set (FILE *, bitmap);
 extern bool nonfreeing_call_p (gimple *);
+extern bool nonbarrier_call_p (gimple *);
 extern bool infer_nonnull_range (gimple *, tree);
 extern bool infer_nonnull_range_by_dereference (gimple *, tree);
 extern bool infer_nonnull_range_by_attribute (gimple *, tree);
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index 02d5aa0..56b3732 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -1519,7 +1519,10 @@ nontrapping_dom_walker::before_dom_children (basic_block bb)
     {
       gimple *stmt = gsi_stmt (gsi);
 
-      if (is_gimple_call (stmt) && !nonfreeing_call_p (stmt))
+      if (gimple_code (stmt) == GIMPLE_ASM
+	  || (is_gimple_call (stmt)
+	      && (!nonfreeing_call_p (stmt)
+		  || !nonbarrier_call_p (stmt))))
 	nt_call_phase++;
       else if (gimple_assign_single_p (stmt) && !gimple_has_volatile_ops (stmt))
 	{

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 15:49                               ` Bernd Schmidt
@ 2015-11-25 15:55                                 ` Michael Matz
  2015-11-26  9:50                                   ` Richard Biener
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Matz @ 2015-11-25 15:55 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Jakub Jelinek, Richard Biener, Jeff Law, GCC Patches, Sebastian Pop

Hi,

On Wed, 25 Nov 2015, Bernd Schmidt wrote:

> So here's a very basic version which I think is appropriate for the 
> current stage, and can be extended later. Ok if it passes testing?

When we're improving that place, we should really only consider ASMs that 
change memory state to be problematic ("&& gimple_vdef (stmt)").


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-25 15:55                                 ` Michael Matz
@ 2015-11-26  9:50                                   ` Richard Biener
  2015-11-27 10:22                                     ` Bernd Schmidt
  0 siblings, 1 reply; 24+ messages in thread
From: Richard Biener @ 2015-11-26  9:50 UTC (permalink / raw)
  To: Michael Matz
  Cc: Bernd Schmidt, Jakub Jelinek, Jeff Law, GCC Patches, Sebastian Pop

On Wed, Nov 25, 2015 at 4:54 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Wed, 25 Nov 2015, Bernd Schmidt wrote:
>
>> So here's a very basic version which I think is appropriate for the
>> current stage, and can be extended later. Ok if it passes testing?
>
> When we're improving that place, we should really only consider ASMs that
> change memory state to be problematic ("&& gimple_vdef (stmt)").

Ok with the change suggested by Micha for the asm()s.  Note that I
originally used gimple_vuse () instead of gimple_vdef () as even
reading random memory is a barrier for the compiler to move stores
across it (not reads, of course).  Which is why I also considered
pure (global memory reading) calls to be a barrier (for the stores).

Of course as we don't consider regular assign statement reads (or stores)
to be a "barrier" in the sense that matters here (we're not looking for
memory optimization barriers!) this might be moot and then the
middle-end will effectively require all synchronization barriers (which we
are looking for(?)) to appear as clobbering memory.

Thanks,
Richard.

>
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt
  2015-11-26  9:50                                   ` Richard Biener
@ 2015-11-27 10:22                                     ` Bernd Schmidt
  0 siblings, 0 replies; 24+ messages in thread
From: Bernd Schmidt @ 2015-11-27 10:22 UTC (permalink / raw)
  To: Richard Biener, Michael Matz
  Cc: Jakub Jelinek, Jeff Law, GCC Patches, Sebastian Pop

On 11/26/2015 10:46 AM, Richard Biener wrote:

> Ok with the change suggested by Micha for the asm()s.  Note that I
> originally used gimple_vuse () instead of gimple_vdef () as even
> reading random memory is a barrier for the compiler to move stores
> across it (not reads, of course).  Which is why I also considered
> pure (global memory reading) calls to be a barrier (for the stores).

Yes, but IIUC the stores aren't being moved, they may just be turned 
into unconditional ones. Being nontrapping is one necessary condition 
for that (which we already compute), but we also want to make sure that 
we don't introduce surprises for threaded programs.

> Of course as we don't consider regular assign statement reads (or stores)
> to be a "barrier" in the sense that matters here (we're not looking for
> memory optimization barriers!) this might be moot and then the
> middle-end will effectively require all synchronization barriers (which we
> are looking for(?)) to appear as clobbering memory.

If I read this correctly you have reached the same conclusions. Test 
results came back ok (with vdef tested for asms), and I've committed the 
change.


Bernd

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-11-27 10:07 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-06 17:21 Remove noce_mem_write_may_trap_or_fault_p in ifcvt Bernd Schmidt
2015-11-06 18:39 ` Jeff Law
2015-11-06 18:44   ` Bernd Schmidt
2015-11-06 19:20     ` Jeff Law
2015-11-06 19:30       ` Bernd Schmidt
2015-11-06 21:09         ` Jeff Law
2015-11-18 19:16           ` Bernd Schmidt
2015-11-18 23:49             ` Jeff Law
2015-11-20 14:09               ` Bernd Schmidt
2015-11-20 18:57                 ` Jeff Law
2015-11-23 16:07                   ` Michael Matz
2015-11-25 10:46                     ` Bernd Schmidt
2015-11-25 11:53                       ` Richard Biener
2015-11-25 13:18                         ` Michael Matz
2015-11-25 14:52                           ` Richard Biener
2015-11-25 15:02                             ` Jakub Jelinek
2015-11-25 15:18                               ` Michael Matz
2015-11-25 15:49                               ` Bernd Schmidt
2015-11-25 15:55                                 ` Michael Matz
2015-11-26  9:50                                   ` Richard Biener
2015-11-27 10:22                                     ` Bernd Schmidt
2015-11-25 13:18                       ` Michael Matz
2015-11-25 10:36                   ` Bernd Schmidt
2015-11-23 16:03                 ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).