public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Generate more efficient memory barriers for LEON3
@ 2014-06-25 11:21 Daniel Cederman
  2014-07-10  9:39 ` Eric Botcazou
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Cederman @ 2014-06-25 11:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: Software

Hello,

The memory barriers generated for SPARC are targeting the weakest memory 
model allowed for SPARC. The LEON3/4 SPARC processors are using a 
stronger memory model and thus have less requirements on the memory 
barriers. For LEON3/4, StoreStore is compiler-only, instead of "stbar", 
and StoreLoad can be achieved with a normal byte write "stb", instead of 
an atomic byte read-write "ldstub". The provided patch changes the 
previously mentioned memory barriers for TARGET_LEON3.

Best regards,
Daniel Cederman

ChangeLog:

2014-06-25  Daniel Cederman  <cederman@gaisler.com>

gcc/config/sparc/
	* sync.md: Generate more efficient memory barriers for LEON3

diff --git a/gcc/config/sparc/sync.md b/gcc/config/sparc/sync.md
index e6e237f..26173a7 100644
--- a/gcc/config/sparc/sync.md
+++ b/gcc/config/sparc/sync.md
@@ -56,6 +56,15 @@
    [(set_attr "type" "multi")
     (set_attr "length" "0")])

+;; For LEON3, membar #StoreStore is compiler-only.
+(define_insn "*membar_storestore_leon3"
+  [(set (match_operand:BLK 0 "" "")
+       (unspec:BLK [(match_dup 0) (const_int 8)] UNSPEC_MEMBAR))]
+  "TARGET_LEON3"
+  ""
+  [(set_attr "type" "multi")
+  (set_attr "length" "0")])
+
  ;; For V8, STBAR is exactly membar #StoreStore, by definition.
  (define_insn "*membar_storestore"
    [(set (match_operand:BLK 0 "" "")
@@ -64,6 +73,14 @@
    "stbar"
    [(set_attr "type" "multi")])

+;; For LEON3, STB has the effect of membar #StoreLoad.
+(define_insn "*membar_storeload_leon3"
+  [(set (match_operand:BLK 0 "" "")
+       (unspec:BLK [(match_dup 0) (const_int 2)] UNSPEC_MEMBAR))]
+  "TARGET_LEON3"
+  "stb\t%%g0, [%%sp-1]"
+  [(set_attr "type" "multi")])
+
  ;; For V8, LDSTUB has the effect of membar #StoreLoad.
  (define_insn "*membar_storeload"
    [(set (match_operand:BLK 0 "" "")
@@ -72,6 +89,15 @@
    "ldstub\t[%%sp-1], %%g0"
    [(set_attr "type" "multi")])

+;; For LEON3, membar #StoreLoad is enough for a full barrier.
+(define_insn "*membar_leon3"
+  [(set (match_operand:BLK 0 "" "")
+       (unspec:BLK [(match_dup 0) (match_operand:SI 1 "const_int_operand")]
+                   UNSPEC_MEMBAR))]
+  "TARGET_LEON3"
+  "stb\t%%g0, [%%sp-1]"
+  [(set_attr "type" "multi")])
+
  ;; Put the two together, in combination with the fact that V8 
implements PSO
  ;; as its weakest memory model, means a full barrier.  Match all remaining
  ;; instances of the membar pattern for Sparc V8.

-- 
Daniel Cederman
Software Engineer
Aeroflex Gaisler AB
Aeroflex Microelectronic Solutions – HiRel
Kungsgatan 12
SE-411 19 Gothenburg, Sweden
Phone: +46 31 7758665
cederman@gaisler.com
www.Aeroflex.com/Gaisler

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Generate more efficient memory barriers for LEON3
  2014-06-25 11:21 [PATCH] Generate more efficient memory barriers for LEON3 Daniel Cederman
@ 2014-07-10  9:39 ` Eric Botcazou
  2014-07-11  7:55   ` Daniel Cederman
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2014-07-10  9:39 UTC (permalink / raw)
  To: Daniel Cederman; +Cc: gcc-patches, Software

> The memory barriers generated for SPARC are targeting the weakest memory
> model allowed for SPARC.

That's not quite true, they are targeting the sparc_memory_model, which is the 
memory model selected for the architecture/OS pair by default and which can be 
overridden by the user with -mmemory-model=[default|rmo|pso|tso|sc].

> The LEON3/4 SPARC processors are using a stronger memory model and thus have
> less requirements on the memory barriers.

My understanding is that they use TSO, in which case...

> For LEON3/4, StoreStore is compiler-only, instead of "stbar",

..."stdbar" should never be generated since #StoreStore is implied by TSO.

> and StoreLoad can be achieved with a normal byte write "stb", instead of
> an atomic byte read-write "ldstub".

OK, thanks.  Does this result in a significance performance gain?

> The provided patch changes the previously mentioned memory barriers for
> TARGET_LEON3.

I think that only the membar_storeload_leon3 pattern is necessary.  Couple of 
more nits: the new pattern is not "multi", it's "store" and you need to add:

  && !TARGET_LEON3

to the original membar_storeload since TARGET_LEON3 is also TARGET_V8.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Generate more efficient memory barriers for LEON3
  2014-07-10  9:39 ` Eric Botcazou
@ 2014-07-11  7:55   ` Daniel Cederman
  2014-07-11  9:17     ` Eric Botcazou
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Cederman @ 2014-07-11  7:55 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Software

Hi,

Thank you for your comments.

 > ..."stdbar" should never be generated since #StoreStore is implied by 
TSO.

I missed that #StoreStore is never generated for TSO, so I'm removing 
that pattern.

 > OK, thanks.  Does this result in a significance performance gain?

stb seems to be at least twice as fast as ldstub.

 > I think that only the membar_storeload_leon3 pattern is necessary.

The full barrier pattern membar_leon3 also gets generated so I think 
that one should be kept also.

I'm changing the pattern type to "store" and the condition on the 
original patterns to "&& !TARGET_LEON3" and resubmitting the patch.

On 2014-07-10 11:37, Eric Botcazou wrote:
>> The memory barriers generated for SPARC are targeting the weakest memory
>> model allowed for SPARC.
>
> That's not quite true, they are targeting the sparc_memory_model, which is the
> memory model selected for the architecture/OS pair by default and which can be
> overridden by the user with -mmemory-model=[default|rmo|pso|tso|sc].
>
>> The LEON3/4 SPARC processors are using a stronger memory model and thus have
>> less requirements on the memory barriers.
>
> My understanding is that they use TSO, in which case...
>
>> For LEON3/4, StoreStore is compiler-only, instead of "stbar",
>
> ..."stdbar" should never be generated since #StoreStore is implied by TSO.
>
>> and StoreLoad can be achieved with a normal byte write "stb", instead of
>> an atomic byte read-write "ldstub".
>
> OK, thanks.  Does this result in a significance performance gain?
>
>> The provided patch changes the previously mentioned memory barriers for
>> TARGET_LEON3.
>
> I think that only the membar_storeload_leon3 pattern is necessary.  Couple of
> more nits: the new pattern is not "multi", it's "store" and you need to add:
>
>    && !TARGET_LEON3
>
> to the original membar_storeload since TARGET_LEON3 is also TARGET_V8.
>

-- 
Daniel Cederman
Software Engineer
Aeroflex Gaisler AB
Aeroflex Microelectronic Solutions – HiRel
Kungsgatan 12
SE-411 19 Gothenburg, Sweden
Phone: +46 31 7758665
cederman@gaisler.com
www.Aeroflex.com/Gaisler

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Generate more efficient memory barriers for LEON3
  2014-07-11  7:55   ` Daniel Cederman
@ 2014-07-11  9:17     ` Eric Botcazou
  2014-07-11 10:25       ` Daniel Cederman
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2014-07-11  9:17 UTC (permalink / raw)
  To: Daniel Cederman; +Cc: gcc-patches, Software

> The full barrier pattern membar_leon3 also gets generated so I think
> that one should be kept also.

Do you have a testcase?  membar is generated by sparc_emit_membar_for_model 
and, for the TSO model of LEON3, implied = StoreStore | LoadLoad | LoadStore 
so mm can only be StoreLoad, which means that membar_storeload will match so 
the full barrier never will.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Generate more efficient memory barriers for LEON3
  2014-07-11  9:17     ` Eric Botcazou
@ 2014-07-11 10:25       ` Daniel Cederman
  2014-07-12  9:22         ` Eric Botcazou
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Cederman @ 2014-07-11 10:25 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Software

That was an error on my side. The wrong memory model had gotten cached 
in a generated make script. Lets drop membar_leon3 also then :)

On 2014-07-11 11:15, Eric Botcazou wrote:
>> The full barrier pattern membar_leon3 also gets generated so I think
>> that one should be kept also.
>
> Do you have a testcase?  membar is generated by sparc_emit_membar_for_model
> and, for the TSO model of LEON3, implied = StoreStore | LoadLoad | LoadStore
> so mm can only be StoreLoad, which means that membar_storeload will match so
> the full barrier never will.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Generate more efficient memory barriers for LEON3
  2014-07-11 10:25       ` Daniel Cederman
@ 2014-07-12  9:22         ` Eric Botcazou
  2014-07-14 11:41           ` Daniel Cederman
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2014-07-12  9:22 UTC (permalink / raw)
  To: Daniel Cederman; +Cc: gcc-patches, Software

> That was an error on my side. The wrong memory model had gotten cached
> in a generated make script. Lets drop membar_leon3 also then :)

Fine with me but, on second thoughts, if a mere "stb" is a #StoreLoad memory 
barrier for LEON3, doesn't this simply mean that the memory model of the LEON3 
is Strong Consistency and not TSO?  In which case, the only thing to change is 
the default setting for LEON3 in sparc_option_override.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Generate more efficient memory barriers for LEON3
  2014-07-12  9:22         ` Eric Botcazou
@ 2014-07-14 11:41           ` Daniel Cederman
  2014-07-19  9:31             ` Eric Botcazou
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Cederman @ 2014-07-14 11:41 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Software

LEON3 has a store buffer of length 1, so an additional store is required 
to be sure that the one preceding it has been written to memory. I am 
not familiar enough with the internals of gcc to pick the optimal place 
to encode this behavior, so any suggestions from you are appreciated :)

On 2014-07-12 11:18, Eric Botcazou wrote:
>> That was an error on my side. The wrong memory model had gotten cached
>> in a generated make script. Lets drop membar_leon3 also then :)
>
> Fine with me but, on second thoughts, if a mere "stb" is a #StoreLoad memory
> barrier for LEON3, doesn't this simply mean that the memory model of the LEON3
> is Strong Consistency and not TSO?  In which case, the only thing to change is
> the default setting for LEON3 in sparc_option_override.
>

-- 
Daniel Cederman

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Generate more efficient memory barriers for LEON3
  2014-07-14 11:41           ` Daniel Cederman
@ 2014-07-19  9:31             ` Eric Botcazou
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Botcazou @ 2014-07-19  9:31 UTC (permalink / raw)
  To: Daniel Cederman; +Cc: gcc-patches, Software

> LEON3 has a store buffer of length 1, so an additional store is required
> to be sure that the one preceding it has been written to memory. I am
> not familiar enough with the internals of gcc to pick the optimal place
> to encode this behavior, so any suggestions from you are appreciated :)

OK, that's not Strong Consistency so I'm going to apply your latest patch.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-07-19  9:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-25 11:21 [PATCH] Generate more efficient memory barriers for LEON3 Daniel Cederman
2014-07-10  9:39 ` Eric Botcazou
2014-07-11  7:55   ` Daniel Cederman
2014-07-11  9:17     ` Eric Botcazou
2014-07-11 10:25       ` Daniel Cederman
2014-07-12  9:22         ` Eric Botcazou
2014-07-14 11:41           ` Daniel Cederman
2014-07-19  9:31             ` Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).