* [PATCH] Generate more efficient memory barriers for LEON3
@ 2014-06-25 11:21 Daniel Cederman
2014-07-10 9:39 ` Eric Botcazou
0 siblings, 1 reply; 8+ messages in thread
From: Daniel Cederman @ 2014-06-25 11:21 UTC (permalink / raw)
To: gcc-patches; +Cc: Software
Hello,
The memory barriers generated for SPARC are targeting the weakest memory
model allowed for SPARC. The LEON3/4 SPARC processors are using a
stronger memory model and thus have less requirements on the memory
barriers. For LEON3/4, StoreStore is compiler-only, instead of "stbar",
and StoreLoad can be achieved with a normal byte write "stb", instead of
an atomic byte read-write "ldstub". The provided patch changes the
previously mentioned memory barriers for TARGET_LEON3.
Best regards,
Daniel Cederman
ChangeLog:
2014-06-25 Daniel Cederman <cederman@gaisler.com>
gcc/config/sparc/
* sync.md: Generate more efficient memory barriers for LEON3
diff --git a/gcc/config/sparc/sync.md b/gcc/config/sparc/sync.md
index e6e237f..26173a7 100644
--- a/gcc/config/sparc/sync.md
+++ b/gcc/config/sparc/sync.md
@@ -56,6 +56,15 @@
[(set_attr "type" "multi")
(set_attr "length" "0")])
+;; For LEON3, membar #StoreStore is compiler-only.
+(define_insn "*membar_storestore_leon3"
+ [(set (match_operand:BLK 0 "" "")
+ (unspec:BLK [(match_dup 0) (const_int 8)] UNSPEC_MEMBAR))]
+ "TARGET_LEON3"
+ ""
+ [(set_attr "type" "multi")
+ (set_attr "length" "0")])
+
;; For V8, STBAR is exactly membar #StoreStore, by definition.
(define_insn "*membar_storestore"
[(set (match_operand:BLK 0 "" "")
@@ -64,6 +73,14 @@
"stbar"
[(set_attr "type" "multi")])
+;; For LEON3, STB has the effect of membar #StoreLoad.
+(define_insn "*membar_storeload_leon3"
+ [(set (match_operand:BLK 0 "" "")
+ (unspec:BLK [(match_dup 0) (const_int 2)] UNSPEC_MEMBAR))]
+ "TARGET_LEON3"
+ "stb\t%%g0, [%%sp-1]"
+ [(set_attr "type" "multi")])
+
;; For V8, LDSTUB has the effect of membar #StoreLoad.
(define_insn "*membar_storeload"
[(set (match_operand:BLK 0 "" "")
@@ -72,6 +89,15 @@
"ldstub\t[%%sp-1], %%g0"
[(set_attr "type" "multi")])
+;; For LEON3, membar #StoreLoad is enough for a full barrier.
+(define_insn "*membar_leon3"
+ [(set (match_operand:BLK 0 "" "")
+ (unspec:BLK [(match_dup 0) (match_operand:SI 1 "const_int_operand")]
+ UNSPEC_MEMBAR))]
+ "TARGET_LEON3"
+ "stb\t%%g0, [%%sp-1]"
+ [(set_attr "type" "multi")])
+
;; Put the two together, in combination with the fact that V8
implements PSO
;; as its weakest memory model, means a full barrier. Match all remaining
;; instances of the membar pattern for Sparc V8.
--
Daniel Cederman
Software Engineer
Aeroflex Gaisler AB
Aeroflex Microelectronic Solutions – HiRel
Kungsgatan 12
SE-411 19 Gothenburg, Sweden
Phone: +46 31 7758665
cederman@gaisler.com
www.Aeroflex.com/Gaisler
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Generate more efficient memory barriers for LEON3
2014-06-25 11:21 [PATCH] Generate more efficient memory barriers for LEON3 Daniel Cederman
@ 2014-07-10 9:39 ` Eric Botcazou
2014-07-11 7:55 ` Daniel Cederman
0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2014-07-10 9:39 UTC (permalink / raw)
To: Daniel Cederman; +Cc: gcc-patches, Software
> The memory barriers generated for SPARC are targeting the weakest memory
> model allowed for SPARC.
That's not quite true, they are targeting the sparc_memory_model, which is the
memory model selected for the architecture/OS pair by default and which can be
overridden by the user with -mmemory-model=[default|rmo|pso|tso|sc].
> The LEON3/4 SPARC processors are using a stronger memory model and thus have
> less requirements on the memory barriers.
My understanding is that they use TSO, in which case...
> For LEON3/4, StoreStore is compiler-only, instead of "stbar",
..."stdbar" should never be generated since #StoreStore is implied by TSO.
> and StoreLoad can be achieved with a normal byte write "stb", instead of
> an atomic byte read-write "ldstub".
OK, thanks. Does this result in a significance performance gain?
> The provided patch changes the previously mentioned memory barriers for
> TARGET_LEON3.
I think that only the membar_storeload_leon3 pattern is necessary. Couple of
more nits: the new pattern is not "multi", it's "store" and you need to add:
&& !TARGET_LEON3
to the original membar_storeload since TARGET_LEON3 is also TARGET_V8.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Generate more efficient memory barriers for LEON3
2014-07-10 9:39 ` Eric Botcazou
@ 2014-07-11 7:55 ` Daniel Cederman
2014-07-11 9:17 ` Eric Botcazou
0 siblings, 1 reply; 8+ messages in thread
From: Daniel Cederman @ 2014-07-11 7:55 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches, Software
Hi,
Thank you for your comments.
> ..."stdbar" should never be generated since #StoreStore is implied by
TSO.
I missed that #StoreStore is never generated for TSO, so I'm removing
that pattern.
> OK, thanks. Does this result in a significance performance gain?
stb seems to be at least twice as fast as ldstub.
> I think that only the membar_storeload_leon3 pattern is necessary.
The full barrier pattern membar_leon3 also gets generated so I think
that one should be kept also.
I'm changing the pattern type to "store" and the condition on the
original patterns to "&& !TARGET_LEON3" and resubmitting the patch.
On 2014-07-10 11:37, Eric Botcazou wrote:
>> The memory barriers generated for SPARC are targeting the weakest memory
>> model allowed for SPARC.
>
> That's not quite true, they are targeting the sparc_memory_model, which is the
> memory model selected for the architecture/OS pair by default and which can be
> overridden by the user with -mmemory-model=[default|rmo|pso|tso|sc].
>
>> The LEON3/4 SPARC processors are using a stronger memory model and thus have
>> less requirements on the memory barriers.
>
> My understanding is that they use TSO, in which case...
>
>> For LEON3/4, StoreStore is compiler-only, instead of "stbar",
>
> ..."stdbar" should never be generated since #StoreStore is implied by TSO.
>
>> and StoreLoad can be achieved with a normal byte write "stb", instead of
>> an atomic byte read-write "ldstub".
>
> OK, thanks. Does this result in a significance performance gain?
>
>> The provided patch changes the previously mentioned memory barriers for
>> TARGET_LEON3.
>
> I think that only the membar_storeload_leon3 pattern is necessary. Couple of
> more nits: the new pattern is not "multi", it's "store" and you need to add:
>
> && !TARGET_LEON3
>
> to the original membar_storeload since TARGET_LEON3 is also TARGET_V8.
>
--
Daniel Cederman
Software Engineer
Aeroflex Gaisler AB
Aeroflex Microelectronic Solutions – HiRel
Kungsgatan 12
SE-411 19 Gothenburg, Sweden
Phone: +46 31 7758665
cederman@gaisler.com
www.Aeroflex.com/Gaisler
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Generate more efficient memory barriers for LEON3
2014-07-11 7:55 ` Daniel Cederman
@ 2014-07-11 9:17 ` Eric Botcazou
2014-07-11 10:25 ` Daniel Cederman
0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2014-07-11 9:17 UTC (permalink / raw)
To: Daniel Cederman; +Cc: gcc-patches, Software
> The full barrier pattern membar_leon3 also gets generated so I think
> that one should be kept also.
Do you have a testcase? membar is generated by sparc_emit_membar_for_model
and, for the TSO model of LEON3, implied = StoreStore | LoadLoad | LoadStore
so mm can only be StoreLoad, which means that membar_storeload will match so
the full barrier never will.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Generate more efficient memory barriers for LEON3
2014-07-11 9:17 ` Eric Botcazou
@ 2014-07-11 10:25 ` Daniel Cederman
2014-07-12 9:22 ` Eric Botcazou
0 siblings, 1 reply; 8+ messages in thread
From: Daniel Cederman @ 2014-07-11 10:25 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches, Software
That was an error on my side. The wrong memory model had gotten cached
in a generated make script. Lets drop membar_leon3 also then :)
On 2014-07-11 11:15, Eric Botcazou wrote:
>> The full barrier pattern membar_leon3 also gets generated so I think
>> that one should be kept also.
>
> Do you have a testcase? membar is generated by sparc_emit_membar_for_model
> and, for the TSO model of LEON3, implied = StoreStore | LoadLoad | LoadStore
> so mm can only be StoreLoad, which means that membar_storeload will match so
> the full barrier never will.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Generate more efficient memory barriers for LEON3
2014-07-11 10:25 ` Daniel Cederman
@ 2014-07-12 9:22 ` Eric Botcazou
2014-07-14 11:41 ` Daniel Cederman
0 siblings, 1 reply; 8+ messages in thread
From: Eric Botcazou @ 2014-07-12 9:22 UTC (permalink / raw)
To: Daniel Cederman; +Cc: gcc-patches, Software
> That was an error on my side. The wrong memory model had gotten cached
> in a generated make script. Lets drop membar_leon3 also then :)
Fine with me but, on second thoughts, if a mere "stb" is a #StoreLoad memory
barrier for LEON3, doesn't this simply mean that the memory model of the LEON3
is Strong Consistency and not TSO? In which case, the only thing to change is
the default setting for LEON3 in sparc_option_override.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Generate more efficient memory barriers for LEON3
2014-07-12 9:22 ` Eric Botcazou
@ 2014-07-14 11:41 ` Daniel Cederman
2014-07-19 9:31 ` Eric Botcazou
0 siblings, 1 reply; 8+ messages in thread
From: Daniel Cederman @ 2014-07-14 11:41 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches, Software
LEON3 has a store buffer of length 1, so an additional store is required
to be sure that the one preceding it has been written to memory. I am
not familiar enough with the internals of gcc to pick the optimal place
to encode this behavior, so any suggestions from you are appreciated :)
On 2014-07-12 11:18, Eric Botcazou wrote:
>> That was an error on my side. The wrong memory model had gotten cached
>> in a generated make script. Lets drop membar_leon3 also then :)
>
> Fine with me but, on second thoughts, if a mere "stb" is a #StoreLoad memory
> barrier for LEON3, doesn't this simply mean that the memory model of the LEON3
> is Strong Consistency and not TSO? In which case, the only thing to change is
> the default setting for LEON3 in sparc_option_override.
>
--
Daniel Cederman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Generate more efficient memory barriers for LEON3
2014-07-14 11:41 ` Daniel Cederman
@ 2014-07-19 9:31 ` Eric Botcazou
0 siblings, 0 replies; 8+ messages in thread
From: Eric Botcazou @ 2014-07-19 9:31 UTC (permalink / raw)
To: Daniel Cederman; +Cc: gcc-patches, Software
> LEON3 has a store buffer of length 1, so an additional store is required
> to be sure that the one preceding it has been written to memory. I am
> not familiar enough with the internals of gcc to pick the optimal place
> to encode this behavior, so any suggestions from you are appreciated :)
OK, that's not Strong Consistency so I'm going to apply your latest patch.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-07-19 9:29 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-25 11:21 [PATCH] Generate more efficient memory barriers for LEON3 Daniel Cederman
2014-07-10 9:39 ` Eric Botcazou
2014-07-11 7:55 ` Daniel Cederman
2014-07-11 9:17 ` Eric Botcazou
2014-07-11 10:25 ` Daniel Cederman
2014-07-12 9:22 ` Eric Botcazou
2014-07-14 11:41 ` Daniel Cederman
2014-07-19 9:31 ` Eric Botcazou
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).