* [committed] pa: Revise memory barriers to use strongly ordered ldcw instruction
@ 2019-11-08 0:51 John David Anglin
0 siblings, 0 replies; only message in thread
From: John David Anglin @ 2019-11-08 0:51 UTC (permalink / raw)
To: GCC Patches
This change revises the memory barrier patterns to use the ldcw instruction instead of
the sync instruction. The sync instruction performs better and I have more confidence
in it than sync.
We use a location just above the top of the stack for these operations. The stack address
is aligned to a 16-byte boundary if the system is not coherent.
I have added two new options. The first is the -mcoherent-ldcw option. The majority of
PA 2.0 system have coherent caches and as a result the coherent ldcw completer can be used.
In that case, the ldcw address doesn't require 16-byte alignment. We set the default to
-mcoherent-ldcw.
The second option is the -mordered option. Although all PA 1.x systems have ordered memory
accesses, PA 2.0 systems are weakly ordered. Since PA 2.0 are now prevalent, we set the
default to -mno-ordered. For ordered systems, we fall back to just a compiler memory barrier.
I believe acquire and release fences can be defined in a similar way using an ordered load
and an ordered store, respectively.
Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.
Committed to trunk.
Dave
2019-11-07 John David Anglin <danglin@gcc.gnu.org>
* config/pa/pa.md (memory_barrier): Revise to use ldcw barriers.
Enhance comment.
(memory_barrier_coherent, memory_barrier_64, memory_barrier_32): New
insn patterns using ldcw instruction.
(memory_barrier): Remove insn pattern using sync instruction.
* config/pa/pa.opt (coherent-ldcw): New option.
(ordered): New option.
Index: config/pa/pa.md
===================================================================
--- config/pa/pa.md (revision 277870)
+++ config/pa/pa.md (working copy)
@@ -10086,23 +10086,55 @@
(set_attr "length" "4,16")])
;; PA 2.0 hardware supports out-of-order execution of loads and stores, so
-;; we need a memory barrier to enforce program order for memory references.
-;; Since we want PA 1.x code to be PA 2.0 compatible, we also need the
-;; barrier when generating PA 1.x code.
+;; we need memory barriers to enforce program order for memory references
+;; when the TLB and PSW O bits are not set. We assume all PA 2.0 systems
+;; are weakly ordered since neither HP-UX or Linux set the PSW O bit. Since
+;; we want PA 1.x code to be PA 2.0 compatible, we also need barriers when
+;; generating PA 1.x code even though all PA 1.x systems are strongly ordered.
+;; When barriers are needed, we use a strongly ordered ldcw instruction as
+;; the barrier. Most PA 2.0 targets are cache coherent. In that case, we
+;; can use the coherent cache control hint and avoid aligning the ldcw
+;; address. In spite of its description, it is not clear that the sync
+;; instruction works as a barrier.
+
(define_expand "memory_barrier"
- [(set (match_dup 0)
- (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))]
+ [(parallel
+ [(set (match_dup 0) (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
+ (clobber (match_dup 1))])]
""
{
- operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+ /* We don't need a barrier if the target uses ordered memory references. */
+ if (TARGET_ORDERED)
+ FAIL;
+ operands[1] = gen_reg_rtx (Pmode);
+ operands[0] = gen_rtx_MEM (BLKmode, operands[1]);
MEM_VOLATILE_P (operands[0]) = 1;
})
-(define_insn "*memory_barrier"
+(define_insn "*memory_barrier_coherent"
[(set (match_operand:BLK 0 "" "")
- (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))]
+ (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
+ (clobber (match_operand 1 "pmode_register_operand" "=r"))]
+ "TARGET_PA_20 && TARGET_COHERENT_LDCW"
+ "ldcw,co 0(%%sp),%1"
+ [(set_attr "type" "binary")
+ (set_attr "length" "4")])
+
+(define_insn "*memory_barrier_64"
+ [(set (match_operand:BLK 0 "" "")
+ (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
+ (clobber (match_operand 1 "pmode_register_operand" "=&r"))]
+ "TARGET_64BIT"
+ "ldo 15(%%sp),%1\n\tdepd %%r0,63,3,%1\n\tldcw 0(%1),%1"
+ [(set_attr "type" "binary")
+ (set_attr "length" "12")])
+
+(define_insn "*memory_barrier_32"
+ [(set (match_operand:BLK 0 "" "")
+ (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
+ (clobber (match_operand 1 "pmode_register_operand" "=&r"))]
""
- "sync"
+ "ldo 15(%%sp),%1\n\t{dep|depw} %%r0,31,3,%1\n\tldcw 0(%1),%1"
[(set_attr "type" "binary")
- (set_attr "length" "4")])
+ (set_attr "length" "12")])
Index: config/pa/pa.opt
===================================================================
--- config/pa/pa.opt (revision 277870)
+++ config/pa/pa.opt (working copy)
@@ -45,6 +45,10 @@
Target Report Mask(CALLER_COPIES)
Caller copies function arguments passed by hidden reference.
+mcoherent-ldcw
+Target Report Var(TARGET_COHERENT_LDCW) Init(1)
+Use ldcw/ldcd coherent cache-control hint.
+
mdisable-fpregs
Target Report Mask(DISABLE_FPREGS)
Disable FP regs.
@@ -90,6 +94,10 @@
Target RejectNegative Report Mask(NO_SPACE_REGS)
Disable space regs.
+mordered
+Target Report Var(TARGET_ORDERED) Init(0)
+Assume memory references are ordered and barriers are not needed.
+
mpa-risc-1-0
Target RejectNegative
Generate PA1.0 code.
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2019-11-08 0:51 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-08 0:51 [committed] pa: Revise memory barriers to use strongly ordered ldcw instruction John David Anglin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).