* [PATCH, ARM 2/7] Adapt atomic and exclusive load and store to ARMv8-M Baseline
@ 2016-09-22 13:44 Thomas Preudhomme
2016-09-22 16:42 ` [arm-embedded] " Thomas Preudhomme
2016-10-03 16:43 ` [PATCH, ARM 2/7, ping] " Thomas Preudhomme
0 siblings, 2 replies; 9+ messages in thread
From: Thomas Preudhomme @ 2016-09-22 13:44 UTC (permalink / raw)
To: gcc-patches, Kyrill Tkachov, Ramana Radhakrishnan, Richard Earnshaw
[-- Attachment #1: Type: text/plain, Size: 8059 bytes --]
Hi,
This patch is part of a patch series to add support for atomic operations on
ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and exclusive
load and store patterns to the constraints of ARMv8-M Baseline. It consists of
two sets of changes:
- adding non predicated output templates because ARMv8-M Baseline does not have
IT instruction
- use low registers for ldr/str
Together these changes require to create 2 new alternatives for atomic_load and
atomic_store: (i) one for relaxed, consume and release memory model (the new Pf
constraint) where ldr/str are used and thus low registers must be used and (ii)
another one for the other memory model where lda/stl are used. These are
separate from the constraint for 32bit targets whose output templates expect
predication.
ChangeLog entry is as follows:
*** gcc/ChangeLog ***
2016-07-05 Thomas Preud'homme <thomas.preudhomme@arm.com>
* config/arm/constraints.md (Q constraint): Document its use for
Thumb-1.
(Pf constraint): New constraint for relaxed, consume or relaxed memory
models.
* config/arm/sync.md (atomic_load<mode>): Add new ARMv8-M Baseline only
alternatives to allow any register when memory model matches Pf and
thus lda is used, but only low registers otherwise. Use unpredicated
output template for Thumb-1 targets.
(atomic_store<mode>): Likewise for stl.
(arm_load_exclusive<mode>): Add new ARMv8-M Baseline only alternative
whose output template does not have predication.
(arm_load_acquire_exclusive<mode>): Likewise.
(arm_load_exclusivesi): Likewise.
(arm_load_acquire_exclusivesi): Likewise.
(arm_store_release_exclusive<mode>): Likewise.
(arm_store_exclusive<mode>): Use unpredicated output template for
Thumb-1 targets.
Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
atomic and synchronization testcases in the testsuite [2]. Patchset was also
bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
optimization level -O1 and above [1] without any regression in the testsuite and
no code generation difference in libitm and libgomp.
Code generation for ARMv8-M Baseline has been manually examined and compared
against ARMv8-A Thumb-2 for the following configuration without finding any issue:
gcc.dg/atomic-op-2.c at -Os
gcc.dg/atomic-compare-exchange-2.c at -Os
gcc.dg/atomic-compare-exchange-3.c at -O3
Is this ok for trunk?
Best regards,
Thomas
[1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
undefined ("-O2 -g")
[2] The exact list is:
gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
gcc/testsuite/gcc.dg/atomic-exchange-1.c
gcc/testsuite/gcc.dg/atomic-exchange-2.c
gcc/testsuite/gcc.dg/atomic-exchange-3.c
gcc/testsuite/gcc.dg/atomic-fence.c
gcc/testsuite/gcc.dg/atomic-flag.c
gcc/testsuite/gcc.dg/atomic-generic.c
gcc/testsuite/gcc.dg/atomic-generic-aux.c
gcc/testsuite/gcc.dg/atomic-invalid-2.c
gcc/testsuite/gcc.dg/atomic-load-1.c
gcc/testsuite/gcc.dg/atomic-load-2.c
gcc/testsuite/gcc.dg/atomic-load-3.c
gcc/testsuite/gcc.dg/atomic-lockfree.c
gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
gcc/testsuite/gcc.dg/atomic-noinline.c
gcc/testsuite/gcc.dg/atomic-noinline-aux.c
gcc/testsuite/gcc.dg/atomic-op-1.c
gcc/testsuite/gcc.dg/atomic-op-2.c
gcc/testsuite/gcc.dg/atomic-op-3.c
gcc/testsuite/gcc.dg/atomic-op-6.c
gcc/testsuite/gcc.dg/atomic-store-1.c
gcc/testsuite/gcc.dg/atomic-store-2.c
gcc/testsuite/gcc.dg/atomic-store-3.c
gcc/testsuite/g++.dg/ext/atomic-1.C
gcc/testsuite/g++.dg/ext/atomic-2.C
gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-char.c
gcc/testsuite/gcc.target/arm/atomic-op-consume.c
gcc/testsuite/gcc.target/arm/atomic-op-int.c
gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
gcc/testsuite/gcc.target/arm/atomic-op-release.c
gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
gcc/testsuite/gcc.target/arm/atomic-op-short.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
gcc/testsuite/gcc.target/arm/sync-1.c
gcc/testsuite/gcc.target/arm/synchronize.c
gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc
libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc
libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc
libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc
libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc
libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
[-- Attachment #2: 2_adapt_atomic_load_store_v8m_baseline.patch --]
[-- Type: text/x-patch, Size: 7789 bytes --]
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 4ece5f013c92adee04157b5c909e1d47c894c994..65098ceeb1a66174b345bcfb0688152f9f137150 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -34,11 +34,13 @@
;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in all states: Pf
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Q, Uh, Ut, Uv, Uy, Un, Um, Us
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
+;; in all states: Q
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
@@ -180,6 +182,13 @@
(and (match_code "const_int")
(match_test "TARGET_THUMB1 && ival >= 256 && ival <= 510")))
+(define_constraint "Pf"
+ "Memory models except relaxed, consume or release ones."
+ (and (match_code "const_int")
+ (match_test "!is_mm_relaxed (memmodel_from_int (ival))
+ && !is_mm_consume (memmodel_from_int (ival))
+ && !is_mm_release (memmodel_from_int (ival))")))
+
(define_constraint "Ps"
"@internal In Thumb-2 state a constant in the range -255 to +255"
(and (match_code "const_int")
@@ -407,7 +416,7 @@
(define_memory_constraint "Q"
"@internal
- In ARM/Thumb-2 state an address that is a single base register."
+ An address that is a single base register."
(and (match_code "mem")
(match_test "REG_P (XEXP (op, 0))")))
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index d10ede4175f94e627a23bf32d19d2b5f3de76771..d36c24f76f670d7602f766d7172286504faa7af5 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -63,37 +63,59 @@
(set_attr "predicable" "no")])
(define_insn "atomic_load<mode>"
- [(set (match_operand:QHSI 0 "register_operand" "=r")
+ [(set (match_operand:QHSI 0 "register_operand" "=r,r,l")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q,Q,Q")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_LDA))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model))
- return \"ldr<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"ldr<sync_sfx>\\t%0, %1\";
+ else
+ return \"ldr<sync_sfx>%?\\t%0, %1\";
+ }
else
- return \"lda<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"lda<sync_sfx>\\t%0, %1\";
+ else
+ return \"lda<sync_sfx>%?\\t%0, %1\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "atomic_store<mode>"
- [(set (match_operand:QHSI 0 "memory_operand" "=Q")
+ [(set (match_operand:QHSI 0 "memory_operand" "=Q,Q,Q")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "general_operand" "r")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "general_operand" "r,r,l")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_STL))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model))
- return \"str<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"str<sync_sfx>\t%1, %0\";
+ else
+ return \"str<sync_sfx>%?\t%1, %0\";
+ }
else
- return \"stl<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"stl<sync_sfx>\t%1, %0\";
+ else
+ return \"stl<sync_sfx>%?\t%1, %0\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
;; An LDRD instruction usable by the atomic_loaddi expander on LPAE targets
@@ -380,45 +402,57 @@
})
(define_insn "arm_load_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL)))]
"TARGET_HAVE_LDREXBH"
- "ldrex<sync_sfx>%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex<sync_sfx>%?\t%0, %C1
+ ldrex<sync_sfx>\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX)))]
"TARGET_HAVE_LDACQ"
- "ldaex<sync_sfx>%?\\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex<sync_sfx>%?\\t%0, %C1
+ ldaex<sync_sfx>\\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL))]
"TARGET_HAVE_LDREX"
- "ldrex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex%?\t%0, %C1
+ ldrex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX))]
"TARGET_HAVE_LDACQ"
- "ldaex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex%?\t%0, %C1
+ ldaex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivedi"
@@ -460,7 +494,10 @@
gcc_assert ((REGNO (operands[2]) & 1) == 0 || TARGET_THUMB2);
return "strexd%?\t%0, %2, %H2, %C1";
}
- return "strex<sync_sfx>%?\t%0, %2, %C1";
+ if (TARGET_THUMB1)
+ return "strex<sync_sfx>\t%0, %2, %C1";
+ else
+ return "strex<sync_sfx>%?\t%0, %2, %C1";
}
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
@@ -482,13 +519,16 @@
(set_attr "predicable_short_it" "no")])
(define_insn "arm_store_release_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=&r")
+ [(set (match_operand:SI 0 "s_register_operand" "=&r,&r")
(unspec_volatile:SI [(const_int 0)] VUNSPEC_SLX))
- (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua")
+ (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua,Ua")
(unspec_volatile:QHSI
- [(match_operand:QHSI 2 "s_register_operand" "r")]
+ [(match_operand:QHSI 2 "s_register_operand" "r,r")]
VUNSPEC_SLX))]
"TARGET_HAVE_LDACQ"
- "stlex<sync_sfx>%?\t%0, %2, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ stlex<sync_sfx>%?\t%0, %2, %C1
+ stlex<sync_sfx>\t%0, %2, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
^ permalink raw reply [flat|nested] 9+ messages in thread
* [arm-embedded] [PATCH, ARM 2/7] Adapt atomic and exclusive load and store to ARMv8-M Baseline
2016-09-22 13:44 [PATCH, ARM 2/7] Adapt atomic and exclusive load and store to ARMv8-M Baseline Thomas Preudhomme
@ 2016-09-22 16:42 ` Thomas Preudhomme
2016-10-27 12:55 ` Thomas Preudhomme
2016-10-03 16:43 ` [PATCH, ARM 2/7, ping] " Thomas Preudhomme
1 sibling, 1 reply; 9+ messages in thread
From: Thomas Preudhomme @ 2016-09-22 16:42 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 97 bytes --]
Hi,
We've decided to apply the following patch to ARM/embedded-6-branch.
Best regards,
Thomas
[-- Attachment #2: [PATCH, ARM 2/7] Adapt atomic and exclusive load and store to ARMv8-M Baseline.eml --]
[-- Type: message/rfc822, Size: 16999 bytes --]
[-- Attachment #2.1.1: Type: text/plain, Size: 8059 bytes --]
Hi,
This patch is part of a patch series to add support for atomic operations on
ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and exclusive
load and store patterns to the constraints of ARMv8-M Baseline. It consists of
two sets of changes:
- adding non predicated output templates because ARMv8-M Baseline does not have
IT instruction
- use low registers for ldr/str
Together these changes require to create 2 new alternatives for atomic_load and
atomic_store: (i) one for relaxed, consume and release memory model (the new Pf
constraint) where ldr/str are used and thus low registers must be used and (ii)
another one for the other memory model where lda/stl are used. These are
separate from the constraint for 32bit targets whose output templates expect
predication.
ChangeLog entry is as follows:
*** gcc/ChangeLog ***
2016-07-05 Thomas Preud'homme <thomas.preudhomme@arm.com>
* config/arm/constraints.md (Q constraint): Document its use for
Thumb-1.
(Pf constraint): New constraint for relaxed, consume or relaxed memory
models.
* config/arm/sync.md (atomic_load<mode>): Add new ARMv8-M Baseline only
alternatives to allow any register when memory model matches Pf and
thus lda is used, but only low registers otherwise. Use unpredicated
output template for Thumb-1 targets.
(atomic_store<mode>): Likewise for stl.
(arm_load_exclusive<mode>): Add new ARMv8-M Baseline only alternative
whose output template does not have predication.
(arm_load_acquire_exclusive<mode>): Likewise.
(arm_load_exclusivesi): Likewise.
(arm_load_acquire_exclusivesi): Likewise.
(arm_store_release_exclusive<mode>): Likewise.
(arm_store_exclusive<mode>): Use unpredicated output template for
Thumb-1 targets.
Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
atomic and synchronization testcases in the testsuite [2]. Patchset was also
bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
optimization level -O1 and above [1] without any regression in the testsuite and
no code generation difference in libitm and libgomp.
Code generation for ARMv8-M Baseline has been manually examined and compared
against ARMv8-A Thumb-2 for the following configuration without finding any issue:
gcc.dg/atomic-op-2.c at -Os
gcc.dg/atomic-compare-exchange-2.c at -Os
gcc.dg/atomic-compare-exchange-3.c at -O3
Is this ok for trunk?
Best regards,
Thomas
[1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
undefined ("-O2 -g")
[2] The exact list is:
gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
gcc/testsuite/gcc.dg/atomic-exchange-1.c
gcc/testsuite/gcc.dg/atomic-exchange-2.c
gcc/testsuite/gcc.dg/atomic-exchange-3.c
gcc/testsuite/gcc.dg/atomic-fence.c
gcc/testsuite/gcc.dg/atomic-flag.c
gcc/testsuite/gcc.dg/atomic-generic.c
gcc/testsuite/gcc.dg/atomic-generic-aux.c
gcc/testsuite/gcc.dg/atomic-invalid-2.c
gcc/testsuite/gcc.dg/atomic-load-1.c
gcc/testsuite/gcc.dg/atomic-load-2.c
gcc/testsuite/gcc.dg/atomic-load-3.c
gcc/testsuite/gcc.dg/atomic-lockfree.c
gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
gcc/testsuite/gcc.dg/atomic-noinline.c
gcc/testsuite/gcc.dg/atomic-noinline-aux.c
gcc/testsuite/gcc.dg/atomic-op-1.c
gcc/testsuite/gcc.dg/atomic-op-2.c
gcc/testsuite/gcc.dg/atomic-op-3.c
gcc/testsuite/gcc.dg/atomic-op-6.c
gcc/testsuite/gcc.dg/atomic-store-1.c
gcc/testsuite/gcc.dg/atomic-store-2.c
gcc/testsuite/gcc.dg/atomic-store-3.c
gcc/testsuite/g++.dg/ext/atomic-1.C
gcc/testsuite/g++.dg/ext/atomic-2.C
gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
gcc/testsuite/gcc.target/arm/atomic-op-char.c
gcc/testsuite/gcc.target/arm/atomic-op-consume.c
gcc/testsuite/gcc.target/arm/atomic-op-int.c
gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
gcc/testsuite/gcc.target/arm/atomic-op-release.c
gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
gcc/testsuite/gcc.target/arm/atomic-op-short.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
gcc/testsuite/gcc.target/arm/sync-1.c
gcc/testsuite/gcc.target/arm/synchronize.c
gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc
libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc
libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc
libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc
libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc
libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc
libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc
libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc
libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
[-- Attachment #2.1.2: 2_adapt_atomic_load_store_v8m_baseline.patch --]
[-- Type: text/x-patch, Size: 7789 bytes --]
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 4ece5f013c92adee04157b5c909e1d47c894c994..65098ceeb1a66174b345bcfb0688152f9f137150 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -34,11 +34,13 @@
;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in all states: Pf
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Q, Uh, Ut, Uv, Uy, Un, Um, Us
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
+;; in all states: Q
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
@@ -180,6 +182,13 @@
(and (match_code "const_int")
(match_test "TARGET_THUMB1 && ival >= 256 && ival <= 510")))
+(define_constraint "Pf"
+ "Memory models except relaxed, consume or release ones."
+ (and (match_code "const_int")
+ (match_test "!is_mm_relaxed (memmodel_from_int (ival))
+ && !is_mm_consume (memmodel_from_int (ival))
+ && !is_mm_release (memmodel_from_int (ival))")))
+
(define_constraint "Ps"
"@internal In Thumb-2 state a constant in the range -255 to +255"
(and (match_code "const_int")
@@ -407,7 +416,7 @@
(define_memory_constraint "Q"
"@internal
- In ARM/Thumb-2 state an address that is a single base register."
+ An address that is a single base register."
(and (match_code "mem")
(match_test "REG_P (XEXP (op, 0))")))
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index d10ede4175f94e627a23bf32d19d2b5f3de76771..d36c24f76f670d7602f766d7172286504faa7af5 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -63,37 +63,59 @@
(set_attr "predicable" "no")])
(define_insn "atomic_load<mode>"
- [(set (match_operand:QHSI 0 "register_operand" "=r")
+ [(set (match_operand:QHSI 0 "register_operand" "=r,r,l")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q,Q,Q")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_LDA))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model))
- return \"ldr<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"ldr<sync_sfx>\\t%0, %1\";
+ else
+ return \"ldr<sync_sfx>%?\\t%0, %1\";
+ }
else
- return \"lda<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"lda<sync_sfx>\\t%0, %1\";
+ else
+ return \"lda<sync_sfx>%?\\t%0, %1\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "atomic_store<mode>"
- [(set (match_operand:QHSI 0 "memory_operand" "=Q")
+ [(set (match_operand:QHSI 0 "memory_operand" "=Q,Q,Q")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "general_operand" "r")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "general_operand" "r,r,l")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_STL))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model))
- return \"str<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"str<sync_sfx>\t%1, %0\";
+ else
+ return \"str<sync_sfx>%?\t%1, %0\";
+ }
else
- return \"stl<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"stl<sync_sfx>\t%1, %0\";
+ else
+ return \"stl<sync_sfx>%?\t%1, %0\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
;; An LDRD instruction usable by the atomic_loaddi expander on LPAE targets
@@ -380,45 +402,57 @@
})
(define_insn "arm_load_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL)))]
"TARGET_HAVE_LDREXBH"
- "ldrex<sync_sfx>%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex<sync_sfx>%?\t%0, %C1
+ ldrex<sync_sfx>\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX)))]
"TARGET_HAVE_LDACQ"
- "ldaex<sync_sfx>%?\\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex<sync_sfx>%?\\t%0, %C1
+ ldaex<sync_sfx>\\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL))]
"TARGET_HAVE_LDREX"
- "ldrex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex%?\t%0, %C1
+ ldrex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX))]
"TARGET_HAVE_LDACQ"
- "ldaex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex%?\t%0, %C1
+ ldaex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivedi"
@@ -460,7 +494,10 @@
gcc_assert ((REGNO (operands[2]) & 1) == 0 || TARGET_THUMB2);
return "strexd%?\t%0, %2, %H2, %C1";
}
- return "strex<sync_sfx>%?\t%0, %2, %C1";
+ if (TARGET_THUMB1)
+ return "strex<sync_sfx>\t%0, %2, %C1";
+ else
+ return "strex<sync_sfx>%?\t%0, %2, %C1";
}
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
@@ -482,13 +519,16 @@
(set_attr "predicable_short_it" "no")])
(define_insn "arm_store_release_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=&r")
+ [(set (match_operand:SI 0 "s_register_operand" "=&r,&r")
(unspec_volatile:SI [(const_int 0)] VUNSPEC_SLX))
- (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua")
+ (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua,Ua")
(unspec_volatile:QHSI
- [(match_operand:QHSI 2 "s_register_operand" "r")]
+ [(match_operand:QHSI 2 "s_register_operand" "r,r")]
VUNSPEC_SLX))]
"TARGET_HAVE_LDACQ"
- "stlex<sync_sfx>%?\t%0, %2, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ stlex<sync_sfx>%?\t%0, %2, %C1
+ stlex<sync_sfx>\t%0, %2, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH, ARM 2/7, ping] Adapt atomic and exclusive load and store to ARMv8-M Baseline
2016-09-22 13:44 [PATCH, ARM 2/7] Adapt atomic and exclusive load and store to ARMv8-M Baseline Thomas Preudhomme
2016-09-22 16:42 ` [arm-embedded] " Thomas Preudhomme
@ 2016-10-03 16:43 ` Thomas Preudhomme
2016-10-14 13:48 ` [PATCH, ARM 2/7, ping2] " Thomas Preudhomme
1 sibling, 1 reply; 9+ messages in thread
From: Thomas Preudhomme @ 2016-10-03 16:43 UTC (permalink / raw)
To: gcc-patches, Kyrill Tkachov, Ramana Radhakrishnan, Richard Earnshaw
[-- Attachment #1: Type: text/plain, Size: 8416 bytes --]
Ping?
Best regards,
Thomas
On 22/09/16 14:41, Thomas Preudhomme wrote:
> Hi,
>
> This patch is part of a patch series to add support for atomic operations on
> ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and exclusive
> load and store patterns to the constraints of ARMv8-M Baseline. It consists of
> two sets of changes:
>
> - adding non predicated output templates because ARMv8-M Baseline does not have
> IT instruction
> - use low registers for ldr/str
>
> Together these changes require to create 2 new alternatives for atomic_load and
> atomic_store: (i) one for relaxed, consume and release memory model (the new Pf
> constraint) where ldr/str are used and thus low registers must be used and (ii)
> another one for the other memory model where lda/stl are used. These are
> separate from the constraint for 32bit targets whose output templates expect
> predication.
>
> ChangeLog entry is as follows:
>
> *** gcc/ChangeLog ***
>
> 2016-07-05 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> * config/arm/constraints.md (Q constraint): Document its use for
> Thumb-1.
> (Pf constraint): New constraint for relaxed, consume or relaxed memory
> models.
> * config/arm/sync.md (atomic_load<mode>): Add new ARMv8-M Baseline only
> alternatives to allow any register when memory model matches Pf and
> thus lda is used, but only low registers otherwise. Use unpredicated
> output template for Thumb-1 targets.
> (atomic_store<mode>): Likewise for stl.
> (arm_load_exclusive<mode>): Add new ARMv8-M Baseline only alternative
> whose output template does not have predication.
> (arm_load_acquire_exclusive<mode>): Likewise.
> (arm_load_exclusivesi): Likewise.
> (arm_load_acquire_exclusivesi): Likewise.
> (arm_store_release_exclusive<mode>): Likewise.
> (arm_store_exclusive<mode>): Use unpredicated output template for
> Thumb-1 targets.
>
>
> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
> atomic and synchronization testcases in the testsuite [2]. Patchset was also
> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
> optimization level -O1 and above [1] without any regression in the testsuite and
> no code generation difference in libitm and libgomp.
>
> Code generation for ARMv8-M Baseline has been manually examined and compared
> against ARMv8-A Thumb-2 for the following configuration without finding any issue:
>
> gcc.dg/atomic-op-2.c at -Os
> gcc.dg/atomic-compare-exchange-2.c at -Os
> gcc.dg/atomic-compare-exchange-3.c at -O3
>
>
> Is this ok for trunk?
>
> Best regards,
>
> Thomas
>
> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
> undefined ("-O2 -g")
> [2] The exact list is:
>
> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
> gcc/testsuite/gcc.dg/atomic-exchange-1.c
> gcc/testsuite/gcc.dg/atomic-exchange-2.c
> gcc/testsuite/gcc.dg/atomic-exchange-3.c
> gcc/testsuite/gcc.dg/atomic-fence.c
> gcc/testsuite/gcc.dg/atomic-flag.c
> gcc/testsuite/gcc.dg/atomic-generic.c
> gcc/testsuite/gcc.dg/atomic-generic-aux.c
> gcc/testsuite/gcc.dg/atomic-invalid-2.c
> gcc/testsuite/gcc.dg/atomic-load-1.c
> gcc/testsuite/gcc.dg/atomic-load-2.c
> gcc/testsuite/gcc.dg/atomic-load-3.c
> gcc/testsuite/gcc.dg/atomic-lockfree.c
> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
> gcc/testsuite/gcc.dg/atomic-noinline.c
> gcc/testsuite/gcc.dg/atomic-noinline-aux.c
> gcc/testsuite/gcc.dg/atomic-op-1.c
> gcc/testsuite/gcc.dg/atomic-op-2.c
> gcc/testsuite/gcc.dg/atomic-op-3.c
> gcc/testsuite/gcc.dg/atomic-op-6.c
> gcc/testsuite/gcc.dg/atomic-store-1.c
> gcc/testsuite/gcc.dg/atomic-store-2.c
> gcc/testsuite/gcc.dg/atomic-store-3.c
> gcc/testsuite/g++.dg/ext/atomic-1.C
> gcc/testsuite/g++.dg/ext/atomic-2.C
> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
> gcc/testsuite/gcc.target/arm/atomic-op-char.c
> gcc/testsuite/gcc.target/arm/atomic-op-consume.c
> gcc/testsuite/gcc.target/arm/atomic-op-int.c
> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
> gcc/testsuite/gcc.target/arm/atomic-op-release.c
> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
> gcc/testsuite/gcc.target/arm/atomic-op-short.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
> gcc/testsuite/gcc.target/arm/sync-1.c
> gcc/testsuite/gcc.target/arm/synchronize.c
> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc
> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc
> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc
> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc
> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc
> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc
> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc
> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc
> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc
> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc
> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc
> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc
> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc
> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc
> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
[-- Attachment #2: 2_adapt_atomic_load_store_v8m_baseline.patch --]
[-- Type: text/x-patch, Size: 7789 bytes --]
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 4ece5f013c92adee04157b5c909e1d47c894c994..65098ceeb1a66174b345bcfb0688152f9f137150 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -34,11 +34,13 @@
;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in all states: Pf
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Q, Uh, Ut, Uv, Uy, Un, Um, Us
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
+;; in all states: Q
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
@@ -180,6 +182,13 @@
(and (match_code "const_int")
(match_test "TARGET_THUMB1 && ival >= 256 && ival <= 510")))
+(define_constraint "Pf"
+ "Memory models except relaxed, consume or release ones."
+ (and (match_code "const_int")
+ (match_test "!is_mm_relaxed (memmodel_from_int (ival))
+ && !is_mm_consume (memmodel_from_int (ival))
+ && !is_mm_release (memmodel_from_int (ival))")))
+
(define_constraint "Ps"
"@internal In Thumb-2 state a constant in the range -255 to +255"
(and (match_code "const_int")
@@ -407,7 +416,7 @@
(define_memory_constraint "Q"
"@internal
- In ARM/Thumb-2 state an address that is a single base register."
+ An address that is a single base register."
(and (match_code "mem")
(match_test "REG_P (XEXP (op, 0))")))
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index d10ede4175f94e627a23bf32d19d2b5f3de76771..d36c24f76f670d7602f766d7172286504faa7af5 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -63,37 +63,59 @@
(set_attr "predicable" "no")])
(define_insn "atomic_load<mode>"
- [(set (match_operand:QHSI 0 "register_operand" "=r")
+ [(set (match_operand:QHSI 0 "register_operand" "=r,r,l")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q,Q,Q")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_LDA))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model))
- return \"ldr<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"ldr<sync_sfx>\\t%0, %1\";
+ else
+ return \"ldr<sync_sfx>%?\\t%0, %1\";
+ }
else
- return \"lda<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"lda<sync_sfx>\\t%0, %1\";
+ else
+ return \"lda<sync_sfx>%?\\t%0, %1\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "atomic_store<mode>"
- [(set (match_operand:QHSI 0 "memory_operand" "=Q")
+ [(set (match_operand:QHSI 0 "memory_operand" "=Q,Q,Q")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "general_operand" "r")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "general_operand" "r,r,l")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_STL))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model))
- return \"str<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"str<sync_sfx>\t%1, %0\";
+ else
+ return \"str<sync_sfx>%?\t%1, %0\";
+ }
else
- return \"stl<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"stl<sync_sfx>\t%1, %0\";
+ else
+ return \"stl<sync_sfx>%?\t%1, %0\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
;; An LDRD instruction usable by the atomic_loaddi expander on LPAE targets
@@ -380,45 +402,57 @@
})
(define_insn "arm_load_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL)))]
"TARGET_HAVE_LDREXBH"
- "ldrex<sync_sfx>%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex<sync_sfx>%?\t%0, %C1
+ ldrex<sync_sfx>\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX)))]
"TARGET_HAVE_LDACQ"
- "ldaex<sync_sfx>%?\\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex<sync_sfx>%?\\t%0, %C1
+ ldaex<sync_sfx>\\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL))]
"TARGET_HAVE_LDREX"
- "ldrex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex%?\t%0, %C1
+ ldrex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX))]
"TARGET_HAVE_LDACQ"
- "ldaex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex%?\t%0, %C1
+ ldaex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivedi"
@@ -460,7 +494,10 @@
gcc_assert ((REGNO (operands[2]) & 1) == 0 || TARGET_THUMB2);
return "strexd%?\t%0, %2, %H2, %C1";
}
- return "strex<sync_sfx>%?\t%0, %2, %C1";
+ if (TARGET_THUMB1)
+ return "strex<sync_sfx>\t%0, %2, %C1";
+ else
+ return "strex<sync_sfx>%?\t%0, %2, %C1";
}
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
@@ -482,13 +519,16 @@
(set_attr "predicable_short_it" "no")])
(define_insn "arm_store_release_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=&r")
+ [(set (match_operand:SI 0 "s_register_operand" "=&r,&r")
(unspec_volatile:SI [(const_int 0)] VUNSPEC_SLX))
- (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua")
+ (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua,Ua")
(unspec_volatile:QHSI
- [(match_operand:QHSI 2 "s_register_operand" "r")]
+ [(match_operand:QHSI 2 "s_register_operand" "r,r")]
VUNSPEC_SLX))]
"TARGET_HAVE_LDACQ"
- "stlex<sync_sfx>%?\t%0, %2, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ stlex<sync_sfx>%?\t%0, %2, %C1
+ stlex<sync_sfx>\t%0, %2, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH, ARM 2/7, ping2] Adapt atomic and exclusive load and store to ARMv8-M Baseline
2016-10-03 16:43 ` [PATCH, ARM 2/7, ping] " Thomas Preudhomme
@ 2016-10-14 13:48 ` Thomas Preudhomme
2016-10-24 8:04 ` [PATCH, ARM 2/7, ping3] " Thomas Preudhomme
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Preudhomme @ 2016-10-14 13:48 UTC (permalink / raw)
To: gcc-patches, Kyrill Tkachov, Ramana Radhakrishnan, Richard Earnshaw
[-- Attachment #1: Type: text/plain, Size: 8679 bytes --]
Ping?
Best regards,
Thomas
On 03/10/16 17:42, Thomas Preudhomme wrote:
> Ping?
>
> Best regards,
>
> Thomas
>
> On 22/09/16 14:41, Thomas Preudhomme wrote:
>> Hi,
>>
>> This patch is part of a patch series to add support for atomic operations on
>> ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and exclusive
>> load and store patterns to the constraints of ARMv8-M Baseline. It consists of
>> two sets of changes:
>>
>> - adding non predicated output templates because ARMv8-M Baseline does not have
>> IT instruction
>> - use low registers for ldr/str
>>
>> Together these changes require to create 2 new alternatives for atomic_load and
>> atomic_store: (i) one for relaxed, consume and release memory model (the new Pf
>> constraint) where ldr/str are used and thus low registers must be used and (ii)
>> another one for the other memory model where lda/stl are used. These are
>> separate from the constraint for 32bit targets whose output templates expect
>> predication.
>>
>> ChangeLog entry is as follows:
>>
>> *** gcc/ChangeLog ***
>>
>> 2016-07-05 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>
>> * config/arm/constraints.md (Q constraint): Document its use for
>> Thumb-1.
>> (Pf constraint): New constraint for relaxed, consume or relaxed memory
>> models.
>> * config/arm/sync.md (atomic_load<mode>): Add new ARMv8-M Baseline only
>> alternatives to allow any register when memory model matches Pf and
>> thus lda is used, but only low registers otherwise. Use unpredicated
>> output template for Thumb-1 targets.
>> (atomic_store<mode>): Likewise for stl.
>> (arm_load_exclusive<mode>): Add new ARMv8-M Baseline only alternative
>> whose output template does not have predication.
>> (arm_load_acquire_exclusive<mode>): Likewise.
>> (arm_load_exclusivesi): Likewise.
>> (arm_load_acquire_exclusivesi): Likewise.
>> (arm_store_release_exclusive<mode>): Likewise.
>> (arm_store_exclusive<mode>): Use unpredicated output template for
>> Thumb-1 targets.
>>
>>
>> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
>> atomic and synchronization testcases in the testsuite [2]. Patchset was also
>> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
>> optimization level -O1 and above [1] without any regression in the testsuite and
>> no code generation difference in libitm and libgomp.
>>
>> Code generation for ARMv8-M Baseline has been manually examined and compared
>> against ARMv8-A Thumb-2 for the following configuration without finding any
>> issue:
>>
>> gcc.dg/atomic-op-2.c at -Os
>> gcc.dg/atomic-compare-exchange-2.c at -Os
>> gcc.dg/atomic-compare-exchange-3.c at -O3
>>
>>
>> Is this ok for trunk?
>>
>> Best regards,
>>
>> Thomas
>>
>> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
>> undefined ("-O2 -g")
>> [2] The exact list is:
>>
>> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
>> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
>> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
>> gcc/testsuite/gcc.dg/atomic-exchange-1.c
>> gcc/testsuite/gcc.dg/atomic-exchange-2.c
>> gcc/testsuite/gcc.dg/atomic-exchange-3.c
>> gcc/testsuite/gcc.dg/atomic-fence.c
>> gcc/testsuite/gcc.dg/atomic-flag.c
>> gcc/testsuite/gcc.dg/atomic-generic.c
>> gcc/testsuite/gcc.dg/atomic-generic-aux.c
>> gcc/testsuite/gcc.dg/atomic-invalid-2.c
>> gcc/testsuite/gcc.dg/atomic-load-1.c
>> gcc/testsuite/gcc.dg/atomic-load-2.c
>> gcc/testsuite/gcc.dg/atomic-load-3.c
>> gcc/testsuite/gcc.dg/atomic-lockfree.c
>> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
>> gcc/testsuite/gcc.dg/atomic-noinline.c
>> gcc/testsuite/gcc.dg/atomic-noinline-aux.c
>> gcc/testsuite/gcc.dg/atomic-op-1.c
>> gcc/testsuite/gcc.dg/atomic-op-2.c
>> gcc/testsuite/gcc.dg/atomic-op-3.c
>> gcc/testsuite/gcc.dg/atomic-op-6.c
>> gcc/testsuite/gcc.dg/atomic-store-1.c
>> gcc/testsuite/gcc.dg/atomic-store-2.c
>> gcc/testsuite/gcc.dg/atomic-store-3.c
>> gcc/testsuite/g++.dg/ext/atomic-1.C
>> gcc/testsuite/g++.dg/ext/atomic-2.C
>> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
>> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
>> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
>> gcc/testsuite/gcc.target/arm/atomic-op-char.c
>> gcc/testsuite/gcc.target/arm/atomic-op-consume.c
>> gcc/testsuite/gcc.target/arm/atomic-op-int.c
>> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
>> gcc/testsuite/gcc.target/arm/atomic-op-release.c
>> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
>> gcc/testsuite/gcc.target/arm/atomic-op-short.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
>> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
>> gcc/testsuite/gcc.target/arm/sync-1.c
>> gcc/testsuite/gcc.target/arm/synchronize.c
>> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
>> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
>> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
>> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
>> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc
>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc
>>
>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc
>>
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc
>>
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc
>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc
>> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc
>> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc
>> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
[-- Attachment #2: 2_adapt_atomic_load_store_v8m_baseline.patch --]
[-- Type: text/x-patch, Size: 7789 bytes --]
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 4ece5f013c92adee04157b5c909e1d47c894c994..65098ceeb1a66174b345bcfb0688152f9f137150 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -34,11 +34,13 @@
;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in all states: Pf
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Q, Uh, Ut, Uv, Uy, Un, Um, Us
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
+;; in all states: Q
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
@@ -180,6 +182,13 @@
(and (match_code "const_int")
(match_test "TARGET_THUMB1 && ival >= 256 && ival <= 510")))
+(define_constraint "Pf"
+ "Memory models except relaxed, consume or release ones."
+ (and (match_code "const_int")
+ (match_test "!is_mm_relaxed (memmodel_from_int (ival))
+ && !is_mm_consume (memmodel_from_int (ival))
+ && !is_mm_release (memmodel_from_int (ival))")))
+
(define_constraint "Ps"
"@internal In Thumb-2 state a constant in the range -255 to +255"
(and (match_code "const_int")
@@ -407,7 +416,7 @@
(define_memory_constraint "Q"
"@internal
- In ARM/Thumb-2 state an address that is a single base register."
+ An address that is a single base register."
(and (match_code "mem")
(match_test "REG_P (XEXP (op, 0))")))
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index d10ede4175f94e627a23bf32d19d2b5f3de76771..d36c24f76f670d7602f766d7172286504faa7af5 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -63,37 +63,59 @@
(set_attr "predicable" "no")])
(define_insn "atomic_load<mode>"
- [(set (match_operand:QHSI 0 "register_operand" "=r")
+ [(set (match_operand:QHSI 0 "register_operand" "=r,r,l")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q,Q,Q")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_LDA))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model))
- return \"ldr<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"ldr<sync_sfx>\\t%0, %1\";
+ else
+ return \"ldr<sync_sfx>%?\\t%0, %1\";
+ }
else
- return \"lda<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"lda<sync_sfx>\\t%0, %1\";
+ else
+ return \"lda<sync_sfx>%?\\t%0, %1\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "atomic_store<mode>"
- [(set (match_operand:QHSI 0 "memory_operand" "=Q")
+ [(set (match_operand:QHSI 0 "memory_operand" "=Q,Q,Q")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "general_operand" "r")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "general_operand" "r,r,l")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_STL))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model))
- return \"str<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"str<sync_sfx>\t%1, %0\";
+ else
+ return \"str<sync_sfx>%?\t%1, %0\";
+ }
else
- return \"stl<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"stl<sync_sfx>\t%1, %0\";
+ else
+ return \"stl<sync_sfx>%?\t%1, %0\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
;; An LDRD instruction usable by the atomic_loaddi expander on LPAE targets
@@ -380,45 +402,57 @@
})
(define_insn "arm_load_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL)))]
"TARGET_HAVE_LDREXBH"
- "ldrex<sync_sfx>%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex<sync_sfx>%?\t%0, %C1
+ ldrex<sync_sfx>\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX)))]
"TARGET_HAVE_LDACQ"
- "ldaex<sync_sfx>%?\\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex<sync_sfx>%?\\t%0, %C1
+ ldaex<sync_sfx>\\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL))]
"TARGET_HAVE_LDREX"
- "ldrex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex%?\t%0, %C1
+ ldrex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX))]
"TARGET_HAVE_LDACQ"
- "ldaex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex%?\t%0, %C1
+ ldaex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivedi"
@@ -460,7 +494,10 @@
gcc_assert ((REGNO (operands[2]) & 1) == 0 || TARGET_THUMB2);
return "strexd%?\t%0, %2, %H2, %C1";
}
- return "strex<sync_sfx>%?\t%0, %2, %C1";
+ if (TARGET_THUMB1)
+ return "strex<sync_sfx>\t%0, %2, %C1";
+ else
+ return "strex<sync_sfx>%?\t%0, %2, %C1";
}
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
@@ -482,13 +519,16 @@
(set_attr "predicable_short_it" "no")])
(define_insn "arm_store_release_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=&r")
+ [(set (match_operand:SI 0 "s_register_operand" "=&r,&r")
(unspec_volatile:SI [(const_int 0)] VUNSPEC_SLX))
- (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua")
+ (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua,Ua")
(unspec_volatile:QHSI
- [(match_operand:QHSI 2 "s_register_operand" "r")]
+ [(match_operand:QHSI 2 "s_register_operand" "r,r")]
VUNSPEC_SLX))]
"TARGET_HAVE_LDACQ"
- "stlex<sync_sfx>%?\t%0, %2, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ stlex<sync_sfx>%?\t%0, %2, %C1
+ stlex<sync_sfx>\t%0, %2, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH, ARM 2/7, ping3] Adapt atomic and exclusive load and store to ARMv8-M Baseline
2016-10-14 13:48 ` [PATCH, ARM 2/7, ping2] " Thomas Preudhomme
@ 2016-10-24 8:04 ` Thomas Preudhomme
2016-10-24 16:40 ` Kyrill Tkachov
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Preudhomme @ 2016-10-24 8:04 UTC (permalink / raw)
To: gcc-patches, Kyrill Tkachov, Ramana Radhakrishnan, Richard Earnshaw
[-- Attachment #1: Type: text/plain, Size: 8961 bytes --]
Ping?
Best regards,
Thomas
On 14/10/16 14:48, Thomas Preudhomme wrote:
> Ping?
>
> Best regards,
>
> Thomas
>
> On 03/10/16 17:42, Thomas Preudhomme wrote:
>> Ping?
>>
>> Best regards,
>>
>> Thomas
>>
>> On 22/09/16 14:41, Thomas Preudhomme wrote:
>>> Hi,
>>>
>>> This patch is part of a patch series to add support for atomic operations on
>>> ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and exclusive
>>> load and store patterns to the constraints of ARMv8-M Baseline. It consists of
>>> two sets of changes:
>>>
>>> - adding non predicated output templates because ARMv8-M Baseline does not have
>>> IT instruction
>>> - use low registers for ldr/str
>>>
>>> Together these changes require to create 2 new alternatives for atomic_load and
>>> atomic_store: (i) one for relaxed, consume and release memory model (the new Pf
>>> constraint) where ldr/str are used and thus low registers must be used and (ii)
>>> another one for the other memory model where lda/stl are used. These are
>>> separate from the constraint for 32bit targets whose output templates expect
>>> predication.
>>>
>>> ChangeLog entry is as follows:
>>>
>>> *** gcc/ChangeLog ***
>>>
>>> 2016-07-05 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>>
>>> * config/arm/constraints.md (Q constraint): Document its use for
>>> Thumb-1.
>>> (Pf constraint): New constraint for relaxed, consume or relaxed memory
>>> models.
>>> * config/arm/sync.md (atomic_load<mode>): Add new ARMv8-M Baseline only
>>> alternatives to allow any register when memory model matches Pf and
>>> thus lda is used, but only low registers otherwise. Use unpredicated
>>> output template for Thumb-1 targets.
>>> (atomic_store<mode>): Likewise for stl.
>>> (arm_load_exclusive<mode>): Add new ARMv8-M Baseline only alternative
>>> whose output template does not have predication.
>>> (arm_load_acquire_exclusive<mode>): Likewise.
>>> (arm_load_exclusivesi): Likewise.
>>> (arm_load_acquire_exclusivesi): Likewise.
>>> (arm_store_release_exclusive<mode>): Likewise.
>>> (arm_store_exclusive<mode>): Use unpredicated output template for
>>> Thumb-1 targets.
>>>
>>>
>>> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
>>> atomic and synchronization testcases in the testsuite [2]. Patchset was also
>>> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
>>> optimization level -O1 and above [1] without any regression in the testsuite and
>>> no code generation difference in libitm and libgomp.
>>>
>>> Code generation for ARMv8-M Baseline has been manually examined and compared
>>> against ARMv8-A Thumb-2 for the following configuration without finding any
>>> issue:
>>>
>>> gcc.dg/atomic-op-2.c at -Os
>>> gcc.dg/atomic-compare-exchange-2.c at -Os
>>> gcc.dg/atomic-compare-exchange-3.c at -O3
>>>
>>>
>>> Is this ok for trunk?
>>>
>>> Best regards,
>>>
>>> Thomas
>>>
>>> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
>>> undefined ("-O2 -g")
>>> [2] The exact list is:
>>>
>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
>>> gcc/testsuite/gcc.dg/atomic-exchange-1.c
>>> gcc/testsuite/gcc.dg/atomic-exchange-2.c
>>> gcc/testsuite/gcc.dg/atomic-exchange-3.c
>>> gcc/testsuite/gcc.dg/atomic-fence.c
>>> gcc/testsuite/gcc.dg/atomic-flag.c
>>> gcc/testsuite/gcc.dg/atomic-generic.c
>>> gcc/testsuite/gcc.dg/atomic-generic-aux.c
>>> gcc/testsuite/gcc.dg/atomic-invalid-2.c
>>> gcc/testsuite/gcc.dg/atomic-load-1.c
>>> gcc/testsuite/gcc.dg/atomic-load-2.c
>>> gcc/testsuite/gcc.dg/atomic-load-3.c
>>> gcc/testsuite/gcc.dg/atomic-lockfree.c
>>> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
>>> gcc/testsuite/gcc.dg/atomic-noinline.c
>>> gcc/testsuite/gcc.dg/atomic-noinline-aux.c
>>> gcc/testsuite/gcc.dg/atomic-op-1.c
>>> gcc/testsuite/gcc.dg/atomic-op-2.c
>>> gcc/testsuite/gcc.dg/atomic-op-3.c
>>> gcc/testsuite/gcc.dg/atomic-op-6.c
>>> gcc/testsuite/gcc.dg/atomic-store-1.c
>>> gcc/testsuite/gcc.dg/atomic-store-2.c
>>> gcc/testsuite/gcc.dg/atomic-store-3.c
>>> gcc/testsuite/g++.dg/ext/atomic-1.C
>>> gcc/testsuite/g++.dg/ext/atomic-2.C
>>> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-char.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-consume.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-int.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-release.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
>>> gcc/testsuite/gcc.target/arm/atomic-op-short.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
>>> gcc/testsuite/gcc.target/arm/sync-1.c
>>> gcc/testsuite/gcc.target/arm/synchronize.c
>>> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
>>> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc
>>>
>>>
>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc
>>>
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc
>>>
>>>
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc
>>>
>>>
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc
>>>
>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc
>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc
>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc
>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
[-- Attachment #2: 2_adapt_atomic_load_store_v8m_baseline.patch --]
[-- Type: text/x-patch, Size: 7789 bytes --]
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 4ece5f013c92adee04157b5c909e1d47c894c994..65098ceeb1a66174b345bcfb0688152f9f137150 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -34,11 +34,13 @@
;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in all states: Pf
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Q, Uh, Ut, Uv, Uy, Un, Um, Us
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
+;; in all states: Q
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
@@ -180,6 +182,13 @@
(and (match_code "const_int")
(match_test "TARGET_THUMB1 && ival >= 256 && ival <= 510")))
+(define_constraint "Pf"
+ "Memory models except relaxed, consume or release ones."
+ (and (match_code "const_int")
+ (match_test "!is_mm_relaxed (memmodel_from_int (ival))
+ && !is_mm_consume (memmodel_from_int (ival))
+ && !is_mm_release (memmodel_from_int (ival))")))
+
(define_constraint "Ps"
"@internal In Thumb-2 state a constant in the range -255 to +255"
(and (match_code "const_int")
@@ -407,7 +416,7 @@
(define_memory_constraint "Q"
"@internal
- In ARM/Thumb-2 state an address that is a single base register."
+ An address that is a single base register."
(and (match_code "mem")
(match_test "REG_P (XEXP (op, 0))")))
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index d10ede4175f94e627a23bf32d19d2b5f3de76771..d36c24f76f670d7602f766d7172286504faa7af5 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -63,37 +63,59 @@
(set_attr "predicable" "no")])
(define_insn "atomic_load<mode>"
- [(set (match_operand:QHSI 0 "register_operand" "=r")
+ [(set (match_operand:QHSI 0 "register_operand" "=r,r,l")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "arm_sync_memory_operand" "Q,Q,Q")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_LDA))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model))
- return \"ldr<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"ldr<sync_sfx>\\t%0, %1\";
+ else
+ return \"ldr<sync_sfx>%?\\t%0, %1\";
+ }
else
- return \"lda<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"lda<sync_sfx>\\t%0, %1\";
+ else
+ return \"lda<sync_sfx>%?\\t%0, %1\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "atomic_store<mode>"
- [(set (match_operand:QHSI 0 "memory_operand" "=Q")
+ [(set (match_operand:QHSI 0 "memory_operand" "=Q,Q,Q")
(unspec_volatile:QHSI
- [(match_operand:QHSI 1 "general_operand" "r")
- (match_operand:SI 2 "const_int_operand")] ;; model
+ [(match_operand:QHSI 1 "general_operand" "r,r,l")
+ (match_operand:SI 2 "const_int_operand" "n,Pf,n")] ;; model
VUNSPEC_STL))]
"TARGET_HAVE_LDACQ"
{
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model))
- return \"str<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"str<sync_sfx>\t%1, %0\";
+ else
+ return \"str<sync_sfx>%?\t%1, %0\";
+ }
else
- return \"stl<sync_sfx>%?\t%1, %0\";
+ {
+ if (TARGET_THUMB1)
+ return \"stl<sync_sfx>\t%1, %0\";
+ else
+ return \"stl<sync_sfx>%?\t%1, %0\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
;; An LDRD instruction usable by the atomic_loaddi expander on LPAE targets
@@ -380,45 +402,57 @@
})
(define_insn "arm_load_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL)))]
"TARGET_HAVE_LDREXBH"
- "ldrex<sync_sfx>%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex<sync_sfx>%?\t%0, %C1
+ ldrex<sync_sfx>\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(zero_extend:SI
(unspec_volatile:NARROW
- [(match_operand:NARROW 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:NARROW 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX)))]
"TARGET_HAVE_LDACQ"
- "ldaex<sync_sfx>%?\\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex<sync_sfx>%?\\t%0, %C1
+ ldaex<sync_sfx>\\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LL))]
"TARGET_HAVE_LDREX"
- "ldrex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldrex%?\t%0, %C1
+ ldrex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_acquire_exclusivesi"
- [(set (match_operand:SI 0 "s_register_operand" "=r")
+ [(set (match_operand:SI 0 "s_register_operand" "=r,r")
(unspec_volatile:SI
- [(match_operand:SI 1 "mem_noofs_operand" "Ua")]
+ [(match_operand:SI 1 "mem_noofs_operand" "Ua,Ua")]
VUNSPEC_LAX))]
"TARGET_HAVE_LDACQ"
- "ldaex%?\t%0, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ ldaex%?\t%0, %C1
+ ldaex\t%0, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
(define_insn "arm_load_exclusivedi"
@@ -460,7 +494,10 @@
gcc_assert ((REGNO (operands[2]) & 1) == 0 || TARGET_THUMB2);
return "strexd%?\t%0, %2, %H2, %C1";
}
- return "strex<sync_sfx>%?\t%0, %2, %C1";
+ if (TARGET_THUMB1)
+ return "strex<sync_sfx>\t%0, %2, %C1";
+ else
+ return "strex<sync_sfx>%?\t%0, %2, %C1";
}
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
@@ -482,13 +519,16 @@
(set_attr "predicable_short_it" "no")])
(define_insn "arm_store_release_exclusive<mode>"
- [(set (match_operand:SI 0 "s_register_operand" "=&r")
+ [(set (match_operand:SI 0 "s_register_operand" "=&r,&r")
(unspec_volatile:SI [(const_int 0)] VUNSPEC_SLX))
- (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua")
+ (set (match_operand:QHSI 1 "mem_noofs_operand" "=Ua,Ua")
(unspec_volatile:QHSI
- [(match_operand:QHSI 2 "s_register_operand" "r")]
+ [(match_operand:QHSI 2 "s_register_operand" "r,r")]
VUNSPEC_SLX))]
"TARGET_HAVE_LDACQ"
- "stlex<sync_sfx>%?\t%0, %2, %C1"
- [(set_attr "predicable" "yes")
+ "@
+ stlex<sync_sfx>%?\t%0, %2, %C1
+ stlex<sync_sfx>\t%0, %2, %C1"
+ [(set_attr "arch" "32,v8mb")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH, ARM 2/7, ping3] Adapt atomic and exclusive load and store to ARMv8-M Baseline
2016-10-24 8:04 ` [PATCH, ARM 2/7, ping3] " Thomas Preudhomme
@ 2016-10-24 16:40 ` Kyrill Tkachov
2016-10-24 17:01 ` Thomas Preudhomme
0 siblings, 1 reply; 9+ messages in thread
From: Kyrill Tkachov @ 2016-10-24 16:40 UTC (permalink / raw)
To: Thomas Preudhomme, gcc-patches, Ramana Radhakrishnan, Richard Earnshaw
Hi Thomas,
On 24/10/16 09:04, Thomas Preudhomme wrote:
> Ping?
>
> Best regards,
>
> Thomas
>
> On 14/10/16 14:48, Thomas Preudhomme wrote:
>> Ping?
>>
>> Best regards,
>>
>> Thomas
>>
>> On 03/10/16 17:42, Thomas Preudhomme wrote:
>>> Ping?
>>>
>>> Best regards,
>>>
>>> Thomas
>>>
>>> On 22/09/16 14:41, Thomas Preudhomme wrote:
>>>> Hi,
>>>>
>>>> This patch is part of a patch series to add support for atomic operations on
>>>> ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and exclusive
>>>> load and store patterns to the constraints of ARMv8-M Baseline. It consists of
>>>> two sets of changes:
>>>>
>>>> - adding non predicated output templates because ARMv8-M Baseline does not have
>>>> IT instruction
>>>> - use low registers for ldr/str
>>>>
>>>> Together these changes require to create 2 new alternatives for atomic_load and
>>>> atomic_store: (i) one for relaxed, consume and release memory model (the new Pf
>>>> constraint) where ldr/str are used and thus low registers must be used and (ii)
>>>> another one for the other memory model where lda/stl are used. These are
>>>> separate from the constraint for 32bit targets whose output templates expect
>>>> predication.
>>>>
>>>> ChangeLog entry is as follows:
>>>>
>>>> *** gcc/ChangeLog ***
>>>>
>>>> 2016-07-05 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>>>
>>>> * config/arm/constraints.md (Q constraint): Document its use for
>>>> Thumb-1.
>>>> (Pf constraint): New constraint for relaxed, consume or relaxed memory
>>>> models.
>>>> * config/arm/sync.md (atomic_load<mode>): Add new ARMv8-M Baseline only
>>>> alternatives to allow any register when memory model matches Pf and
>>>> thus lda is used, but only low registers otherwise. Use unpredicated
>>>> output template for Thumb-1 targets.
>>>> (atomic_store<mode>): Likewise for stl.
>>>> (arm_load_exclusive<mode>): Add new ARMv8-M Baseline only alternative
>>>> whose output template does not have predication.
>>>> (arm_load_acquire_exclusive<mode>): Likewise.
>>>> (arm_load_exclusivesi): Likewise.
>>>> (arm_load_acquire_exclusivesi): Likewise.
>>>> (arm_store_release_exclusive<mode>): Likewise.
>>>> (arm_store_exclusive<mode>): Use unpredicated output template for
>>>> Thumb-1 targets.
>>>>
>>>>
>>>> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
>>>> atomic and synchronization testcases in the testsuite [2]. Patchset was also
>>>> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
>>>> optimization level -O1 and above [1] without any regression in the testsuite and
>>>> no code generation difference in libitm and libgomp.
>>>>
>>>> Code generation for ARMv8-M Baseline has been manually examined and compared
>>>> against ARMv8-A Thumb-2 for the following configuration without finding any
>>>> issue:
>>>>
>>>> gcc.dg/atomic-op-2.c at -Os
>>>> gcc.dg/atomic-compare-exchange-2.c at -Os
>>>> gcc.dg/atomic-compare-exchange-3.c at -O3
>>>>
>>>>
>>>> Is this ok for trunk?
>>>>
>>>> Best regards,
>>>>
>>>> Thomas
>>>>
>>>> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
>>>> undefined ("-O2 -g")
>>>> [2] The exact list is:
>>>>
>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
>>>> gcc/testsuite/gcc.dg/atomic-exchange-1.c
>>>> gcc/testsuite/gcc.dg/atomic-exchange-2.c
>>>> gcc/testsuite/gcc.dg/atomic-exchange-3.c
>>>> gcc/testsuite/gcc.dg/atomic-fence.c
>>>> gcc/testsuite/gcc.dg/atomic-flag.c
>>>> gcc/testsuite/gcc.dg/atomic-generic.c
>>>> gcc/testsuite/gcc.dg/atomic-generic-aux.c
>>>> gcc/testsuite/gcc.dg/atomic-invalid-2.c
>>>> gcc/testsuite/gcc.dg/atomic-load-1.c
>>>> gcc/testsuite/gcc.dg/atomic-load-2.c
>>>> gcc/testsuite/gcc.dg/atomic-load-3.c
>>>> gcc/testsuite/gcc.dg/atomic-lockfree.c
>>>> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
>>>> gcc/testsuite/gcc.dg/atomic-noinline.c
>>>> gcc/testsuite/gcc.dg/atomic-noinline-aux.c
>>>> gcc/testsuite/gcc.dg/atomic-op-1.c
>>>> gcc/testsuite/gcc.dg/atomic-op-2.c
>>>> gcc/testsuite/gcc.dg/atomic-op-3.c
>>>> gcc/testsuite/gcc.dg/atomic-op-6.c
>>>> gcc/testsuite/gcc.dg/atomic-store-1.c
>>>> gcc/testsuite/gcc.dg/atomic-store-2.c
>>>> gcc/testsuite/gcc.dg/atomic-store-3.c
>>>> gcc/testsuite/g++.dg/ext/atomic-1.C
>>>> gcc/testsuite/g++.dg/ext/atomic-2.C
>>>> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-char.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-consume.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-int.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-release.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-short.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
>>>> gcc/testsuite/gcc.target/arm/sync-1.c
>>>> gcc/testsuite/gcc.target/arm/synchronize.c
>>>> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
>>>> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc
>>>>
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc
>>>>
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc
>>>>
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc
>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc
>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc
>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
<snip>
else
- return \"lda<sync_sfx>%?\\t%0, %1\";
+ {
+ if (TARGET_THUMB1)
+ return \"lda<sync_sfx>\\t%0, %1\";
+ else
+ return \"lda<sync_sfx>%?\\t%0, %1\";
+ }
}
- [(set_attr "predicable" "yes")
+ [(set_attr "arch" "32,v8mb,any")
+ (set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
Please set the predicable attribute to "no" for the v8mb alternative.
It wouldn't change any functionality as the ifcvt pass for conditional execution
won't run for ARMv8-M Baseline but it's better to be explicit for documentation purposes.
Same for the other patterns where you add new v8mb alternatives.
Ok with that change.
Sorry for the delay,
Kyrill
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH, ARM 2/7, ping3] Adapt atomic and exclusive load and store to ARMv8-M Baseline
2016-10-24 16:40 ` Kyrill Tkachov
@ 2016-10-24 17:01 ` Thomas Preudhomme
2016-10-24 20:18 ` Kyrylo Tkachov
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Preudhomme @ 2016-10-24 17:01 UTC (permalink / raw)
To: Kyrill Tkachov, gcc-patches, Ramana Radhakrishnan, Richard Earnshaw
Hi Kyrill,
On 24/10/16 17:40, Kyrill Tkachov wrote:
> Hi Thomas,
>
> On 24/10/16 09:04, Thomas Preudhomme wrote:
>> Ping?
>>
>> Best regards,
>>
>> Thomas
>>
>> On 14/10/16 14:48, Thomas Preudhomme wrote:
>>> Ping?
>>>
>>> Best regards,
>>>
>>> Thomas
>>>
>>> On 03/10/16 17:42, Thomas Preudhomme wrote:
>>>> Ping?
>>>>
>>>> Best regards,
>>>>
>>>> Thomas
>>>>
>>>> On 22/09/16 14:41, Thomas Preudhomme wrote:
>>>>> Hi,
>>>>>
>>>>> This patch is part of a patch series to add support for atomic operations on
>>>>> ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and
>>>>> exclusive
>>>>> load and store patterns to the constraints of ARMv8-M Baseline. It consists of
>>>>> two sets of changes:
>>>>>
>>>>> - adding non predicated output templates because ARMv8-M Baseline does not
>>>>> have
>>>>> IT instruction
>>>>> - use low registers for ldr/str
>>>>>
>>>>> Together these changes require to create 2 new alternatives for atomic_load
>>>>> and
>>>>> atomic_store: (i) one for relaxed, consume and release memory model (the
>>>>> new Pf
>>>>> constraint) where ldr/str are used and thus low registers must be used and
>>>>> (ii)
>>>>> another one for the other memory model where lda/stl are used. These are
>>>>> separate from the constraint for 32bit targets whose output templates expect
>>>>> predication.
>>>>>
>>>>> ChangeLog entry is as follows:
>>>>>
>>>>> *** gcc/ChangeLog ***
>>>>>
>>>>> 2016-07-05 Thomas Preud'homme <thomas.preudhomme@arm.com>
>>>>>
>>>>> * config/arm/constraints.md (Q constraint): Document its use for
>>>>> Thumb-1.
>>>>> (Pf constraint): New constraint for relaxed, consume or relaxed memory
>>>>> models.
>>>>> * config/arm/sync.md (atomic_load<mode>): Add new ARMv8-M Baseline
>>>>> only
>>>>> alternatives to allow any register when memory model matches Pf and
>>>>> thus lda is used, but only low registers otherwise. Use unpredicated
>>>>> output template for Thumb-1 targets.
>>>>> (atomic_store<mode>): Likewise for stl.
>>>>> (arm_load_exclusive<mode>): Add new ARMv8-M Baseline only alternative
>>>>> whose output template does not have predication.
>>>>> (arm_load_acquire_exclusive<mode>): Likewise.
>>>>> (arm_load_exclusivesi): Likewise.
>>>>> (arm_load_acquire_exclusivesi): Likewise.
>>>>> (arm_store_release_exclusive<mode>): Likewise.
>>>>> (arm_store_exclusive<mode>): Use unpredicated output template for
>>>>> Thumb-1 targets.
>>>>>
>>>>>
>>>>> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
>>>>> atomic and synchronization testcases in the testsuite [2]. Patchset was also
>>>>> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb
>>>>> mode at
>>>>> optimization level -O1 and above [1] without any regression in the
>>>>> testsuite and
>>>>> no code generation difference in libitm and libgomp.
>>>>>
>>>>> Code generation for ARMv8-M Baseline has been manually examined and compared
>>>>> against ARMv8-A Thumb-2 for the following configuration without finding any
>>>>> issue:
>>>>>
>>>>> gcc.dg/atomic-op-2.c at -Os
>>>>> gcc.dg/atomic-compare-exchange-2.c at -Os
>>>>> gcc.dg/atomic-compare-exchange-3.c at -O3
>>>>>
>>>>>
>>>>> Is this ok for trunk?
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Thomas
>>>>>
>>>>> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3
>>>>> -g" and
>>>>> undefined ("-O2 -g")
>>>>> [2] The exact list is:
>>>>>
>>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
>>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
>>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
>>>>> gcc/testsuite/gcc.dg/atomic-exchange-1.c
>>>>> gcc/testsuite/gcc.dg/atomic-exchange-2.c
>>>>> gcc/testsuite/gcc.dg/atomic-exchange-3.c
>>>>> gcc/testsuite/gcc.dg/atomic-fence.c
>>>>> gcc/testsuite/gcc.dg/atomic-flag.c
>>>>> gcc/testsuite/gcc.dg/atomic-generic.c
>>>>> gcc/testsuite/gcc.dg/atomic-generic-aux.c
>>>>> gcc/testsuite/gcc.dg/atomic-invalid-2.c
>>>>> gcc/testsuite/gcc.dg/atomic-load-1.c
>>>>> gcc/testsuite/gcc.dg/atomic-load-2.c
>>>>> gcc/testsuite/gcc.dg/atomic-load-3.c
>>>>> gcc/testsuite/gcc.dg/atomic-lockfree.c
>>>>> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
>>>>> gcc/testsuite/gcc.dg/atomic-noinline.c
>>>>> gcc/testsuite/gcc.dg/atomic-noinline-aux.c
>>>>> gcc/testsuite/gcc.dg/atomic-op-1.c
>>>>> gcc/testsuite/gcc.dg/atomic-op-2.c
>>>>> gcc/testsuite/gcc.dg/atomic-op-3.c
>>>>> gcc/testsuite/gcc.dg/atomic-op-6.c
>>>>> gcc/testsuite/gcc.dg/atomic-store-1.c
>>>>> gcc/testsuite/gcc.dg/atomic-store-2.c
>>>>> gcc/testsuite/gcc.dg/atomic-store-3.c
>>>>> gcc/testsuite/g++.dg/ext/atomic-1.C
>>>>> gcc/testsuite/g++.dg/ext/atomic-2.C
>>>>> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-char.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-consume.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-int.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-release.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
>>>>> gcc/testsuite/gcc.target/arm/atomic-op-short.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
>>>>> gcc/testsuite/gcc.target/arm/sync-1.c
>>>>> gcc/testsuite/gcc.target/arm/synchronize.c
>>>>> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
>>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
>>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
>>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc
>>>>>
>>>>>
>>>>>
>>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc
>>>>>
>>>>>
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc
>>>>>
>>>>>
>>>>>
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc
>>>>>
>>>>>
>>>>>
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc
>>>>>
>>>>>
>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc
>>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc
>>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc
>>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
> <snip>
>
> else
> - return \"lda<sync_sfx>%?\\t%0, %1\";
> + {
> + if (TARGET_THUMB1)
> + return \"lda<sync_sfx>\\t%0, %1\";
> + else
> + return \"lda<sync_sfx>%?\\t%0, %1\";
> + }
> }
> - [(set_attr "predicable" "yes")
> + [(set_attr "arch" "32,v8mb,any")
> + (set_attr "predicable" "yes")
> (set_attr "predicable_short_it" "no")])
>
>
> Please set the predicable attribute to "no" for the v8mb alternative.
> It wouldn't change any functionality as the ifcvt pass for conditional execution
> won't run for ARMv8-M Baseline but it's better to be explicit for documentation
> purposes.
> Same for the other patterns where you add new v8mb alternatives.
predicable cannot be set on a per architecture basis which is why I kept it this
way. See SET_ATTR_ALTERNATIVE case in is_predicable function in gensupport.c
Best regards,
Thomas
>
> Ok with that change.
Ok without that then?
Best regards,
Thomas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH, ARM 2/7, ping3] Adapt atomic and exclusive load and store to ARMv8-M Baseline
2016-10-24 17:01 ` Thomas Preudhomme
@ 2016-10-24 20:18 ` Kyrylo Tkachov
0 siblings, 0 replies; 9+ messages in thread
From: Kyrylo Tkachov @ 2016-10-24 20:18 UTC (permalink / raw)
To: Thomas Preudhomme, gcc-patches, Ramana Radhakrishnan, Richard Earnshaw; +Cc: nd
>
> Please set the predicable attribute to "no" for the v8mb alternative.
> It wouldn't change any functionality as the ifcvt pass for conditional execution
> won't run for ARMv8-M Baseline but it's better to be explicit for documentation
> purposes.
> Same for the other patterns where you add new v8mb alternatives.
predicable cannot be set on a per architecture basis which is why I kept it this
way. See SET_ATTR_ALTERNATIVE case in is_predicable function in gensupport.c
Best regards,
Thomas
>
> Ok with that change.
Ok without that then?
You're right.
This is ok then,
Kyrill
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [arm-embedded] [PATCH, ARM 2/7] Adapt atomic and exclusive load and store to ARMv8-M Baseline
2016-09-22 16:42 ` [arm-embedded] " Thomas Preudhomme
@ 2016-10-27 12:55 ` Thomas Preudhomme
0 siblings, 0 replies; 9+ messages in thread
From: Thomas Preudhomme @ 2016-10-27 12:55 UTC (permalink / raw)
To: gcc-patches
On 22/09/16 17:41, Thomas Preudhomme wrote:
> Hi,
>
> We've decided to apply the following patch to ARM/embedded-6-branch.
Sorry I meant ARM/embedded-5-branch. This has just been applied on
ARM/embedded-6-branch as well 1 day ago (2016-10-26).
Best regards,
Thomas
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-10-27 12:55 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-22 13:44 [PATCH, ARM 2/7] Adapt atomic and exclusive load and store to ARMv8-M Baseline Thomas Preudhomme
2016-09-22 16:42 ` [arm-embedded] " Thomas Preudhomme
2016-10-27 12:55 ` Thomas Preudhomme
2016-10-03 16:43 ` [PATCH, ARM 2/7, ping] " Thomas Preudhomme
2016-10-14 13:48 ` [PATCH, ARM 2/7, ping2] " Thomas Preudhomme
2016-10-24 8:04 ` [PATCH, ARM 2/7, ping3] " Thomas Preudhomme
2016-10-24 16:40 ` Kyrill Tkachov
2016-10-24 17:01 ` Thomas Preudhomme
2016-10-24 20:18 ` Kyrylo Tkachov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).