From: Andrea Parri <andrea@rivosinc.com>
To: Patrick O'Neill <patrick@rivosinc.com>
Cc: gcc-patches@gcc.gnu.org, palmer@rivosinc.com,
gnu-toolchain@rivosinc.com, vineetg@rivosinc.com,
andrew@sifive.com, kito.cheng@sifive.com, dlustig@nvidia.com,
cmuellner@gcc.gnu.org, hboehm@google.com, jeffreyalaw@gmail.com
Subject: Re: [PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings
Date: Thu, 27 Apr 2023 19:20:14 +0200 [thread overview]
Message-ID: <ZEquzlKEegm0k3lx@andrea> (raw)
In-Reply-To: <20230427162301.1151333-1-patrick@rivosinc.com>
On Thu, Apr 27, 2023 at 09:22:50AM -0700, Patrick O'Neill wrote:
> This patchset aims to make the RISCV atomics implementation stronger
> than the recommended mapping present in table A.6 of the ISA manual.
> https://github.com/riscv/riscv-isa-manual/blob/c7cf84547b3aefacab5463add1734c1602b67a49/src/memory.tex#L1083-L1157
>
> Context
> ---------
> GCC defined RISC-V mappings [1] before the Memory Model task group
> finalized their work and provided the ISA Manual Table A.6/A.7 mappings[2].
>
> For at least a year now, we've known that the mappings were different,
> but it wasn't clear if these unique mappings had correctness issues.
>
> Andrea Parri found an issue with the GCC mappings, showing that
> atomic_compare_exchange_weak_explicit(-,-,-,release,relaxed) mappings do
> not enforce release ordering guarantees. (Meaning the GCC mappings have
> a correctness issue).
> https://inbox.sourceware.org/gcc-patches/Y1GbJuhcBFpPGJQ0@andrea/
>
> Why not A.6?
> ---------
> We can update our mappings now, so the obvious choice would be to
> implement Table A.6 (what LLVM implements/ISA manual recommends).
>
> The reason why that isn't the best path forward for GCC is due to a
> proposal by Hans Boehm to add L{d|w|b|h}.aq/rl and S{d|w|b|h}.aq/rl.
>
> For context, there is discussion about fast-tracking the addition of
> these instructions. The RISCV architectural review committee supports
> adopting a "new and common atomics ABI for gcc and LLVM toochains ...
> that assumes the addition of the preceding instructions”. That common
> ABI is likely to be A.7.
> https://lists.riscv.org/g/tech-privileged/message/1284
>
> Transitioning from A.6 to A.7 will cause an ABI break. We can hedge
> against that risk by emitting a conservative fence after SEQ_CST stores
> to make the mapping compatible with both A.6 and A.7.
>
> What does a mapping compatible with both A.6 & A.7 look like?
> ---------
> It is exactly the same as Table A.6, but SEQ_CST stores have a trailing
> fence rw,rw. It's strictly stronger than Table A.6.
>
> Microbenchmark
> ---------
> Hans Boehm helpfully wrote a microbenchmark [3] that uses ARM to give a
> rough estimate for the performance benefits/penalties of the different
> mappings. The microbenchmark is single threaded and almost-write-only.
> This case seems unlikely but is useful for getting a rough idea of the
> workload that would be impacted the most.
>
> Testcases
> -------
> Control: A simple volatile store. This is most similar to a relaxed
> store.
> Release Store: This is most similar to Sw.rl (one of the instructions in
> Hans' proposal).
> Store with release fence: This is most similar to the mapping present in
> Table A.6.
> Store with two fences: This is most similar to the compatibility mapping
> present in this patchset.
>
> Machines
> -------
> Intel(R) Core(TM) i7-8650U (sanity check only): x86 TSO
> Cortex A53 (Raspberry pi): ARM In order core
> Cortex A55 (Pixel 6 Pro): ARM In order core
> Cortex A76 (Pixel 6 Pro): ARM Out of order core
> Cortex X1 (Pixel 6 Pro): ARM Out of order core
>
> Microbenchmark Results [4]
> --------
> Units are nsecs per iteration.
>
> Sanity check
> Machine CONTROL REL_STORE STORE_REL_FENCE STORE_TWO_FENCE
> ------- ------- --------- --------------- ---------------
> Intel i7-8650U 1.34812 1.30038 1.2933 18.0474
>
>
> Machine CONTROL REL_STORE STORE_REL_FENCE STORE_TWO_FENCE
> ------- ------- --------- --------------- ---------------
> Cortex A53 7.15224 10.7282 7.15221 10.013
> Cortex A55 2.77965 8.89654 4.44787 7.78331
> Cortex A76 1.78021 1.86095 5.33088 8.88462
> Cortex X1 2.14252 2.14258 4.32982 7.05234
>
> Reordered tests (using -r flag on microbenchmark)
> Machine CONTROL REL_STORE STORE_REL_FENCE STORE_TWO_FENCE
> ------- ------- --------- --------------- ---------------
> Cortex A53 7.15227 10.7282 7.16113 10.034
> Cortex A55 2.78024 8.89574 4.44844 7.78428
> Cortex A76 1.77686 1.81081 5.3301 8.88346
> Cortex X1 2.14254 2.14251 4.3273 7.05239
>
> Benchmark Interpretation
> --------
> As expected, out of order machines are significantly faster with the
> REL_STORE mappings. Unexpectedly, the in-order machines are
> significantly slower with REL_STORE rather than REL_STORE_FENCE.
>
> Most machines in the wild are expected to use Table A.7 once the
> instructions are introduced.
> Incurring this added cost now will make it easier for compiled RISC-V
> binaries to transition to the A.7 memory model mapping.
>
> The performance benefits of moving to A.7 can be more clearly seen using
> an almost-all-load microbenchmark (included on page 3 of Hans’
> proposal). The code for that microbenchmark is attached below [5].
> https://lists.riscv.org/g/tech-unprivileged/attachment/382/0/load-acquire110422.pdf
> https://lists.riscv.org/g/tech-unprivileged/topic/92916241
>
> Caveats
> --------
> This is a very synthetic microbenchmark that represents what is expected
> to be a very unlikely workload. Nevertheless, it's helpful to see the
> worst-case price we are paying for compatibility.
>
> “All times include an entire loop iteration, indirect dispatch and all.
> The benchmark alternates tests, but does not lock CPU frequency. Since a
> single core was in use, I expect this was running at basically full
> speed. Any throttling affected everything more or less uniformly.”
> - Hans Boehm
>
> Patchset overview
> --------
> Patch 1 simplifies the memmodel to ignore MEMMODEL_SYNC_* cases (legacy
> cases that aren't handled differently for RISC-V).
> Patches 2-6 make the mappings strictly stronger.
> Patches 7-9 weaken the mappings to be in line with table A.6 of the ISA
> manual.
> Patch 11 adds some basic conformance tests to ensure the implemented
> mapping matches table A.6 with stronger SEQ_CST stores.
>
> Conformance test cases notes
> --------
> The conformance tests in this patch are a good sanity check but do not
> guarantee exactly following Table A.6. It checks that the right
> instructions are emitted (ex. fence rw,r) but not the order of those
> instructions.
>
> LLVM mapping notes
> --------
> LLVM emits corresponding fences for atomic_signal_fence instructions.
> This seems to be an oversight since AFAIK atomic_signal_fence acts as a
> compiler directive. GCC does not emit any fences for atomic_signal_fence
> instructions.
>
> Future work
> --------
> There still remains some work to be done in this space after this
> patchset fixes the correctness of the GCC mappings.
> * Look into explicitly handling subword loads/stores.
> * Look into using AMOSWAP.rl for store words/doubles.
> * L{b|h|w|d}.aq/rl & S{b|h|w|d}.aq/rl support once ratified.
> * zTSO mappings.
>
> Prior Patchsets
> --------
> Patchset v1:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592950.html
>
> Patchset v2:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615264.html
>
> Patchset v3:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615431.html
>
> Patchset v4:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615748.html
>
> Changelogs
> --------
> Changes for v2:
> * Use memmodel_base rather than a custom simplify_memmodel function
> (Inspired by Christoph Muellner's patch 1/9)
> * Move instruction styling change from [v1 5/7] to [v2 3/8] to reduce
> [v2 6/8]'s complexity
> * Eliminated %K flag for atomic store introduced in v1 in favor of
> if/else
> * Rebase/test
>
> Changes for v3:
> * Use a trailing fence for atomic stores to be compatible with table A.7
> * Emit an optimized fence r,rw following a SEQ_CST load
> * Consolidate tests in [PATCH v3 10/10]
> * Add tests for basic A.6 conformance
>
> Changes for v4:
> * Update cover letter to cover more of the reasoning behind moving to a
> compatibility mapping
> * Improve conformance testcases patch assertions and add new
> compare-exchange testcases
>
> Changes for v5:
> * Update cover letter to cover more context and reasoning behind moving
> to a compatibility mapping
> * Rebase to include the subword-atomic cases introduced here:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616080.html
> * Add basic amo-add subword atomic testcases
> * Reformat changelogs
> * Fix misc. whitespace issues
>
> [1] GCC port with mappings merged 06 Feb 2017
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=09cae7507d9e88f2b05cf3a9404bf181e65ccbac
>
> [2] A.6 mappings added to ISA manual 12 Dec 2017
> https://github.com/riscv/riscv-isa-manual/commit/9da1a115bcc4fe327f35acceb851d4850d12e9fa
>
> [3] Hans Boehm almost-all-store Microbenchmark:
> // Copyright 2023 Google LLC.
> // SPDX-License-Identifier: Apache-2.0
>
> #include <atomic>
> #include <iostream>
> #include <time.h>
>
> static constexpr int INNER_ITERS = 10'000'000;
> static constexpr int OUTER_ITERS = 20;
> static constexpr int N_TESTS = 4;
>
> volatile int the_volatile(17);
> std::atomic<int> the_atomic(17);
>
> void test1(int i) {
> the_volatile = i;
> }
>
> void test2(int i) {
> the_atomic.store(i, std::memory_order_release);
> }
>
> void test3(int i) {
> atomic_thread_fence(std::memory_order_release);
> the_atomic.store(i, std::memory_order_relaxed);
> }
>
> void test4(int i) {
> atomic_thread_fence(std::memory_order_release);
> the_atomic.store(i, std::memory_order_relaxed);
> atomic_thread_fence(std::memory_order_seq_cst);
> }
>
> typedef void (*int_func)(int);
>
> uint64_t getnanos() {
> struct timespec result;
> if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &result) != 0) {
> std::cerr << "clock_gettime() failed\n";
> exit(1);
> }
> return (uint64_t)result.tv_nsec + 1'000'000'000 * (uint64_t)result.tv_sec;
> }
>
> int_func tests[N_TESTS] = { test1, test2, test3, test4 };
> const char *test_names[N_TESTS] =
> { "control", "release store", "store with release fence", "store with two fences" };
> uint64_t total_time[N_TESTS] = { 0 };
>
> int main(int argc, char **argv) {
> struct timespec res;
> if (clock_getres(CLOCK_PROCESS_CPUTIME_ID, &res) != 0) {
> std::cerr << "clock_getres() failed\n";
> exit(1);
> } else {
> std::cout << "nsec resolution = " << res.tv_nsec << std::endl;
> }
> if (argc == 2 && argv[1][0] == 'r') {
> // Run tests in reverse order.
> for (int i = 0; i < N_TESTS / 2; ++i) {
> std::swap(tests[i], tests[N_TESTS - 1 - i]);
> std::swap(test_names[i], test_names[N_TESTS - 1 - i]);
> }
> }
> for (int i = 0; i < OUTER_ITERS; ++i) {
> // Alternate tests to minimize bias due to thermal throttling.
> for (int j = 0; j < N_TESTS; ++j) {
> uint64_t start_time = getnanos();
> for (int k = 1; k <= INNER_ITERS; ++k) {
> tests[j](k); // Provides memory accesses between tests.
> }
> // Ignore first iteration for all tests. The first iteration of the first test is
> // empirically slightly slower.
> if (i != 0) {
> total_time[j] += getnanos() - start_time;
> }
> if ((tests[j] == test1 ? the_volatile : the_atomic.load()) != INNER_ITERS) {
> std::cerr << "result check failed, test = " << j << ", " << the_volatile << std::endl;
> exit(1);
> }
> }
> }
> for (int i = 0; i < N_TESTS; ++i) {
> double nsecs_per_iter = (double) total_time[i] / INNER_ITERS / (OUTER_ITERS - 1);
> std::cout << test_names[i] << " took " << nsecs_per_iter << " nseconds per iteration\n";
> }
> exit(0);
> }
>
> [4] Hans Boehm Raw Microbenchmark Results
> Intel(R) Core(TM) i7-8650U (sanity check only):
>
> hboehm@hboehm-glaptop0:~/tests$ ./a.out
> nsec resolution = 1
> control took 1.34812 nseconds per iteration
> release store took 1.30038 nseconds per iteration
> store with release fence took 1.2933 nseconds per iteration
> store with two fences took 18.0474 nseconds per iteration
>
> Cortex A53 (Raspberry pi)
> hboehm@rpi3-20210823:~/tests$ ./a.out
> nsec resolution = 1
> control took 7.15224 nseconds per iteration
> release store took 10.7282 nseconds per iteration
> store with release fence took 7.15221 nseconds per iteration
> store with two fences took 10.013 nseconds per iteration
> hboehm@rpi3-20210823:~/tests$ ./a.out -r
> nsec resolution = 1
> control took 7.15227 nseconds per iteration
> release store took 10.7282 nseconds per iteration
> store with release fence took 7.16133 nseconds per iteration
> store with two fences took 10.034 nseconds per iteration
>
> Cortex A55 (Pixel 6 Pro)
>
> raven:/data/tmp # taskset 0f ./release-timer
> nsec resolution = 1
> control took 2.77965 nseconds per iteration
> release store took 8.89654 nseconds per iteration
> store with release fence took 4.44787 nseconds per iteration
> store with two fences took 7.78331 nseconds per iteration
> raven:/data/tmp # taskset 0f ./release-timer -r
> nsec resolution = 1
> control took 2.78024 nseconds per iteration
> release store took 8.89574 nseconds per iteration
> store with release fence took 4.44844 nseconds per iteration
> store with two fences took 7.78428 nseconds per iteration
>
> Cortex A76 (Pixel 6 Pro)
> raven:/data/tmp # taskset 30 ./release-timer -r
> nsec resolution = 1
> control took 1.77686 nseconds per iteration
> release store took 1.81081 nseconds per iteration
> store with release fence took 5.3301 nseconds per iteration
> store with two fences took 8.88346 nseconds per iteration
> raven:/data/tmp # taskset 30 ./release-timer
> nsec resolution = 1
> control took 1.78021 nseconds per iteration
> release store took 1.86095 nseconds per iteration
> store with release fence took 5.33088 nseconds per iteration
> store with two fences took 8.88462 nseconds per iteration
>
> Cortex X1 (Pixel 6 Pro)
> raven:/data/tmp # taskset c0 ./release-timer
> nsec resolution = 1
> control took 2.14252 nseconds per iteration
> release store took 2.14258 nseconds per iteration
> store with release fence took 4.32982 nseconds per iteration
> store with two fences took 7.05234 nseconds per iteration
> raven:/data/tmp # taskset c0 ./release-timer -r
> nsec resolution = 1
> control took 2.14254 nseconds per iteration
> release store took 2.14251 nseconds per iteration
> store with release fence took 4.3273 nseconds per iteration
> store with two fences took 7.05239 nseconds per iteration
>
> [5] Hans Boehm almost-all-load Microbenchmark:
> // Copyright 2023 Google LLC.
> // SPDX-License-Identifier: Apache-2.0
>
> #include <atomic>
> #include <iostream>
> #include <time.h>
>
> static constexpr int INNER_ITERS = 10'000'000;
> static constexpr int OUTER_ITERS = 20;
> static constexpr int N_TESTS = 4;
>
> volatile int the_volatile(17);
> std::atomic<int> the_atomic(17);
>
> int test1() {
> return the_volatile;
> }
>
> int test2() {
> return the_atomic.load(std::memory_order_acquire);
> }
>
> int test3() {
> int result = the_atomic.load(std::memory_order_relaxed);
> atomic_thread_fence(std::memory_order_acquire);
> return result;
> }
>
> int test4() {
> atomic_thread_fence(std::memory_order_seq_cst);
> int result = the_atomic.load(std::memory_order_relaxed);
> atomic_thread_fence(std::memory_order_acquire);
> return result;
> }
>
> typedef int (*int_func)();
>
> uint64_t getnanos() {
> struct timespec result;
> if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &result) != 0) {
> std::cerr << "clock_gettime() failed\n";
> exit(1);
> }
> return (uint64_t)result.tv_nsec + 1'000'000'000 * (uint64_t)result.tv_sec;
> }
>
> int_func tests[N_TESTS] = { test1, test2, test3, test4 };
> const char *test_names[N_TESTS] =
> { "control", "acquire load", "load with acquire fence", "load with two fences" };
> uint64_t total_time[N_TESTS] = { 0 };
>
> uint sum, last_sum = 0;
>
> int main(int argc, char **argv) {
> struct timespec res;
> if (clock_getres(CLOCK_PROCESS_CPUTIME_ID, &res) != 0) {
> std::cerr << "clock_getres() failed\n";
> exit(1);
> } else {
> std::cout << "nsec resolution = " << res.tv_nsec << std::endl;
> }
> if (argc == 2 && argv[1][0] == 'r') {
> // Run tests in reverse order.
> for (int i = 0; i < N_TESTS / 2; ++i) {
> std::swap(tests[i], tests[N_TESTS - 1 - i]);
> std::swap(test_names[i], test_names[N_TESTS - 1 - i]);
> }
> }
> for (int i = 0; i < OUTER_ITERS; ++i) {
> // Alternate tests to minimize bias due to thermal throttling.
> for (int j = 0; j < N_TESTS; ++j) {
> sum = 0;
> uint64_t start_time = getnanos();
> for (int k = 0; k < INNER_ITERS; ++k) {
> sum += tests[j](); // Provides memory accesses between tests.
> }
> // Ignore first iteration for all tests. The first iteration of the first test is
> // empirically slightly slower.
> if (i != 0) {
> total_time[j] += getnanos() - start_time;
> }
> if (sum == 0 || last_sum != 0 && sum != last_sum) {
> std::cerr << "result check failed";
> exit(1);
> }
> last_sum = sum;
> }
> }
> for (int i = 0; i < N_TESTS; ++i) {
> double nsecs_per_iter = (double) total_time[i] / INNER_ITERS / (OUTER_ITERS - 1);
> std::cout << test_names[i] << " took " << nsecs_per_iter << " nseconds per iteration\n";
> }
> exit(0);
> }
>
> Patrick O'Neill (11):
> RISC-V: Eliminate SYNC memory models
> RISC-V: Enforce Libatomic LR/SC SEQ_CST
> RISC-V: Enforce subword atomic LR/SC SEQ_CST
> RISC-V: Enforce atomic compare_exchange SEQ_CST
> RISC-V: Add AMO release bits
> RISC-V: Strengthen atomic stores
> RISC-V: Eliminate AMO op fences
> RISC-V: Weaken LR/SC pairs
> RISC-V: Weaken mem_thread_fence
> RISC-V: Weaken atomic loads
> RISC-V: Table A.6 conformance tests
>
> gcc/config/riscv/riscv-protos.h | 3 +
> gcc/config/riscv/riscv.cc | 66 ++++--
> gcc/config/riscv/sync.md | 194 ++++++++++++------
> .../riscv/amo-table-a-6-amo-add-1.c | 8 +
> .../riscv/amo-table-a-6-amo-add-2.c | 8 +
> .../riscv/amo-table-a-6-amo-add-3.c | 8 +
> .../riscv/amo-table-a-6-amo-add-4.c | 8 +
> .../riscv/amo-table-a-6-amo-add-5.c | 8 +
> .../riscv/amo-table-a-6-compare-exchange-1.c | 10 +
> .../riscv/amo-table-a-6-compare-exchange-2.c | 10 +
> .../riscv/amo-table-a-6-compare-exchange-3.c | 10 +
> .../riscv/amo-table-a-6-compare-exchange-4.c | 10 +
> .../riscv/amo-table-a-6-compare-exchange-5.c | 10 +
> .../riscv/amo-table-a-6-compare-exchange-6.c | 11 +
> .../riscv/amo-table-a-6-compare-exchange-7.c | 10 +
> .../gcc.target/riscv/amo-table-a-6-fence-1.c | 8 +
> .../gcc.target/riscv/amo-table-a-6-fence-2.c | 10 +
> .../gcc.target/riscv/amo-table-a-6-fence-3.c | 10 +
> .../gcc.target/riscv/amo-table-a-6-fence-4.c | 10 +
> .../gcc.target/riscv/amo-table-a-6-fence-5.c | 10 +
> .../gcc.target/riscv/amo-table-a-6-load-1.c | 9 +
> .../gcc.target/riscv/amo-table-a-6-load-2.c | 11 +
> .../gcc.target/riscv/amo-table-a-6-load-3.c | 11 +
> .../gcc.target/riscv/amo-table-a-6-store-1.c | 9 +
> .../gcc.target/riscv/amo-table-a-6-store-2.c | 11 +
> .../riscv/amo-table-a-6-store-compat-3.c | 11 +
> .../riscv/amo-table-a-6-subword-amo-add-1.c | 9 +
> .../riscv/amo-table-a-6-subword-amo-add-2.c | 9 +
> .../riscv/amo-table-a-6-subword-amo-add-3.c | 9 +
> .../riscv/amo-table-a-6-subword-amo-add-4.c | 9 +
> .../riscv/amo-table-a-6-subword-amo-add-5.c | 9 +
> gcc/testsuite/gcc.target/riscv/pr89835.c | 9 +
> libgcc/config/riscv/atomic.c | 4 +-
> 33 files changed, 467 insertions(+), 75 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-2.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-3.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-4.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-5.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-2.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-3.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-4.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-5.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-6.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-7.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-2.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-3.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-4.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-5.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-2.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-3.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-2.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-compat-3.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/pr89835.c
These changes address and fix all the issues I reported/I'm aware of,
thank you!
Tested-by: Andrea Parri <andrea@rivosinc.com>
Andrea
next prev parent reply other threads:[~2023-04-27 17:20 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220407182918.294892-1-patrick@rivosinc.com>
2023-04-05 21:01 ` [PATCH v2 0/8] RISCV: " Patrick O'Neill
2023-04-05 21:01 ` [PATCH v2 1/8] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-05 21:01 ` [PATCH v2 2/8] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-05 21:01 ` [PATCH v2 3/8] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-05 21:01 ` [PATCH v2 4/8] RISCV: Add AMO release bits Patrick O'Neill
2023-04-05 21:01 ` [PATCH v2 5/8] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-05 21:01 ` [PATCH v2 6/8] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-05 21:01 ` [PATCH v2 7/8] RISCV: Weaken atomic stores Patrick O'Neill
2023-04-05 21:01 ` [PATCH v2 8/8] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 00/10] RISCV: Implement ISA Manual Table A.6 Mappings Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 01/10] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 02/10] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 03/10] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 04/10] RISCV: Add AMO release bits Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 05/10] RISCV: Strengthen atomic stores Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 06/10] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 07/10] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 08/10] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 09/10] RISCV: Weaken atomic loads Patrick O'Neill
2023-04-10 18:23 ` [PATCH v3 10/10] RISCV: Table A.6 conformance tests Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 00/10] RISCV: Implement ISA Manual Table A.6 Mappings Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 01/10] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 02/10] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 03/10] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 04/10] RISCV: Add AMO release bits Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 05/10] RISCV: Strengthen atomic stores Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 06/10] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 07/10] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 08/10] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 09/10] RISCV: Weaken atomic loads Patrick O'Neill
2023-04-14 17:09 ` [PATCH v4 10/10] RISCV: Table A.6 conformance tests Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 01/11] RISC-V: Eliminate SYNC memory models Patrick O'Neill
2023-04-28 16:23 ` Jeff Law
2023-05-02 20:12 ` [Committed " Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 02/11] RISC-V: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-28 16:50 ` Jeff Law
2023-05-02 20:12 ` [Committed " Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 03/11] RISC-V: Enforce subword atomic " Patrick O'Neill
2023-05-02 20:14 ` [Committed " Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 04/11] RISC-V: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-28 17:23 ` Jeff Law
2023-05-02 20:15 ` [Committed " Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 05/11] RISC-V: Add AMO release bits Patrick O'Neill
2023-04-28 17:34 ` Jeff Law
2023-05-02 20:16 ` Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 06/11] RISC-V: Strengthen atomic stores Patrick O'Neill
2023-04-28 17:40 ` Jeff Law
2023-04-28 17:43 ` Palmer Dabbelt
2023-04-28 21:42 ` Hans Boehm
2023-04-28 22:21 ` Hans Boehm
2023-04-30 17:10 ` Jeff Law
2023-05-02 20:18 ` [Committed " Patrick O'Neill
2023-05-02 16:11 ` [PATCH v5 " Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 07/11] RISC-V: Eliminate AMO op fences Patrick O'Neill
2023-04-28 17:43 ` Jeff Law
2023-05-02 20:19 ` [Committed " Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 08/11] RISC-V: Weaken LR/SC pairs Patrick O'Neill
2023-04-28 17:56 ` Jeff Law
2023-05-02 20:19 ` [Committed " Patrick O'Neill
2023-04-27 16:22 ` [PATCH v5 09/11] RISC-V: Weaken mem_thread_fence Patrick O'Neill
2023-04-28 18:00 ` Jeff Law
2023-05-02 20:20 ` [Committed " Patrick O'Neill
2023-05-03 12:18 ` [PATCH v5 " Andreas Schwab
2023-05-03 12:22 ` Martin Liška
2023-04-27 16:23 ` [PATCH v5 10/11] RISC-V: Weaken atomic loads Patrick O'Neill
2023-04-28 18:04 ` Jeff Law
2023-05-02 20:20 ` [Committed " Patrick O'Neill
2023-04-27 16:23 ` [PATCH v5 11/11] RISC-V: Table A.6 conformance tests Patrick O'Neill
2023-04-28 18:07 ` Jeff Law
2023-05-02 20:28 ` [Committed " Patrick O'Neill
2023-04-27 17:20 ` Andrea Parri [this message]
2023-04-28 16:14 ` [PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings Jeff Law
2023-04-28 16:29 ` Palmer Dabbelt
2023-04-28 17:44 ` Patrick O'Neill
2023-04-28 18:18 ` Patrick O'Neill
[not found] ` <CAMOCf+hK9nedV+UeENbTn=Uy3RpYLeMt04mLiLmDsZyNm83CCg@mail.gmail.com>
2023-04-30 16:37 ` Jeff Law
2023-07-25 18:01 ` [gcc13 backport 00/12] " Patrick O'Neill
2023-07-25 18:01 ` [gcc13 backport 01/12] RISC-V: Eliminate SYNC memory models Patrick O'Neill
2023-07-25 18:01 ` [gcc13 backport 02/12] RISC-V: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-07-25 18:01 ` [gcc13 backport 03/12] RISC-V: Enforce subword atomic " Patrick O'Neill
2023-07-25 18:01 ` [gcc13 backport 04/12] RISC-V: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-07-25 18:01 ` [gcc13 backport 05/12] RISC-V: Add AMO release bits Patrick O'Neill
2023-07-25 18:02 ` [gcc13 backport 06/12] RISC-V: Strengthen atomic stores Patrick O'Neill
2023-07-25 18:02 ` [gcc13 backport 07/12] RISC-V: Eliminate AMO op fences Patrick O'Neill
2023-07-25 18:02 ` [gcc13 backport 08/12] RISC-V: Weaken LR/SC pairs Patrick O'Neill
2023-07-25 18:02 ` [gcc13 backport 09/12] RISC-V: Weaken mem_thread_fence Patrick O'Neill
2023-07-25 18:02 ` [gcc13 backport 10/12] RISC-V: Weaken atomic loads Patrick O'Neill
2023-07-25 18:02 ` [gcc13 backport 11/12] RISC-V: Table A.6 conformance tests Patrick O'Neill
2023-07-25 18:02 ` [gcc13 backport 12/12] riscv: fix error: control reaches end of non-void function Patrick O'Neill
2023-07-26 1:22 ` Kito Cheng
2023-07-26 17:41 ` Patrick O'Neill
2023-07-25 19:50 ` [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings Jakub Jelinek
2023-07-25 20:01 ` Palmer Dabbelt
2023-07-25 21:02 ` Jeff Law
2023-07-25 21:16 ` Palmer Dabbelt
2023-07-25 19:58 ` Palmer Dabbelt
2023-07-31 16:19 ` [Committed] " Patrick O'Neill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZEquzlKEegm0k3lx@andrea \
--to=andrea@rivosinc.com \
--cc=andrew@sifive.com \
--cc=cmuellner@gcc.gnu.org \
--cc=dlustig@nvidia.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=gnu-toolchain@rivosinc.com \
--cc=hboehm@google.com \
--cc=jeffreyalaw@gmail.com \
--cc=kito.cheng@sifive.com \
--cc=palmer@rivosinc.com \
--cc=patrick@rivosinc.com \
--cc=vineetg@rivosinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).