public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Andrea Parri <andrea@rivosinc.com>
To: Patrick O'Neill <patrick@rivosinc.com>
Cc: gcc-patches@gcc.gnu.org, palmer@rivosinc.com,
	gnu-toolchain@rivosinc.com, vineetg@rivosinc.com,
	andrew@sifive.com, kito.cheng@sifive.com, dlustig@nvidia.com,
	cmuellner@gcc.gnu.org, hboehm@google.com, jeffreyalaw@gmail.com
Subject: Re: [PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings
Date: Thu, 27 Apr 2023 19:20:14 +0200	[thread overview]
Message-ID: <ZEquzlKEegm0k3lx@andrea> (raw)
In-Reply-To: <20230427162301.1151333-1-patrick@rivosinc.com>

On Thu, Apr 27, 2023 at 09:22:50AM -0700, Patrick O'Neill wrote:
> This patchset aims to make the RISCV atomics implementation stronger
> than the recommended mapping present in table A.6 of the ISA manual.
> https://github.com/riscv/riscv-isa-manual/blob/c7cf84547b3aefacab5463add1734c1602b67a49/src/memory.tex#L1083-L1157 
> 
> Context
> ---------
> GCC defined RISC-V mappings [1] before the Memory Model task group
> finalized their work and provided the ISA Manual Table A.6/A.7 mappings[2].
> 
> For at least a year now, we've known that the mappings were different,
> but it wasn't clear if these unique mappings had correctness issues.
> 
> Andrea Parri found an issue with the GCC mappings, showing that
> atomic_compare_exchange_weak_explicit(-,-,-,release,relaxed) mappings do
> not enforce release ordering guarantees. (Meaning the GCC mappings have
> a correctness issue).
>   https://inbox.sourceware.org/gcc-patches/Y1GbJuhcBFpPGJQ0@andrea/ 
> 
> Why not A.6?
> ---------
> We can update our mappings now, so the obvious choice would be to
> implement Table A.6 (what LLVM implements/ISA manual recommends).
> 
> The reason why that isn't the best path forward for GCC is due to a
> proposal by Hans Boehm to add L{d|w|b|h}.aq/rl and S{d|w|b|h}.aq/rl.
> 
> For context, there is discussion about fast-tracking the addition of
> these instructions. The RISCV architectural review committee supports
> adopting a "new and common atomics ABI for gcc and LLVM toochains ...
> that assumes the addition of the preceding instructions”. That common
> ABI is likely to be A.7.
>   https://lists.riscv.org/g/tech-privileged/message/1284 
> 
> Transitioning from A.6 to A.7 will cause an ABI break. We can hedge
> against that risk by emitting a conservative fence after SEQ_CST stores
> to make the mapping compatible with both A.6 and A.7.
> 
> What does a mapping compatible with both A.6 & A.7 look like?
> ---------
> It is exactly the same as Table A.6, but SEQ_CST stores have a trailing
> fence rw,rw. It's strictly stronger than Table A.6.
> 
> Microbenchmark
> ---------
> Hans Boehm helpfully wrote a microbenchmark [3] that uses ARM to give a
> rough estimate for the performance benefits/penalties of the different
> mappings. The microbenchmark is single threaded and almost-write-only.
> This case seems unlikely but is useful for getting a rough idea of the
> workload that would be impacted the most.
> 
> Testcases
> -------
> Control: A simple volatile store. This is most similar to a relaxed
> store.
> Release Store: This is most similar to Sw.rl (one of the instructions in
> Hans' proposal).
> Store with release fence: This is most similar to the mapping present in
> Table A.6.
> Store with two fences: This is most similar to the compatibility mapping
> present in this patchset.
> 
> Machines
> -------
> Intel(R) Core(TM) i7-8650U (sanity check only): x86 TSO
> Cortex A53 (Raspberry pi): ARM In order core
> Cortex A55 (Pixel 6 Pro): ARM In order core
> Cortex A76 (Pixel 6 Pro): ARM Out of order core
> Cortex X1 (Pixel 6 Pro): ARM Out of order core
> 
> Microbenchmark Results [4]
> --------
> Units are nsecs per iteration.
> 
> Sanity check
> Machine    	   CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
> -------    	   -------   ---------   ---------------   ---------------
> Intel i7-8650U 1.34812   1.30038     1.2933            18.0474
> 
> 
> Machine    	CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
> -------    	-------   ---------   ---------------   ---------------
> Cortex A53 	7.15224   10.7282     7.15221           10.013
> Cortex A55 	2.77965   8.89654     4.44787           7.78331
> Cortex A76 	1.78021   1.86095     5.33088           8.88462
> Cortex X1  	2.14252   2.14258     4.32982           7.05234
> 
> Reordered tests (using -r flag on microbenchmark)
> Machine    	CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
> -------    	-------   ---------   ---------------   ---------------
> Cortex A53 	7.15227   10.7282     7.16113           10.034
> Cortex A55 	2.78024   8.89574     4.44844           7.78428
> Cortex A76 	1.77686   1.81081     5.3301            8.88346
> Cortex X1  	2.14254   2.14251     4.3273            7.05239
> 
> Benchmark Interpretation
> --------
> As expected, out of order machines are significantly faster with the
> REL_STORE mappings. Unexpectedly, the in-order machines are
> significantly slower with REL_STORE rather than REL_STORE_FENCE.
> 
> Most machines in the wild are expected to use Table A.7 once the
> instructions are introduced. 
> Incurring this added cost now will make it easier for compiled RISC-V
> binaries to transition to the A.7 memory model mapping.
> 
> The performance benefits of moving to A.7 can be more clearly seen using
> an almost-all-load microbenchmark (included on page 3 of Hans’
> proposal). The code for that microbenchmark is attached below [5].
>   https://lists.riscv.org/g/tech-unprivileged/attachment/382/0/load-acquire110422.pdf 
>   https://lists.riscv.org/g/tech-unprivileged/topic/92916241 
> 
> Caveats
> --------
> This is a very synthetic microbenchmark that represents what is expected
> to be a very unlikely workload. Nevertheless, it's helpful to see the
> worst-case price we are paying for compatibility. 
> 
> “All times include an entire loop iteration, indirect dispatch and all.
> The benchmark alternates tests, but does not lock CPU frequency. Since a
> single core was in use, I expect this was running at basically full
> speed. Any throttling affected everything more or less uniformly.”
> - Hans Boehm
> 
> Patchset overview
> --------
> Patch 1 simplifies the memmodel to ignore MEMMODEL_SYNC_* cases (legacy
> cases that aren't handled differently for RISC-V).
> Patches 2-6 make the mappings strictly stronger.
> Patches 7-9 weaken the mappings to be in line with table A.6 of the ISA
> manual.
> Patch 11 adds some basic conformance tests to ensure the implemented
> mapping matches table A.6 with stronger SEQ_CST stores.
> 
> Conformance test cases notes
> --------
> The conformance tests in this patch are a good sanity check but do not
> guarantee exactly following Table A.6. It checks that the right
> instructions are emitted (ex. fence rw,r) but not the order of those
> instructions.
> 
> LLVM mapping notes
> --------
> LLVM emits corresponding fences for atomic_signal_fence instructions.
> This seems to be an oversight since AFAIK atomic_signal_fence acts as a
> compiler directive. GCC does not emit any fences for atomic_signal_fence
> instructions.
> 
> Future work
> --------
> There still remains some work to be done in this space after this
> patchset fixes the correctness of the GCC mappings. 
> * Look into explicitly handling subword loads/stores.
> * Look into using AMOSWAP.rl for store words/doubles.
> * L{b|h|w|d}.aq/rl & S{b|h|w|d}.aq/rl support once ratified.
> * zTSO mappings.
> 
> Prior Patchsets
> --------
> Patchset v1:
>   https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592950.html 
> 
> Patchset v2:
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615264.html 
> 
> Patchset v3:
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615431.html 
> 
> Patchset v4:
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615748.html 
> 
> Changelogs
> --------
> Changes for v2:
> * Use memmodel_base rather than a custom simplify_memmodel function
>   (Inspired by Christoph Muellner's patch 1/9)
> * Move instruction styling change from [v1 5/7] to [v2 3/8] to reduce
>   [v2 6/8]'s complexity
> * Eliminated %K flag for atomic store introduced in v1 in favor of
>   if/else
> * Rebase/test
> 
> Changes for v3:
> * Use a trailing fence for atomic stores to be compatible with table A.7
> * Emit an optimized fence r,rw following a SEQ_CST load
> * Consolidate tests in [PATCH v3 10/10]
> * Add tests for basic A.6 conformance
> 
> Changes for v4:
> * Update cover letter to cover more of the reasoning behind moving to a
>   compatibility mapping
> * Improve conformance testcases patch assertions and add new
>   compare-exchange testcases
> 
> Changes for v5:
> * Update cover letter to cover more context and reasoning behind moving
>   to a compatibility mapping
> * Rebase to include the subword-atomic cases introduced here:
>   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616080.html
> * Add basic amo-add subword atomic testcases
> * Reformat changelogs
> * Fix misc. whitespace issues
> 
> [1] GCC port with mappings merged 06 Feb 2017
>   https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=09cae7507d9e88f2b05cf3a9404bf181e65ccbac
> 
> [2] A.6 mappings added to ISA manual 12 Dec 2017
> https://github.com/riscv/riscv-isa-manual/commit/9da1a115bcc4fe327f35acceb851d4850d12e9fa 
> 
> [3] Hans Boehm almost-all-store Microbenchmark:
> // Copyright 2023 Google LLC.
> // SPDX-License-Identifier: Apache-2.0
> 
> #include <atomic>
> #include <iostream>
> #include <time.h>
> 
> static constexpr int INNER_ITERS = 10'000'000;
> static constexpr int OUTER_ITERS = 20;
> static constexpr int N_TESTS = 4;
> 
> volatile int the_volatile(17);
> std::atomic<int> the_atomic(17);
> 
> void test1(int i) {
>   the_volatile = i;
> }
> 
> void test2(int i) {
>   the_atomic.store(i, std::memory_order_release);
> }
> 
> void test3(int i) {
>   atomic_thread_fence(std::memory_order_release);
>   the_atomic.store(i, std::memory_order_relaxed);
> }
> 
> void test4(int i) {
>   atomic_thread_fence(std::memory_order_release);
>   the_atomic.store(i, std::memory_order_relaxed);
>   atomic_thread_fence(std::memory_order_seq_cst);
> }
> 
> typedef void (*int_func)(int);
> 
> uint64_t getnanos() {
>   struct timespec result;
>   if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &result) != 0) {
> 	std::cerr << "clock_gettime() failed\n";
> 	exit(1);
>   }
>   return (uint64_t)result.tv_nsec + 1'000'000'000 * (uint64_t)result.tv_sec;
> }
> 
> int_func tests[N_TESTS] = { test1, test2, test3, test4 };
> const char *test_names[N_TESTS] =
> 	{ "control", "release store", "store with release fence", "store with two fences" };
> uint64_t total_time[N_TESTS] = { 0 };
> 
> int main(int argc, char **argv) {
>   struct timespec res;
>   if (clock_getres(CLOCK_PROCESS_CPUTIME_ID, &res) != 0) {
> 	std::cerr << "clock_getres() failed\n";
> 	exit(1);
>   } else {
> 	std::cout << "nsec resolution = " << res.tv_nsec << std::endl;
>   }
>   if (argc == 2 && argv[1][0] == 'r') {
> 	// Run tests in reverse order.
> 	for (int i = 0; i < N_TESTS / 2; ++i) {
>   	std::swap(tests[i], tests[N_TESTS - 1 - i]);
>   	std::swap(test_names[i], test_names[N_TESTS - 1 - i]);
> 	}
>   }
>   for (int i = 0; i < OUTER_ITERS; ++i) {
> 	// Alternate tests to minimize bias due to thermal throttling.
> 	for (int j = 0; j < N_TESTS; ++j) {
>   	uint64_t start_time = getnanos();
>   	for (int k = 1; k <= INNER_ITERS; ++k) {
>     	tests[j](k); // Provides memory accesses between tests.
>   	}
>   	// Ignore first iteration for all tests. The first iteration of the first test is
>   	// empirically slightly slower.
>   	if (i != 0) {
>     	total_time[j] += getnanos() - start_time;
>   	}
>   	if ((tests[j] == test1 ? the_volatile : the_atomic.load()) != INNER_ITERS) {
>     	std::cerr << "result check failed, test = " << j << ", " << the_volatile << std::endl;
>     	exit(1);
>   	}
> 	}
>   }
>   for (int i = 0; i < N_TESTS; ++i) {
> 	double nsecs_per_iter = (double) total_time[i] / INNER_ITERS / (OUTER_ITERS - 1);
> 	std::cout << test_names[i] << " took " << nsecs_per_iter << " nseconds per iteration\n";
>   }
>   exit(0);
> }
> 
> [4] Hans Boehm Raw Microbenchmark Results
> Intel(R) Core(TM) i7-8650U (sanity check only):
> 
> hboehm@hboehm-glaptop0:~/tests$ ./a.out
> nsec resolution = 1
> control took 1.34812 nseconds per iteration
> release store took 1.30038 nseconds per iteration
> store with release fence took 1.2933 nseconds per iteration
> store with two fences took 18.0474 nseconds per iteration
> 
> Cortex A53 (Raspberry pi)
> hboehm@rpi3-20210823:~/tests$ ./a.out
> nsec resolution = 1
> control took 7.15224 nseconds per iteration
> release store took 10.7282 nseconds per iteration
> store with release fence took 7.15221 nseconds per iteration
> store with two fences took 10.013 nseconds per iteration
> hboehm@rpi3-20210823:~/tests$ ./a.out -r
> nsec resolution = 1
> control took 7.15227 nseconds per iteration
> release store took 10.7282 nseconds per iteration
> store with release fence took 7.16133 nseconds per iteration
> store with two fences took 10.034 nseconds per iteration
> 
> Cortex A55 (Pixel 6 Pro)
> 
> raven:/data/tmp # taskset 0f ./release-timer
> nsec resolution = 1
> control took 2.77965 nseconds per iteration
> release store took 8.89654 nseconds per iteration
> store with release fence took 4.44787 nseconds per iteration
> store with two fences took 7.78331 nseconds per iteration
> raven:/data/tmp # taskset 0f ./release-timer -r                                                                 	 
> nsec resolution = 1
> control took 2.78024 nseconds per iteration
> release store took 8.89574 nseconds per iteration
> store with release fence took 4.44844 nseconds per iteration
> store with two fences took 7.78428 nseconds per iteration
> 
> Cortex A76 (Pixel 6 Pro)
> raven:/data/tmp # taskset 30 ./release-timer -r                                                                 	 
> nsec resolution = 1
> control took 1.77686 nseconds per iteration
> release store took 1.81081 nseconds per iteration
> store with release fence took 5.3301 nseconds per iteration
> store with two fences took 8.88346 nseconds per iteration
> raven:/data/tmp # taskset 30 ./release-timer                                                                   	 
> nsec resolution = 1
> control took 1.78021 nseconds per iteration
> release store took 1.86095 nseconds per iteration
> store with release fence took 5.33088 nseconds per iteration
> store with two fences took 8.88462 nseconds per iteration
> 
> Cortex X1 (Pixel 6 Pro)
> raven:/data/tmp # taskset c0 ./release-timer                                                                   	 
> nsec resolution = 1
> control took 2.14252 nseconds per iteration
> release store took 2.14258 nseconds per iteration
> store with release fence took 4.32982 nseconds per iteration
> store with two fences took 7.05234 nseconds per iteration
> raven:/data/tmp # taskset c0 ./release-timer -r                                                                 	 
> nsec resolution = 1
> control took 2.14254 nseconds per iteration
> release store took 2.14251 nseconds per iteration
> store with release fence took 4.3273 nseconds per iteration
> store with two fences took 7.05239 nseconds per iteration
> 
> [5] Hans Boehm almost-all-load Microbenchmark:
> // Copyright 2023 Google LLC.
> // SPDX-License-Identifier: Apache-2.0
> 
> #include <atomic>
> #include <iostream>
> #include <time.h>
> 
> static constexpr int INNER_ITERS = 10'000'000;
> static constexpr int OUTER_ITERS = 20;
> static constexpr int N_TESTS = 4;
> 
> volatile int the_volatile(17);
> std::atomic<int> the_atomic(17);
> 
> int test1() {
>   return the_volatile;
> }
> 
> int test2() {
>   return the_atomic.load(std::memory_order_acquire);
> }
> 
> int test3() {
>   int result = the_atomic.load(std::memory_order_relaxed);
>   atomic_thread_fence(std::memory_order_acquire);
>   return result;
> }
> 
> int test4() {
>   atomic_thread_fence(std::memory_order_seq_cst);
>   int result = the_atomic.load(std::memory_order_relaxed);
>   atomic_thread_fence(std::memory_order_acquire);
>   return result;
> }
> 
> typedef int (*int_func)();
> 
> uint64_t getnanos() {
>   struct timespec result;
>   if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &result) != 0) {
> 	std::cerr << "clock_gettime() failed\n";
> 	exit(1);
>   }
>   return (uint64_t)result.tv_nsec + 1'000'000'000 * (uint64_t)result.tv_sec;
> }
> 
> int_func tests[N_TESTS] = { test1, test2, test3, test4 };
> const char *test_names[N_TESTS] =
> 	{ "control", "acquire load", "load with acquire fence", "load with two fences" };
> uint64_t total_time[N_TESTS] = { 0 };
> 
> uint sum, last_sum = 0;
> 
> int main(int argc, char **argv) {
>   struct timespec res;
>   if (clock_getres(CLOCK_PROCESS_CPUTIME_ID, &res) != 0) {
> 	std::cerr << "clock_getres() failed\n";
> 	exit(1);
>   } else {
> 	std::cout << "nsec resolution = " << res.tv_nsec << std::endl;
>   }
>   if (argc == 2 && argv[1][0] == 'r') {
> 	// Run tests in reverse order.
> 	for (int i = 0; i < N_TESTS / 2; ++i) {
>   	std::swap(tests[i], tests[N_TESTS - 1 - i]);
>   	std::swap(test_names[i], test_names[N_TESTS - 1 - i]);
> 	}
>   }
>   for (int i = 0; i < OUTER_ITERS; ++i) {
> 	// Alternate tests to minimize bias due to thermal throttling.
> 	for (int j = 0; j < N_TESTS; ++j) {
>   	sum = 0;
>   	uint64_t start_time = getnanos();
>   	for (int k = 0; k < INNER_ITERS; ++k) {
>     	sum += tests[j](); // Provides memory accesses between tests.
>   	}
>   	// Ignore first iteration for all tests. The first iteration of the first test is
>   	// empirically slightly slower.
>   	if (i != 0) {
>     	total_time[j] += getnanos() - start_time;
>   	}
>   	if (sum == 0 || last_sum != 0 && sum != last_sum) {
>     	std::cerr << "result check failed";
>     	exit(1);
>   	}
>   	last_sum = sum;
> 	}
>   }
>   for (int i = 0; i < N_TESTS; ++i) {
> 	double nsecs_per_iter = (double) total_time[i] / INNER_ITERS / (OUTER_ITERS - 1);
> 	std::cout << test_names[i] << " took " << nsecs_per_iter << " nseconds per iteration\n";
>   }
>   exit(0);
> }
> 
> Patrick O'Neill (11):
>   RISC-V: Eliminate SYNC memory models
>   RISC-V: Enforce Libatomic LR/SC SEQ_CST
>   RISC-V: Enforce subword atomic LR/SC SEQ_CST
>   RISC-V: Enforce atomic compare_exchange SEQ_CST
>   RISC-V: Add AMO release bits
>   RISC-V: Strengthen atomic stores
>   RISC-V: Eliminate AMO op fences
>   RISC-V: Weaken LR/SC pairs
>   RISC-V: Weaken mem_thread_fence
>   RISC-V: Weaken atomic loads
>   RISC-V: Table A.6 conformance tests
> 
>  gcc/config/riscv/riscv-protos.h               |   3 +
>  gcc/config/riscv/riscv.cc                     |  66 ++++--
>  gcc/config/riscv/sync.md                      | 194 ++++++++++++------
>  .../riscv/amo-table-a-6-amo-add-1.c           |   8 +
>  .../riscv/amo-table-a-6-amo-add-2.c           |   8 +
>  .../riscv/amo-table-a-6-amo-add-3.c           |   8 +
>  .../riscv/amo-table-a-6-amo-add-4.c           |   8 +
>  .../riscv/amo-table-a-6-amo-add-5.c           |   8 +
>  .../riscv/amo-table-a-6-compare-exchange-1.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-2.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-3.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-4.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-5.c  |  10 +
>  .../riscv/amo-table-a-6-compare-exchange-6.c  |  11 +
>  .../riscv/amo-table-a-6-compare-exchange-7.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-fence-1.c  |   8 +
>  .../gcc.target/riscv/amo-table-a-6-fence-2.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-fence-3.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-fence-4.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-fence-5.c  |  10 +
>  .../gcc.target/riscv/amo-table-a-6-load-1.c   |   9 +
>  .../gcc.target/riscv/amo-table-a-6-load-2.c   |  11 +
>  .../gcc.target/riscv/amo-table-a-6-load-3.c   |  11 +
>  .../gcc.target/riscv/amo-table-a-6-store-1.c  |   9 +
>  .../gcc.target/riscv/amo-table-a-6-store-2.c  |  11 +
>  .../riscv/amo-table-a-6-store-compat-3.c      |  11 +
>  .../riscv/amo-table-a-6-subword-amo-add-1.c   |   9 +
>  .../riscv/amo-table-a-6-subword-amo-add-2.c   |   9 +
>  .../riscv/amo-table-a-6-subword-amo-add-3.c   |   9 +
>  .../riscv/amo-table-a-6-subword-amo-add-4.c   |   9 +
>  .../riscv/amo-table-a-6-subword-amo-add-5.c   |   9 +
>  gcc/testsuite/gcc.target/riscv/pr89835.c      |   9 +
>  libgcc/config/riscv/atomic.c                  |   4 +-
>  33 files changed, 467 insertions(+), 75 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-compat-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr89835.c

These changes address and fix all the issues I reported/I'm aware of,
thank you!

Tested-by: Andrea Parri <andrea@rivosinc.com>

  Andrea

  parent reply	other threads:[~2023-04-27 17:20 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220407182918.294892-1-patrick@rivosinc.com>
2023-04-05 21:01 ` [PATCH v2 0/8] RISCV: " Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 1/8] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 2/8] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 3/8] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 4/8] RISCV: Add AMO release bits Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 5/8] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 6/8] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 7/8] RISCV: Weaken atomic stores Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 8/8] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-10 18:23   ` [PATCH v3 00/10] RISCV: Implement ISA Manual Table A.6 Mappings Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 01/10] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 02/10] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 03/10] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 04/10] RISCV: Add AMO release bits Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 05/10] RISCV: Strengthen atomic stores Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 06/10] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 07/10] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 08/10] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 09/10] RISCV: Weaken atomic loads Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 10/10] RISCV: Table A.6 conformance tests Patrick O'Neill
2023-04-14 17:09     ` [PATCH v4 00/10] RISCV: Implement ISA Manual Table A.6 Mappings Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 01/10] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 02/10] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 03/10] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 04/10] RISCV: Add AMO release bits Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 05/10] RISCV: Strengthen atomic stores Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 06/10] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 07/10] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 08/10] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 09/10] RISCV: Weaken atomic loads Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 10/10] RISCV: Table A.6 conformance tests Patrick O'Neill
2023-04-27 16:22       ` [PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 01/11] RISC-V: Eliminate SYNC memory models Patrick O'Neill
2023-04-28 16:23           ` Jeff Law
2023-05-02 20:12             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 02/11] RISC-V: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-28 16:50           ` Jeff Law
2023-05-02 20:12             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 03/11] RISC-V: Enforce subword atomic " Patrick O'Neill
2023-05-02 20:14           ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 04/11] RISC-V: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-28 17:23           ` Jeff Law
2023-05-02 20:15             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 05/11] RISC-V: Add AMO release bits Patrick O'Neill
2023-04-28 17:34           ` Jeff Law
2023-05-02 20:16             ` Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 06/11] RISC-V: Strengthen atomic stores Patrick O'Neill
2023-04-28 17:40           ` Jeff Law
2023-04-28 17:43             ` Palmer Dabbelt
2023-04-28 21:42               ` Hans Boehm
2023-04-28 22:21                 ` Hans Boehm
2023-04-30 17:10                 ` Jeff Law
2023-05-02 20:18             ` [Committed " Patrick O'Neill
2023-05-02 16:11           ` [PATCH v5 " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 07/11] RISC-V: Eliminate AMO op fences Patrick O'Neill
2023-04-28 17:43           ` Jeff Law
2023-05-02 20:19             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 08/11] RISC-V: Weaken LR/SC pairs Patrick O'Neill
2023-04-28 17:56           ` Jeff Law
2023-05-02 20:19             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 09/11] RISC-V: Weaken mem_thread_fence Patrick O'Neill
2023-04-28 18:00           ` Jeff Law
2023-05-02 20:20             ` [Committed " Patrick O'Neill
2023-05-03 12:18           ` [PATCH v5 " Andreas Schwab
2023-05-03 12:22             ` Martin Liška
2023-04-27 16:23         ` [PATCH v5 10/11] RISC-V: Weaken atomic loads Patrick O'Neill
2023-04-28 18:04           ` Jeff Law
2023-05-02 20:20             ` [Committed " Patrick O'Neill
2023-04-27 16:23         ` [PATCH v5 11/11] RISC-V: Table A.6 conformance tests Patrick O'Neill
2023-04-28 18:07           ` Jeff Law
2023-05-02 20:28             ` [Committed " Patrick O'Neill
2023-04-27 17:20         ` Andrea Parri [this message]
2023-04-28 16:14         ` [PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings Jeff Law
2023-04-28 16:29           ` Palmer Dabbelt
2023-04-28 17:44             ` Patrick O'Neill
2023-04-28 18:18               ` Patrick O'Neill
     [not found]               ` <CAMOCf+hK9nedV+UeENbTn=Uy3RpYLeMt04mLiLmDsZyNm83CCg@mail.gmail.com>
2023-04-30 16:37                 ` Jeff Law
2023-07-25 18:01         ` [gcc13 backport 00/12] " Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 01/12] RISC-V: Eliminate SYNC memory models Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 02/12] RISC-V: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 03/12] RISC-V: Enforce subword atomic " Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 04/12] RISC-V: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 05/12] RISC-V: Add AMO release bits Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 06/12] RISC-V: Strengthen atomic stores Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 07/12] RISC-V: Eliminate AMO op fences Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 08/12] RISC-V: Weaken LR/SC pairs Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 09/12] RISC-V: Weaken mem_thread_fence Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 10/12] RISC-V: Weaken atomic loads Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 11/12] RISC-V: Table A.6 conformance tests Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 12/12] riscv: fix error: control reaches end of non-void function Patrick O'Neill
2023-07-26  1:22             ` Kito Cheng
2023-07-26 17:41               ` Patrick O'Neill
2023-07-25 19:50           ` [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings Jakub Jelinek
2023-07-25 20:01             ` Palmer Dabbelt
2023-07-25 21:02             ` Jeff Law
2023-07-25 21:16               ` Palmer Dabbelt
2023-07-25 19:58           ` Palmer Dabbelt
2023-07-31 16:19           ` [Committed] " Patrick O'Neill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZEquzlKEegm0k3lx@andrea \
    --to=andrea@rivosinc.com \
    --cc=andrew@sifive.com \
    --cc=cmuellner@gcc.gnu.org \
    --cc=dlustig@nvidia.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=gnu-toolchain@rivosinc.com \
    --cc=hboehm@google.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=kito.cheng@sifive.com \
    --cc=palmer@rivosinc.com \
    --cc=patrick@rivosinc.com \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).