[PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Patrick O'Neill <patrick@rivosinc.com>
To: gcc-patches@gcc.gnu.org
Cc: palmer@rivosinc.com, gnu-toolchain@rivosinc.com,
	vineetg@rivosinc.com, andrew@sifive.com, kito.cheng@sifive.com,
	dlustig@nvidia.com, cmuellner@gcc.gnu.org, andrea@rivosinc.com,
	hboehm@google.com, jeffreyalaw@gmail.com,
	Patrick O'Neill <patrick@rivosinc.com>
Subject: [PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings
Date: Thu, 27 Apr 2023 09:22:50 -0700	[thread overview]
Message-ID: <20230427162301.1151333-1-patrick@rivosinc.com> (raw)
In-Reply-To: <20230414170942.1695672-1-patrick@rivosinc.com>

This patchset aims to make the RISCV atomics implementation stronger
than the recommended mapping present in table A.6 of the ISA manual.
https://github.com/riscv/riscv-isa-manual/blob/c7cf84547b3aefacab5463add1734c1602b67a49/src/memory.tex#L1083-L1157 

Context
---------
GCC defined RISC-V mappings [1] before the Memory Model task group
finalized their work and provided the ISA Manual Table A.6/A.7 mappings[2].

For at least a year now, we've known that the mappings were different,
but it wasn't clear if these unique mappings had correctness issues.

Andrea Parri found an issue with the GCC mappings, showing that
atomic_compare_exchange_weak_explicit(-,-,-,release,relaxed) mappings do
not enforce release ordering guarantees. (Meaning the GCC mappings have
a correctness issue).
  https://inbox.sourceware.org/gcc-patches/Y1GbJuhcBFpPGJQ0@andrea/ 

Why not A.6?
---------
We can update our mappings now, so the obvious choice would be to
implement Table A.6 (what LLVM implements/ISA manual recommends).

The reason why that isn't the best path forward for GCC is due to a
proposal by Hans Boehm to add L{d|w|b|h}.aq/rl and S{d|w|b|h}.aq/rl.

For context, there is discussion about fast-tracking the addition of
these instructions. The RISCV architectural review committee supports
adopting a "new and common atomics ABI for gcc and LLVM toochains ...
that assumes the addition of the preceding instructions”. That common
ABI is likely to be A.7.
  https://lists.riscv.org/g/tech-privileged/message/1284 

Transitioning from A.6 to A.7 will cause an ABI break. We can hedge
against that risk by emitting a conservative fence after SEQ_CST stores
to make the mapping compatible with both A.6 and A.7.

What does a mapping compatible with both A.6 & A.7 look like?
---------
It is exactly the same as Table A.6, but SEQ_CST stores have a trailing
fence rw,rw. It's strictly stronger than Table A.6.

Microbenchmark
---------
Hans Boehm helpfully wrote a microbenchmark [3] that uses ARM to give a
rough estimate for the performance benefits/penalties of the different
mappings. The microbenchmark is single threaded and almost-write-only.
This case seems unlikely but is useful for getting a rough idea of the
workload that would be impacted the most.

Testcases
-------
Control: A simple volatile store. This is most similar to a relaxed
store.
Release Store: This is most similar to Sw.rl (one of the instructions in
Hans' proposal).
Store with release fence: This is most similar to the mapping present in
Table A.6.
Store with two fences: This is most similar to the compatibility mapping
present in this patchset.

Machines
-------
Intel(R) Core(TM) i7-8650U (sanity check only): x86 TSO
Cortex A53 (Raspberry pi): ARM In order core
Cortex A55 (Pixel 6 Pro): ARM In order core
Cortex A76 (Pixel 6 Pro): ARM Out of order core
Cortex X1 (Pixel 6 Pro): ARM Out of order core

Microbenchmark Results [4]
--------
Units are nsecs per iteration.

Sanity check
Machine    	   CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
-------    	   -------   ---------   ---------------   ---------------
Intel i7-8650U 1.34812   1.30038     1.2933            18.0474


Machine    	CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
-------    	-------   ---------   ---------------   ---------------
Cortex A53 	7.15224   10.7282     7.15221           10.013
Cortex A55 	2.77965   8.89654     4.44787           7.78331
Cortex A76 	1.78021   1.86095     5.33088           8.88462
Cortex X1  	2.14252   2.14258     4.32982           7.05234

Reordered tests (using -r flag on microbenchmark)
Machine    	CONTROL   REL_STORE   STORE_REL_FENCE   STORE_TWO_FENCE
-------    	-------   ---------   ---------------   ---------------
Cortex A53 	7.15227   10.7282     7.16113           10.034
Cortex A55 	2.78024   8.89574     4.44844           7.78428
Cortex A76 	1.77686   1.81081     5.3301            8.88346
Cortex X1  	2.14254   2.14251     4.3273            7.05239

Benchmark Interpretation
--------
As expected, out of order machines are significantly faster with the
REL_STORE mappings. Unexpectedly, the in-order machines are
significantly slower with REL_STORE rather than REL_STORE_FENCE.

Most machines in the wild are expected to use Table A.7 once the
instructions are introduced. 
Incurring this added cost now will make it easier for compiled RISC-V
binaries to transition to the A.7 memory model mapping.

The performance benefits of moving to A.7 can be more clearly seen using
an almost-all-load microbenchmark (included on page 3 of Hans’
proposal). The code for that microbenchmark is attached below [5].
  https://lists.riscv.org/g/tech-unprivileged/attachment/382/0/load-acquire110422.pdf 
  https://lists.riscv.org/g/tech-unprivileged/topic/92916241 

Caveats
--------
This is a very synthetic microbenchmark that represents what is expected
to be a very unlikely workload. Nevertheless, it's helpful to see the
worst-case price we are paying for compatibility. 

“All times include an entire loop iteration, indirect dispatch and all.
The benchmark alternates tests, but does not lock CPU frequency. Since a
single core was in use, I expect this was running at basically full
speed. Any throttling affected everything more or less uniformly.”
- Hans Boehm

Patchset overview
--------
Patch 1 simplifies the memmodel to ignore MEMMODEL_SYNC_* cases (legacy
cases that aren't handled differently for RISC-V).
Patches 2-6 make the mappings strictly stronger.
Patches 7-9 weaken the mappings to be in line with table A.6 of the ISA
manual.
Patch 11 adds some basic conformance tests to ensure the implemented
mapping matches table A.6 with stronger SEQ_CST stores.

Conformance test cases notes
--------
The conformance tests in this patch are a good sanity check but do not
guarantee exactly following Table A.6. It checks that the right
instructions are emitted (ex. fence rw,r) but not the order of those
instructions.

LLVM mapping notes
--------
LLVM emits corresponding fences for atomic_signal_fence instructions.
This seems to be an oversight since AFAIK atomic_signal_fence acts as a
compiler directive. GCC does not emit any fences for atomic_signal_fence
instructions.

Future work
--------
There still remains some work to be done in this space after this
patchset fixes the correctness of the GCC mappings. 
* Look into explicitly handling subword loads/stores.
* Look into using AMOSWAP.rl for store words/doubles.
* L{b|h|w|d}.aq/rl & S{b|h|w|d}.aq/rl support once ratified.
* zTSO mappings.

Prior Patchsets
--------
Patchset v1:
  https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592950.html 

Patchset v2:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615264.html 

Patchset v3:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615431.html 

Patchset v4:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615748.html 

Changelogs
--------
Changes for v2:
* Use memmodel_base rather than a custom simplify_memmodel function
  (Inspired by Christoph Muellner's patch 1/9)
* Move instruction styling change from [v1 5/7] to [v2 3/8] to reduce
  [v2 6/8]'s complexity
* Eliminated %K flag for atomic store introduced in v1 in favor of
  if/else
* Rebase/test

Changes for v3:
* Use a trailing fence for atomic stores to be compatible with table A.7
* Emit an optimized fence r,rw following a SEQ_CST load
* Consolidate tests in [PATCH v3 10/10]
* Add tests for basic A.6 conformance

Changes for v4:
* Update cover letter to cover more of the reasoning behind moving to a
  compatibility mapping
* Improve conformance testcases patch assertions and add new
  compare-exchange testcases

Changes for v5:
* Update cover letter to cover more context and reasoning behind moving
  to a compatibility mapping
* Rebase to include the subword-atomic cases introduced here:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616080.html
* Add basic amo-add subword atomic testcases
* Reformat changelogs
* Fix misc. whitespace issues

[1] GCC port with mappings merged 06 Feb 2017
  https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=09cae7507d9e88f2b05cf3a9404bf181e65ccbac

[2] A.6 mappings added to ISA manual 12 Dec 2017
https://github.com/riscv/riscv-isa-manual/commit/9da1a115bcc4fe327f35acceb851d4850d12e9fa 

[3] Hans Boehm almost-all-store Microbenchmark:
// Copyright 2023 Google LLC.
// SPDX-License-Identifier: Apache-2.0

#include <atomic>
#include <iostream>
#include <time.h>

static constexpr int INNER_ITERS = 10'000'000;
static constexpr int OUTER_ITERS = 20;
static constexpr int N_TESTS = 4;

volatile int the_volatile(17);
std::atomic<int> the_atomic(17);

void test1(int i) {
  the_volatile = i;
}

void test2(int i) {
  the_atomic.store(i, std::memory_order_release);
}

void test3(int i) {
  atomic_thread_fence(std::memory_order_release);
  the_atomic.store(i, std::memory_order_relaxed);
}

void test4(int i) {
  atomic_thread_fence(std::memory_order_release);
  the_atomic.store(i, std::memory_order_relaxed);
  atomic_thread_fence(std::memory_order_seq_cst);
}

typedef void (*int_func)(int);

uint64_t getnanos() {
  struct timespec result;
  if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &result) != 0) {
	std::cerr << "clock_gettime() failed\n";
	exit(1);
  }
  return (uint64_t)result.tv_nsec + 1'000'000'000 * (uint64_t)result.tv_sec;
}

int_func tests[N_TESTS] = { test1, test2, test3, test4 };
const char *test_names[N_TESTS] =
	{ "control", "release store", "store with release fence", "store with two fences" };
uint64_t total_time[N_TESTS] = { 0 };

int main(int argc, char **argv) {
  struct timespec res;
  if (clock_getres(CLOCK_PROCESS_CPUTIME_ID, &res) != 0) {
	std::cerr << "clock_getres() failed\n";
	exit(1);
  } else {
	std::cout << "nsec resolution = " << res.tv_nsec << std::endl;
  }
  if (argc == 2 && argv[1][0] == 'r') {
	// Run tests in reverse order.
	for (int i = 0; i < N_TESTS / 2; ++i) {
  	std::swap(tests[i], tests[N_TESTS - 1 - i]);
  	std::swap(test_names[i], test_names[N_TESTS - 1 - i]);
	}
  }
  for (int i = 0; i < OUTER_ITERS; ++i) {
	// Alternate tests to minimize bias due to thermal throttling.
	for (int j = 0; j < N_TESTS; ++j) {
  	uint64_t start_time = getnanos();
  	for (int k = 1; k <= INNER_ITERS; ++k) {
    	tests[j](k); // Provides memory accesses between tests.
  	}
  	// Ignore first iteration for all tests. The first iteration of the first test is
  	// empirically slightly slower.
  	if (i != 0) {
    	total_time[j] += getnanos() - start_time;
  	}
  	if ((tests[j] == test1 ? the_volatile : the_atomic.load()) != INNER_ITERS) {
    	std::cerr << "result check failed, test = " << j << ", " << the_volatile << std::endl;
    	exit(1);
  	}
	}
  }
  for (int i = 0; i < N_TESTS; ++i) {
	double nsecs_per_iter = (double) total_time[i] / INNER_ITERS / (OUTER_ITERS - 1);
	std::cout << test_names[i] << " took " << nsecs_per_iter << " nseconds per iteration\n";
  }
  exit(0);
}

[4] Hans Boehm Raw Microbenchmark Results
Intel(R) Core(TM) i7-8650U (sanity check only):

hboehm@hboehm-glaptop0:~/tests$ ./a.out
nsec resolution = 1
control took 1.34812 nseconds per iteration
release store took 1.30038 nseconds per iteration
store with release fence took 1.2933 nseconds per iteration
store with two fences took 18.0474 nseconds per iteration

Cortex A53 (Raspberry pi)
hboehm@rpi3-20210823:~/tests$ ./a.out
nsec resolution = 1
control took 7.15224 nseconds per iteration
release store took 10.7282 nseconds per iteration
store with release fence took 7.15221 nseconds per iteration
store with two fences took 10.013 nseconds per iteration
hboehm@rpi3-20210823:~/tests$ ./a.out -r
nsec resolution = 1
control took 7.15227 nseconds per iteration
release store took 10.7282 nseconds per iteration
store with release fence took 7.16133 nseconds per iteration
store with two fences took 10.034 nseconds per iteration

Cortex A55 (Pixel 6 Pro)

raven:/data/tmp # taskset 0f ./release-timer
nsec resolution = 1
control took 2.77965 nseconds per iteration
release store took 8.89654 nseconds per iteration
store with release fence took 4.44787 nseconds per iteration
store with two fences took 7.78331 nseconds per iteration
raven:/data/tmp # taskset 0f ./release-timer -r                                                                 	 
nsec resolution = 1
control took 2.78024 nseconds per iteration
release store took 8.89574 nseconds per iteration
store with release fence took 4.44844 nseconds per iteration
store with two fences took 7.78428 nseconds per iteration

Cortex A76 (Pixel 6 Pro)
raven:/data/tmp # taskset 30 ./release-timer -r                                                                 	 
nsec resolution = 1
control took 1.77686 nseconds per iteration
release store took 1.81081 nseconds per iteration
store with release fence took 5.3301 nseconds per iteration
store with two fences took 8.88346 nseconds per iteration
raven:/data/tmp # taskset 30 ./release-timer                                                                   	 
nsec resolution = 1
control took 1.78021 nseconds per iteration
release store took 1.86095 nseconds per iteration
store with release fence took 5.33088 nseconds per iteration
store with two fences took 8.88462 nseconds per iteration

Cortex X1 (Pixel 6 Pro)
raven:/data/tmp # taskset c0 ./release-timer                                                                   	 
nsec resolution = 1
control took 2.14252 nseconds per iteration
release store took 2.14258 nseconds per iteration
store with release fence took 4.32982 nseconds per iteration
store with two fences took 7.05234 nseconds per iteration
raven:/data/tmp # taskset c0 ./release-timer -r                                                                 	 
nsec resolution = 1
control took 2.14254 nseconds per iteration
release store took 2.14251 nseconds per iteration
store with release fence took 4.3273 nseconds per iteration
store with two fences took 7.05239 nseconds per iteration

[5] Hans Boehm almost-all-load Microbenchmark:
// Copyright 2023 Google LLC.
// SPDX-License-Identifier: Apache-2.0

#include <atomic>
#include <iostream>
#include <time.h>

static constexpr int INNER_ITERS = 10'000'000;
static constexpr int OUTER_ITERS = 20;
static constexpr int N_TESTS = 4;

volatile int the_volatile(17);
std::atomic<int> the_atomic(17);

int test1() {
  return the_volatile;
}

int test2() {
  return the_atomic.load(std::memory_order_acquire);
}

int test3() {
  int result = the_atomic.load(std::memory_order_relaxed);
  atomic_thread_fence(std::memory_order_acquire);
  return result;
}

int test4() {
  atomic_thread_fence(std::memory_order_seq_cst);
  int result = the_atomic.load(std::memory_order_relaxed);
  atomic_thread_fence(std::memory_order_acquire);
  return result;
}

typedef int (*int_func)();

uint64_t getnanos() {
  struct timespec result;
  if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &result) != 0) {
	std::cerr << "clock_gettime() failed\n";
	exit(1);
  }
  return (uint64_t)result.tv_nsec + 1'000'000'000 * (uint64_t)result.tv_sec;
}

int_func tests[N_TESTS] = { test1, test2, test3, test4 };
const char *test_names[N_TESTS] =
	{ "control", "acquire load", "load with acquire fence", "load with two fences" };
uint64_t total_time[N_TESTS] = { 0 };

uint sum, last_sum = 0;

int main(int argc, char **argv) {
  struct timespec res;
  if (clock_getres(CLOCK_PROCESS_CPUTIME_ID, &res) != 0) {
	std::cerr << "clock_getres() failed\n";
	exit(1);
  } else {
	std::cout << "nsec resolution = " << res.tv_nsec << std::endl;
  }
  if (argc == 2 && argv[1][0] == 'r') {
	// Run tests in reverse order.
	for (int i = 0; i < N_TESTS / 2; ++i) {
  	std::swap(tests[i], tests[N_TESTS - 1 - i]);
  	std::swap(test_names[i], test_names[N_TESTS - 1 - i]);
	}
  }
  for (int i = 0; i < OUTER_ITERS; ++i) {
	// Alternate tests to minimize bias due to thermal throttling.
	for (int j = 0; j < N_TESTS; ++j) {
  	sum = 0;
  	uint64_t start_time = getnanos();
  	for (int k = 0; k < INNER_ITERS; ++k) {
    	sum += tests[j](); // Provides memory accesses between tests.
  	}
  	// Ignore first iteration for all tests. The first iteration of the first test is
  	// empirically slightly slower.
  	if (i != 0) {
    	total_time[j] += getnanos() - start_time;
  	}
  	if (sum == 0 || last_sum != 0 && sum != last_sum) {
    	std::cerr << "result check failed";
    	exit(1);
  	}
  	last_sum = sum;
	}
  }
  for (int i = 0; i < N_TESTS; ++i) {
	double nsecs_per_iter = (double) total_time[i] / INNER_ITERS / (OUTER_ITERS - 1);
	std::cout << test_names[i] << " took " << nsecs_per_iter << " nseconds per iteration\n";
  }
  exit(0);
}

Patrick O'Neill (11):
  RISC-V: Eliminate SYNC memory models
  RISC-V: Enforce Libatomic LR/SC SEQ_CST
  RISC-V: Enforce subword atomic LR/SC SEQ_CST
  RISC-V: Enforce atomic compare_exchange SEQ_CST
  RISC-V: Add AMO release bits
  RISC-V: Strengthen atomic stores
  RISC-V: Eliminate AMO op fences
  RISC-V: Weaken LR/SC pairs
  RISC-V: Weaken mem_thread_fence
  RISC-V: Weaken atomic loads
  RISC-V: Table A.6 conformance tests

 gcc/config/riscv/riscv-protos.h               |   3 +
 gcc/config/riscv/riscv.cc                     |  66 ++++--
 gcc/config/riscv/sync.md                      | 194 ++++++++++++------
 .../riscv/amo-table-a-6-amo-add-1.c           |   8 +
 .../riscv/amo-table-a-6-amo-add-2.c           |   8 +
 .../riscv/amo-table-a-6-amo-add-3.c           |   8 +
 .../riscv/amo-table-a-6-amo-add-4.c           |   8 +
 .../riscv/amo-table-a-6-amo-add-5.c           |   8 +
 .../riscv/amo-table-a-6-compare-exchange-1.c  |  10 +
 .../riscv/amo-table-a-6-compare-exchange-2.c  |  10 +
 .../riscv/amo-table-a-6-compare-exchange-3.c  |  10 +
 .../riscv/amo-table-a-6-compare-exchange-4.c  |  10 +
 .../riscv/amo-table-a-6-compare-exchange-5.c  |  10 +
 .../riscv/amo-table-a-6-compare-exchange-6.c  |  11 +
 .../riscv/amo-table-a-6-compare-exchange-7.c  |  10 +
 .../gcc.target/riscv/amo-table-a-6-fence-1.c  |   8 +
 .../gcc.target/riscv/amo-table-a-6-fence-2.c  |  10 +
 .../gcc.target/riscv/amo-table-a-6-fence-3.c  |  10 +
 .../gcc.target/riscv/amo-table-a-6-fence-4.c  |  10 +
 .../gcc.target/riscv/amo-table-a-6-fence-5.c  |  10 +
 .../gcc.target/riscv/amo-table-a-6-load-1.c   |   9 +
 .../gcc.target/riscv/amo-table-a-6-load-2.c   |  11 +
 .../gcc.target/riscv/amo-table-a-6-load-3.c   |  11 +
 .../gcc.target/riscv/amo-table-a-6-store-1.c  |   9 +
 .../gcc.target/riscv/amo-table-a-6-store-2.c  |  11 +
 .../riscv/amo-table-a-6-store-compat-3.c      |  11 +
 .../riscv/amo-table-a-6-subword-amo-add-1.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-2.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-3.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-4.c   |   9 +
 .../riscv/amo-table-a-6-subword-amo-add-5.c   |   9 +
 gcc/testsuite/gcc.target/riscv/pr89835.c      |   9 +
 libgcc/config/riscv/atomic.c                  |   4 +-
 33 files changed, 467 insertions(+), 75 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-amo-add-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-compare-exchange-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-fence-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-load-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-store-compat-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr89835.c

-- 
2.34.1

next prev parent reply	other threads:[~2023-04-27 16:24 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220407182918.294892-1-patrick@rivosinc.com>
2023-04-05 21:01 ` [PATCH v2 0/8] RISCV: " Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 1/8] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 2/8] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 3/8] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 4/8] RISCV: Add AMO release bits Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 5/8] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 6/8] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 7/8] RISCV: Weaken atomic stores Patrick O'Neill
2023-04-05 21:01   ` [PATCH v2 8/8] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-10 18:23   ` [PATCH v3 00/10] RISCV: Implement ISA Manual Table A.6 Mappings Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 01/10] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 02/10] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 03/10] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 04/10] RISCV: Add AMO release bits Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 05/10] RISCV: Strengthen atomic stores Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 06/10] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 07/10] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 08/10] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 09/10] RISCV: Weaken atomic loads Patrick O'Neill
2023-04-10 18:23     ` [PATCH v3 10/10] RISCV: Table A.6 conformance tests Patrick O'Neill
2023-04-14 17:09     ` [PATCH v4 00/10] RISCV: Implement ISA Manual Table A.6 Mappings Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 01/10] RISCV: Eliminate SYNC memory models Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 02/10] RISCV: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 03/10] RISCV: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 04/10] RISCV: Add AMO release bits Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 05/10] RISCV: Strengthen atomic stores Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 06/10] RISCV: Eliminate AMO op fences Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 07/10] RISCV: Weaken compare_exchange LR/SC pairs Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 08/10] RISCV: Weaken mem_thread_fence Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 09/10] RISCV: Weaken atomic loads Patrick O'Neill
2023-04-14 17:09       ` [PATCH v4 10/10] RISCV: Table A.6 conformance tests Patrick O'Neill
2023-04-27 16:22       ` Patrick O'Neill [this message]
2023-04-27 16:22         ` [PATCH v5 01/11] RISC-V: Eliminate SYNC memory models Patrick O'Neill
2023-04-28 16:23           ` Jeff Law
2023-05-02 20:12             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 02/11] RISC-V: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-04-28 16:50           ` Jeff Law
2023-05-02 20:12             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 03/11] RISC-V: Enforce subword atomic " Patrick O'Neill
2023-05-02 20:14           ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 04/11] RISC-V: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-04-28 17:23           ` Jeff Law
2023-05-02 20:15             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 05/11] RISC-V: Add AMO release bits Patrick O'Neill
2023-04-28 17:34           ` Jeff Law
2023-05-02 20:16             ` Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 06/11] RISC-V: Strengthen atomic stores Patrick O'Neill
2023-04-28 17:40           ` Jeff Law
2023-04-28 17:43             ` Palmer Dabbelt
2023-04-28 21:42               ` Hans Boehm
2023-04-28 22:21                 ` Hans Boehm
2023-04-30 17:10                 ` Jeff Law
2023-05-02 20:18             ` [Committed " Patrick O'Neill
2023-05-02 16:11           ` [PATCH v5 " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 07/11] RISC-V: Eliminate AMO op fences Patrick O'Neill
2023-04-28 17:43           ` Jeff Law
2023-05-02 20:19             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 08/11] RISC-V: Weaken LR/SC pairs Patrick O'Neill
2023-04-28 17:56           ` Jeff Law
2023-05-02 20:19             ` [Committed " Patrick O'Neill
2023-04-27 16:22         ` [PATCH v5 09/11] RISC-V: Weaken mem_thread_fence Patrick O'Neill
2023-04-28 18:00           ` Jeff Law
2023-05-02 20:20             ` [Committed " Patrick O'Neill
2023-05-03 12:18           ` [PATCH v5 " Andreas Schwab
2023-05-03 12:22             ` Martin Liška
2023-04-27 16:23         ` [PATCH v5 10/11] RISC-V: Weaken atomic loads Patrick O'Neill
2023-04-28 18:04           ` Jeff Law
2023-05-02 20:20             ` [Committed " Patrick O'Neill
2023-04-27 16:23         ` [PATCH v5 11/11] RISC-V: Table A.6 conformance tests Patrick O'Neill
2023-04-28 18:07           ` Jeff Law
2023-05-02 20:28             ` [Committed " Patrick O'Neill
2023-04-27 17:20         ` [PATCH v5 00/11] RISC-V: Implement ISA Manual Table A.6 Mappings Andrea Parri
2023-04-28 16:14         ` Jeff Law
2023-04-28 16:29           ` Palmer Dabbelt
2023-04-28 17:44             ` Patrick O'Neill
2023-04-28 18:18               ` Patrick O'Neill
     [not found]               ` <CAMOCf+hK9nedV+UeENbTn=Uy3RpYLeMt04mLiLmDsZyNm83CCg@mail.gmail.com>
2023-04-30 16:37                 ` Jeff Law
2023-07-25 18:01         ` [gcc13 backport 00/12] " Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 01/12] RISC-V: Eliminate SYNC memory models Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 02/12] RISC-V: Enforce Libatomic LR/SC SEQ_CST Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 03/12] RISC-V: Enforce subword atomic " Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 04/12] RISC-V: Enforce atomic compare_exchange SEQ_CST Patrick O'Neill
2023-07-25 18:01           ` [gcc13 backport 05/12] RISC-V: Add AMO release bits Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 06/12] RISC-V: Strengthen atomic stores Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 07/12] RISC-V: Eliminate AMO op fences Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 08/12] RISC-V: Weaken LR/SC pairs Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 09/12] RISC-V: Weaken mem_thread_fence Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 10/12] RISC-V: Weaken atomic loads Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 11/12] RISC-V: Table A.6 conformance tests Patrick O'Neill
2023-07-25 18:02           ` [gcc13 backport 12/12] riscv: fix error: control reaches end of non-void function Patrick O'Neill
2023-07-26  1:22             ` Kito Cheng
2023-07-26 17:41               ` Patrick O'Neill
2023-07-25 19:50           ` [gcc13 backport 00/12] RISC-V: Implement ISA Manual Table A.6 Mappings Jakub Jelinek
2023-07-25 20:01             ` Palmer Dabbelt
2023-07-25 21:02             ` Jeff Law
2023-07-25 21:16               ` Palmer Dabbelt
2023-07-25 19:58           ` Palmer Dabbelt
2023-07-31 16:19           ` [Committed] " Patrick O'Neill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230427162301.1151333-1-patrick@rivosinc.com \
    --to=patrick@rivosinc.com \
    --cc=andrea@rivosinc.com \
    --cc=andrew@sifive.com \
    --cc=cmuellner@gcc.gnu.org \
    --cc=dlustig@nvidia.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=gnu-toolchain@rivosinc.com \
    --cc=hboehm@google.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=kito.cheng@sifive.com \
    --cc=palmer@rivosinc.com \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).