public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "campbell+gcc-bugzilla at mumble dot net" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/110592] [SPARC] GCC should default to TSO memory model when compiling for sparc32
Date: Mon, 10 Jul 2023 13:19:38 +0000	[thread overview]
Message-ID: <bug-110592-4-AZkvpfJYoT@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-110592-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110592

Taylor R Campbell <campbell+gcc-bugzilla at mumble dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |campbell+gcc-bugzilla@mumbl
                   |                            |e.net

--- Comment #5 from Taylor R Campbell <campbell+gcc-bugzilla at mumble dot net> ---
(In reply to Eric Botcazou from comment #4)
> Well, you need to elaborate a bit here, because the current configuration
> has been there for a quarter of century and everybody had apparently
> survived it until a couple of days ago.

For most of that quarter century, memory ordering was limited to out-of-line
barrier/fence subroutines implemented in assembly, like membar_sync in Solaris
and NetBSD, or the thread-switch assembly routines in the kernel.

It is only relatively recently, since C11 and C++11, that a lot of programs
started using in-line barriers/fences and ordered memory operations like
store-release/load-acquire.

In that time, sparcv7 and sparcv8 haven't gotten a lot of attention, of course.

But since they were introduced, NetBSD has had a common userland for sparcv7
and sparcv8, just called `NetBSD/sparc', with a special libc loaded on sparcv8
to use v8-only instructions like SMUL and UMUL for runtime multiplication
subroutines to improve performance.  (We could in principle do the same for
LDSTUB in membar_sync on sparcv7, although we don't at the moment.)

But now that programs rely on compiler-generated barriers, there's a conflict
between gcc's v7 and v8 code generation:

1. `gcc -mcpu=v7' generates code that lacks LDSTUB where store-before-load
barriers are needed, so anything that uses Dekker's algorithm with in-line
barriers won't work correctly on a sparcv8 CPU (but it will only manifest in
extremely rare, hard-to-diagnose scenarios, because Dekker's algorithm is so
obscure).

2. `gcc -mcpu=v8' generates code that uses SMUL and UMUL and other instructions
that don't exist on sparcv7.

Evidently gcc can be made to generate SMUL/UMUL but omit LDSTUB barriers by
using `gcc -mcpu=v8 -mmemory-model=sc', but the other way around doesn't work:
`gcc -mcpu=v7 -mmemory-model=tso' still omits the LDSTUB barriers, because the
code generation rules for barriers are all gated on TARGET_V8 || TARGET_V9.

What we would like to do for NetBSD/sparc is use `-mcpu=v7 -mmemory-model=tso'
-- that is, if it worked -- by default.  The original submitter drafted a
relatively small patch to achieve this, mostly by removing TARGET_V8 ||
TARGET_V9 conditionals or changing TARGET_V8 to !TARGET_V9 in membar-related
code generation rules.  But we'd also like to avoid diverging from gcc
upstream.  Could we convince you to take up an approach like this?

Applications built to run on v7-only, of course, could omit the LDSTUBs by
using `-mcpu=v7 -mmemory-model=sc' (or perhaps we could have the default be
`-mcpu=v7 -mmemory-model=sc', but have bare `-mcpu=v7' imply `-mcpu=v7
-mmemory-model=sc' or something), and applications built to run on v8-only can
still use `-mcpu=v8' to take advantage of `SMUL/UMUL'.

I expect this would only affect a tiny fraction of programs in extremely rare
scenarios -- those that actually rely on Dekker's algorithm (already rare), and
hit problems with memory ordering (also rare, only under high contention),
using in-line barriers or ordered memory operations (which wasn't the norm a
quarter century ago when v7 and v8 were relevant).  So you have to go out of
your way to hit problems in practice, and any negative performance impact of
the extra LDSTUBs on v7 CPUs that don't need them is likely negligible.  But
it's clear from code inspection and theory that the problem is there.

  parent reply	other threads:[~2023-07-10 13:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-08  1:46 [Bug target/110592] New: " koachan+gccbugs at protonmail dot com
2023-07-08  8:25 ` [Bug target/110592] " ebotcazou at gcc dot gnu.org
2023-07-08  9:42 ` ebotcazou at gcc dot gnu.org
2023-07-09 13:02 ` martin at netbsd dot org
2023-07-09 17:47 ` ebotcazou at gcc dot gnu.org
2023-07-10 13:19 ` campbell+gcc-bugzilla at mumble dot net [this message]
2023-07-12  9:31 ` ebotcazou at gcc dot gnu.org
2023-07-12 12:17 ` campbell+gcc-bugzilla at mumble dot net
2023-07-12 14:58 ` koachan+gccbugs at protonmail dot com
2023-07-12 17:16 ` ebotcazou at gcc dot gnu.org
2023-07-12 20:36 ` campbell+gcc-bugzilla at mumble dot net

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-110592-4-AZkvpfJYoT@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).