Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jakub Jelinek <jakub@redhat.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Jeff Law <law@redhat.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	       Richard Biener <rguenther@suse.de>,
	       Eric Botcazou <ebotcazou@adacore.com>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)
Date: Thu, 15 Jan 2015 08:40:00 -0000	[thread overview]
Message-ID: <20150115081330.GB1405@tucnak.redhat.com> (raw)
In-Reply-To: <DE331E6C-39D7-4F70-9E25-A3A4F7F49640@gmail.com>

On Thu, Jan 15, 2015 at 07:46:18AM +0100, Richard Biener wrote:
> >> That last line means the compiler is free to delete a non-volatile
> >> asm with a memory clobber if that asm is not needed for dataflow.  Or
> >> that is how I read it; it is trying to indicate that if you want to
> >> prevent the memory clobber from being deleted (together with the rest
> >> of the asm), you need to make the asm volatile.
> >>
> >> So as far as I can see the compiler can CSE two identical
> >non-volatile
> >> asms with memory clobber just fine.  Older GCC (I tried 4.7.2) does
> >do
> >> this; current mainline doesn't.  I think it should.
> >No, it should not CSE those two cases.  That's simply wrong and if an 
> >older version did that optimization, that's a bug.
> 
> I think segher has a point here.  If the asm with memory clobber would store to random memory and the point would be to preserve that then the whole distinction with volatile doesn't make much sense (after all without volatile we happily DCE such asm if the regular outputs are not needed).
> 
> This doesn't mean 'memory' is a well-designed thing, of course. Just its
> effects are effectively limited to reads without volatile(?)

Segher's mails talk about "memory" being a write but not read.
If we even can't agree on what non-volatile "memory" means, I think
we should treat it more conservatively, because every user (and there are
lots of people using non-volatile asm with "memory" in the wild) treats it
differently.  Just trying to grep for a few:
glibc:
./sysdeps/alpha/bits/atomic.h:# define atomic_full_barrier()	__asm ("mb" : : : "memory");
./sysdeps/alpha/bits/atomic.h:# define atomic_read_barrier()	__asm ("mb" : : : "memory");
./sysdeps/alpha/bits/atomic.h:# define atomic_write_barrier()	__asm ("wmb" : : : "memory");
./sysdeps/sparc/sparc32/bits/atomic.h:# define atomic_full_barrier() __asm ("" ::: "memory")
./sysdeps/powerpc/powerpc32/bits/atomic.h:# define atomic_read_barrier()	__asm ("lwsync" ::: "memory")
./sysdeps/powerpc/powerpc32/bits/atomic.h:# define atomic_read_barrier()	__asm ("sync" ::: "memory")
./sysdeps/powerpc/powerpc64/bits/atomic.h:#define atomic_read_barrier()	__asm ("lwsync" ::: "memory")
./sysdeps/powerpc/bits/atomic.h:#define atomic_full_barrier()	__asm ("sync" ::: "memory")
./sysdeps/powerpc/bits/atomic.h:#define atomic_write_barrier()	__asm ("eieio" ::: "memory")
./sysdeps/generic/malloc-machine.h:# define atomic_full_barrier() __asm ("" ::: "memory")
./elf/tst-tlsmod3.c:  asm ("" ::: "memory");
./elf/tst-tlsmod4.c:  asm ("" ::: "memory");
./elf/tst-tlsmod1.c:  asm ("" ::: "memory");
./elf/tst-tlsmod2.c:  asm ("" ::: "memory");
./include/atomic.h:# define atomic_full_barrier() __asm ("" ::: "memory")
linux kernel:
./arch/arm/mach-omap2/pm24xx.c:	asm("mcr p15, 0, %0, c7, c0, 4" : : "r" (zero) : "memory", "cc");
./arch/arm/include/asm/irqflags.h:#define local_fiq_enable()  __asm__("cpsie f	@ __stf" : : : "memory", "cc")
./arch/arm/include/asm/irqflags.h:#define local_fiq_disable() __asm__("cpsid f	@ __clf" : : : "memory", "cc")
./arch/x86/include/asm/uaccess_64.h:		asm("":::"memory");
./arch/x86/include/asm/uaccess_64.h:		asm("":::"memory");
./arch/x86/include/asm/segment.h:	asm("mov %%" #seg ",%0":"=r" (value) : : "memory")
./arch/x86/include/asm/stackprotector.h:	asm("mov %0, %%gs" : : "r" (__KERNEL_STACK_CANARY) : "memory");
./arch/arm64/include/asm/irqflags.h:#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
./arch/arm64/include/asm/irqflags.h:#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
./arch/arm64/include/asm/irqflags.h:#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
./arch/arm64/include/asm/irqflags.h:#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
./arch/tile/lib/memcpy_64.c:				__asm__ ("" : : : "memory");
./arch/tile/lib/memcpy_64.c:				__asm__ ("" : : : "memory");
./arch/tile/include/hv/netio_intf.h:  __asm__("" : : : "memory");
./drivers/net/ethernet/sgi/ioc3-eth.c:	__asm__("sync" ::: "memory")
./lib/sha1.c:  #define setW(x, val) do { W(x) = (val); __asm__("":::"memory"); } while (0)

The glibc barriers are supposedly something that can be CSEd (one barrier instead of
two consecutive barriers is enough), but certainly not moved across any loads/stores
in between.  In the kernel case, the enable/disable probably wouldn't allow even CSE.

So I'm with Jeff that we should treat "memory" at least as unspecified read and write,
and whether we can CSE them if there are no memory loads/stores in between them can
be discussed (most likely the kernel would be safe even in that case, because those
usually don't nest and appear in pairs, or act as barriers (like the glibc case),
or read from segment registers (guess again ok to be CSEd with no intervening loads/stores).

In 4.9 backport I'd prefer not to CSE them at all though, stay conservative.

	Jakub

next prev parent reply	other threads:[~2015-01-15  8:13 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-13 16:22 Jakub Jelinek
2015-01-13 17:06 ` Segher Boessenkool
2015-01-13 20:02   ` Jeff Law
2015-01-13 20:29     ` Jakub Jelinek
2015-01-13 22:28       ` Jeff Law
2015-01-14  3:44         ` Segher Boessenkool
2015-01-14  6:52           ` Jeff Law
2015-01-14 15:40             ` Segher Boessenkool
2015-01-15  6:46               ` Jeff Law
2015-01-15  7:54                 ` Richard Biener
2015-01-15  8:40                   ` Jakub Jelinek [this message]
2015-01-15  8:43                     ` Richard Biener
2015-01-15  9:50                     ` Jakub Jelinek
2015-01-15 18:22                     ` Jeff Law
2015-01-23 21:39                     ` Richard Henderson
2015-01-23 22:53                       ` Segher Boessenkool
2015-01-23 23:12                         ` Jakub Jelinek
2015-01-24  7:23                           ` Segher Boessenkool
2015-01-24 14:39                             ` Richard Sandiford
2015-01-13 22:42     ` Segher Boessenkool
2015-01-14  0:40       ` Segher Boessenkool
2015-01-14  7:12 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150115081330.GB1405@tucnak.redhat.com \
    --to=jakub@redhat.com \
    --cc=ebotcazou@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=rguenther@suse.de \
    --cc=richard.guenther@gmail.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).