public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7]
       [not found] ` <Pine.LNX.4.21.0009071653350.7236-100000@inspiron.random>
@ 2000-09-07  8:59   ` Jamie Lokier
  2000-09-07  9:29     ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
  2000-09-07  9:38     ` Linus Torvalds
  0 siblings, 2 replies; 10+ messages in thread
From: Jamie Lokier @ 2000-09-07  8:59 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: David S. Miller, Ingo Molnar, Linus Torvalds, linux-kernel, gcc

asm *__volatile__* seems to make no difference.  I've tried a few things.

Andrea Arcangeli wrote:
> Maybe we can rely on the __volatile__ statement of the asm that will
> enforce that if we write:
> 
> 	*p = 0;
> 	__asm__ __volatile__("" : :);
> 	*p = 1;
>
> in the assembler we'll then find both a write of 0 and then a write of 1
> to memory.

That does 2 writes with gcc-2.96 and also egcs-2.91.66/19990314
(Red Hat's kgcc), with or without -fstrict-aliasing.

It also does 2 writes without __volatile__.

> 	int a = *p;
> 	__asm__ __volatile__("" : :);
> 	a = *p;
> 
> (to do two explicit reads)

Sorry, that does just one read, kgcc (old stable gcc) and also with
gcc-2.96.  Type aliasing on/off makes no difference to the number of reads.

Again, __volatile__ makes no difference.

I cannot tell from the GCC manual whether either of these behaviours is
correct or not.

-- Jamie

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [wasRe: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07  8:59   ` spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7] Jamie Lokier
@ 2000-09-07  9:29     ` Andrea Arcangeli
  2000-09-07  9:39       ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
  2000-09-07  9:38     ` Linus Torvalds
  1 sibling, 1 reply; 10+ messages in thread
From: Andrea Arcangeli @ 2000-09-07  9:29 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: David S. Miller, Ingo Molnar, Linus Torvalds, linux-kernel, gcc

On Thu, 7 Sep 2000, Jamie Lokier wrote:

>asm *__volatile__* seems to make no difference.  I've tried a few things.

It makes a difference, see below.

>
>Andrea Arcangeli wrote:
>> Maybe we can rely on the __volatile__ statement of the asm that will
>> enforce that if we write:
>> 
>> 	*p = 0;
>> 	__asm__ __volatile__("" : :);
>> 	*p = 1;
>>
>> in the assembler we'll then find both a write of 0 and then a write of 1
>> to memory.
>
>That does 2 writes with gcc-2.96 and also egcs-2.91.66/19990314
>(Red Hat's kgcc), with or without -fstrict-aliasing.
>
>It also does 2 writes without __volatile__.
>
>> 	int a = *p;
>> 	__asm__ __volatile__("" : :);
>> 	a = *p;
>> 
>> (to do two explicit reads)
>
>Sorry, that does just one read, kgcc (old stable gcc) and also with
>gcc-2.96.  Type aliasing on/off makes no difference to the number of reads.

I wrote the above not just as a complete testecase, but just to mean what
the case I was talking about. You made int a a local variable and the
thing you noticed is an otimization that the compiler is allowed to do
regardless of the "memory" clobber too (`int a' have to be at least extern
otherwise the compiler understands the first read can go away). Try this:

int * p;
int a;

extern f(int);

main()
{
	a = *p;
	
	__asm__ __volatile__("zzz" : : );

	a = *p;

	f(a);
}

Try to add "memory" as clobber to the above testcase and nothing will
change. (that's what I meant in my previous email saying that even w/o
"memory" things got compiled right at least in my simple testcases)

>Again, __volatile__ makes no difference.

Note that __volatile__ really makes a difference if for example you
speficy as output an operand that isn't used anymore.

Try this:

main()
{
	int a;
	__asm__("zzz" : "=r" (a));
}

and then this:

main()
{
	int a;
	__asm__ __volatile__("zzz" : "=r" (a));
}


--- p.s.nonvolatile	Thu Sep  7 18:05:30 2000
+++ p.s	Thu Sep  7 18:05:53 2000
@@ -6,10 +6,13 @@
 .globl main
 	.type	 main,@function
 main:
 	pushl %ebp
 	movl %esp,%ebp
+#APP
+	zzz
+#NO_APP
 	movl %ebp,%esp
 	popl %ebp
 	ret
 .Lfe1:
 	.size	 main,.Lfe1-main

>I cannot tell from the GCC manual whether either of these behaviours is
>correct or not.

The behaviour of what you described is definitely correct/safe.

My only wondering was about "memory" non-"memory" as clobber because gcc
was doing things right just with the __asm__ __volatile__ thing w/o
"memory" as clobber. However I had the confirm my worries was right and
that "memory" is needed for all the spinlocks.

BTW, we had a bug in the alpha port last month in the
linux/include/asm-alpha/fpu.h:wrfpcr() function, read it for a real world
example of where __volatile__ must be used.

Andrea

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [wasRe: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07  8:59   ` spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7] Jamie Lokier
  2000-09-07  9:29     ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
@ 2000-09-07  9:38     ` Linus Torvalds
  2000-09-07  9:56       ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2000-09-07  9:38 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Andrea Arcangeli, David S. Miller, Ingo Molnar, linux-kernel, gcc

On Thu, 7 Sep 2000, Jamie Lokier wrote:

> asm *__volatile__* seems to make no difference.  I've tried a few things.
> 
> Andrea Arcangeli wrote:
> > Maybe we can rely on the __volatile__ statement of the asm that will
> > enforce that if we write:
> > 
> > 	*p = 0;
> > 	__asm__ __volatile__("" : :);
> > 	*p = 1;
> >
> > in the assembler we'll then find both a write of 0 and then a write of 1
> > to memory.
> 
> That does 2 writes with gcc-2.96 and also egcs-2.91.66/19990314
> (Red Hat's kgcc), with or without -fstrict-aliasing.
> 
> It also does 2 writes without __volatile__.

Your test is broken.

Read the gcc documentation. A inline asm with no outputs is implicitly
considered volatile. So _both_ your tests had volatile there.

Now, that may not matter that much fo ryour test-case: gcc gets careful
around inline asm anyway, even without the volatile.


Change it to something like

	__asm__("":"=r" (x):"0" (x));

and the "volatile" should matter.

Not for memory references, perhaps. But for the movement issues.

			Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07  9:29     ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
@ 2000-09-07  9:39       ` Jamie Lokier
  2000-09-07  9:48         ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Linus Torvalds
  2000-09-07 10:05         ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
  0 siblings, 2 replies; 10+ messages in thread
From: Jamie Lokier @ 2000-09-07  9:39 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: David S. Miller, Ingo Molnar, Linus Torvalds, linux-kernel, gcc

Andrea Arcangeli wrote:
> >> 	int a = *p;
> >> 	__asm__ __volatile__("" : :);
> >> 	a = *p;
> >> 
> >> (to do two explicit reads)
> >
> >Sorry, that does just one read, kgcc (old stable gcc) and also with
> >gcc-2.96.  Type aliasing on/off makes no difference to the number of reads.
> 
> I wrote the above not just as a complete testecase, but just to mean what
> the case I was talking about. You made int a a local variable and the
> thing you noticed is an otimization that the compiler is allowed to do
> regardless of the "memory" clobber too (`int a' have to be at least extern
> otherwise the compiler understands the first read can go away).

Interestingly enough, the local variable case is one where "memory" does
make a difference.  Without "memory":

        movl    p, %eax
        movl    (%eax), %eax
#APP
#NO_APP

With "memory":

#APP
#NO_APP
        movl    p, %eax
        movl    (%eax), %eax

> Try to add "memory" as clobber to the above testcase and nothing will
> change. (that's what I meant in my previous email saying that even w/o
> "memory" things got compiled right at least in my simple testcases)

As you can see from above, there are cases where

   local_var = shared->mumble;
   // ...
   spin_lock (&spinlock);
   local_var = shared->mumble;

requires a "memory" clobber, otherwise the second read, which is in a
critical region, won't be emitted by the compiler.

-- Jamie

ps. There is a _clobber_ for memory, but no way to say "this asm _reads_
arbitrary memory".  __volatile__ may be filling that role though.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [wasRe: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07  9:39       ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
@ 2000-09-07  9:48         ` Linus Torvalds
  2000-09-07 10:11           ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
  2000-09-07 10:05         ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2000-09-07  9:48 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Andrea Arcangeli, David S. Miller, Ingo Molnar, linux-kernel, gcc

On Thu, 7 Sep 2000, Jamie Lokier wrote:
> 
> ps. There is a _clobber_ for memory, but no way to say "this asm _reads_
> arbitrary memory".  __volatile__ may be filling that role though.

Nope. "memory" fills that role too. Remember: "memory" doesn't actually
say "this clobbers all memory". That would be silly: an asm that just
wipes all memory would not be a very useful asm (or rather, it would have
just _one_ use: "execve()"). So "memory" really says that the asm clobbers
_some_ memory.

Which in turn means that the code scheduler has to synchronize all memory
accesses around it - as if the asm was reading all memory. Because if the
scheduler would move a store to after the asm, it would give the wrong
results if the asm happened to clobber _that_ memory. And the scheduler
obviously cannot just drop the store ("Oh, the asm will clobber this
anyway"), because it doesn't know which memory regions get clobbered.

Now, the fact that the "memory" clobber also seems to clobber local
variables is a bug, I think. Or at least a misfeature. It should not be
considered to clobber reloads, as those are really in "registers" - at
least as far as the asm is concerned (the compiler could have chosen to
just allocate a hard register to that local variable or argument).

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07  9:38     ` Linus Torvalds
@ 2000-09-07  9:56       ` Jamie Lokier
  0 siblings, 0 replies; 10+ messages in thread
From: Jamie Lokier @ 2000-09-07  9:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrea Arcangeli, David S. Miller, Ingo Molnar, linux-kernel, gcc

Linus Torvalds wrote:
> Change it to something like
> 
> 	__asm__("":"=r" (x):"0" (x));
> 
> and the "volatile" should matter.

Yes it does.  Without "volatile", the asm disappears :-)

> Not for memory references, perhaps. But for the movement issues.

The compiler isn't moving memory references around "asm volatile", but
it is doing CSE around them to _eliminate_ memory references.

Thus spin_lock needs the memory clobber, to prevent CSE of non-volatile
memory references between the critical region and outside the critical
region.

Maybe spin_unlock doesn't need one because CSE doesn't work the other
way.  (I'd put that clobber in anyway, to be sure).

-- Jamie


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [wasRe: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07  9:39       ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
  2000-09-07  9:48         ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Linus Torvalds
@ 2000-09-07 10:05         ` Andrea Arcangeli
  2000-09-07 10:14           ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
  1 sibling, 1 reply; 10+ messages in thread
From: Andrea Arcangeli @ 2000-09-07 10:05 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: David S. Miller, Ingo Molnar, Linus Torvalds, linux-kernel, gcc

On Thu, 7 Sep 2000, Jamie Lokier wrote:

>Interestingly enough, the local variable case is one where "memory" does
>make a difference.  Without "memory":
>
>        movl    p, %eax
>        movl    (%eax), %eax
>#APP
>#NO_APP
>
>With "memory":
>
>#APP
>#NO_APP
>        movl    p, %eax
>        movl    (%eax), %eax

My gcc doesn't make differences between "memory" and non "memory" in this
testcase:

int * p;

extern f(int);

main()
{
	int a;

	a = *p;
	
	__asm__ __volatile__("zzz" : :);

	a = *p;

	f(a);
}

My compiler _always_ produced first the zzz and then it loads p
(regardless of "memory" clobber or not). That's why I said I couldn't
reproduce miscompilations.

>As you can see from above, there are cases where
>
>   local_var = shared->mumble;
>   // ...
    ^^^^^^
>   spin_lock (&spinlock);
>   local_var = shared->mumble;
>
>requires a "memory" clobber, otherwise the second read, which is in a
>critical region, won't be emitted by the compiler.

In your testcase have only `//' in the underlined line, so the compiler is
100% allowed to throw away the first read to local_val, so far so good.

So the compiler does only one read from the `p' pointer and on with my
compiler it's always done _after_ the spin_lock (or after the __asm__
__volatile__ in the above testcase).

Of course "memory" should enforce the read to be done after the spin_lock
but in real life it seems to do the right thing anyway and I couldn't
reproduce miscompilation.

Said that if your compiler puts the read before the spin_lock without the
memory clobber, it is allowed to do that, and in such case you would proof
it was a real world bug (not just a "documentation" one).

Or maybe your testcase was a bit different then mine, in such case please
send it to me (I'm curious indeed :).

Andrea

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07  9:48         ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Linus Torvalds
@ 2000-09-07 10:11           ` Jamie Lokier
  0 siblings, 0 replies; 10+ messages in thread
From: Jamie Lokier @ 2000-09-07 10:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrea Arcangeli, David S. Miller, Ingo Molnar, linux-kernel, gcc

Linus Torvalds wrote:
> Nope. "memory" fills that role too. Remember: "memory" doesn't actually
> say "this clobbers all memory". That would be silly: an asm that just
> wipes all memory would not be a very useful asm (or rather, it would have
> just _one_ use: "execve()"). So "memory" really says that the asm clobbers
> _some_ memory.
> 
> Which in turn means that the code scheduler has to synchronize all memory
> accesses around it - as if the asm was reading all memory. Because if the
> scheduler would move a store to after the asm, it would give the wrong
> results if the asm happened to clobber _that_ memory. And the scheduler
> obviously cannot just drop the store ("Oh, the asm will clobber this
> anyway"), because it doesn't know which memory regions get clobbered.

There are other reasons why stores can be eliminated, and "memory" does
not prevent those.

Example: you have two stores to the same address.  An "asm" with
"memory" is in between the stores.

The first store can still be dropped, even though we don't know _which_
memory the "memory" clobbers.

Now, "memory" may well mean "this asm is considered to read some memory
and clobber some memory".  But that's not what the manual says, and it
would be inconsistent with other kinds of clobber.

> Now, the fact that the "memory" clobber also seems to clobber local
> variables is a bug, I think.

If you are thinking of my examples, "memory" doesn't clobber local
variables.  It does affect the scheduling of reads from general memory
into a local variable.

-- Jamie

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07 10:05         ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
@ 2000-09-07 10:14           ` Jamie Lokier
  2000-09-07 10:20             ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
  0 siblings, 1 reply; 10+ messages in thread
From: Jamie Lokier @ 2000-09-07 10:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: David S. Miller, Ingo Molnar, Linus Torvalds, linux-kernel, gcc

Andrea Arcangeli wrote:
> Said that if your compiler puts the read before the spin_lock without the
> memory clobber, it is allowed to do that, and in such case you would proof
> it was a real world bug (not just a "documentation" one).

Yes, it does.

> Or maybe your testcase was a bit different then mine, in such case please
> send it to me (I'm curious indeed :).

int *p;
int func()
{
  int x = *p;
  __asm__ __volatile ("zzz" : :);
  x = *p;
  return x;
}

In output:

        .ident  "GCC: (GNU) 2.96 20000724 (experimental)"

From the Red Hat 7 beta.

-- Jamie

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock forgets to clobber memory and other smp fixes [wasRe: [patch] waitqueue optimization, 2.4.0-test7]
  2000-09-07 10:14           ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
@ 2000-09-07 10:20             ` Andrea Arcangeli
  0 siblings, 0 replies; 10+ messages in thread
From: Andrea Arcangeli @ 2000-09-07 10:20 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: David S. Miller, Ingo Molnar, Linus Torvalds, linux-kernel, gcc

On Thu, 7 Sep 2000, Jamie Lokier wrote:

>Yes, it does.

Nice.

>        .ident  "GCC: (GNU) 2.96 20000724 (experimental)"
>
>From the Red Hat 7 beta.

Ok.

Andrea

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2000-09-07 10:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.21.0009041922260.10246-100000@inspiron.random>
     [not found] ` <Pine.LNX.4.21.0009071653350.7236-100000@inspiron.random>
2000-09-07  8:59   ` spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7] Jamie Lokier
2000-09-07  9:29     ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
2000-09-07  9:39       ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
2000-09-07  9:48         ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Linus Torvalds
2000-09-07 10:11           ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
2000-09-07 10:05         ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
2000-09-07 10:14           ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier
2000-09-07 10:20             ` spin_lock forgets to clobber memory and other smp fixes [wasRe: " Andrea Arcangeli
2000-09-07  9:38     ` Linus Torvalds
2000-09-07  9:56       ` spin_lock forgets to clobber memory and other smp fixes [was " Jamie Lokier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).