public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* GCC proposal for "@" asm constraint
       [not found] <Pine.LNX.4.21.0009071837310.8494-100000@inspiron.random>
@ 2000-09-08  4:42 ` Jamie Lokier
  2000-09-18 15:35   ` Andrea Arcangeli
  0 siblings, 1 reply; 18+ messages in thread
From: Jamie Lokier @ 2000-09-08  4:42 UTC (permalink / raw)
  To: gcc; +Cc: Andrea Arcangeli, kuznet, linux-kernel, torvalds

Andrea Arcangeli wrote:
> >BTW Look also into asm-i386/bitops.h and dummy cast to some crap there.
> >Are you impressed? 8)
> 
> Yep 8). If we add "memory" such stuff could be removed I think. As far I
> can see the object of such stuff is to cause gcc to say `I'm too lazy to
> see exactly what memory this guy is trying to change, so just assume he
> added "memory" in the clobber list' :))

No, that's not the reason for __dummy.  It is an important side effect,
though as ever it isn't guaranteed.  Someone should add "memory" to the
bitops _iff_ the bitops are supposed to imply a compiler memory barrier.
It's a kernel policy decision.

     -----------

For the benefit of GCC list readers: Linux uses asm constraints like
this:  `"m" (*(volatile struct __dummy *) &object)', where __dummy is
defined by  `struct __dummy { unsigned long a[100]; }'.

This is used extensively in asms for spinlocks, semaphores and atomic
bit operations, atomic counters etc.  In short, anything needing to
operate on a specific memory object.

Passing the address as an operand would be correct but generates worse
code, because in general we don't need a register to hold the address of
`object'.  It is often part of a larger struct, and the __dummy method
lets it be addressed using offset addressing, and often fewer registers.

Casting via __dummy is there so that the "m" (or "=m") memory constraint
will make that operand refer to the actual object in memory, and not a
copy (in a different area of memory).

Most of the time there is no reason for GCC to use a copy of the object
for an "m" constraint, but things like CSE can allow the compiler to
choose a different object known to have the same contents.  Other
scenarios cause __dummy to be required for "=m" constraints.

(Even with __dummy there is no guarantee that a future GCC won't use a
different object anyway, but I expect that is years away).

I'm posting this to the GCC list to make a feature request.

GCC feature request
-------------------

An additional constraint character, like "m" but means "the object at
the specified address".

The operand in C source code would be the object's address (not
dereferenced as Linux does now), so there is no need for bizarre
semantics.

So if I write (assume `@' because most letters are taken):

   asm ("movl %1,%0" : "=g" (result) : "@" (&object));

it would a clean equivalent to this:

   asm ("movl %1,%0" : "=g" (result)
                     : "m" (*(volatile struct __dummy *) &object));

and more or less equivalent to this (but generates better code):

   asm ("movl (%1),%0" : "=g" (result) : "r" (&object));

An alternative would be a modifier to "m" that means "definitely use the
actual object referred to".  I prefer the direct approach.

What do GCC designers think?  This is useful for any code that must use
asms for atomic operations such as semaphores in threads etc.

-- Jamie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-08  4:42 ` GCC proposal for "@" asm constraint Jamie Lokier
@ 2000-09-18 15:35   ` Andrea Arcangeli
  2000-09-18 15:40     ` Linus Torvalds
  0 siblings, 1 reply; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-18 15:35 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: gcc, kuznet, linux-kernel, torvalds

On Fri, Sep 08, 2000 at 01:41:05PM +0200, Jamie Lokier wrote:
> Casting via __dummy is there so that the "m" (or "=m") memory constraint
> will make that operand refer to the actual object in memory, and not a
> copy (in a different area of memory).

Are you really sure gcc could pass a copy even when I specify "memory" in the
clobber list? My point is that "memory" could also mean that the address where
gcc is taking the _copy_ could be clobbered by the asm itself as side effect
and so gcc shouldn't allowed to assume anything and it should first copy the
result value of the previous actions to its real final location before starting
up any asm that clobbers "memory". I think the "memory" clobber should be
enough to ensure that the address used with the .*m constraints refers to the
real backend of the memory passed as parameter to the inline asm.

I think we can remove all the __dummy stuff and put the "memory" in such asm
statements.

Comments?

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-18 15:35   ` Andrea Arcangeli
@ 2000-09-18 15:40     ` Linus Torvalds
  2000-09-18 15:59       ` Andrea Arcangeli
  0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2000-09-18 15:40 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Jamie Lokier, gcc, kuznet, linux-kernel

On Tue, 19 Sep 2000, Andrea Arcangeli wrote:
> 
> I think we can remove all the __dummy stuff and put the "memory" in such asm
> statements.
> 
> Comments?

Have you looked at the code it generates? Quite sad, really.

		Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-18 15:40     ` Linus Torvalds
@ 2000-09-18 15:59       ` Andrea Arcangeli
  0 siblings, 0 replies; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-18 15:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jamie Lokier, gcc, kuznet, linux-kernel

On Mon, Sep 18, 2000 at 03:39:50PM -0700, Linus Torvalds wrote:

> Have you looked at the code it generates? Quite sad, really.

I read the asm produced by some of some of my testcases.  The current spinlock
implementation seems to do exactly the _right_ thing in practice and nothing
more. "memory" was instead causing reloads of constant addresses into registers
and stuff that shouldn't be necessary (I was infact wondering about the reason
of those suprious loads also in my first email).

Said that I heard that some recent gcc miscompiles the kernel and we also have
to always compile with -fno-strict-aliasing. I think gcc developers should
comment about this issue. If they say the __dummy way is still going to be safe
for some gcc release, we can skip those spurious loads caused by the "memory"
clobber. From the email I received from Richard Henderson in this thread it
seems they prefer that the kernel doesn't relys on those __dummy just now (and
they have the rights to complain because that's a kernel bug).

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-19 13:16 John Wehle
@ 2000-09-19 13:34 ` Andrea Arcangeli
  0 siblings, 0 replies; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-19 13:34 UTC (permalink / raw)
  To: John Wehle; +Cc: egcs, gcc, kuznet, linux-kernel, torvalds, rth

On Tue, Sep 19, 2000 at 04:16:07PM -0400, John Wehle wrote:
> Umm ... "miscompilation"?  As in the compiler produced the wrong code
> based on the input provided?

That's not a gcc bug (gcc is doing the right thing). It's the kernel that
should use the "memory" clobber in the spinlock implementation.

The sad code generated was in reality the _right_ code. I was blind not
noticing the missing $ (I missed it the first time in the first testcase I
tried and I kept missing it, I was probably also biased assuming the current
spinlocks was safe with the commonly used compilers, I was thinking to fix only
a theorical bug). I'm sorry for that (and thanks again to Richard and Jamie).

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
@ 2000-09-19 13:16 John Wehle
  2000-09-19 13:34 ` Andrea Arcangeli
  0 siblings, 1 reply; 18+ messages in thread
From: John Wehle @ 2000-09-19 13:16 UTC (permalink / raw)
  To: andrea; +Cc: egcs, gcc, kuznet, linux-kernel, torvalds, rth

> I see. So Jamie was right and we reproduced a case of miscompilation.

Umm ... "miscompilation"?  As in the compiler produced the wrong code
based on the input provided?

  int * p;

  ...

  a = *p;

        movl p,%eax
        movl (%eax),%edx

The assembly code appears to load the address stored at p (keep in mind
that p is a pointer), then use that address to fetch the value which is
placed in a.  What do you believe should have been generated by the compiler?

-- John
-------------------------------------------------------------------------
|   Feith Systems  |   Voice: 1-215-646-8000  |  Email: john@feith.com  |
|    John Wehle    |     Fax: 1-215-540-5495  |                         |
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-19 10:23       ` Jamie Lokier
@ 2000-09-19 11:19         ` Andrea Arcangeli
  0 siblings, 0 replies; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-19 11:19 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: John Wehle, gcc, kuznet, linux-kernel, torvalds

On Tue, Sep 19, 2000 at 07:22:48PM +0200, Jamie Lokier wrote:
> That instruction loads the _value_ of p.  I.e. reads the memory from
> location 0x80495a4 into %eax.  The source instruction was:
> 
>        movl p,%eax
> 
> The instructions that you're thinking of, that load fixed addresses,
> look like these:
> 
>        mov $0x80495a4,%eax
>        lea 0x80495a4,%eax
> 
> or in source form:
> 
>        movl $p,%eax
>        leal p,%eax

Thanks for noticing my error.

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-19 10:23       ` Richard Henderson
@ 2000-09-19 11:17         ` Andrea Arcangeli
  0 siblings, 0 replies; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-19 11:17 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jamie Lokier, John Wehle, gcc, kuznet, linux-kernel, torvalds

On Tue, Sep 19, 2000 at 10:23:05AM -0700, Richard Henderson wrote:
> On Tue, Sep 19, 2000 at 04:32:16PM +0200, Andrea Arcangeli wrote:
> > Wrong: it's really loading the _address_.
> [...]
> >  80483f5:       a1 a4 95 04 08          mov    0x80495a4,%eax
> >                    ^^^^^^^^^^^                 ^^^^^^^^^
> 
> No, that's an absolute memory load.  If we were loading
> the address, there'd be a '$' before that number.

I see. So Jamie was right and we reproduced a case of miscompilation.

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-19  7:21     ` Andrea Arcangeli
  2000-09-19 10:23       ` Jamie Lokier
@ 2000-09-19 10:23       ` Richard Henderson
  2000-09-19 11:17         ` Andrea Arcangeli
  1 sibling, 1 reply; 18+ messages in thread
From: Richard Henderson @ 2000-09-19 10:23 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Jamie Lokier, John Wehle, gcc, kuznet, linux-kernel, torvalds

On Tue, Sep 19, 2000 at 04:32:16PM +0200, Andrea Arcangeli wrote:
> Wrong: it's really loading the _address_.
[...]
>  80483f5:       a1 a4 95 04 08          mov    0x80495a4,%eax
>                    ^^^^^^^^^^^                 ^^^^^^^^^

No, that's an absolute memory load.  If we were loading
the address, there'd be a '$' before that number.


r~

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-19  7:21     ` Andrea Arcangeli
@ 2000-09-19 10:23       ` Jamie Lokier
  2000-09-19 11:19         ` Andrea Arcangeli
  2000-09-19 10:23       ` Richard Henderson
  1 sibling, 1 reply; 18+ messages in thread
From: Jamie Lokier @ 2000-09-19 10:23 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: John Wehle, gcc, kuznet, linux-kernel, torvalds

Andrea Arcangeli wrote:
> > p is a variable.  The _address_ of p is constant, but the reload is
> > not loading the address of p, it's loading the _value_.  That value can
> 
> Wrong: it's really loading the _address_. The value that is loaded into the
> register is the addess of p that is stored only in the .text and that is an
> immediate value embedded into the opcodes of the asm.
> 
> If you can change the address it it means it's selfmodifying code, and GCC
> shouldn't really assume anything about that so GCC is _wrong_.
> 
...
>  80483f5:       a1 a4 95 04 08          mov    0x80495a4,%eax

That instruction loads the _value_ of p.  I.e. reads the memory from
location 0x80495a4 into %eax.  The source instruction was:

       movl p,%eax

The instructions that you're thinking of, that load fixed addresses,
look like these:

       mov $0x80495a4,%eax
       lea 0x80495a4,%eax

or in source form:

       movl $p,%eax
       leal p,%eax

You must be working too hard.
Please take a short break, relax, enjoy nature again :-)

have a nice day,
-- Jamie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-19  8:01   ` David Howells
@ 2000-09-19  8:14     ` Andrea Arcangeli
  0 siblings, 0 replies; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-19  8:14 UTC (permalink / raw)
  To: David Howells; +Cc: John Wehle, gcc, linux-kernel

On Tue, Sep 19, 2000 at 04:01:26PM +0100, David Howells wrote:
> I can't remember exactly what it was now, but I think it was either something
> to do with spinlocks or bitops. I'll re-investigate tonight and see if I can
> come back with some benchmarks/code-snippets tomorrow.

Yes you should tell us which is the inlined function that generated
different asm (if you post the two differnt asm or the two different .o we
can probably find it ourself).

I seen the rw_spin_locks are silly requesting the address of the spinlock to be
in the register eax when the address of the spinlock isn't a constant (while it
should instead at least use "r" and not "a") and I was going to fix it, however
that's not changed between test7 and test8...

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-19  7:27 ` Andrea Arcangeli
@ 2000-09-19  8:01   ` David Howells
  2000-09-19  8:14     ` Andrea Arcangeli
  0 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2000-09-19  8:01 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: John Wehle, gcc, linux-kernel

I've been writing a kernel module and I've noticed a measurable performance
drop between the same code compiled against linux-2.4.0-test7 and against
test8. I disassembled the code to try and work out what was going on and I saw
the following happen:

 * [test8]
   One of the atomic memory access primitives supplied by the kernel
   was putting immediate data into a register outside of the inline-asm
   instruction group and then using the register inside.

 * [test7]
   The immediate data was passed directly to the inline-asm instruction.

In test8, of course, this means that the compiler has to scratch up a spare
register, which is totally unnecessary, as the immediate data could be
attached directly to the instruction opcode as was done in test7.

This had the effect of making the compiler have to push the old contents of
the register into a slot on the stack (I think it held a local variable at the
time), which had the further effects of using more stack memory, introducing
more register rearrangement (the code ended up longer), and burning up more
CPU cycles.

I can't remember exactly what it was now, but I think it was either something
to do with spinlocks or bitops. I'll re-investigate tonight and see if I can
come back with some benchmarks/code-snippets tomorrow.

David Howells

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-18 18:38 John Wehle
@ 2000-09-19  7:27 ` Andrea Arcangeli
  2000-09-19  8:01   ` David Howells
  0 siblings, 1 reply; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-19  7:27 UTC (permalink / raw)
  To: John Wehle; +Cc: lk, gcc, kuznet, linux-kernel, torvalds

On Mon, Sep 18, 2000 at 09:37:43PM -0400, John Wehle wrote:
> It's perhaps not optimal, however I'm not sure that it's wrong.  In

It's not "wrong" in the sense that something breaks but it's definitely
suboptimal. There's no reason to reload a value that can't change because it's
embedded into the opcodes of the asm and set a static linking time.

> any case if you can supply a small standalone test case (i.e. preprocessed
> source code) I'll take a quick look at things.  I take it that you haven't
> tried the current gcc sources?

The first testcase is the current spinlock implementation, the second testcase
adds the "memory" clobber and it generates the _spurious_ reload of `p'. You
should be able to compile with `gcc -O2 -S p-*.i`.

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-19  1:41   ` Jamie Lokier
@ 2000-09-19  7:21     ` Andrea Arcangeli
  2000-09-19 10:23       ` Jamie Lokier
  2000-09-19 10:23       ` Richard Henderson
  0 siblings, 2 replies; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-19  7:21 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: John Wehle, gcc, kuznet, linux-kernel, torvalds

On Tue, Sep 19, 2000 at 10:40:04AM +0200, Jamie Lokier wrote:
> p is a variable.  The _address_ of p is constant, but the reload is
> not loading the address of p, it's loading the _value_.  That value can

Wrong: it's really loading the _address_. The value that is loaded into the
register is the addess of p that is stored only in the .text and that is an
immediate value embedded into the opcodes of the asm.

If you can change the address it it means it's selfmodifying code, and GCC
shouldn't really assume anything about that so GCC is _wrong_.

> Here, the saved cycle is a kernel bug.

Nope. Here it is a disassembly from after the linking of the testcase so you
can see where the value `p' cames from:

080483e0 <main>:
 80483e0:       55                      push   %ebp
 80483e1:       89 e5                   mov    %esp,%ebp
 80483e3:       83 ec 08                sub    $0x8,%esp
 80483e6:       f0 0f ba 2d b8 94 04    lock btsl $0x0,0x80494b8
 80483ed:       08 00 
 80483ef:       0f 82 77 00 00 00       jb     804846c <gcc2_compiled.>
 80483f5:       a1 a4 95 04 08          mov    0x80495a4,%eax
                   ^^^^^^^^^^^                 ^^^^^^^^^
 80483fa:       8b 10                   mov    (%eax),%edx
 80483fc:       f0 0f ba 35 b8 94 04    lock btrl $0x0,0x80494b8
 8048403:       08 00 
 8048405:       f0 0f ba 2d b8 94 04    lock btsl $0x0,0x80494b8
 804840c:       08 00 
 804840e:       0f 82 66 00 00 00       jb     804847a <gcc2_compiled.+0xe>
 8048414:       8b 00                   mov    (%eax),%eax
 8048416:       f0 0f ba 35 b8 94 04    lock btrl $0x0,0x80494b8
 804841d:       08 00 
 804841f:       83 c4 f8                add    $0xfffffff8,%esp
 8048422:       50                      push   %eax
 8048423:       52                      push   %edx
 8048424:       e8 a7 ff ff ff          call   80483d0 <dummy>
 8048429:       89 ec                   mov    %ebp,%esp
 804842b:       5d                      pop    %ebp
 804842c:       c3                      ret    
 804842d:       90                      nop    
 804842e:       90                      nop    
 804842f:       90                      nop    

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-18 18:07 ` Andrea Arcangeli
@ 2000-09-19  1:41   ` Jamie Lokier
  2000-09-19  7:21     ` Andrea Arcangeli
  0 siblings, 1 reply; 18+ messages in thread
From: Jamie Lokier @ 2000-09-19  1:41 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: John Wehle, gcc, kuznet, linux-kernel, torvalds

Andrea Arcangeli wrote:
> int * p;
> [...]
>         spin_lock(&lock);
>         a = *p;
>         spin_unlock(&lock);
> 
>         spin_lock(&lock);  
>         b = *p;
>         spin_unlock(&lock);

> [With "memory" clobber"] the [second] reload of the address of `p'
> isn't necessary and gcc is wrong in generating it.

Wrong, GCC is behaving correctly.

> p is a constant embedded into the .text section and set at link time,

p is a variable.  The _address_ of p is constant, but the reload is
not loading the address of p, it's loading the _value_.  That value can
be changed by other threads.

In fact, you have demonstrated why the "memory" clobber is necessary for
spinlocks.  A perfect test case!

In your first example, without the clobber, the asm code is incorrect.
A parallel thread can change the value of p between the first
spin_unlock and the second spin_lock, and the GCC-generated code does
not notice.

> The above reload are just wasted CPU cycles that we're little worried
> to waste.

Here, the saved cycle is a kernel bug.

-- Jamie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
@ 2000-09-18 18:38 John Wehle
  2000-09-19  7:27 ` Andrea Arcangeli
  0 siblings, 1 reply; 18+ messages in thread
From: John Wehle @ 2000-09-18 18:38 UTC (permalink / raw)
  To: andrea; +Cc: lk, gcc, kuznet, linux-kernel, torvalds

> The reload of the address of `p' isn't necessary and gcc is wrong in
> generating it. p is a constant embedded into the .text section ...

It's perhaps not optimal, however I'm not sure that it's wrong.  In
any case if you can supply a small standalone test case (i.e. preprocessed
source code) I'll take a quick look at things.  I take it that you haven't
tried the current gcc sources?

-- John
-------------------------------------------------------------------------
|   Feith Systems  |   Voice: 1-215-646-8000  |  Email: john@feith.com  |
|    John Wehle    |     Fax: 1-215-540-5495  |                         |
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
  2000-09-18 16:53 John Wehle
@ 2000-09-18 18:07 ` Andrea Arcangeli
  2000-09-19  1:41   ` Jamie Lokier
  0 siblings, 1 reply; 18+ messages in thread
From: Andrea Arcangeli @ 2000-09-18 18:07 UTC (permalink / raw)
  To: John Wehle; +Cc: lk, gcc, kuznet, linux-kernel, torvalds

On Mon, Sep 18, 2000 at 07:53:04PM -0400, John Wehle wrote:
> What version of gcc?  Recently some work was done to improve the handling of
> constant memory.

I'm using 2.95.2 19991024.

Take this small testcase:

#include <linux/spinlock.h>

int * p;
spinlock_t lock = SPIN_LOCK_UNLOCKED;

extern void dummy(int, int);

myfunc() {
        int a, b; 
        spin_lock(&lock);
        a = *p;
        spin_unlock(&lock);

        spin_lock(&lock);  
        b = *p;
        spin_unlock(&lock);

        dummy(a,b);
}

If I compile it with:

	gcc -O2 -D__SMP__ -I ~/kernel/devel/2.2.18pre9aa1/include/ -S p.c

where 2.2.18pre9aa1 is the current spinlock implementation without
the "memory" clobber and with the __dummy trick I get:

	.file	"p.c"
	.version	"01.01"
gcc2_compiled.:
.globl lock
.data
	.align 4
	.type	 lock,@object
	.size	 lock,4
lock:
	.long 0
.text
	.align 16
.globl myfunc
	.type	 myfunc,@function
myfunc:
	pushl %ebp
	movl %esp,%ebp
	subl $8,%esp
#APP
	
1:	lock ; btsl $0,lock
	jc 2f
.section .text.lock,"ax"
2:	testb $1,lock
	jne 2b
	jmp 1b
.previous
#NO_APP
	movl p,%eax
	^^^^^^^^^^^
	movl (%eax),%edx
#APP
	lock ; btrl $0,lock
	
1:	lock ; btsl $0,lock
	jc 2f
.section .text.lock,"ax"
2:	testb $1,lock
	jne 2b
	jmp 1b
.previous
#NO_APP
	movl (%eax),%eax
#APP
	lock ; btrl $0,lock
#NO_APP
	addl $-8,%esp
	pushl %eax
	pushl %edx
	call dummy
	movl %ebp,%esp
	popl %ebp
	ret
.Lfe1:
	.size	 myfunc,.Lfe1-myfunc
	.comm	p,4,4
	.ident	"GCC: (GNU) 2.95.2 19991024 (release)"

If now I repeat the same after applying this patch to the
kernel tree that I was inlining:

--- 2.2.18pre9aa1/include/asm-i386/spinlock.h.~1~	Mon Sep 18 04:56:28 2000
+++ 2.2.18pre9aa1/include/asm-i386/spinlock.h	Tue Sep 19 03:04:56 2000
@@ -173,12 +173,12 @@
 #define spin_lock(lock) \
 __asm__ __volatile__( \
 	spin_lock_string \
-	:"=m" (__dummy_lock(lock)))
+	:"=m" (__dummy_lock(lock)) : : "memory")
 
 #define spin_unlock(lock) \
 __asm__ __volatile__( \
 	spin_unlock_string \
-	:"=m" (__dummy_lock(lock)))
+	:"=m" (__dummy_lock(lock)) : : "memory")
 
 #define spin_trylock(lock) (!test_and_set_bit(0,(lock)))

I then get this assembler:

	.file	"p.c"
	.version	"01.01"
gcc2_compiled.:
.globl lock
.data
	.align 4
	.type	 lock,@object
	.size	 lock,4
lock:
	.long 0
.text
	.align 16
.globl myfunc
	.type	 myfunc,@function
myfunc:
	pushl %ebp
	movl %esp,%ebp
	subl $8,%esp
#APP
	
1:	lock ; btsl $0,lock
	jc 2f
.section .text.lock,"ax"
2:	testb $1,lock
	jne 2b
	jmp 1b
.previous
#NO_APP
	movl p,%eax
	^^^^^^^^^^^
	movl (%eax),%edx
#APP
	lock ; btrl $0,lock
	
1:	lock ; btsl $0,lock
	jc 2f
.section .text.lock,"ax"
2:	testb $1,lock
	jne 2b
	jmp 1b
.previous
#NO_APP
	movl p,%eax
	^^^^^^^^^^^
	movl (%eax),%eax
#APP
	lock ; btrl $0,lock
#NO_APP
	addl $-8,%esp
	pushl %eax
	pushl %edx
	call dummy
	movl %ebp,%esp
	popl %ebp
	ret
.Lfe1:
	.size	 myfunc,.Lfe1-myfunc
	.comm	p,4,4
	.ident	"GCC: (GNU) 2.95.2 19991024 (release)"

The diff between the generated asms:

--- p.s.default-spinlocks	Tue Sep 19 03:10:14 2000
+++ p.s.memory	Tue Sep 19 03:10:29 2000
@@ -39,6 +39,7 @@
 	jmp 1b
 .previous
 #NO_APP
+	movl p,%eax
 	movl (%eax),%eax
 #APP
 	lock ; btrl $0,lock


The reload of the address of `p' isn't necessary and gcc is wrong in generating
it. p is a constant embedded into the .text section and set at link time, the
only way to change it would be if the assembler that declares "memory" as
clobber would be self modifying the code itself and gcc should assume nothing
about self modifying code instead (none bit of IA32 linux is self modifying).

The above reload are just wasted CPU cycles that we're little worried to waste.

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: GCC proposal for "@" asm constraint
@ 2000-09-18 16:53 John Wehle
  2000-09-18 18:07 ` Andrea Arcangeli
  0 siblings, 1 reply; 18+ messages in thread
From: John Wehle @ 2000-09-18 16:53 UTC (permalink / raw)
  To: andrea; +Cc: lk, gcc, kuznet, linux-kernel, torvalds

> I read the asm produced by some of some of my testcases.  The current spinlock
> implementation seems to do exactly the _right_ thing in practice and nothing
> more. "memory" was instead causing reloads of constant addresses into registers
> and stuff that shouldn't be necessary (I was infact wondering about the reason
> of those suprious loads also in my first email).

What version of gcc?  Recently some work was done to improve the handling
of constant memory.

-- John
-------------------------------------------------------------------------
|   Feith Systems  |   Voice: 1-215-646-8000  |  Email: john@feith.com  |
|    John Wehle    |     Fax: 1-215-540-5495  |                         |
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2000-09-19 13:34 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.21.0009071837310.8494-100000@inspiron.random>
2000-09-08  4:42 ` GCC proposal for "@" asm constraint Jamie Lokier
2000-09-18 15:35   ` Andrea Arcangeli
2000-09-18 15:40     ` Linus Torvalds
2000-09-18 15:59       ` Andrea Arcangeli
2000-09-18 16:53 John Wehle
2000-09-18 18:07 ` Andrea Arcangeli
2000-09-19  1:41   ` Jamie Lokier
2000-09-19  7:21     ` Andrea Arcangeli
2000-09-19 10:23       ` Jamie Lokier
2000-09-19 11:19         ` Andrea Arcangeli
2000-09-19 10:23       ` Richard Henderson
2000-09-19 11:17         ` Andrea Arcangeli
2000-09-18 18:38 John Wehle
2000-09-19  7:27 ` Andrea Arcangeli
2000-09-19  8:01   ` David Howells
2000-09-19  8:14     ` Andrea Arcangeli
2000-09-19 13:16 John Wehle
2000-09-19 13:34 ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).