public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
* libstdc++/3584: arm-specific atomic operations not atomic
@ 2001-07-06 2:56 robin.farine
0 siblings, 0 replies; 8+ messages in thread
From: robin.farine @ 2001-07-06 2:56 UTC (permalink / raw)
To: gcc-gnats; +Cc: acnrf
>Number: 3584
>Category: libstdc++
>Synopsis: arm-specific atomic operations not atomic
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: unassigned
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Jul 06 02:56:00 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator: Robin Farine
>Release: gcc version 3.0
>Organization:
>Environment:
build/host: Debian GNU/Linux
target: arm-unknown-elf
>Description:
Routines such as atomic_add() that reads a memory location,
apply an operation to the read value, uses the swp
instruction to update the memory location and swp again if
the value read the 2nd time does equal the initial value
introduce a race condition. The attachment illustrates such
a case.
>How-To-Repeat:
>Fix:
For single CPU systems, disable thread scheduling during the
operation.
>Release-Note:
>Audit-Trail:
>Unformatted:
----gnatsweb-attachment----
Content-Type: text/plain; name="arm-atomic-add.txt"
Content-Disposition: inline; filename="arm-atomic-add.txt"
The atomic_add() for ARM (with instructions numbering):
static inline void
__attribute__ ((__unused__))
__atomic_add (volatile _Atomic_word *__mem, int __val)
{
_Atomic_word __tmp, __tmp2, __tmp3;
__asm__ __volatile__ (
"\n"
"0:\t"
1 "ldr %0, [%3] \n\t"
2 "add %1, %0, %4 \n\t"
3 "swp %2, %1, [%3] \n\t"
4 "cmp %0, %2 \n\t"
5 "swpne %1, %2, [%3] \n\t"
6 "bne 0b \n\t"
7 ""
: "=&r"(__tmp), "=&r"(__tmp2), "=&r"(__tmp3)
: "r" (__mem), "r"(__val)
: "cc", "memory");
}
Example of sequence that produces an invalid result (*__mem == 2 instead of 4 at
the end):
T1 (%0 %1 %2) T2 (%0 %1 %2) *__mem
1 (x x x) __val=1 0
2 (0 x x)
3 (0 1 x) <-- reschedule
1 (x x x) __val=1
2 (0 x x)
3 (0 1 x)
4 (0 1 0) 1
5 (0 1 0)
6 (0 1 0)
7 (0 1 0)
... <-- reschedule
4 (0 1 1) 1
5 (0 1 1) <-- reschedule
1 (x x x) __val=2
2 (1 x x)
3 (1 3 x)
4 (1 3 1) 3
5 (1 3 1)
6 (1 3 1)
7 (1 3 1)
.... <-- reschedule
6 (0 3 1) 1
1 (0 3 1)
2 (1 3 1)
3 (1 2 1)
4 (1 2 1) 2
5 (1 2 1)
6 (1 2 1)
7 (1 2 1)
....
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: libstdc++/3584: arm-specific atomic operations not atomic
@ 2002-10-03 9:17 rearnsha
0 siblings, 0 replies; 8+ messages in thread
From: rearnsha @ 2002-10-03 9:17 UTC (permalink / raw)
To: acnrf, gcc-bugs, gcc-prs, nobody, rearnsha, robin.farine
Synopsis: arm-specific atomic operations not atomic
State-Changed-From-To: feedback->closed
State-Changed-By: rearnsha
State-Changed-When: Thu Oct 3 09:17:26 2002
State-Changed-Why:
I'm closing this because there doesn't seem to be much more that can be done other than reverting back to the generic (no thread safety) model, which is what I've just done.
See http://gcc.gnu.org/ml/gcc-patches/2002-10/msg00157.html
for further details on the problem with the old code.
If thread support is added to the generic model, then we can make use of that on ARM too.
http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=3584
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: libstdc++/3584: arm-specific atomic operations not atomic
@ 2002-05-31 18:09 pme
0 siblings, 0 replies; 8+ messages in thread
From: pme @ 2002-05-31 18:09 UTC (permalink / raw)
To: acnrf, gcc-bugs, gcc-prs, nobody, rearnsha, robin.farine
Synopsis: arm-specific atomic operations not atomic
State-Changed-From-To: open->feedback
State-Changed-By: pme
State-Changed-When: Fri May 31 18:05:20 2002
State-Changed-Why:
What's the status of this PR? How are things for 3.1
and 3.2?
> I'm not up on what threading models are usable with
> libstdc++-v3, so I'll have to leave that side of
> things to the c++ library maintainers.
We simply grab whatever thread model has been configured
for GCC, as represented by the various gcc/gthr-*.h files.
http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=3584
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: libstdc++/3584: arm-specific atomic operations not atomic
@ 2001-07-12 7:26 Richard Earnshaw
0 siblings, 0 replies; 8+ messages in thread
From: Richard Earnshaw @ 2001-07-12 7:26 UTC (permalink / raw)
To: nobody; +Cc: gcc-prs
The following reply was made to PR libstdc++/3584; it has been noted by GNATS.
From: Richard Earnshaw <rearnsha@arm.com>
To: Robin Farine <acnrf@dial.eunet.ch>
Cc: Richard.Earnshaw@arm.com, robin.farine@terminus.org, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/3584: arm-specific atomic operations not atomic
Date: Thu, 12 Jul 2001 15:23:57 +0100
acnrf@dial.eunet.ch said:
> Besides the fact that the label "0:" should *precede* the "mov r2, 1"
> instruction, this version too would allow starvation of a low-priority
> thread. I suppose that a multi-threads and multi-processors safe
> implementation of atomic_add() would look more like this:
I think we should forget multi-processor versions for the moment. SWP
semantics don't work well with these, since SWP memory locations need to
be placed in uncached/unbuffered pages to have well-defined
multi-processor semantics.
I'm not up on what threading models are usable with libstdc++-v3, so I'll
have to leave that side of things to the c++ library maintainers.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: libstdc++/3584: arm-specific atomic operations not atomic
@ 2001-07-12 7:16 Robin Farine
0 siblings, 0 replies; 8+ messages in thread
From: Robin Farine @ 2001-07-12 7:16 UTC (permalink / raw)
To: nobody; +Cc: gcc-prs
The following reply was made to PR libstdc++/3584; it has been noted by GNATS.
From: Robin Farine <acnrf@dial.eunet.ch>
To: Richard.Earnshaw@arm.com
Cc: robin.farine@terminus.org, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/3584: arm-specific atomic operations not atomic
Date: 12 Jul 2001 16:08:17 +0200
Robin Farine <rfarine@halftrack.hq.acn-group.ch> writes:
[...]
>
> @ r0 = address (struct {int lock, int value}), r1 = increment.
> ... @ prolog
> mov r2, 1
> 0:
> swp r3, r2, [r0]
> teq r3, #0
> beq 1f @ got the lock
> mov lr, pc
> ldr pc, .thread_yield
> b 0b
> .thread_yield: .word thread_yield
>
> 1:
> @ We now have the lock
> ldr r2, [r0, #4]
> add r2, r2, r1
> str r2, [r0, #4]
> str r3, [r0] @ Release the lock.
Besides the fact that the label "0:" should *precede* the "mov r2, 1"
instruction, this version too would allow starvation of a low-priority thread. I
suppose that a multi-threads and multi-processors safe implementation of
atomic_add() would look more like this:
void
atomic_add(volatile _Atomic_word *__mem, int __val)
{
int __lock;
int __tmp = 1;
thread_suspend_scheduler();
__asm__ __volatile__ (
"\n"
/* spin while another CPU holds the lock */
"0:\t"
"swp %0, %1, [%2] \n\t"
"teq %0, #0 \n\t"
"bne 0b \n\t"
/* we now have the lock */
"ldr %1, [%2, #4] \n\t"
"add %1, %1, %3 \n\t"
"str %1, [%2, #4] \n\t"
/* release the lock */
"str %0, [%2] \n\t"
""
: "=&r"(__lock), "=&r"(__tmp)
: "r" (__mem), "r"(__val)
: "cc", "memory");
thread_resume_scheduler();
}
This routine relies on the structured version of _Atomic_word and on nestable
primitives to suspend/resume threads scheduling, which the POSIX threads API
doesn't provide, if I remember correctly ...
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: libstdc++/3584: arm-specific atomic operations not atomic
@ 2001-07-11 11:06 Richard Earnshaw
0 siblings, 0 replies; 8+ messages in thread
From: Richard Earnshaw @ 2001-07-11 11:06 UTC (permalink / raw)
To: nobody; +Cc: gcc-prs
The following reply was made to PR libstdc++/3584; it has been noted by GNATS.
From: Richard Earnshaw <rearnsha@arm.com>
To: Robin Farine <acnrf@dial.eunet.ch>
Cc: Richard.Earnshaw@arm.com, robin.farine@terminus.org, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/3584: arm-specific atomic operations not atomic
Date: Wed, 11 Jul 2001 19:04:22 +0100
> This certainly works correctly but it would introduce another problem. If
> another thread already holds the lock, the current thread would just spin
> hopelessly until the thread that holds the lock gets the CPU and releases
> it. This could result into a dead-lock if the spinning thread has a higher
> priority than the lock owner.
>
> With the swap instruction (in contrast to test-and-set ;-)), I think that such
> routines implementations have to depend on the threading model to explicitly
> release the CPU when they fail to get the lock. Something like:
Correct. I was forgetting that the load-locked/store-conditional
operations wouldn't suffer from this problem. That makes the sequence
particularly nasty, especially when called from thumb code... :-(
R.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: libstdc++/3584: arm-specific atomic operations not atomic
@ 2001-07-11 10:36 Robin Farine
0 siblings, 0 replies; 8+ messages in thread
From: Robin Farine @ 2001-07-11 10:36 UTC (permalink / raw)
To: nobody; +Cc: gcc-prs
The following reply was made to PR libstdc++/3584; it has been noted by GNATS.
From: Robin Farine <acnrf@dial.eunet.ch>
To: Richard.Earnshaw@arm.com
Cc: robin.farine@terminus.org, gcc-gnats@gcc.gnu.org
Subject: Re: libstdc++/3584: arm-specific atomic operations not atomic
Date: 11 Jul 2001 19:34:59 +0200
Richard Earnshaw <rearnsha@arm.com> writes:
[...]
> The code for this variant would be:
> @ r0 = address (struct {int lock, int value}), r1 = increment.
> mov r2, #INT_MIN
> 0:
> swp r3, r2, [r0]
> cmp r3, #INT_MIN
> beq 0b @ already locked
> @ We now have the lock
> ldr r2, [r0, #4]
> add r2, r2, r1
> str r2, [r0, #4]
> str r3, [r0] @ Release the lock.
> mov r2
This certainly works correctly but it would introduce another problem. If
another thread already holds the lock, the current thread would just spin
hopelessly until the thread that holds the lock gets the CPU and releases
it. This could result into a dead-lock if the spinning thread has a higher
priority than the lock owner.
With the swap instruction (in contrast to test-and-set ;-)), I think that such
routines implementations have to depend on the threading model to explicitly
release the CPU when they fail to get the lock. Something like:
@ r0 = address (struct {int lock, int value}), r1 = increment.
... @ prolog
mov r2, 1
0:
swp r3, r2, [r0]
teq r3, #0
beq 1f @ got the lock
mov lr, pc
ldr pc, .thread_yield
b 0b
.thread_yield: .word thread_yield
1:
@ We now have the lock
ldr r2, [r0, #4]
add r2, r2, r1
str r2, [r0, #4]
str r3, [r0] @ Release the lock.
mov r2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: libstdc++/3584: arm-specific atomic operations not atomic
@ 2001-07-11 8:46 Richard Earnshaw
0 siblings, 0 replies; 8+ messages in thread
From: Richard Earnshaw @ 2001-07-11 8:46 UTC (permalink / raw)
To: nobody; +Cc: gcc-prs
The following reply was made to PR libstdc++/3584; it has been noted by GNATS.
From: Richard Earnshaw <rearnsha@arm.com>
To: robin.farine@terminus.org
Cc: gcc-gnats@gcc.gnu.org, acnrf@dial.eunet.ch, Richard.Earnshaw@arm.com
Subject: Re: libstdc++/3584: arm-specific atomic operations not atomic
Date: Wed, 11 Jul 2001 16:38:26 +0100
robin.farine@terminus.org said:
> Routines such as atomic_add() that reads a memory location, apply an
> operation to the read value, uses the swp instruction to update the
> memory location and swp again if the value read the 2nd time does
> equal the initial value introduce a race condition.
I agree. I can think of two possible solutions; which is best depends on
the potential uses of these operations.
The first one reserves a value (say INT_MIN, chosen because it is a simple
immediate constant on the ARM, and a relatively unlikely value in normal
signed-arithmetic operations); if the atomic memory location ever contains
this value, then another atomic operation is in progress on the memory
location, and the requester must spin until that value is not already
stored there.
The code for atomic add would then be.
@ r0 = address, r1 = increment.
mov r2, #INT_MIN
0:
swp r3, r2, [r0]
cmp r3, #INT_MIN
beq 0b @ already locked.
@ We now have the lock
add r3, r3, r1
cmp r3, #INT_MIN @ Paranoia -- INT_MIN is reserved.
addeq r3, #1
str r3, [r0] @ save, and release the lock.
Similar sequences could be written for the other operations.
The second avoids reserving a value, but requires a larger object for the
"lock". It would also require special code to initialize and read any
value from an _Atomic_word object. This would break the current
include/bits/basic_string.h implementation, which contains
_Atomic_word _M_references;
bool
_M_is_leaked() const
{ return _M_references < 0; }
bool
_M_is_shared() const
{ return _M_references > 0; }
...
I guess we could fix this by making _Atomic_word a class which defined
these operations.
The code for this variant would be:
@ r0 = address (struct {int lock, int value}), r1 = increment.
mov r2, #INT_MIN
0:
swp r3, r2, [r0]
cmp r3, #INT_MIN
beq 0b @ already locked
@ We now have the lock
ldr r2, [r0, #4]
add r2, r2, r1
str r2, [r0, #4]
str r3, [r0] @ Release the lock.
mov r2
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-10-03 16:17 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-07-06 2:56 libstdc++/3584: arm-specific atomic operations not atomic robin.farine
2001-07-11 8:46 Richard Earnshaw
2001-07-11 10:36 Robin Farine
2001-07-11 11:06 Richard Earnshaw
2001-07-12 7:16 Robin Farine
2001-07-12 7:26 Richard Earnshaw
2002-05-31 18:09 pme
2002-10-03 9:17 rearnsha
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).