* [CFT] i386 sync functions for PR 39677
@ 2009-10-16 23:11 Richard Henderson
2009-10-16 23:27 ` Joseph S. Myers
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Richard Henderson @ 2009-10-16 23:11 UTC (permalink / raw)
To: GCC Patches; +Cc: ro, dannysmith, ubizjak
[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]
Some simple experimentation shows that while adding the lfence insn has
a performance impact on cpus that don't need it, it's approximately the
same as the performance impact of an external function call. Therefore,
my approach for this PR is going to be:
* If sse2 is supported, inline the lfence insn as needed.
This takes care of all 64-bit code. And I believe
all Darwin, so we don't have to worry about the odd
shared libgcc issues we have there.
* Include 32-bit routines in the shared libgcc.
When possible, use @gnu_indirect_function support
to minimize the overhead of the cpu detection.
* Given that we now have a central location for handling
atomic synchronization, handle 80386 and 80486 via spinlock.
This means that we'll no longer have to inject -march=i586
for compiling some of our runtime libraries.
I've now written the external functions in assembly. Primarily because
some of the DImode routines use all 7 registers, which made -fpic
compilation in the compiler very tricky. Secondarily, it's much easier
to implement the indirect function support directly in assembly.
First, I'd like to ask for extra sets of eyes to look over the code and
make sure I haven't made any silly typos.
Second, I'd like to ask different port maintainers (cygwin and solaris
particularly) to try to compile the code and report any portability
problems. Use any relevant combinations of:
-fpic
-DHAVE_GAS_CFI_DIRECTIVE
-DHAVE_GNU_INDIRECT_FUNCTION
r~
[-- Attachment #2: sync.S --]
[-- Type: text/plain, Size: 12420 bytes --]
/* Synchronization functions for i386.
Copyright (C) 2009 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
Under Section 7 of GPL version 3, you are granted additional
permissions described in the GCC Runtime Library Exception, version
3.1, as published by the Free Software Foundation.
You should have received a copy of the GNU General Public License and
a copy of the GCC Runtime Library Exception along with this program;
see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
<http://www.gnu.org/licenses/>. */
/* Note that we don't bother with a 64-bit version, as there we know
that cmpxchg and lfence are both supported by the cpu. */
/* Token concatenation macros. */
#define CAT_(A,B) A ## B
#define CAT(A,B) CAT_(A,B)
#define CAT4_(A,B,C,D) A ## B ## C ## D
#define CAT4(A,B,C,D) CAT4_(A,B,C,D)
/* Redefine this to add a prefix to all symbols defined. */
#define PREFIX
#define P(X) CAT(PREFIX,X)
/* Redefine this to change the default alignment of subsequent functions. */
#define ALIGN 4
/* Define the type of a symbol. */
#ifdef __ELF__
# define TYPE(N,T) .type P(N),T
#else
# define TYPE(N,T)
#endif
/* Define a new symbol, with appropriate type and alignment. */
#define _DEFINE(N,T,A) .p2align A; TYPE(N,T); P(N):
/* End the definition of a symbol. */
#ifdef __ELF__
# define END(N) .size P(N), .-P(N)
#else
# define END(N)
#endif
/* Redefine these to generate functions with different prefixes. */
#define FUNC(N) _DEFINE(N,@function,ALIGN)
/* Define an object name. */
#define OBJECT(N,A) _DEFINE(N,@object,A)
/* If gas cfi directives are supported, use them, otherwise do nothing. */
#ifdef HAVE_GAS_CFI_DIRECTIVE
# define cfi_startproc .cfi_startproc
# define cfi_endproc .cfi_endproc
# define cfi_adjust_cfa_offset(O) .cfi_adjust_cfa_offset O
# define cfi_rel_offset(R,O) .cfi_rel_offset R, O
# define cfi_restore(R) .cfi_restore R
#else
# define cfi_startproc
# define cfi_endproc
# define cfi_adjust_cfa_offset(O)
# define cfi_rel_offset(R,O)
# define cfi_restore(R)
#endif
/* Simplify generation of those cfi directives for the common cases.
The PUSHS/POPS pair indicates the register should be saved for unwind;
otherwise we simply adjust the CFA. */
#define PUSH(R) pushl R; cfi_adjust_cfa_offset(4)
#define PUSHS(R) PUSH(R); cfi_rel_offset(R,0)
#define PUSHF pushfl; cfi_adjust_cfa_offset(4)
#define POP(R) popl R; cfi_adjust_cfa_offset(-4)
#define POPS(R) POP(R); cfi_restore(R)
#define POPF popfl; cfi_adjust_cfa_offset(-4)
/* Define a function name and do begin a new CFI proc. */
#define FUNC_CFI(N) FUNC(N); cfi_startproc
/* End a function (without CFI proc) or object. */
#define END_CFI(N) cfi_endproc; END(N)
/* Parameterize the PIC model for the target. */
#ifdef __PIC__
# ifdef __ELF__
# define PIC_INIT(REG) \
call __i686.get_pc_thunk.REG; \
addl $_GLOBAL_OFFSET_TABLE_, %CAT(e,REG)
# define PIC_ADD(P,D) addl P,D
# define PIC_OFFSET(S) S@GOTOFF
# define PIC_ADDRESS(S,P) S@GOTOFF(P)
# else
# error "Unknown PIC model"
# endif
#else
# define PIC_INIT(REG)
# define PIC_ADD(P,D)
# define PIC_OFFSET(S) S
# define PIC_ADDRESS(S,P) S
#endif
\f
/* This variable caches the (relevant) properties of the currently
running cpu. It has the following values:
-1 Uninitialized
0 An LFENCE instruction required after any sync
function with acquire semantics. Given that
lfence is an SSE2 insn, we also assume cmpxchg8b.
1 No cmpxchg support. Note that we're not interested
in the 80486 XADD instruction, which we use. If we
have to use a spinlock for any of the routines for
a data size, we have to use a spinlock for all of
the routines for a data size. Therefore XADD by
itself isn't interesting.
2 CMPXCHG supported
3 CMPXCHG8B supported
*/
.data
OBJECT(cpu_prop_index,2)
.long -1
END(cpu_prop_index)
.text
/* Detect the properties of the currently running cpu, according to
the values listed above. Preserves all registers except EAX, which
holds the return value. */
FUNC_CFI(detect_cpu)
PUSHS(%ebx)
PUSH(%ecx)
PUSH(%edx)
PUSHS(%esi)
PUSHS(%edi)
/* Determine 386 vs 486 and presence of cpuid all at once. */
PUSHF
PUSHF
POP(%eax)
movl %eax, %edx
xorl $0x00200000, %eax
PUSH(%eax)
POPF
PUSHF
POP(%eax)
POPF
xorl %edx, %eax
/* If we weren't able to toggle the ID bit in the flags,
we don't have the cpuid instruction, and also don't
have the cmpxchg instruction. */
movl $1, %esi /* do not have cmpxchg */
jz .Legress
movl $2, %esi /* have cmpxchg */
xorl %eax, %eax
cpuid
/* Check for AuthenticAMD. At the end, %edi is zero for matched. */
xorl $0x68747541, %ebx
xorl $0x444D4163, %ecx
xorl $0x69746E65, %edx
movl %ebx, %edi
orl %ecx, %edi
orl %edx, %edi
/* If max_cpuid == 0, we can check no further. */
testl %eax, %eax
jz .Legress
movl $1, %eax
cpuid
/* Check for cmpxchg8b support. The CX8 bit is 1<<8 in EDX. */
shr $8, %edx
andl $1, %edx
addl %edx, %esi /* incr iff cmpxchg8b */
/* Check for AMD cpu. */
testl %edi, %edi
jnz .Legress
/* Extract family (%edx) and model (%ecx). */
movl %eax, %edx
movl %eax, %ecx
shl $4, %edx
shl $8, %ecx
andl $0xf, %edx
andl $0xf, %ecx
cmpl $0xf, %edx /* if family=15... */
jne 2f
shl $12, %eax /* ... include extended fields. */
movl %eax, %ebx
andl $0xf0, %ebx
addl %ebx, %ecx
movzbl %ah, %eax
addl %eax, %edx
2:
/* Opteron Rev E has a bug in which on very rare occasions
a locked instruction doesn't act as a read-acquire barrier
if followed by a non-locked read-modify-write instruction.
Rev F has this bug in pre-release versions, but not in
versions released to customers, so we test only for Rev E,
which is family 15, model 32..63 inclusive. */
cmpl $15, %edx
jne .Legress
cmpl $32, %ecx
jb .Legress
cmpl $63, %ecx
ja .Legress
xorl %esi, %esi /* need lfence */
.Legress:
movl %esi, %eax
POPS(%edi)
POPS(%esi)
POP(%edx)
POP(%ecx)
POPS(%ebx)
ret
END_CFI(detect_cpu)
\f
/* Note that this CFI proc covers all of the ifuncs. */
.p2align ALIGN
cfi_startproc
#if defined(HAVE_GNU_INDIRECT_FUNCTION) && defined(__PIC__)
/* If we have indirect function support in the shared libgcc, we wish
to define the entry point symbol such that it returns the address
of the function we wish to execute for this cpu. The result of the
indirect function is stored in the PLT so that future invocations
proceed directly to the target function.
Each entry point defines a 4-entry table according to the values
for cpu_prop_index and we use a common routine to load the value. */
FUNC(common_indirect_function)
PIC_INIT(cx)
PIC_ADD(%ecx, %edx)
movl PIC_ADDRESS(cpu_prop_index,%ecx), %eax
testl %eax, %eax
jns 1f
call detect_cpu
movl %eax, PIC_ADDRESS(cpu_prop_index,%ecx)
1: movl (%edx,%eax,4), %eax
PIC_ADD(%ecx, %eax)
ret
END(common_indirect_function)
#define _IFUNC(N,P2,P3) \
_DEFINE(CAT(__,N),@gnu_indirect_function,3); \
movl $PIC_OFFSET(CAT(t_,N)), %edx; \
jmp P(common_indirect_function); \
END(CAT(__,N)); \
.globl CAT(__,N); \
.section .rodata; \
OBJECT(CAT(t_,N),2); \
.long PIC_OFFSET(CAT(l_,N)); \
.long PIC_OFFSET(CAT(o_,N)); \
.long PIC_OFFSET(CAT(P2,N)); \
.long PIC_OFFSET(CAT(P3,N)); \
END(CAT(t_,N)); \
.text
#define IFUNC(N) _IFUNC(N,n_,n_)
#define IFUNC8(N) _IFUNC(N,o_,n_)
#else
/* If we don't have (or aren't using) indirect function support, define
functions that dispatch to the correct implementation function. */
/* ??? The question is, what's the best method for the branch predictors?
My guess is that indirect branches are, in general, hardest. Therefore
separate the 3 with compares and use direct branches. Aid the Pentium4
static branch predictor by indicating that the "normal" function is the
one we expect to execute. */
#define _IFUNC(N,CX_IDX) \
.globl CAT(__,N); \
FUNC(CAT(__,N)); \
PIC_INIT(cx); \
movl PIC_ADDRESS(cpu_prop_index,%ecx), %eax; \
testl %eax, %eax; \
jns,pt 1f; \
call detect_cpu; \
movl %eax, PIC_ADDRESS(cpu_prop_index,%ecx); \
1: cmpl $CX_IDX, %eax; \
jge,pt CAT(n_,N); \
testl %eax, %eax; \
jz CAT(l_,N); \
jmp CAT(o_,N); \
END(CAT(__,N))
#define IFUNC(N) _IFUNC(N,2)
#define IFUNC8(N) _IFUNC(N,3)
#endif /* HAVE_GNU_INDIRECT_FUNCTION */
IFUNC(sync_val_compare_and_swap_1)
IFUNC(sync_val_compare_and_swap_2)
IFUNC(sync_val_compare_and_swap_4)
IFUNC8(sync_val_compare_and_swap_8)
IFUNC(sync_bool_compare_and_swap_1)
IFUNC(sync_bool_compare_and_swap_2)
IFUNC(sync_bool_compare_and_swap_4)
IFUNC8(sync_bool_compare_and_swap_8)
IFUNC(sync_fetch_and_add_1)
IFUNC(sync_fetch_and_add_2)
IFUNC(sync_fetch_and_add_4)
IFUNC8(sync_fetch_and_add_8)
IFUNC(sync_add_and_fetch_1)
IFUNC(sync_add_and_fetch_2)
IFUNC(sync_add_and_fetch_4)
IFUNC8(sync_add_and_fetch_8)
IFUNC(sync_fetch_and_sub_1)
IFUNC(sync_fetch_and_sub_2)
IFUNC(sync_fetch_and_sub_4)
IFUNC8(sync_fetch_and_sub_8)
IFUNC(sync_sub_and_fetch_1)
IFUNC(sync_sub_and_fetch_2)
IFUNC(sync_sub_and_fetch_4)
IFUNC8(sync_sub_and_fetch_8)
IFUNC(sync_fetch_and_or_1)
IFUNC(sync_fetch_and_or_2)
IFUNC(sync_fetch_and_or_4)
IFUNC8(sync_fetch_and_or_8)
IFUNC(sync_or_and_fetch_1)
IFUNC(sync_or_and_fetch_2)
IFUNC(sync_or_and_fetch_4)
IFUNC8(sync_or_and_fetch_8)
IFUNC(sync_fetch_and_and_1)
IFUNC(sync_fetch_and_and_2)
IFUNC(sync_fetch_and_and_4)
IFUNC8(sync_fetch_and_and_8)
IFUNC(sync_and_and_fetch_1)
IFUNC(sync_and_and_fetch_2)
IFUNC(sync_and_and_fetch_4)
IFUNC8(sync_and_and_fetch_8)
IFUNC(sync_fetch_and_nand_1)
IFUNC(sync_fetch_and_nand_2)
IFUNC(sync_fetch_and_nand_4)
IFUNC8(sync_fetch_and_nand_8)
IFUNC(sync_nand_and_fetch_1)
IFUNC(sync_nand_and_fetch_2)
IFUNC(sync_nand_and_fetch_4)
IFUNC8(sync_nand_and_fetch_8)
cfi_endproc
\f
/* The actual body of the functions are implemented in sync.inc.
Include it 3 times with different parameters to generate the
"normal" (i.e. cmpxchg), "lfence", and "old" (i.e. no cmpxchg)
versions of the code. */
#undef ALIGN
#undef PREFIX
#define ALIGN 4
#define PREFIX n_
#define LFENCE
#include "sync.inc"
#undef PREFIX
#undef LFENCE
#define PREFIX l_
#define LFENCE lfence
#include "sync.inc"
#undef ALIGN
#undef PREFIX
#undef LFENCE
#define ALIGN 2
#define PREFIX o_
#define SPINLOCK 1
.local spinlock
.comm spinlock,4,4
/* Common code for the beginning and end of any spinlock protected function. */
#ifdef __PIC__
#define ARG(N) N+8(%esp) /* Skip saved ebx and return address. */
#define SPINLOCK_LOCK PUSHS(%ebx); call P(spinlock_lock)
FUNC(spinlock_lock)
/* Note that this startproc covers both lock and unlock functions. */
cfi_startproc
PIC_INIT(bx)
1: lock
btsl $0, PIC_ADDRESS(spinlock,%ebx)
jc 1b
ret
END(spinlock_lock)
#define SPINLOCK_UNLOCK_AND_RET jmp P(spinlock_unlock)
FUNC(spinlock_unlock)
cfi_adjust_cfa_offset(4)
cfi_rel_offset(%ebx,0)
xorl %ecx, %ecx
movl %ecx, PIC_ADDRESS(spinlock,%ebx)
POPS(%ebx)
ret
cfi_endproc
END(spinlock_unlock)
#else
#define ARG(N) N+4(%esp) /* Skip return address. */
#define SPINLOCK_LOCK \
1: lock; btsl $0, spinlock; jc 1b
#define SPINLOCK_UNLOCK_AND_RET \
xorl %ecx,%ecx; movl %ecx,spinlock; ret
#endif /* PIC */
#include "sync.inc"
\f
#ifdef __ELF__
#ifdef __PIC__
.section .text.__i686.get_pc_thunk.bx,"axG",@progbits,__i686.get_pc_thunk.bx,comdat
.globl __i686.get_pc_thunk.bx
.hidden __i686.get_pc_thunk.bx
.type __i686.get_pc_thunk.bx, @function
__i686.get_pc_thunk.bx:
movl (%esp), %ebx
ret
.section .text.__i686.get_pc_thunk.cx,"axG",@progbits,__i686.get_pc_thunk.cx,comdat
.globl __i686.get_pc_thunk.cx
.hidden __i686.get_pc_thunk.cx
.type __i686.get_pc_thunk.cx, @function
__i686.get_pc_thunk.cx:
movl (%esp), %ecx
ret
#endif
.section .note.GNU-stack,"",@progbits
#endif
[-- Attachment #3: sync.inc --]
[-- Type: text/plain, Size: 12187 bytes --]
/* This file is logically a part of sync.S; it is included 3 times. */
FUNC_CFI(sync_val_compare_and_swap_1)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movzbl (%ecx), %eax
cmpb %al, ARG(4)
jne 2f
movl ARG(8), %edx
movb %dl, (%ecx)
2: SPINLOCK_UNLOCK_AND_RET
#else
movl 4(%esp), %ecx
movzbl 8(%esp), %eax
movl 12(%esp), %edx
lock; cmpxchgb %dl, (%ecx)
LFENCE
ret
#endif
END_CFI(sync_val_compare_and_swap_1)
FUNC_CFI(sync_val_compare_and_swap_2)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movzwl (%ecx), %eax
cmpw %ax, ARG(4)
jne 2f
movl ARG(8), %edx
movw %dx, (%ecx)
2: SPINLOCK_UNLOCK_AND_RET
#else
movl 4(%esp), %ecx
movzwl 8(%esp), %eax
movl 12(%esp), %edx
lock; cmpxchgw %dx, (%ecx)
LFENCE
ret
#endif
END_CFI(sync_val_compare_and_swap_2)
FUNC_CFI(sync_val_compare_and_swap_4)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl (%ecx), %eax
cmpl %eax, ARG(4)
jne 2f
movl ARG(8), %edx
movl %edx, (%ecx)
2: SPINLOCK_UNLOCK_AND_RET
#else
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl 12(%esp), %edx
lock; cmpxchgl %edx, (%ecx)
LFENCE
ret
#endif
END_CFI(sync_val_compare_and_swap_4)
FUNC_CFI(sync_val_compare_and_swap_8)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl (%ecx), %eax
movl 4(%ecx), %edx
cmpl %eax, ARG(4)
jne 2f
cmpl %edx, ARG(8)
jne 2f
PUSHS(%esi)
movl ARG(12), %esi
movl %esi, (%ecx)
movl ARG(16), %esi
movl %esi, 4(%ecx)
POPS(%esi)
2: SPINLOCK_UNLOCK_AND_RET
#else
PUSHS(%ebx)
PUSHS(%esi)
movl 12(%esp), %esi
movl 16(%esp), %eax
movl 20(%esp), %edx
movl 24(%esp), %ebx
movl 28(%esp), %ecx
lock; cmpxchg8b (%esi)
LFENCE
POPS(%esi)
POPS(%ebx)
ret
#endif
END_CFI(sync_val_compare_and_swap_8)
\f
FUNC_CFI(sync_bool_compare_and_swap_1)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl ARG(4), %eax
cmpb %al, (%ecx)
jne 2f
movl ARG(8), %edx
movb %dl, (%ecx)
2: sete %al
movzbl %al, %eax
SPINLOCK_UNLOCK_AND_RET
#else
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl 12(%esp), %edx
lock; cmpxchgb %dl, (%ecx)
LFENCE
setz %al
movzbl %al, %eax
ret
#endif
END_CFI(sync_bool_compare_and_swap_1)
FUNC_CFI(sync_bool_compare_and_swap_2)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl ARG(4), %eax
cmpw %ax, (%ecx)
jne 2f
movl ARG(8), %edx
movw %dx, (%ecx)
2: sete %al
movzbl %al, %eax
SPINLOCK_UNLOCK_AND_RET
#else
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl 12(%esp), %edx
lock; cmpxchgw %dx, (%ecx)
LFENCE
setz %al
movzbl %al, %eax
ret
#endif
END_CFI(sync_bool_compare_and_swap_2)
FUNC_CFI(sync_bool_compare_and_swap_4)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl ARG(4), %eax
cmpl %eax, (%ecx)
jne 2f
movl ARG(8), %edx
movl %edx, (%ecx)
2: sete %al
movzbl %al, %eax
SPINLOCK_UNLOCK_AND_RET
#else
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl 12(%esp), %edx
lock; cmpxchgl %edx,(%ecx)
LFENCE
setz %al
movzbl %al,%eax
ret
#endif
END_CFI(sync_bool_compare_and_swap_4)
FUNC_CFI(sync_bool_compare_and_swap_8)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl ARG(4), %eax
movl ARG(8), %edx
cmpl %eax, (%ecx)
jne 2f
cmpl %edx, 4(%ecx)
jne 2f
movl ARG(12), %eax
movl ARG(16), %edx
movl %eax, (%ecx)
movl %edx, 4(%ecx)
movl $1, %eax
SPINLOCK_UNLOCK_AND_RET
2: xorl %eax, %eax
SPINLOCK_UNLOCK_AND_RET
#else
PUSHS(%ebx)
PUSHS(%esi)
movl 12(%esp), %esi
movl 16(%esp), %eax
movl 20(%esp), %edx
movl 24(%esp), %ebx
movl 28(%esp), %ecx
lock; cmpxchg8b (%esi)
LFENCE
setz %al
movzbl %al,%eax
POPS(%esi)
POPS(%ebx)
ret
#endif
END_CFI(sync_bool_compare_and_swap_8)
\f
#ifndef SPINLOCK
/* This CFI covers all of the add and subtract functions. */
.p2align ALIGN
cfi_startproc
FUNC(sync_fetch_and_add_1)
movl 4(%esp), %ecx
movl 8(%esp), %eax
lock; xaddb %al, (%ecx)
LFENCE
movzbl %al, %eax
ret
END(sync_fetch_and_add_1)
FUNC(sync_fetch_and_add_2)
movl 4(%esp), %ecx
movl 8(%esp), %eax
lock; xaddw %ax, (%ecx)
LFENCE
movzwl %ax, %eax
ret
END(sync_fetch_and_add_2)
FUNC(sync_fetch_and_add_4)
movl 4(%esp), %ecx
movl 8(%esp), %eax
lock; xaddl %eax, (%ecx)
LFENCE
ret
END(sync_fetch_and_add_4)
FUNC(sync_add_and_fetch_1)
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl %eax, %edx
lock; xaddb %dl, (%ecx)
LFENCE
addb %dl, %al
movzbl %al, %eax
ret
END(sync_add_and_fetch_1)
FUNC(sync_add_and_fetch_2)
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl %eax, %edx
lock; xaddw %dx, (%ecx)
LFENCE
addw %dx, %ax
movzwl %ax, %eax
ret
END(sync_add_and_fetch_2)
FUNC(sync_add_and_fetch_4)
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl %eax, %edx
lock; xaddl %edx, (%ecx)
LFENCE
addl %edx, %eax
ret
END(sync_add_and_fetch_4)
FUNC(sync_fetch_and_sub_1)
movl 4(%esp), %ecx
movl 8(%esp), %eax
negl %eax
lock; xaddb %al, (%ecx)
LFENCE
movzbl %al, %eax
ret
END(sync_fetch_and_sub_1)
FUNC(sync_fetch_and_sub_2)
movl 4(%esp), %ecx
movl 8(%esp), %eax
negl %eax
lock; xaddw %ax, (%ecx)
LFENCE
movzwl %ax, %eax
ret
END(sync_fetch_and_sub_2)
FUNC(sync_fetch_and_sub_4)
movl 4(%esp), %ecx
movl 8(%esp), %eax
negl %eax
lock; xaddl %eax, (%ecx)
LFENCE
ret
END(sync_fetch_and_sub_4)
FUNC(sync_sub_and_fetch_1)
movl 4(%esp), %ecx
movl 8(%esp), %eax
negl %eax
movl %eax, %edx
lock; xaddb %dl, (%ecx)
LFENCE
addb %dl, %al
movzbl %al, %eax
ret
END(sync_sub_and_fetch_1)
FUNC(sync_sub_and_fetch_2)
movl 4(%esp), %ecx
movl 8(%esp), %eax
negl %eax
movl %eax, %edx
lock; xaddw %dx, (%ecx)
LFENCE
addw %dx, %ax
movzwl %ax, %eax
ret
END(sync_sub_and_fetch_2)
FUNC(sync_sub_and_fetch_4)
movl 4(%esp), %ecx
movl 8(%esp), %eax
negl %eax
movl %eax, %edx
lock; xaddl %edx, (%ecx)
LFENCE
addl %edx, %eax
ret
END(sync_sub_and_fetch_4)
cfi_endproc
#endif /* SPINLOCK */
\f
#define OR(S,D) or S,D
#define AND(S,D) and S,D
#define NAND(S,D) not D; and S,D
#define ADD(S,D) add S,D
#define ADC(S,D) adc S,D
#define SUB(S,D) sub S,D
#define SBB(S,D) sbb S,D
#define NIL(S,D)
#define MOV(S,D) mov S,D
#define MOVZX(S,D) movzx S,D
#ifdef SPINLOCK
#define _SYNC_FETCH_AND_OP(N, S, sax, sbx, sdx, OP, LDEXT, EXT) \
FUNC_CFI(CAT4(sync_fetch_and_,N,_,S)); \
SPINLOCK_LOCK; \
movl ARG(0), %ecx; \
movl ARG(4), %edx; \
LDEXT((%ecx),%eax); \
OP(%eax, %edx); \
mov sdx, (%ecx); \
SPINLOCK_UNLOCK_AND_RET; \
END_CFI(CAT4(sync_fetch_and_,N,_,S))
#else
#define _SYNC_FETCH_AND_OP(N, S, sax, sbx, sdx, OP, LDEXT, EXT) \
FUNC_CFI(CAT4(sync_fetch_and_,N,_,S)); \
PUSHS(%ebx); \
movl 8(%esp), %ecx; \
movl 12(%esp), %ebx; \
mov (%ecx), sax; \
1: mov sax, sdx; \
OP(sbx, sdx); \
lock; cmpxchg sdx, (%ecx); \
jnz 1b; \
LFENCE; \
EXT(sax, %eax); \
POPS(%ebx); \
ret; \
END_CFI(CAT4(sync_fetch_and_,N,_,S))
#endif /* SPINLOCK */
#define SYNC_FETCH_AND_OP_1(N,OP) \
_SYNC_FETCH_AND_OP(N, 1, %al, %bl, %dl, OP, MOVZX, MOVZX)
#define SYNC_FETCH_AND_OP_2(N,OP) \
_SYNC_FETCH_AND_OP(N, 2, %ax, %bx, %dx, OP, MOVZX, MOVZX)
#define SYNC_FETCH_AND_OP_4(N, OP) \
_SYNC_FETCH_AND_OP(N, 4, %eax, %ebx, %edx, OP, MOV, NIL)
#ifdef SPINLOCK
#define SYNC_FETCH_AND_OP_8(N, OPLO, OPHI) \
FUNC_CFI(CAT4(sync_fetch_and_,N,_,8)); \
SPINLOCK_LOCK; \
PUSHS(%esi); \
PUSHS(%edi); \
/* Note that the ARG macro doesn't include the two \
pushes that we do above, so need to bias by 8. */ \
movl ARG(8), %ecx; \
movl ARG(12), %esi; \
movl ARG(16), %edi; \
movl (%ecx), %eax; \
movl 4(%ecx), %edx; \
OPLO(%eax, %esi); \
OPHI(%edx, %edi); \
movl %esi, (%ecx); \
movl %edi, 4(%ecx); \
POPS(%edi); \
POPS(%esi); \
SPINLOCK_UNLOCK_AND_RET; \
END_CFI(CAT4(sync_fetch_and_,N,_,8))
#else
#define SYNC_FETCH_AND_OP_8(N, OPLO, OPHI) \
FUNC_CFI(CAT4(sync_fetch_and_,N,_,8)); \
PUSHS(%ebx); \
PUSHS(%esi); \
PUSHS(%edi); \
PUSHS(%ebp); \
movl 20(%esp), %esi; \
movl 24(%esp), %edi; \
movl 28(%esp), %ebp; \
movl (%esi), %eax; \
movl 4(%esi), %edx; \
1: movl %eax, %ebx; \
movl %edx, %ecx; \
OPLO(%edi, %ebx); \
OPHI(%ebp, %ecx); \
lock; cmpxchg8b (%esi); \
jnz 1b; \
LFENCE; \
POPS(%ebp); \
POPS(%edi); \
POPS(%esi); \
POPS(%ebx); \
ret; \
END_CFI(CAT4(sync_fetch_and_,N,_,8))
#endif /* SPINLOCK */
#ifdef SPINLOCK
#define _SYNC_OP_AND_FETCH(N, S, sax, sbx, sdx, OP, LDEXT, EXT) \
FUNC_CFI(CAT4(sync_,N,_and_fetch_,S)); \
SPINLOCK_LOCK; \
movl ARG(0), %ecx; \
movl ARG(4), %eax; \
OP((%ecx), sax); \
mov sax, (%ecx); \
EXT(sax, %eax); \
SPINLOCK_UNLOCK_AND_RET; \
END_CFI(CAT4(sync_,N,_and_fetch_,S))
#else
#define _SYNC_OP_AND_FETCH(N, S, sax, sbx, sdx, OP, LDEXT, EXT) \
FUNC_CFI(CAT4(sync_,N,_and_fetch_,S)); \
PUSHS(%ebx); \
movl 8(%esp), %ecx; \
movl 12(%esp), %ebx; \
mov (%ecx), sax; \
1: mov sax, sdx; \
OP(sbx, sdx); \
lock; cmpxchg sdx, (%ecx); \
jnz 1b; \
LFENCE; \
LDEXT(sdx, %eax); \
POPS(%ebx); \
ret; \
END_CFI(CAT4(sync_,N,_and_fetch_,S))
#endif /* SPINLOCK */
#define SYNC_OP_AND_FETCH_1(N,OP) \
_SYNC_OP_AND_FETCH(N, 1, %al, %bl, %dl, OP, MOVZX, MOVZX)
#define SYNC_OP_AND_FETCH_2(N,OP) \
_SYNC_OP_AND_FETCH(N, 2, %ax, %bx, %dx, OP, MOVZX, MOVZX)
#define SYNC_OP_AND_FETCH_4(N, OP) \
_SYNC_OP_AND_FETCH(N, 4, %eax, %ebx, %edx, OP, MOV, NIL)
#ifdef SPINLOCK
#define SYNC_OP_AND_FETCH_8(N, OPLO, OPHI) \
FUNC_CFI(CAT4(sync_,N,_and_fetch_,8)); \
SPINLOCK_LOCK; \
movl ARG(0), %ecx; \
movl ARG(4), %eax; \
movl ARG(8), %edx; \
OPLO((%ecx), %eax); \
OPHI(4(%ecx), %edx); \
movl %eax, (%ecx); \
movl %edx, 4(%ecx); \
SPINLOCK_UNLOCK_AND_RET; \
END_CFI(CAT4(sync_,N,_and_fetch_,8))
#else
#define SYNC_OP_AND_FETCH_8(N, OPLO, OPHI) \
FUNC_CFI(CAT4(sync_,N,_and_fetch_,8)); \
PUSHS(%ebx); \
PUSHS(%esi); \
PUSHS(%edi); \
PUSHS(%ebp); \
movl 20(%esp), %esi; \
movl 24(%esp), %edi; \
movl 28(%esp), %ebp; \
movl (%esi), %eax; \
movl 4(%esi), %edx; \
1: movl %eax, %ebx; \
movl %edx, %ecx; \
OPLO(%edi, %ebx); \
OPHI(%ebp, %ecx); \
lock; cmpxchg8b (%esi); \
jnz 1b; \
LFENCE; \
movl %ebx, %eax; \
movl %ecx, %edx; \
POPS(%ebp); \
POPS(%edi); \
POPS(%esi); \
POPS(%ebx); \
ret; \
END_CFI(CAT4(sync_,N,_and_fetch_,8))
#endif /* SPINLOCK */
#ifdef SPINLOCK
SYNC_FETCH_AND_OP_1(add, ADD)
SYNC_FETCH_AND_OP_2(add, ADD)
SYNC_FETCH_AND_OP_4(add, ADD)
SYNC_FETCH_AND_OP_1(sub, SUB)
SYNC_FETCH_AND_OP_2(sub, SUB)
SYNC_FETCH_AND_OP_4(sub, SUB)
#endif
SYNC_FETCH_AND_OP_8(add, ADD, ADC)
SYNC_FETCH_AND_OP_8(sub, SUB, SBB)
SYNC_FETCH_AND_OP_1(or, OR)
SYNC_FETCH_AND_OP_2(or, OR)
SYNC_FETCH_AND_OP_4(or, OR)
SYNC_FETCH_AND_OP_8(or, OR, OR)
SYNC_FETCH_AND_OP_1(and, AND)
SYNC_FETCH_AND_OP_2(and, AND)
SYNC_FETCH_AND_OP_4(and, AND)
SYNC_FETCH_AND_OP_8(and, AND, AND)
SYNC_FETCH_AND_OP_1(nand, NAND)
SYNC_FETCH_AND_OP_2(nand, NAND)
SYNC_FETCH_AND_OP_4(nand, NAND)
SYNC_FETCH_AND_OP_8(nand, NAND, NAND)
#ifdef SPINLOCK
SYNC_OP_AND_FETCH_1(add, ADD)
SYNC_OP_AND_FETCH_2(add, ADD)
SYNC_OP_AND_FETCH_4(add, ADD)
SYNC_OP_AND_FETCH_1(sub, SUB)
SYNC_OP_AND_FETCH_2(sub, SUB)
SYNC_OP_AND_FETCH_4(sub, SUB)
#endif
SYNC_OP_AND_FETCH_8(add, ADD, ADC)
SYNC_OP_AND_FETCH_8(sub, SUB, SBB)
SYNC_OP_AND_FETCH_1(or, OR)
SYNC_OP_AND_FETCH_2(or, OR)
SYNC_OP_AND_FETCH_4(or, OR)
SYNC_OP_AND_FETCH_8(or, OR, OR)
SYNC_OP_AND_FETCH_1(and, AND)
SYNC_OP_AND_FETCH_2(and, AND)
SYNC_OP_AND_FETCH_4(and, AND)
SYNC_OP_AND_FETCH_8(and, AND, AND)
SYNC_OP_AND_FETCH_1(nand, NAND)
SYNC_OP_AND_FETCH_2(nand, NAND)
SYNC_OP_AND_FETCH_4(nand, NAND)
SYNC_OP_AND_FETCH_8(nand, NAND, NAND)
/* Undef all macros defined herein for reinclude. */
#undef OR
#undef AND
#undef NAND
#undef ADD
#undef ADC
#undef SUB
#undef NIL
#undef MOV
#undef MOVZX
#undef _SYNC_FETCH_AND_OP
#undef SYNC_FETCH_AND_OP_1
#undef SYNC_FETCH_AND_OP_2
#undef SYNC_FETCH_AND_OP_4
#undef SYNC_FETCH_AND_OP_8
#undef _SYNC_OP_AND_FETCH
#undef SYNC_OP_AND_FETCH_1
#undef SYNC_OP_AND_FETCH_2
#undef SYNC_OP_AND_FETCH_4
#undef SYNC_OP_AND_FETCH_8
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-16 23:11 [CFT] i386 sync functions for PR 39677 Richard Henderson
@ 2009-10-16 23:27 ` Joseph S. Myers
2009-10-16 23:48 ` Richard Henderson
2009-10-19 14:24 ` Rainer Orth
2009-10-24 6:51 ` Danny Smith
2 siblings, 1 reply; 11+ messages in thread
From: Joseph S. Myers @ 2009-10-16 23:27 UTC (permalink / raw)
To: Richard Henderson; +Cc: GCC Patches, ro, dannysmith, ubizjak
The uses of __i686 in sync.S look likely to break tools configured
--with-arch=i686 (when __i686 is a macro defined to 1). Building glibc
with such a compiler is notoriously broken (there have been many bug
reports and patches over the years, from
<http://sourceware.org/ml/libc-alpha/2002-10/msg00156.html> through to
<http://sourceware.org/ml/libc-alpha/2009-07/msg00072.html> with many
inbetween, but none of the patches have been applied), but it's worked to
build GCC that way.
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-16 23:27 ` Joseph S. Myers
@ 2009-10-16 23:48 ` Richard Henderson
2009-10-17 18:14 ` Paolo Bonzini
0 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2009-10-16 23:48 UTC (permalink / raw)
To: Joseph S. Myers; +Cc: Richard Henderson, GCC Patches, ro, dannysmith, ubizjak
[-- Attachment #1: Type: text/plain, Size: 309 bytes --]
On 10/16/2009 04:11 PM, Joseph S. Myers wrote:
> The uses of __i686 in sync.S look likely to break tools configured
> --with-arch=i686 (when __i686 is a macro defined to 1).
Fixed.
I've also re-partitioned into 2 nested include files, simplifying the
macros and shaving 200 lines of code duplication.
r~
[-- Attachment #2: sync.S --]
[-- Type: text/plain, Size: 12960 bytes --]
/* Synchronization functions for i386.
Copyright (C) 2009 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
Under Section 7 of GPL version 3, you are granted additional
permissions described in the GCC Runtime Library Exception, version
3.1, as published by the Free Software Foundation.
You should have received a copy of the GNU General Public License and
a copy of the GCC Runtime Library Exception along with this program;
see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
<http://www.gnu.org/licenses/>. */
/* Note that we don't bother with a 64-bit version, as there we know
that cmpxchg and lfence are both supported by the cpu. */
/* Don't break builds configured with --with-arch=i686. */
#undef __i686
/* Token concatenation macros. */
#define CAT_(A,B) A ## B
#define CAT(A,B) CAT_(A,B)
#define CAT4_(A,B,C,D) A ## B ## C ## D
#define CAT4(A,B,C,D) CAT4_(A,B,C,D)
/* Redefine this to add a prefix to all symbols defined. */
#define PREFIX
#define P(X) CAT(PREFIX,X)
/* Redefine this to change the default alignment of subsequent functions. */
#define ALIGN 4
/* Define the type of a symbol. */
#ifdef __ELF__
# define TYPE(N,T) .type P(N),T
#else
# define TYPE(N,T)
#endif
/* Define a new symbol, with appropriate type and alignment. */
#define _DEFINE(N,T,A) .p2align A; TYPE(N,T); P(N):
/* End the definition of a symbol. */
#ifdef __ELF__
# define END(N) .size P(N), .-P(N)
#else
# define END(N)
#endif
/* Redefine these to generate functions with different prefixes. */
#define FUNC(N) _DEFINE(N,@function,ALIGN)
/* Define an object name. */
#define OBJECT(N,A) _DEFINE(N,@object,A)
/* If gas cfi directives are supported, use them, otherwise do nothing. */
#ifdef HAVE_GAS_CFI_DIRECTIVE
# define cfi_startproc .cfi_startproc
# define cfi_endproc .cfi_endproc
# define cfi_adjust_cfa_offset(O) .cfi_adjust_cfa_offset O
# define cfi_rel_offset(R,O) .cfi_rel_offset R, O
# define cfi_restore(R) .cfi_restore R
#else
# define cfi_startproc
# define cfi_endproc
# define cfi_adjust_cfa_offset(O)
# define cfi_rel_offset(R,O)
# define cfi_restore(R)
#endif
/* Simplify generation of those cfi directives for the common cases.
The PUSHS/POPS pair indicates the register should be saved for unwind;
otherwise we simply adjust the CFA. */
#define PUSH(R) pushl R; cfi_adjust_cfa_offset(4)
#define PUSHS(R) PUSH(R); cfi_rel_offset(R,0)
#define PUSHF pushfl; cfi_adjust_cfa_offset(4)
#define POP(R) popl R; cfi_adjust_cfa_offset(-4)
#define POPS(R) POP(R); cfi_restore(R)
#define POPF popfl; cfi_adjust_cfa_offset(-4)
/* Define a function name and do begin a new CFI proc. */
#define FUNC_CFI(N) FUNC(N); cfi_startproc
/* End a function (without CFI proc) or object. */
#define END_CFI(N) cfi_endproc; END(N)
/* Parameterize the PIC model for the target. */
#ifdef __PIC__
# ifdef __ELF__
# define PIC_INIT(REG) \
call __i686.get_pc_thunk.REG; \
addl $_GLOBAL_OFFSET_TABLE_, %CAT(e,REG)
# define PIC_ADD(P,D) addl P,D
# define PIC_OFFSET(S) S@GOTOFF
# define PIC_ADDRESS(S,P) S@GOTOFF(P)
# else
# error "Unknown PIC model"
# endif
#else
# define PIC_INIT(REG)
# define PIC_ADD(P,D)
# define PIC_OFFSET(S) S
# define PIC_ADDRESS(S,P) S
#endif
\f
/* This variable caches the (relevant) properties of the currently
running cpu. It has the following values:
-1 Uninitialized
0 An LFENCE instruction required after any sync
function with acquire semantics. Given that
lfence is an SSE2 insn, we also assume cmpxchg8b.
1 No cmpxchg support. Note that we're not interested
in the 80486 XADD instruction, which we use. If we
have to use a spinlock for any of the routines for
a data size, we have to use a spinlock for all of
the routines for a data size. Therefore XADD by
itself isn't interesting.
2 CMPXCHG supported
3 CMPXCHG8B supported
*/
.data
OBJECT(cpu_prop_index,2)
.long -1
END(cpu_prop_index)
.text
/* Detect the properties of the currently running cpu, according to
the values listed above. Preserves all registers except EAX, which
holds the return value. */
FUNC_CFI(detect_cpu)
PUSHS(%ebx)
PUSH(%ecx)
PUSH(%edx)
PUSHS(%esi)
PUSHS(%edi)
/* Determine 386 vs 486 and presence of cpuid all at once. */
PUSHF
PUSHF
POP(%eax)
movl %eax, %edx
xorl $0x00200000, %eax
PUSH(%eax)
POPF
PUSHF
POP(%eax)
POPF
xorl %edx, %eax
/* If we weren't able to toggle the ID bit in the flags,
we don't have the cpuid instruction, and also don't
have the cmpxchg instruction. */
movl $1, %esi /* do not have cmpxchg */
jz .Legress
movl $2, %esi /* have cmpxchg */
xorl %eax, %eax
cpuid
/* Check for AuthenticAMD. At the end, %edi is zero for matched. */
xorl $0x68747541, %ebx
xorl $0x444D4163, %ecx
xorl $0x69746E65, %edx
movl %ebx, %edi
orl %ecx, %edi
orl %edx, %edi
/* If max_cpuid == 0, we can check no further. */
testl %eax, %eax
jz .Legress
movl $1, %eax
cpuid
/* Check for cmpxchg8b support. The CX8 bit is 1<<8 in EDX. */
shr $8, %edx
andl $1, %edx
addl %edx, %esi /* incr iff cmpxchg8b */
/* Check for AMD cpu. */
testl %edi, %edi
jnz .Legress
/* Extract family (%edx) and model (%ecx). */
movl %eax, %edx
movl %eax, %ecx
shl $4, %edx
shl $8, %ecx
andl $0xf, %edx
andl $0xf, %ecx
cmpl $0xf, %edx /* if family=15... */
jne 2f
shl $12, %eax /* ... include extended fields. */
movl %eax, %ebx
andl $0xf0, %ebx
addl %ebx, %ecx
movzbl %ah, %eax
addl %eax, %edx
2:
/* Opteron Rev E has a bug in which on very rare occasions
a locked instruction doesn't act as a read-acquire barrier
if followed by a non-locked read-modify-write instruction.
Rev F has this bug in pre-release versions, but not in
versions released to customers, so we test only for Rev E,
which is family 15, model 32..63 inclusive. */
cmpl $15, %edx
jne .Legress
cmpl $32, %ecx
jb .Legress
cmpl $63, %ecx
ja .Legress
xorl %esi, %esi /* need lfence */
.Legress:
movl %esi, %eax
POPS(%edi)
POPS(%esi)
POP(%edx)
POP(%ecx)
POPS(%ebx)
ret
END_CFI(detect_cpu)
\f
/* Note that this CFI proc covers all of the ifuncs. */
.p2align ALIGN
cfi_startproc
#if defined(HAVE_GNU_INDIRECT_FUNCTION) && defined(__PIC__)
/* If we have indirect function support in the shared libgcc, we wish
to define the entry point symbol such that it returns the address
of the function we wish to execute for this cpu. The result of the
indirect function is stored in the PLT so that future invocations
proceed directly to the target function.
Each entry point defines a 4-entry table according to the values
for cpu_prop_index and we use a common routine to load the value. */
FUNC(common_indirect_function)
PIC_INIT(cx)
PIC_ADD(%ecx, %edx)
movl PIC_ADDRESS(cpu_prop_index,%ecx), %eax
testl %eax, %eax
jns 1f
call detect_cpu
movl %eax, PIC_ADDRESS(cpu_prop_index,%ecx)
1: movl (%edx,%eax,4), %eax
PIC_ADD(%ecx, %eax)
ret
END(common_indirect_function)
#define _IFUNC(N,P2,P3) \
_DEFINE(CAT(__,N),@gnu_indirect_function,3); \
movl $PIC_OFFSET(CAT(t_,N)), %edx; \
jmp P(common_indirect_function); \
END(CAT(__,N)); \
.globl CAT(__,N); \
.section .rodata; \
OBJECT(CAT(t_,N),2); \
.long PIC_OFFSET(CAT(l_,N)); \
.long PIC_OFFSET(CAT(o_,N)); \
.long PIC_OFFSET(CAT(P2,N)); \
.long PIC_OFFSET(CAT(P3,N)); \
END(CAT(t_,N)); \
.text
#define IFUNC(N) _IFUNC(N,n_,n_)
#define IFUNC8(N) _IFUNC(N,o_,n_)
#else
/* If we don't have (or aren't using) indirect function support, define
functions that dispatch to the correct implementation function. */
/* ??? The question is, what's the best method for the branch predictors?
My guess is that indirect branches are, in general, hardest. Therefore
separate the 3 with compares and use direct branches. Aid the Pentium4
static branch predictor by indicating that the "normal" function is the
one we expect to execute. */
#define _IFUNC(N,CX_IDX) \
.globl CAT(__,N); \
FUNC(CAT(__,N)); \
PIC_INIT(cx); \
movl PIC_ADDRESS(cpu_prop_index,%ecx), %eax; \
testl %eax, %eax; \
jns,pt 1f; \
call detect_cpu; \
movl %eax, PIC_ADDRESS(cpu_prop_index,%ecx); \
1: cmpl $CX_IDX, %eax; \
jge,pt CAT(n_,N); \
testl %eax, %eax; \
jz CAT(l_,N); \
jmp CAT(o_,N); \
END(CAT(__,N))
#define IFUNC(N) _IFUNC(N,2)
#define IFUNC8(N) _IFUNC(N,3)
#endif /* HAVE_GNU_INDIRECT_FUNCTION */
IFUNC(sync_val_compare_and_swap_1)
IFUNC(sync_val_compare_and_swap_2)
IFUNC(sync_val_compare_and_swap_4)
IFUNC8(sync_val_compare_and_swap_8)
IFUNC(sync_bool_compare_and_swap_1)
IFUNC(sync_bool_compare_and_swap_2)
IFUNC(sync_bool_compare_and_swap_4)
IFUNC8(sync_bool_compare_and_swap_8)
IFUNC(sync_fetch_and_add_1)
IFUNC(sync_fetch_and_add_2)
IFUNC(sync_fetch_and_add_4)
IFUNC8(sync_fetch_and_add_8)
IFUNC(sync_add_and_fetch_1)
IFUNC(sync_add_and_fetch_2)
IFUNC(sync_add_and_fetch_4)
IFUNC8(sync_add_and_fetch_8)
IFUNC(sync_fetch_and_sub_1)
IFUNC(sync_fetch_and_sub_2)
IFUNC(sync_fetch_and_sub_4)
IFUNC8(sync_fetch_and_sub_8)
IFUNC(sync_sub_and_fetch_1)
IFUNC(sync_sub_and_fetch_2)
IFUNC(sync_sub_and_fetch_4)
IFUNC8(sync_sub_and_fetch_8)
IFUNC(sync_fetch_and_or_1)
IFUNC(sync_fetch_and_or_2)
IFUNC(sync_fetch_and_or_4)
IFUNC8(sync_fetch_and_or_8)
IFUNC(sync_or_and_fetch_1)
IFUNC(sync_or_and_fetch_2)
IFUNC(sync_or_and_fetch_4)
IFUNC8(sync_or_and_fetch_8)
IFUNC(sync_fetch_and_and_1)
IFUNC(sync_fetch_and_and_2)
IFUNC(sync_fetch_and_and_4)
IFUNC8(sync_fetch_and_and_8)
IFUNC(sync_and_and_fetch_1)
IFUNC(sync_and_and_fetch_2)
IFUNC(sync_and_and_fetch_4)
IFUNC8(sync_and_and_fetch_8)
IFUNC(sync_fetch_and_nand_1)
IFUNC(sync_fetch_and_nand_2)
IFUNC(sync_fetch_and_nand_4)
IFUNC8(sync_fetch_and_nand_8)
IFUNC(sync_nand_and_fetch_1)
IFUNC(sync_nand_and_fetch_2)
IFUNC(sync_nand_and_fetch_4)
IFUNC8(sync_nand_and_fetch_8)
cfi_endproc
\f
/* Some macros passed to e.g. SYNC_FETCH_AND_OP macro. */
#define OR(S,D) or S,D
#define AND(S,D) and S,D
#define NAND(S,D) not D; and S,D
#define ADD(S,D) add S,D
#define ADC(S,D) adc S,D
#define SUB(S,D) sub S,D
#define SBB(S,D) sbb S,D
/* The actual body of the functions are implemented in sync.inc.
Include it 3 times with different parameters to generate the
"normal" (i.e. cmpxchg), "lfence", and "old" (i.e. no cmpxchg)
versions of the code. */
#undef ALIGN
#define ALIGN 4
#undef PREFIX
#define PREFIX n_
#define LFENCE
#include "sync-1.inc"
#define PREFIX l_
#define LFENCE lfence
#include "sync-1.inc"
/* Conserve space in the spinlock versions. */
#undef ALIGN
#define ALIGN 2
#define PREFIX o_
#define SPINLOCK 1
/* The spinlock variable. */
/* ??? If this object is not going to be included in shared libgcc,
should we make this variable global, so that it can be unified
across different (potential) copies of this object? */
.local spinlock
.comm spinlock,4,4
/* Common code for the beginning and end of any spinlock protected function. */
#ifdef __PIC__
#define ARG(N) N+8(%esp) /* Skip saved ebx and return address. */
#define SPINLOCK_LOCK PUSHS(%ebx); call P(spinlock_lock)
FUNC(spinlock_lock)
/* Note that this startproc covers both lock and unlock functions. */
cfi_startproc
PIC_INIT(bx)
1: lock
btsl $0, PIC_ADDRESS(spinlock,%ebx)
jc 1b
ret
END(spinlock_lock)
#define SPINLOCK_UNLOCK_AND_RET jmp P(spinlock_unlock)
FUNC(spinlock_unlock)
cfi_adjust_cfa_offset(4)
cfi_rel_offset(%ebx,0)
xorl %ecx, %ecx
movl %ecx, PIC_ADDRESS(spinlock,%ebx)
POPS(%ebx)
ret
cfi_endproc
END(spinlock_unlock)
#else
#define ARG(N) N+4(%esp) /* Skip return address. */
#define SPINLOCK_LOCK \
1: lock; btsl $0, spinlock; jc 1b
#define SPINLOCK_UNLOCK_AND_RET \
xorl %ecx,%ecx; movl %ecx,spinlock; ret
#endif /* PIC */
#include "sync-1.inc"
\f
#ifdef __ELF__
#ifdef __PIC__
.section .text.__i686.get_pc_thunk.bx,"axG",@progbits,__i686.get_pc_thunk.bx,comdat
.globl __i686.get_pc_thunk.bx
.hidden __i686.get_pc_thunk.bx
.type __i686.get_pc_thunk.bx, @function
__i686.get_pc_thunk.bx:
movl (%esp), %ebx
ret
.section .text.__i686.get_pc_thunk.cx,"axG",@progbits,__i686.get_pc_thunk.cx,comdat
.globl __i686.get_pc_thunk.cx
.hidden __i686.get_pc_thunk.cx
.type __i686.get_pc_thunk.cx, @function
__i686.get_pc_thunk.cx:
movl (%esp), %ecx
ret
#endif
.section .note.GNU-stack,"",@progbits
#endif
[-- Attachment #3: sync-1.inc --]
[-- Type: text/plain, Size: 4810 bytes --]
/* This file is logically a part of sync.S. It is included 3 times
with macros set for the different function sets. */
/* Generate all 1 byte operations. */
#define SIZE 1
#define MOVEXT movzbl
#define EXT(S,D) movzbl S,D
#define SAX %al
#define SBX %bl
#define SCX %cl
#define SDX %dl
#include "sync-2.inc"
/* Generate all 2 byte operations. */
#define SIZE 2
#define MOVEXT movzwl
#define EXT(S,D) movzwl S,D
#define SAX %ax
#define SBX %bx
#define SCX %cx
#define SDX %dx
#include "sync-2.inc"
/* Generate all 4 byte operations. */
#define SIZE 4
#define MOVEXT mov
#define EXT(S,D)
#define SAX %eax
#define SBX %ebx
#define SCX %ecx
#define SDX %edx
#include "sync-2.inc"
/* Generate all 8 byte operations. */
FUNC_CFI(sync_val_compare_and_swap_8)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl (%ecx), %eax
movl 4(%ecx), %edx
cmpl %eax, ARG(4)
jne 2f
cmpl %edx, ARG(8)
jne 2f
PUSHS(%esi)
movl ARG(12), %esi
movl %esi, (%ecx)
movl ARG(16), %esi
movl %esi, 4(%ecx)
POPS(%esi)
2: SPINLOCK_UNLOCK_AND_RET
#else
PUSHS(%ebx)
PUSHS(%esi)
movl 12(%esp), %esi
movl 16(%esp), %eax
movl 20(%esp), %edx
movl 24(%esp), %ebx
movl 28(%esp), %ecx
lock; cmpxchg8b (%esi)
LFENCE
POPS(%esi)
POPS(%ebx)
ret
#endif
END_CFI(sync_val_compare_and_swap_8)
FUNC_CFI(sync_bool_compare_and_swap_8)
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl ARG(4), %eax
movl ARG(8), %edx
cmpl %eax, (%ecx)
jne 2f
cmpl %edx, 4(%ecx)
jne 2f
movl ARG(12), %eax
movl ARG(16), %edx
movl %eax, (%ecx)
movl %edx, 4(%ecx)
movl $1, %eax
SPINLOCK_UNLOCK_AND_RET
2: xorl %eax, %eax
SPINLOCK_UNLOCK_AND_RET
#else
PUSHS(%ebx)
PUSHS(%esi)
movl 12(%esp), %esi
movl 16(%esp), %eax
movl 20(%esp), %edx
movl 24(%esp), %ebx
movl 28(%esp), %ecx
lock; cmpxchg8b (%esi)
LFENCE
setz %al
movzbl %al,%eax
POPS(%esi)
POPS(%ebx)
ret
#endif
END_CFI(sync_bool_compare_and_swap_8)
#ifdef SPINLOCK
#define SYNC_FETCH_AND_OP_8(N, OPLO, OPHI) \
FUNC_CFI(CAT4(sync_fetch_and_,N,_,8)); \
SPINLOCK_LOCK; \
PUSHS(%esi); \
PUSHS(%edi); \
/* Note that the ARG macro doesn't include the two \
pushes that we do above, so need to bias by 8. */ \
movl ARG(8), %ecx; \
movl ARG(12), %esi; \
movl ARG(16), %edi; \
movl (%ecx), %eax; \
movl 4(%ecx), %edx; \
OPLO(%eax, %esi); \
OPHI(%edx, %edi); \
movl %esi, (%ecx); \
movl %edi, 4(%ecx); \
POPS(%edi); \
POPS(%esi); \
SPINLOCK_UNLOCK_AND_RET; \
END_CFI(CAT4(sync_fetch_and_,N,_,8))
#else
#define SYNC_FETCH_AND_OP_8(N, OPLO, OPHI) \
FUNC_CFI(CAT4(sync_fetch_and_,N,_,8)); \
PUSHS(%ebx); \
PUSHS(%esi); \
PUSHS(%edi); \
PUSHS(%ebp); \
movl 20(%esp), %esi; \
movl 24(%esp), %edi; \
movl 28(%esp), %ebp; \
movl (%esi), %eax; \
movl 4(%esi), %edx; \
1: movl %eax, %ebx; \
movl %edx, %ecx; \
OPLO(%edi, %ebx); \
OPHI(%ebp, %ecx); \
lock; cmpxchg8b (%esi); \
jnz 1b; \
LFENCE; \
POPS(%ebp); \
POPS(%edi); \
POPS(%esi); \
POPS(%ebx); \
ret; \
END_CFI(CAT4(sync_fetch_and_,N,_,8))
#endif /* SPINLOCK */
#ifdef SPINLOCK
#define SYNC_OP_AND_FETCH_8(N, OPLO, OPHI) \
FUNC_CFI(CAT4(sync_,N,_and_fetch_,8)); \
SPINLOCK_LOCK; \
movl ARG(0), %ecx; \
movl ARG(4), %eax; \
movl ARG(8), %edx; \
OPLO((%ecx), %eax); \
OPHI(4(%ecx), %edx); \
movl %eax, (%ecx); \
movl %edx, 4(%ecx); \
SPINLOCK_UNLOCK_AND_RET; \
END_CFI(CAT4(sync_,N,_and_fetch_,8))
#else
#define SYNC_OP_AND_FETCH_8(N, OPLO, OPHI) \
FUNC_CFI(CAT4(sync_,N,_and_fetch_,8)); \
PUSHS(%ebx); \
PUSHS(%esi); \
PUSHS(%edi); \
PUSHS(%ebp); \
movl 20(%esp), %esi; \
movl 24(%esp), %edi; \
movl 28(%esp), %ebp; \
movl (%esi), %eax; \
movl 4(%esi), %edx; \
1: movl %eax, %ebx; \
movl %edx, %ecx; \
OPLO(%edi, %ebx); \
OPHI(%ebp, %ecx); \
lock; cmpxchg8b (%esi); \
jnz 1b; \
LFENCE; \
movl %ebx, %eax; \
movl %ecx, %edx; \
POPS(%ebp); \
POPS(%edi); \
POPS(%esi); \
POPS(%ebx); \
ret; \
END_CFI(CAT4(sync_,N,_and_fetch_,8))
#endif /* SPINLOCK */
SYNC_FETCH_AND_OP_8(add, ADD, ADC)
SYNC_FETCH_AND_OP_8(sub, SUB, SBB)
SYNC_FETCH_AND_OP_8(or, OR, OR)
SYNC_FETCH_AND_OP_8(and, AND, AND)
SYNC_FETCH_AND_OP_8(nand, NAND, NAND)
SYNC_OP_AND_FETCH_8(add, ADD, ADC)
SYNC_OP_AND_FETCH_8(sub, SUB, SBB)
SYNC_OP_AND_FETCH_8(or, OR, OR)
SYNC_OP_AND_FETCH_8(and, AND, AND)
SYNC_OP_AND_FETCH_8(nand, NAND, NAND)
/* Undef all macros defined herein for reinclude. */
#undef SYNC_FETCH_AND_OP_8
#undef SYNC_OP_AND_FETCH_8
/* Undef all the parameters to this file for reinclude. */
#undef PREFIX
#undef LFENCE
[-- Attachment #4: sync-2.inc --]
[-- Type: text/plain, Size: 3890 bytes --]
/* This file is logically a part of sync.S. It is included 3 times
into sync-1.inc with macros set for various operand sizes. */
FUNC_CFI(CAT(sync_val_compare_and_swap_,SIZE))
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
MOVEXT (%ecx), %eax
cmp SAX, ARG(4)
jne 1f
movl ARG(8), %edx
mov SDX, (%ecx)
1: SPINLOCK_UNLOCK_AND_RET
#else
movl 4(%esp), %ecx
MOVEXT 8(%esp), %eax
movl 12(%esp), %edx
lock; cmpxchg SDX, (%ecx)
LFENCE
ret
#endif
END_CFI(CAT(sync_val_compare_and_swap_,SIZE))
FUNC_CFI(CAT(sync_bool_compare_and_swap_,SIZE))
#ifdef SPINLOCK
SPINLOCK_LOCK
movl ARG(0), %ecx
movl ARG(4), %edx
xorl %eax, %eax
cmp SDX, (%ecx)
jne 1f
movl ARG(8), %edx
mov SDX, (%ecx)
incl %eax
1: SPINLOCK_UNLOCK_AND_RET
#else
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl 12(%esp), %edx
lock; cmpxchg SDX, (%ecx)
LFENCE
setz %al
movzbl %al, %eax
ret
#endif
END_CFI(CAT(sync_bool_compare_and_swap_,SIZE))
#ifndef SPINLOCK
/* This CFI covers all of the add and subtract functions. */
.p2align ALIGN
cfi_startproc
FUNC(CAT(sync_fetch_and_add_,SIZE))
movl 4(%esp), %ecx
movl 8(%esp), %eax
lock; xadd SAX, (%ecx)
LFENCE
EXT(SAX, %eax)
ret
END(CAT(sync_fetch_and_add_,SIZE))
FUNC(CAT(sync_add_and_fetch_,SIZE))
movl 4(%esp), %ecx
movl 8(%esp), %eax
movl %eax, %edx
lock; xadd SDX, (%ecx)
LFENCE
add SDX, SAX
EXT(SAX, %eax)
ret
END(CAT(sync_add_and_fetch_,SIZE))
FUNC(CAT(sync_fetch_and_sub_,SIZE))
movl 4(%esp), %ecx
movl 8(%esp), %eax
negl %eax
lock; xadd SAX, (%ecx)
LFENCE
EXT(SAX, %eax)
ret
END(CAT(sync_fetch_and_sub_,SIZE))
FUNC(CAT(sync_sub_and_fetch_,SIZE))
movl 4(%esp), %ecx
movl 8(%esp), %eax
negl %eax
movl %eax, %edx
lock; xadd SDX, (%ecx)
LFENCE
add SDX, SAX
EXT(SAX, %eax)
ret
END(CAT(sync_sub_and_fetch_,SIZE))
cfi_endproc
#endif /* SPINLOCK */
\f
#ifdef SPINLOCK
#define SYNC_FETCH_AND_OP(N, OP) \
FUNC_CFI(CAT4(sync_fetch_and_,N,_,SIZE)); \
SPINLOCK_LOCK; \
movl ARG(0), %ecx; \
movl ARG(4), %edx; \
MOVEXT (%ecx), %eax; \
OP(%eax, %edx); \
mov SDX, (%ecx); \
SPINLOCK_UNLOCK_AND_RET; \
END_CFI(CAT4(sync_fetch_and_,N,_,SIZE))
#else
#define SYNC_FETCH_AND_OP(N, OP) \
FUNC_CFI(CAT4(sync_fetch_and_,N,_,SIZE)); \
PUSHS(%ebx); \
movl 8(%esp), %ecx; \
movl 12(%esp), %ebx; \
mov (%ecx), SAX; \
1: mov SAX, SDX; \
OP(SBX, SDX); \
lock; cmpxchg SDX, (%ecx); \
jnz 1b; \
LFENCE; \
EXT(SAX, %eax); \
POPS(%ebx); \
ret; \
END_CFI(CAT4(sync_fetch_and_,N,_,S))
#endif /* SPINLOCK */
#ifdef SPINLOCK
#define SYNC_OP_AND_FETCH(N, OP) \
FUNC_CFI(CAT4(sync_,N,_and_fetch_,SIZE)); \
SPINLOCK_LOCK; \
movl ARG(0), %ecx; \
movl ARG(4), %eax; \
OP((%ecx), SAX); \
mov SAX, (%ecx); \
EXT(SAX, %eax); \
SPINLOCK_UNLOCK_AND_RET; \
END_CFI(CAT4(sync_,N,_and_fetch_,SIZE))
#else
#define SYNC_OP_AND_FETCH(N, OP) \
FUNC_CFI(CAT4(sync_,N,_and_fetch_,SIZE)); \
PUSHS(%ebx); \
movl 8(%esp), %ecx; \
movl 12(%esp), %ebx; \
mov (%ecx), SAX; \
1: mov SAX, SDX; \
OP(SBX, SDX); \
lock; cmpxchg SDX, (%ecx); \
jnz 1b; \
LFENCE; \
MOVEXT SDX, %eax; \
POPS(%ebx); \
ret; \
END_CFI(CAT4(sync_,N,_and_fetch_,SIZE))
#endif /* SPINLOCK */
#ifdef SPINLOCK
SYNC_FETCH_AND_OP(add, ADD)
SYNC_FETCH_AND_OP(sub, SUB)
#endif
SYNC_FETCH_AND_OP(or, OR)
SYNC_FETCH_AND_OP(and, AND)
SYNC_FETCH_AND_OP(nand, NAND)
#ifdef SPINLOCK
SYNC_OP_AND_FETCH(add, ADD)
SYNC_OP_AND_FETCH(sub, SUB)
#endif
SYNC_OP_AND_FETCH(or, OR)
SYNC_OP_AND_FETCH(and, AND)
SYNC_OP_AND_FETCH(nand, NAND)
/* Undef all macros defined herein for reinclude. */
#undef SYNC_FETCH_AND_OP
#undef SYNC_OP_AND_FETCH
/* Undef all the parameters to this file for reinclude. */
#undef SIZE
#undef MOVEXT
#undef EXT
#undef SAX
#undef SBX
#undef SCX
#undef SDX
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-16 23:48 ` Richard Henderson
@ 2009-10-17 18:14 ` Paolo Bonzini
2009-10-17 18:51 ` Uros Bizjak
0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2009-10-17 18:14 UTC (permalink / raw)
To: Richard Henderson
Cc: Joseph S. Myers, Richard Henderson, GCC Patches, ro, dannysmith, ubizjak
On 10/17/2009 01:27 AM, Richard Henderson wrote:
> #define NAND(S,D) not D; and S,D
I'm not 100% positive, but shouldn't this be "not S; and S,D" (as in
"clear the given bits")?
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-17 18:14 ` Paolo Bonzini
@ 2009-10-17 18:51 ` Uros Bizjak
2009-10-17 22:31 ` Paolo Bonzini
0 siblings, 1 reply; 11+ messages in thread
From: Uros Bizjak @ 2009-10-17 18:51 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Richard Henderson, Joseph S. Myers, Richard Henderson,
GCC Patches, ro, dannysmith
On 10/17/2009 08:02 PM, Paolo Bonzini wrote:
> On 10/17/2009 01:27 AM, Richard Henderson wrote:
>> #define NAND(S,D) not D; and S,D
>
> I'm not 100% positive, but shouldn't this be "not S; and S,D" (as in
> "clear the given bits")?
This should be "and S,D; not D", since NAND stands for NOT AND.
Uros.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-17 18:51 ` Uros Bizjak
@ 2009-10-17 22:31 ` Paolo Bonzini
2009-10-18 11:58 ` Uros Bizjak
0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2009-10-17 22:31 UTC (permalink / raw)
To: Uros Bizjak
Cc: Richard Henderson, Joseph S. Myers, Richard Henderson,
GCC Patches, ro, dannysmith
On 10/17/2009 08:22 PM, Uros Bizjak wrote:
> On 10/17/2009 08:02 PM, Paolo Bonzini wrote:
>> On 10/17/2009 01:27 AM, Richard Henderson wrote:
>>> #define NAND(S,D) not D; and S,D
>>
>> I'm not 100% positive, but shouldn't this be "not S; and S,D" (as in
>> "clear the given bits")?
>
> This should be "and S,D; not D", since NAND stands for NOT AND.
Not in sync builtins (nand is actually andn). You fixed that bug IIRC.
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-17 22:31 ` Paolo Bonzini
@ 2009-10-18 11:58 ` Uros Bizjak
0 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2009-10-18 11:58 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Richard Henderson, Joseph S. Myers, Richard Henderson,
GCC Patches, ro, dannysmith
On 10/17/2009 11:13 PM, Paolo Bonzini wrote:
>>>> #define NAND(S,D) not D; and S,D
>>>
>>> I'm not 100% positive, but shouldn't this be "not S; and S,D" (as in
>>> "clear the given bits")?
>>
>> This should be "and S,D; not D", since NAND stands for NOT AND.
>
>
> Not in sync builtins (nand is actually andn). You fixed that bug IIRC.
No! it is actually NAND, that is "not(a b)". Please look at [1], my
patch fixed the bug the other way around.
[1] http://gcc.gnu.org/ml/gcc-patches/2008-10/msg01214.html
Uros.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-16 23:11 [CFT] i386 sync functions for PR 39677 Richard Henderson
2009-10-16 23:27 ` Joseph S. Myers
@ 2009-10-19 14:24 ` Rainer Orth
2009-10-24 6:51 ` Danny Smith
2 siblings, 0 replies; 11+ messages in thread
From: Rainer Orth @ 2009-10-19 14:24 UTC (permalink / raw)
To: Richard Henderson; +Cc: GCC Patches, dannysmith, ubizjak
Richard Henderson <rth@twiddle.net> writes:
> Second, I'd like to ask different port maintainers (cygwin and solaris
> particularly) to try to compile the code and report any portability
> problems. Use any relevant combinations of:
>
> -fpic
> -DHAVE_GAS_CFI_DIRECTIVE
> -DHAVE_GNU_INDIRECT_FUNCTION
I've tried the updated versions on Solaris 10/x86, with current mainline
configured to use /usr/sfw/bin/gas (gas 2.15).
It compiles without any additional options, but fails with -fpic:
/var/tmp//ccQxxfbf.s: Assembler messages:
/var/tmp//ccQxxfbf.s:755: Warning: setting incorrect section attributes for .text.__i686.get_pc_thunk.bx
/var/tmp//ccQxxfbf.s:762: Warning: setting incorrect section attributes for .text.__i686.get_pc_thunk.cx
Otherwise, all combinations of HAVE_GAS_CFI_DIRECTIVE and
HAVE_GNU_INDIRECT_FUNCTION work, although the compilers' auto-host.h
defines HAVE_GAS_CFI_DIRECTIVE as 0. The latter means that the test
should be changed from #ifdef HAVE_GAS_CFI_DIRECTIVE to #if
HAVE_GAS_CFI_DIRECTIVE.
Rainer
--
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-16 23:11 [CFT] i386 sync functions for PR 39677 Richard Henderson
2009-10-16 23:27 ` Joseph S. Myers
2009-10-19 14:24 ` Rainer Orth
@ 2009-10-24 6:51 ` Danny Smith
2 siblings, 0 replies; 11+ messages in thread
From: Danny Smith @ 2009-10-24 6:51 UTC (permalink / raw)
To: Richard Henderson, GCC Patches; +Cc: ro, ubizjak
> Second, I'd like to ask different port maintainers (cygwin and solaris
> particularly) to try to compile the code and report any portability
> problems. Use any relevant combinations of:
>
Sorry for delay
On mingw32:
gcc -c sync.S
sync.S: Assembler messages:
sync.S:416: Error: unknown pseudo-op: `.local'
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
2009-10-17 6:12 Ross Ridge
@ 2009-10-17 17:14 ` Richard Henderson
0 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2009-10-17 17:14 UTC (permalink / raw)
To: Ross Ridge; +Cc: gcc-patches
On 10/16/2009 11:07 PM, Ross Ridge wrote:
> Is really worth support supporting older CPUs? It wouldn't seem
> unreasonble to me for GCC to require that the CPU support at least the
> CMPXCHG8 instruction (ie. Pentium or better) to use the __sync* functions.
>
> Not that it really matters, but your CPUID logic seems a bit wrong.
> All Intel 80486 CPUs supported CMPXCHG, but only the later ones supported
> CPUID.
You're quite right -- I'd misremembered cmpxchg in 586.
And having slept on the problem I'm sure I don't think we should get
into the spinlock thing -- there's too many ways for that to go wrong.
I'll simplify the cpuid check to only look for the amd cpu with the
eratta, and use the atomic instructions as needed. The resulting sigill
will be no different than at present, when we simply compile the
libraries with -march=i586.
r~
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CFT] i386 sync functions for PR 39677
@ 2009-10-17 6:12 Ross Ridge
2009-10-17 17:14 ` Richard Henderson
0 siblings, 1 reply; 11+ messages in thread
From: Ross Ridge @ 2009-10-17 6:12 UTC (permalink / raw)
To: gcc-patches
Richard Henderson writes:
> Given that we now have a central location for handling
> atomic synchronization, handle 80386 and 80486 via spinlock.
> This means that we'll no longer have to inject -march=i586
> for compiling some of our runtime libraries.
Is really worth support supporting older CPUs? It wouldn't seem
unreasonble to me for GCC to require that the CPU support at least the
CMPXCHG8 instruction (ie. Pentium or better) to use the __sync* functions.
Not that it really matters, but your CPUID logic seems a bit wrong.
All Intel 80486 CPUs supported CMPXCHG, but only the later ones supported
CPUID.
Ross Ridge
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-10-24 6:33 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-16 23:11 [CFT] i386 sync functions for PR 39677 Richard Henderson
2009-10-16 23:27 ` Joseph S. Myers
2009-10-16 23:48 ` Richard Henderson
2009-10-17 18:14 ` Paolo Bonzini
2009-10-17 18:51 ` Uros Bizjak
2009-10-17 22:31 ` Paolo Bonzini
2009-10-18 11:58 ` Uros Bizjak
2009-10-19 14:24 ` Rainer Orth
2009-10-24 6:51 ` Danny Smith
2009-10-17 6:12 Ross Ridge
2009-10-17 17:14 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).