* [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware @ 2016-11-01 15:08 Andreas Larsson 2016-11-01 15:08 ` [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported Andreas Larsson ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Andreas Larsson @ 2016-11-01 15:08 UTC (permalink / raw) To: GNU C Library Cc: Adhemerval Zanella, Carlos O'Donell, David Miller, Torvald Riegel, software This patch series: 1) Fixes a sparcv8 bug introduced since the #error was added to sysdeps/sparc/sparc32/pthread_barrier_wait.c in 2.23. This fix stops incorrect usage of sendmsg and recvmsg Linux system calls for sparcv8. 2) Makes use of the CASA compare and swap instruction for atomic_* functions sparcv8, that is available for most LEON3 and LEON4 designs and implied by -mcpu=leon3, but not part of the sparcv8 standard. To allow for easy kernel emulation on systems that lack the instruction, the CASA instruction is used for all writing atomic_* functions. This approach is discussed in thread [1]. Any comments are most welcome. The spin lock based sparcv8 semaphore implementation is currently unchanged by this patchset, but I would say that that should go as well. I will look into that as well, but I have not tested that yet. [1] https://sourceware.org/ml/libc-alpha/2016-10/msg00344.html Andreas Larsson (2): sparc32: Mark sendmsg and recvmsg system calls as unsupported sparc32: Use cas for atomic_* operations and use general pthread_barrier_wait sysdeps/sparc/sparc32/atomic-machine.h | 228 +++++------------------ sysdeps/sparc/sparc32/pthread_barrier_wait.c | 1 - sysdeps/unix/sysv/linux/sparc/kernel-features.h | 4 +- 3 files changed, 53 insertions(+), 180 deletions(-) delete mode 100644 sysdeps/sparc/sparc32/pthread_barrier_wait.c ^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported 2016-11-01 15:08 [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware Andreas Larsson @ 2016-11-01 15:08 ` Andreas Larsson 2016-11-01 17:28 ` Adhemerval Zanella 2016-11-04 18:36 ` David Miller 2016-11-01 15:08 ` [RFC][PATCH 2/2] sparc32: Use cas for atomic_* operations and use general pthread_barrier_wait Andreas Larsson 2016-11-01 16:00 ` [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware Torvald Riegel 2 siblings, 2 replies; 30+ messages in thread From: Andreas Larsson @ 2016-11-01 15:08 UTC (permalink / raw) To: GNU C Library Cc: Adhemerval Zanella, Carlos O'Donell, David Miller, Torvald Riegel, software This fixes a bug introduced by abf29edd4a3918 that missed fixing up sparc32 in the change. * sysdeps/unix/sysv/linux/sparc/kernel-features.h: Undefine __ASSUME_SENDMSG_SYSCALL and __ASSUME_RECVMSG_SYSCALL for 32-bit sparcv8 --- sysdeps/unix/sysv/linux/sparc/kernel-features.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sysdeps/unix/sysv/linux/sparc/kernel-features.h b/sysdeps/unix/sysv/linux/sparc/kernel-features.h index 69c9c7c..db3f5cd 100644 --- a/sysdeps/unix/sysv/linux/sparc/kernel-features.h +++ b/sysdeps/unix/sysv/linux/sparc/kernel-features.h @@ -32,8 +32,10 @@ #include_next <kernel-features.h> /* 32-bit SPARC kernels do not support - futex_atomic_cmpxchg_inatomic. */ + futex_atomic_cmpxchg_inatomic or sendmsg/recvmsg. */ #if !defined __arch64__ && !defined __sparc_v9__ # undef __ASSUME_REQUEUE_PI # undef __ASSUME_SET_ROBUST_LIST +# undef __ASSUME_SENDMSG_SYSCALL +# undef __ASSUME_RECVMSG_SYSCALL #endif ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported 2016-11-01 15:08 ` [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported Andreas Larsson @ 2016-11-01 17:28 ` Adhemerval Zanella 2016-11-02 11:38 ` Andreas Larsson 2016-11-04 18:36 ` David Miller 1 sibling, 1 reply; 30+ messages in thread From: Adhemerval Zanella @ 2016-11-01 17:28 UTC (permalink / raw) To: Andreas Larsson, GNU C Library Cc: Carlos O'Donell, David Miller, Torvald Riegel, software On 01/11/2016 13:07, Andreas Larsson wrote: > This fixes a bug introduced by abf29edd4a3918 that missed fixing up > sparc32 in the change. > > * sysdeps/unix/sysv/linux/sparc/kernel-features.h: Undefine > __ASSUME_SENDMSG_SYSCALL and __ASSUME_RECVMSG_SYSCALL for 32-bit > sparcv8 > --- > sysdeps/unix/sysv/linux/sparc/kernel-features.h | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/sysdeps/unix/sysv/linux/sparc/kernel-features.h b/sysdeps/unix/sysv/linux/sparc/kernel-features.h > index 69c9c7c..db3f5cd 100644 > --- a/sysdeps/unix/sysv/linux/sparc/kernel-features.h > +++ b/sysdeps/unix/sysv/linux/sparc/kernel-features.h > @@ -32,8 +32,10 @@ > #include_next <kernel-features.h> > > /* 32-bit SPARC kernels do not support > - futex_atomic_cmpxchg_inatomic. */ > + futex_atomic_cmpxchg_inatomic or sendmsg/recvmsg. */ > #if !defined __arch64__ && !defined __sparc_v9__ > # undef __ASSUME_REQUEUE_PI > # undef __ASSUME_SET_ROBUST_LIST > +# undef __ASSUME_SENDMSG_SYSCALL > +# undef __ASSUME_RECVMSG_SYSCALL > #endif > At least the kernel headers for Linux 3.2 on sparc defined both __NR_recvmsg and __NR_sendmsg. Also, checking 'arch/sparc/kernel/sys32.S' on 3.2 does seems that sparc32 have both recvmsg and sendmsg wire-up. Am I missing something here? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported 2016-11-01 17:28 ` Adhemerval Zanella @ 2016-11-02 11:38 ` Andreas Larsson 2016-11-02 12:49 ` Adhemerval Zanella 0 siblings, 1 reply; 30+ messages in thread From: Andreas Larsson @ 2016-11-02 11:38 UTC (permalink / raw) To: Adhemerval Zanella, GNU C Library Cc: Carlos O'Donell, David Miller, Torvald Riegel, software On 2016-11-01 18:28, Adhemerval Zanella wrote: > > > On 01/11/2016 13:07, Andreas Larsson wrote: >> This fixes a bug introduced by abf29edd4a3918 that missed fixing up >> sparc32 in the change. >> >> * sysdeps/unix/sysv/linux/sparc/kernel-features.h: Undefine >> __ASSUME_SENDMSG_SYSCALL and __ASSUME_RECVMSG_SYSCALL for 32-bit >> sparcv8 >> --- >> sysdeps/unix/sysv/linux/sparc/kernel-features.h | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/sysdeps/unix/sysv/linux/sparc/kernel-features.h b/sysdeps/unix/sysv/linux/sparc/kernel-features.h >> index 69c9c7c..db3f5cd 100644 >> --- a/sysdeps/unix/sysv/linux/sparc/kernel-features.h >> +++ b/sysdeps/unix/sysv/linux/sparc/kernel-features.h >> @@ -32,8 +32,10 @@ >> #include_next <kernel-features.h> >> >> /* 32-bit SPARC kernels do not support >> - futex_atomic_cmpxchg_inatomic. */ >> + futex_atomic_cmpxchg_inatomic or sendmsg/recvmsg. */ >> #if !defined __arch64__ && !defined __sparc_v9__ >> # undef __ASSUME_REQUEUE_PI >> # undef __ASSUME_SET_ROBUST_LIST >> +# undef __ASSUME_SENDMSG_SYSCALL >> +# undef __ASSUME_RECVMSG_SYSCALL >> #endif >> > > At least the kernel headers for Linux 3.2 on sparc defined both __NR_recvmsg > and __NR_sendmsg. Also, checking 'arch/sparc/kernel/sys32.S' on 3.2 does > seems that sparc32 have both recvmsg and sendmsg wire-up. Am I missing > something here? [resent to correct topic - I should apparently stay away from my mail client today] Linux kernel commit 8b30ca73b7cc7f2177cfc4e8274d2ebdba328cd5 added sys_sendmsg and sys_recvmsg to the sys_call_table in arch/sparc/kernel/systbls_32.S. So sparc32 kernels prior to Linux 4.4 do not support them as straight up system calls. Best regards, Andreas Larsson ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported 2016-11-02 11:38 ` Andreas Larsson @ 2016-11-02 12:49 ` Adhemerval Zanella 0 siblings, 0 replies; 30+ messages in thread From: Adhemerval Zanella @ 2016-11-02 12:49 UTC (permalink / raw) To: Andreas Larsson, GNU C Library Cc: Carlos O'Donell, David Miller, Torvald Riegel, software On 02/11/2016 09:35, Andreas Larsson wrote: > On 2016-11-01 18:28, Adhemerval Zanella wrote: >> >> >> On 01/11/2016 13:07, Andreas Larsson wrote: >>> This fixes a bug introduced by abf29edd4a3918 that missed fixing up >>> sparc32 in the change. >>> >>> * sysdeps/unix/sysv/linux/sparc/kernel-features.h: Undefine >>> __ASSUME_SENDMSG_SYSCALL and __ASSUME_RECVMSG_SYSCALL for 32-bit >>> sparcv8 >>> --- >>> sysdeps/unix/sysv/linux/sparc/kernel-features.h | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/sysdeps/unix/sysv/linux/sparc/kernel-features.h b/sysdeps/unix/sysv/linux/sparc/kernel-features.h >>> index 69c9c7c..db3f5cd 100644 >>> --- a/sysdeps/unix/sysv/linux/sparc/kernel-features.h >>> +++ b/sysdeps/unix/sysv/linux/sparc/kernel-features.h >>> @@ -32,8 +32,10 @@ >>> #include_next <kernel-features.h> >>> >>> /* 32-bit SPARC kernels do not support >>> - futex_atomic_cmpxchg_inatomic. */ >>> + futex_atomic_cmpxchg_inatomic or sendmsg/recvmsg. */ >>> #if !defined __arch64__ && !defined __sparc_v9__ >>> # undef __ASSUME_REQUEUE_PI >>> # undef __ASSUME_SET_ROBUST_LIST >>> +# undef __ASSUME_SENDMSG_SYSCALL >>> +# undef __ASSUME_RECVMSG_SYSCALL >>> #endif >>> >> >> At least the kernel headers for Linux 3.2 on sparc defined both __NR_recvmsg >> and __NR_sendmsg. Also, checking 'arch/sparc/kernel/sys32.S' on 3.2 does >> seems that sparc32 have both recvmsg and sendmsg wire-up. Am I missing >> something here? > > [resent to correct topic - I should apparently stay away from my mail client today] > > Linux kernel commit 8b30ca73b7cc7f2177cfc4e8274d2ebdba328cd5 added > sys_sendmsg and sys_recvmsg to the sys_call_table in > arch/sparc/kernel/systbls_32.S. So sparc32 kernels prior to Linux 4.4 do > not support them as straight up system calls. > > Best regards, > Andreas Larsson Right, so it is advertise to userland through __SYS/__NR macros even though it is not really supported? Even though, I think a better solution would be: /* All direct socketcalls are available only with kernel 4.4. */ #if __LINUX_KERNEL_VERSION < 0x040400 # undef __ASSUME_SENDMSG_SYSCALL # undef __ASSUME_RECVMSG_SYSCALL #endif ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported 2016-11-01 15:08 ` [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported Andreas Larsson 2016-11-01 17:28 ` Adhemerval Zanella @ 2016-11-04 18:36 ` David Miller 1 sibling, 0 replies; 30+ messages in thread From: David Miller @ 2016-11-04 18:36 UTC (permalink / raw) To: andreas; +Cc: libc-alpha, adhemerval.zanella, carlos, triegel, software From: Andreas Larsson <andreas@gaisler.com> Date: Tue, 1 Nov 2016 16:07:46 +0100 > @@ -32,8 +32,10 @@ > #include_next <kernel-features.h> > > /* 32-bit SPARC kernels do not support > - futex_atomic_cmpxchg_inatomic. */ > + futex_atomic_cmpxchg_inatomic or sendmsg/recvmsg. */ > #if !defined __arch64__ && !defined __sparc_v9__ > # undef __ASSUME_REQUEUE_PI > # undef __ASSUME_SET_ROBUST_LIST > +# undef __ASSUME_SENDMSG_SYSCALL > +# undef __ASSUME_RECVMSG_SYSCALL > #endif As mentioned elsewhere, these are available on 4.4. and later kernels. ^ permalink raw reply [flat|nested] 30+ messages in thread
* [RFC][PATCH 2/2] sparc32: Use cas for atomic_* operations and use general pthread_barrier_wait 2016-11-01 15:08 [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware Andreas Larsson 2016-11-01 15:08 ` [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported Andreas Larsson @ 2016-11-01 15:08 ` Andreas Larsson 2016-11-04 18:37 ` David Miller 2016-11-01 16:00 ` [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware Torvald Riegel 2 siblings, 1 reply; 30+ messages in thread From: Andreas Larsson @ 2016-11-01 15:08 UTC (permalink / raw) To: GNU C Library Cc: Adhemerval Zanella, Carlos O'Donell, David Miller, Torvald Riegel, software This uses the CASA compare and swap with user space data access ASI 0xa that is present on many LEON3 and LEON4 systems and that is implied by gcc's -mcpu=leon3. The CASA instruction is used not only for atomic compare and exchange functions, but also atomic exchange functions and atomic write functions. This is to allow the OS kernel to emulate that instruction on systems where it is missing and to get atomicity between all atomic writing functions without having to resort to stop all CPU:s in an SMP system. * sysdeps/sparc/sparc32/atomic-machine.h: Use CASA instruction instead of spinlocks for sparcv8. * sysdeps/sparc/sparc32/pthread_barrier_wait.c: Remove file --- sysdeps/sparc/sparc32/atomic-machine.h | 228 ++++++-------------------- sysdeps/sparc/sparc32/pthread_barrier_wait.c | 1 - 2 files changed, 50 insertions(+), 179 deletions(-) delete mode 100644 sysdeps/sparc/sparc32/pthread_barrier_wait.c diff --git a/sysdeps/sparc/sparc32/atomic-machine.h b/sysdeps/sparc/sparc32/atomic-machine.h index d6e68f9..818f4e2 100644 --- a/sysdeps/sparc/sparc32/atomic-machine.h +++ b/sysdeps/sparc/sparc32/atomic-machine.h @@ -50,9 +50,8 @@ typedef uintmax_t uatomic_max_t; #define __HAVE_64B_ATOMICS 0 #define USE_ATOMIC_COMPILER_BUILTINS 0 - -/* We have no compare and swap, just test and set. - The following implementation contends on 64 global locks +/* We might have no hardware compare and swap, just test and set. + The following __sparc32_atomic implementation contends on 64 global locks per library and assumes no variable will be accessed using atomic.h macros from two different libraries. */ @@ -110,6 +109,30 @@ volatile unsigned char __sparc32_atomic_locks[64] } \ while (0) +#define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \ + (abort (), (__typeof (*mem)) 0) + +#define __arch_compare_and_exchange_val_16_acq(mem, newval, oldval) \ + (abort (), (__typeof (*mem)) 0) + +#define __arch_compare_and_exchange_val_64_acq(mem, newval, oldval) \ + (abort (), (__typeof (*mem)) 0) + +#define __v7_compare_and_exchange_val_32_acq(mem, newval, oldval) \ +({union { __typeof (oldval) a; uint32_t v; } oldval_arg = { .a = (oldval) }; \ + union { __typeof (newval) a; uint32_t v; } newval_arg = { .a = (newval) }; \ + register uint32_t __acev_tmp __asm ("%g6"); \ + register __typeof (mem) __acev_mem __asm ("%g1") = (mem); \ + register uint32_t __acev_oldval __asm ("%g5"); \ + __acev_tmp = newval_arg.v; \ + __acev_oldval = oldval_arg.v; \ + /* .word 0xcde04145 is casa [%g1] 0xa, %g5, %g6. Can't use casa here \ + though because assembler will not allow it for plain V8 arch. */ \ + __asm __volatile (".word 0xcde04145" \ + : "+r" (__acev_tmp), "=m" (*__acev_mem) \ + : "r" (__acev_oldval), "m" (*__acev_mem), \ + "r" (__acev_mem) : "memory"); \ + (__typeof (oldval)) __acev_tmp; }) #ifndef SHARED # define __v9_compare_and_exchange_val_32_acq(mem, newval, oldval) \ @@ -127,82 +150,31 @@ volatile unsigned char __sparc32_atomic_locks[64] : "r" (__acev_oldval), "m" (*__acev_mem), \ "r" (__acev_mem) : "memory"); \ (__typeof (oldval)) __acev_tmp; }) -#endif -/* The only basic operation needed is compare and exchange. */ -#define __v7_compare_and_exchange_val_acq(mem, newval, oldval) \ - ({ __typeof (mem) __acev_memp = (mem); \ - __typeof (*mem) __acev_ret; \ - __typeof (*mem) __acev_newval = (newval); \ - \ - __sparc32_atomic_do_lock (__acev_memp); \ - __acev_ret = *__acev_memp; \ - if (__acev_ret == (oldval)) \ - *__acev_memp = __acev_newval; \ - __sparc32_atomic_do_unlock (__acev_memp); \ - __acev_ret; }) - -#define __v7_compare_and_exchange_bool_acq(mem, newval, oldval) \ - ({ __typeof (mem) __aceb_memp = (mem); \ - int __aceb_ret; \ - __typeof (*mem) __aceb_newval = (newval); \ - \ - __sparc32_atomic_do_lock (__aceb_memp); \ - __aceb_ret = 0; \ - if (*__aceb_memp == (oldval)) \ - *__aceb_memp = __aceb_newval; \ - else \ - __aceb_ret = 1; \ - __sparc32_atomic_do_unlock (__aceb_memp); \ - __aceb_ret; }) - -#define __v7_exchange_acq(mem, newval) \ - ({ __typeof (mem) __acev_memp = (mem); \ - __typeof (*mem) __acev_ret; \ - __typeof (*mem) __acev_newval = (newval); \ - \ - __sparc32_atomic_do_lock (__acev_memp); \ - __acev_ret = *__acev_memp; \ - *__acev_memp = __acev_newval; \ - __sparc32_atomic_do_unlock (__acev_memp); \ - __acev_ret; }) - -#define __v7_exchange_and_add(mem, value) \ - ({ __typeof (mem) __acev_memp = (mem); \ - __typeof (*mem) __acev_ret; \ - \ - __sparc32_atomic_do_lock (__acev_memp); \ - __acev_ret = *__acev_memp; \ - *__acev_memp = __acev_ret + (value); \ - __sparc32_atomic_do_unlock (__acev_memp); \ - __acev_ret; }) - -/* Special versions, which guarantee that top 8 bits of all values - are cleared and use those bits as the ldstub lock. */ -#define __v7_compare_and_exchange_val_24_acq(mem, newval, oldval) \ - ({ __typeof (mem) __acev_memp = (mem); \ - __typeof (*mem) __acev_ret; \ - __typeof (*mem) __acev_newval = (newval); \ - \ - __sparc32_atomic_do_lock24 (__acev_memp); \ - __acev_ret = *__acev_memp & 0xffffff; \ - if (__acev_ret == (oldval)) \ - *__acev_memp = __acev_newval; \ +# define __arch_compare_and_exchange_val_32_acq(mem, newval, oldval) \ + ({ __typeof (oldval) __acev_wret; \ + if (__atomic_is_v9) \ + __acev_wret \ + = __v9_compare_and_exchange_val_32_acq (mem, newval, \ + oldval); \ else \ - __sparc32_atomic_do_unlock24 (__acev_memp); \ - __asm __volatile ("" ::: "memory"); \ - __acev_ret; }) - -#define __v7_exchange_24_rel(mem, newval) \ - ({ __typeof (mem) __acev_memp = (mem); \ - __typeof (*mem) __acev_ret; \ - __typeof (*mem) __acev_newval = (newval); \ - \ - __sparc32_atomic_do_lock24 (__acev_memp); \ - __acev_ret = *__acev_memp & 0xffffff; \ - *__acev_memp = __acev_newval; \ - __asm __volatile ("" ::: "memory"); \ - __acev_ret; }) + __acev_wret \ + = __v7_compare_and_exchange_val_32_acq (mem, newval, \ + oldval); \ + __acev_wret; }) +#else +# define __arch_compare_and_exchange_val_32_acq(mem, newval, oldval) \ + __v7_compare_and_exchange_val_32_acq(mem, newval, oldval) +#endif + +#define atomic_compare_and_exchange_val_24_acq(mem, newval, oldval) \ + atomic_compare_and_exchange_val_acq (mem, newval, oldval) + +#define atomic_exchange_24_rel(mem, newval) \ + atomic_exchange_rel (mem, newval) + +#define atomic_store_relaxed(mem, newval) \ + do { (void) atomic_exchange_rel(mem, newval); } while (0) #ifdef SHARED @@ -210,30 +182,6 @@ volatile unsigned char __sparc32_atomic_locks[64] used on pre-v9 CPU. */ # define __atomic_is_v9 0 -# define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \ - __v7_compare_and_exchange_val_acq (mem, newval, oldval) - -# define atomic_compare_and_exchange_bool_acq(mem, newval, oldval) \ - __v7_compare_and_exchange_bool_acq (mem, newval, oldval) - -# define atomic_exchange_acq(mem, newval) \ - __v7_exchange_acq (mem, newval) - -# define atomic_exchange_and_add(mem, value) \ - __v7_exchange_and_add (mem, value) - -# define atomic_compare_and_exchange_val_24_acq(mem, newval, oldval) \ - ({ \ - if (sizeof (*mem) != 4) \ - abort (); \ - __v7_compare_and_exchange_val_24_acq (mem, newval, oldval); }) - -# define atomic_exchange_24_rel(mem, newval) \ - ({ \ - if (sizeof (*mem) != 4) \ - abort (); \ - __v7_exchange_24_rel (mem, newval); }) - # define atomic_full_barrier() __asm ("" ::: "memory") # define atomic_read_barrier() atomic_full_barrier () # define atomic_write_barrier() atomic_full_barrier () @@ -250,82 +198,6 @@ extern uint64_t _dl_hwcap __attribute__((weak)); (__builtin_expect (&_dl_hwcap != 0, 1) \ && __builtin_expect (_dl_hwcap & HWCAP_SPARC_V9, HWCAP_SPARC_V9)) -# define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \ - ({ \ - __typeof (*mem) __acev_wret; \ - if (sizeof (*mem) != 4) \ - abort (); \ - if (__atomic_is_v9) \ - __acev_wret \ - = __v9_compare_and_exchange_val_32_acq (mem, newval, oldval);\ - else \ - __acev_wret \ - = __v7_compare_and_exchange_val_acq (mem, newval, oldval); \ - __acev_wret; }) - -# define atomic_compare_and_exchange_bool_acq(mem, newval, oldval) \ - ({ \ - int __acev_wret; \ - if (sizeof (*mem) != 4) \ - abort (); \ - if (__atomic_is_v9) \ - { \ - __typeof (oldval) __acev_woldval = (oldval); \ - __acev_wret \ - = __v9_compare_and_exchange_val_32_acq (mem, newval, \ - __acev_woldval) \ - != __acev_woldval; \ - } \ - else \ - __acev_wret \ - = __v7_compare_and_exchange_bool_acq (mem, newval, oldval); \ - __acev_wret; }) - -# define atomic_exchange_rel(mem, newval) \ - ({ \ - __typeof (*mem) __acev_wret; \ - if (sizeof (*mem) != 4) \ - abort (); \ - if (__atomic_is_v9) \ - { \ - __typeof (mem) __acev_wmemp = (mem); \ - __typeof (*(mem)) __acev_wval = (newval); \ - do \ - __acev_wret = *__acev_wmemp; \ - while (__builtin_expect \ - (__v9_compare_and_exchange_val_32_acq (__acev_wmemp,\ - __acev_wval, \ - __acev_wret) \ - != __acev_wret, 0)); \ - } \ - else \ - __acev_wret = __v7_exchange_acq (mem, newval); \ - __acev_wret; }) - -# define atomic_compare_and_exchange_val_24_acq(mem, newval, oldval) \ - ({ \ - __typeof (*mem) __acev_wret; \ - if (sizeof (*mem) != 4) \ - abort (); \ - if (__atomic_is_v9) \ - __acev_wret \ - = __v9_compare_and_exchange_val_32_acq (mem, newval, oldval);\ - else \ - __acev_wret \ - = __v7_compare_and_exchange_val_24_acq (mem, newval, oldval);\ - __acev_wret; }) - -# define atomic_exchange_24_rel(mem, newval) \ - ({ \ - __typeof (*mem) __acev_w24ret; \ - if (sizeof (*mem) != 4) \ - abort (); \ - if (__atomic_is_v9) \ - __acev_w24ret = atomic_exchange_rel (mem, newval); \ - else \ - __acev_w24ret = __v7_exchange_24_rel (mem, newval); \ - __acev_w24ret; }) - #define atomic_full_barrier() \ do { \ if (__atomic_is_v9) \ @@ -355,6 +227,6 @@ extern uint64_t _dl_hwcap __attribute__((weak)); #endif -#include <sysdep.h> +#include <sys/auxv.h> #endif /* atomic-machine.h */ diff --git a/sysdeps/sparc/sparc32/pthread_barrier_wait.c b/sysdeps/sparc/sparc32/pthread_barrier_wait.c deleted file mode 100644 index e5ef911..0000000 --- a/sysdeps/sparc/sparc32/pthread_barrier_wait.c +++ /dev/null @@ -1 +0,0 @@ -#error No support for pthread barriers on pre-v9 sparc. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/2] sparc32: Use cas for atomic_* operations and use general pthread_barrier_wait 2016-11-01 15:08 ` [RFC][PATCH 2/2] sparc32: Use cas for atomic_* operations and use general pthread_barrier_wait Andreas Larsson @ 2016-11-04 18:37 ` David Miller 2016-11-04 18:44 ` David Miller 0 siblings, 1 reply; 30+ messages in thread From: David Miller @ 2016-11-04 18:37 UTC (permalink / raw) To: andreas; +Cc: libc-alpha, adhemerval.zanella, carlos, triegel, software From: Andreas Larsson <andreas@gaisler.com> Date: Tue, 1 Nov 2016 16:07:47 +0100 > This uses the CASA compare and swap with user space data access ASI 0xa > that is present on many LEON3 and LEON4 systems and that is implied by > gcc's -mcpu=leon3. > > The CASA instruction is used not only for atomic compare and exchange > functions, but also atomic exchange functions and atomic write > functions. This is to allow the OS kernel to emulate that instruction on > systems where it is missing and to get atomicity between all atomic > writing functions without having to resort to stop all CPU:s in an SMP > system. Ok, this is fine. I'll work on the instruction emulation code for the kernel side. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 2/2] sparc32: Use cas for atomic_* operations and use general pthread_barrier_wait 2016-11-04 18:37 ` David Miller @ 2016-11-04 18:44 ` David Miller 0 siblings, 0 replies; 30+ messages in thread From: David Miller @ 2016-11-04 18:44 UTC (permalink / raw) To: andreas; +Cc: libc-alpha, adhemerval.zanella, carlos, triegel, software From: David Miller <davem@davemloft.net> Date: Fri, 04 Nov 2016 14:37:10 -0400 (EDT) > From: Andreas Larsson <andreas@gaisler.com> > Date: Tue, 1 Nov 2016 16:07:47 +0100 > >> This uses the CASA compare and swap with user space data access ASI 0xa >> that is present on many LEON3 and LEON4 systems and that is implied by >> gcc's -mcpu=leon3. >> >> The CASA instruction is used not only for atomic compare and exchange >> functions, but also atomic exchange functions and atomic write >> functions. This is to allow the OS kernel to emulate that instruction on >> systems where it is missing and to get atomicity between all atomic >> writing functions without having to resort to stop all CPU:s in an SMP >> system. > > Ok, this is fine. I'll work on the instruction emulation code for the > kernel side. Actually, this might cause some problems actually. We don't always have access to a proper _dl_hwcap value. Which means that we will emit the LEON CAS sometimes when running on a v9 chip which will not work properly. I need to think about this a bit more. Probably what we need to do is have three cases: 1) We explicitly know we are on a v9 chip via dl_hwcap, emit v9 CAS 2) We explicitly know we are on a v8 LEON chip via dl_hwcap, emit LEON CAS 3) Else, we emit a special trap instruction which the kernel fixes up I think this is necessary because we cannot attempt to execute one of the two CAS cases on the opposing CAS cpu type. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-01 15:08 [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware Andreas Larsson 2016-11-01 15:08 ` [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported Andreas Larsson 2016-11-01 15:08 ` [RFC][PATCH 2/2] sparc32: Use cas for atomic_* operations and use general pthread_barrier_wait Andreas Larsson @ 2016-11-01 16:00 ` Torvald Riegel 2016-11-01 16:09 ` David Miller 2 siblings, 1 reply; 30+ messages in thread From: Torvald Riegel @ 2016-11-01 16:00 UTC (permalink / raw) To: Andreas Larsson Cc: GNU C Library, Adhemerval Zanella, Carlos O'Donell, David Miller, software On Tue, 2016-11-01 at 16:07 +0100, Andreas Larsson wrote: > This patch series: > > 1) Fixes a sparcv8 bug introduced since the #error was added to > sysdeps/sparc/sparc32/pthread_barrier_wait.c in 2.23. This fix stops > incorrect usage of sendmsg and recvmsg Linux system calls for sparcv8. I don't know whether the changes this patch applies make sense, but otherwise the patch looks okay to me. This could also be a separate patch I think. Do you have a copyright assignment in place for glibc? > 2) Makes use of the CASA compare and swap instruction for atomic_* > functions sparcv8, that is available for most LEON3 and LEON4 designs > and implied by -mcpu=leon3, but not part of the sparcv8 standard. To > allow for easy kernel emulation on systems that lack the instruction, > the CASA instruction is used for all writing atomic_* functions. This > approach is discussed in thread [1]. Before I can review the patch in detail, I think there are a few high-level things that need to be taken care of. We need to document what sparc32 systems we actually support. Your patch looks like for now, CAS is required. If this is true, this should be documented (e.g., I guess this would need a NEWS item too). Is there a way to test this new requirement at build time, or is this just a runtime requirement? If this is a build-time requirement, is the assembler actually already aware that it can use a CAS (which would remove the need to hand-code the instruction). If we require CAS, the 24b exchanges should just be removed altogether; it seems the only remaining user is the low-level lock. I don't think you need to make all modifying atomic accesses use a CAS underneath, at least if we require CAS. If we will also allow for kernel emulation in the future, it would also be possible to check whether emulation is required and only then route all modifying accesses through the kernel. In the future, we will most likely require 8b atomic loads and stores (perhaps we can do without an 8b CAS, though doing that with a 32b cas is possible). I suggest to also look at whether you can use the new __atomic builtins (ie, as noted by the USE_ATOMIC_COMPILER_BUILTINS define). On the sparc systems that we want to support, will GCC do the right thing for CAS etc. (eg, if the requirement is to build with -mcpu=leon3)? In particular, will it either always emit a CAS instruction or will libatomic use a CAS instruction? If so, you could also just rely on the new atomic builtins and define the legacy atomic operations (ie, those not in C11 style) on top of those. > Any comments are most welcome. The spin lock based sparcv8 semaphore > implementation is currently unchanged by this patchset, but I would say > that that should go as well. Agreed. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-01 16:00 ` [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware Torvald Riegel @ 2016-11-01 16:09 ` David Miller 2016-11-01 16:46 ` Torvald Riegel 0 siblings, 1 reply; 30+ messages in thread From: David Miller @ 2016-11-01 16:09 UTC (permalink / raw) To: triegel; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software From: Torvald Riegel <triegel@redhat.com> Date: Tue, 01 Nov 2016 16:59:44 +0100 > On Tue, 2016-11-01 at 16:07 +0100, Andreas Larsson wrote: >> Any comments are most welcome. The spin lock based sparcv8 semaphore >> implementation is currently unchanged by this patchset, but I would say >> that that should go as well. > > Agreed. I think tossing out all of the ldstub based v8 code is not wise. I was envisioning adding code to use ldstub on v8 when CAS is not available in order to maintain the status quo of what worked and was functional before the changes which introduced this problem for v8 in the first place. Having that in place until the kernel-side atomics could be implemented, propagated, and supported in glibc would be a nice intermediate state compared to what we have now. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-01 16:09 ` David Miller @ 2016-11-01 16:46 ` Torvald Riegel 2016-11-01 16:51 ` David Miller 0 siblings, 1 reply; 30+ messages in thread From: Torvald Riegel @ 2016-11-01 16:46 UTC (permalink / raw) To: David Miller; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software On Tue, 2016-11-01 at 12:09 -0400, David Miller wrote: > From: Torvald Riegel <triegel@redhat.com> > Date: Tue, 01 Nov 2016 16:59:44 +0100 > > > On Tue, 2016-11-01 at 16:07 +0100, Andreas Larsson wrote: > >> Any comments are most welcome. The spin lock based sparcv8 semaphore > >> implementation is currently unchanged by this patchset, but I would say > >> that that should go as well. > > > > Agreed. > > I think tossing out all of the ldstub based v8 code is not wise. > > I was envisioning adding code to use ldstub on v8 when CAS is not > available in order to maintain the status quo of what worked and > was functional before the changes which introduced this problem > for v8 in the first place. > > Having that in place until the kernel-side atomics could be > implemented, propagated, and supported in glibc would be a nice > intermediate state compared to what we have now. How do you intend to make the synchronization primitives work whose implementation requires a CAS and for which nobody has provided an alternative implementation that does not require CAS? Will they remain unsupported? If so, we're just talking about semaphores and lowlevellock here. Barriers, condvars, rwlock would not be supported anymore. Would we just not support the process-shared of all these? If so, it likely doesn't hurt much to not support process-shared semaphores. The low-level locks can be changed to just use a TAS instead of a CAS, which would remove the need to keep the 24b variant just for these. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-01 16:46 ` Torvald Riegel @ 2016-11-01 16:51 ` David Miller 2016-11-02 10:05 ` Torvald Riegel 0 siblings, 1 reply; 30+ messages in thread From: David Miller @ 2016-11-01 16:51 UTC (permalink / raw) To: triegel; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software From: Torvald Riegel <triegel@redhat.com> Date: Tue, 01 Nov 2016 17:46:41 +0100 > On Tue, 2016-11-01 at 12:09 -0400, David Miller wrote: >> From: Torvald Riegel <triegel@redhat.com> >> Date: Tue, 01 Nov 2016 16:59:44 +0100 >> >> > On Tue, 2016-11-01 at 16:07 +0100, Andreas Larsson wrote: >> >> Any comments are most welcome. The spin lock based sparcv8 semaphore >> >> implementation is currently unchanged by this patchset, but I would say >> >> that that should go as well. >> > >> > Agreed. >> >> I think tossing out all of the ldstub based v8 code is not wise. >> >> I was envisioning adding code to use ldstub on v8 when CAS is not >> available in order to maintain the status quo of what worked and >> was functional before the changes which introduced this problem >> for v8 in the first place. >> >> Having that in place until the kernel-side atomics could be >> implemented, propagated, and supported in glibc would be a nice >> intermediate state compared to what we have now. > > How do you intend to make the synchronization primitives work whose > implementation requires a CAS and for which nobody has provided an > alternative implementation that does not require CAS? The pure userland version will do what has been done for decades, by using a spinlock that protects the word we want to do atomic operations upon. A hash table of spinlocks is another option. When kernel side support exists that version would do the operation entirely inside of the kernel using whatever internal synchronization primitive it deems appropriate. It should be invisible to the user. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-01 16:51 ` David Miller @ 2016-11-02 10:05 ` Torvald Riegel 2016-11-02 11:29 ` Andreas Larsson 2016-11-02 15:32 ` David Miller 0 siblings, 2 replies; 30+ messages in thread From: Torvald Riegel @ 2016-11-02 10:05 UTC (permalink / raw) To: David Miller; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software On Tue, 2016-11-01 at 12:51 -0400, David Miller wrote: > From: Torvald Riegel <triegel@redhat.com> > Date: Tue, 01 Nov 2016 17:46:41 +0100 > > > On Tue, 2016-11-01 at 12:09 -0400, David Miller wrote: > >> From: Torvald Riegel <triegel@redhat.com> > >> Date: Tue, 01 Nov 2016 16:59:44 +0100 > >> > >> > On Tue, 2016-11-01 at 16:07 +0100, Andreas Larsson wrote: > >> >> Any comments are most welcome. The spin lock based sparcv8 semaphore > >> >> implementation is currently unchanged by this patchset, but I would say > >> >> that that should go as well. > >> > > >> > Agreed. > >> > >> I think tossing out all of the ldstub based v8 code is not wise. > >> > >> I was envisioning adding code to use ldstub on v8 when CAS is not > >> available in order to maintain the status quo of what worked and > >> was functional before the changes which introduced this problem > >> for v8 in the first place. > >> > >> Having that in place until the kernel-side atomics could be > >> implemented, propagated, and supported in glibc would be a nice > >> intermediate state compared to what we have now. > > > > How do you intend to make the synchronization primitives work whose > > implementation requires a CAS and for which nobody has provided an > > alternative implementation that does not require CAS? > > The pure userland version will do what has been done for decades, > by using a spinlock that protects the word we want to do atomic > operations upon. A hash table of spinlocks is another option. I know about the available techniques; my question was rather aimed at who's going to do the work, in which rough stages, and when. An external table of locks does not work for process-shared synchronization. Do you plan to not support that, and abort() when someone tries to create a process-shared condvar, for example? Or do you intend to write sparc-specific versions of all the concurrent data structures that are process-shared? Note that in the new condvar, for example, there's no unused space in pthread_cond_t that could be used for a spinlock. So you'd have to reorganize quite a bit. If you want sparc-specific versions, who's going to implement them, and when? What do we do in the meantime? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-02 10:05 ` Torvald Riegel @ 2016-11-02 11:29 ` Andreas Larsson 2016-11-02 15:32 ` David Miller 1 sibling, 0 replies; 30+ messages in thread From: Andreas Larsson @ 2016-11-02 11:29 UTC (permalink / raw) To: Torvald Riegel, David Miller Cc: libc-alpha, adhemerval.zanella, carlos, software On 2016-11-02 11:05, Torvald Riegel wrote: > On Tue, 2016-11-01 at 12:51 -0400, David Miller wrote: >> From: Torvald Riegel <triegel@redhat.com> >> Date: Tue, 01 Nov 2016 17:46:41 +0100 >> >>> On Tue, 2016-11-01 at 12:09 -0400, David Miller wrote: >>>> From: Torvald Riegel <triegel@redhat.com> >>>> Date: Tue, 01 Nov 2016 16:59:44 +0100 >>>> >>>>> On Tue, 2016-11-01 at 16:07 +0100, Andreas Larsson wrote: >>>>>> Any comments are most welcome. The spin lock based sparcv8 semaphore >>>>>> implementation is currently unchanged by this patchset, but I would say >>>>>> that that should go as well. >>>>> >>>>> Agreed. >>>> >>>> I think tossing out all of the ldstub based v8 code is not wise. >>>> >>>> I was envisioning adding code to use ldstub on v8 when CAS is not >>>> available in order to maintain the status quo of what worked and >>>> was functional before the changes which introduced this problem >>>> for v8 in the first place. >>>> >>>> Having that in place until the kernel-side atomics could be >>>> implemented, propagated, and supported in glibc would be a nice >>>> intermediate state compared to what we have now. >>> >>> How do you intend to make the synchronization primitives work whose >>> implementation requires a CAS and for which nobody has provided an >>> alternative implementation that does not require CAS? >> >> The pure userland version will do what has been done for decades, >> by using a spinlock that protects the word we want to do atomic >> operations upon. A hash table of spinlocks is another option. > > I know about the available techniques; my question was rather aimed at > who's going to do the work, in which rough stages, and when. > > An external table of locks does not work for process-shared > synchronization. Do you plan to not support that, and abort() when > someone tries to create a process-shared condvar, for example? > > Or do you intend to write sparc-specific versions of all the concurrent > data structures that are process-shared? Note that in the new condvar, > for example, there's no unused space in pthread_cond_t that could be > used for a spinlock. So you'd have to reorganize quite a bit. > > If you want sparc-specific versions, who's going to implement them, and > when? What do we do in the meantime? [resent due to failure on my part to remove standard signature] Linux kernel commit 8b30ca73b7cc7f2177cfc4e8274d2ebdba328cd5 added sys_sendmsg and sys_recvmsg to the sys_call_table in arch/sparc/kernel/systbls_32.S. So sparc32 kernels prior to Linux 4.4 do not support them as straight up system calls. Best regards, Andreas Larsson ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-02 10:05 ` Torvald Riegel 2016-11-02 11:29 ` Andreas Larsson @ 2016-11-02 15:32 ` David Miller 2016-11-02 22:33 ` Torvald Riegel 1 sibling, 1 reply; 30+ messages in thread From: David Miller @ 2016-11-02 15:32 UTC (permalink / raw) To: triegel; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software From: Torvald Riegel <triegel@redhat.com> Date: Wed, 02 Nov 2016 11:05:21 +0100 > I know about the available techniques; my question was rather aimed at > who's going to do the work, in which rough stages, and when. I'm starting to clear up my backlog and find time to work on glibc so it is likely I can do it over the next month or so. > Or do you intend to write sparc-specific versions of all the concurrent > data structures that are process-shared? This would be necessary anyways, if we have two modes. One that does the pure-userland code path and one that does the kernel helper code path. Furthermore, sparc specific versions are needed in any case since we have the v9 detection even in the v8 libraries. Look at all of the code that checks for v9 in the dl_hwcap mask when deciding which atomic operation to use. > If you want sparc-specific versions, who's going to implement them, > and when? What do we do in the meantime? See above. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-02 15:32 ` David Miller @ 2016-11-02 22:33 ` Torvald Riegel 2016-11-03 2:52 ` David Miller 0 siblings, 1 reply; 30+ messages in thread From: Torvald Riegel @ 2016-11-02 22:33 UTC (permalink / raw) To: David Miller; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software On Wed, 2016-11-02 at 11:32 -0400, David Miller wrote: > From: Torvald Riegel <triegel@redhat.com> > > Or do you intend to write sparc-specific versions of all the concurrent > > data structures that are process-shared? > > This would be necessary anyways, if we have two modes. One that does > the pure-userland code path and one that does the kernel helper code > path. All the other archs that use a kernel helper for CAS don't need it. If you can call the helper in the atomic operations, you won't need a new algorithm except if you wanted to optimize the generic one. > Furthermore, sparc specific versions are needed in any case since we > have the v9 detection even in the v8 libraries. Look at all of the > code that checks for v9 in the dl_hwcap mask when deciding which > atomic operation to use. Or are you talking about the implementation of the atomic operations? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-02 22:33 ` Torvald Riegel @ 2016-11-03 2:52 ` David Miller 2016-11-03 15:39 ` Torvald Riegel 0 siblings, 1 reply; 30+ messages in thread From: David Miller @ 2016-11-03 2:52 UTC (permalink / raw) To: triegel; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software From: Torvald Riegel <triegel@redhat.com> Date: Wed, 02 Nov 2016 23:33:03 +0100 > On Wed, 2016-11-02 at 11:32 -0400, David Miller wrote: >> From: Torvald Riegel <triegel@redhat.com> >> > Or do you intend to write sparc-specific versions of all the concurrent >> > data structures that are process-shared? >> >> This would be necessary anyways, if we have two modes. One that does >> the pure-userland code path and one that does the kernel helper code >> path. > > All the other archs that use a kernel helper for CAS don't need it. If > you can call the helper in the atomic operations, you won't need a new > algorithm except if you wanted to optimize the generic one. > >> Furthermore, sparc specific versions are needed in any case since we >> have the v9 detection even in the v8 libraries. Look at all of the >> code that checks for v9 in the dl_hwcap mask when deciding which >> atomic operation to use. > > Or are you talking about the implementation of the atomic operations? Just as the "are we running on a v9 chip" test is a run-time one, whether we are running on a kernel with kernel CAS simulation support will be run time code path check as well. This is why we'll need sparc specific versions of the primitives, and why it would have been the more optimal if the primitives were abstracted to the point where we didn't have to duplicate so much stuff privately just to pull this off. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 2:52 ` David Miller @ 2016-11-03 15:39 ` Torvald Riegel 2016-11-03 17:22 ` David Miller 0 siblings, 1 reply; 30+ messages in thread From: Torvald Riegel @ 2016-11-03 15:39 UTC (permalink / raw) To: David Miller; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software On Wed, 2016-11-02 at 22:52 -0400, David Miller wrote: > From: Torvald Riegel <triegel@redhat.com> > Date: Wed, 02 Nov 2016 23:33:03 +0100 > > > On Wed, 2016-11-02 at 11:32 -0400, David Miller wrote: > >> From: Torvald Riegel <triegel@redhat.com> > >> > Or do you intend to write sparc-specific versions of all the concurrent > >> > data structures that are process-shared? > >> > >> This would be necessary anyways, if we have two modes. One that does > >> the pure-userland code path and one that does the kernel helper code > >> path. > > > > All the other archs that use a kernel helper for CAS don't need it. If > > you can call the helper in the atomic operations, you won't need a new > > algorithm except if you wanted to optimize the generic one. > > > >> Furthermore, sparc specific versions are needed in any case since we > >> have the v9 detection even in the v8 libraries. Look at all of the > >> code that checks for v9 in the dl_hwcap mask when deciding which > >> atomic operation to use. > > > > Or are you talking about the implementation of the atomic operations? > > Just as the "are we running on a v9 chip" test is a run-time one, Is there any difference between the additional CAS on a v8 and the CAS on a v9? If there should be none (eg, same instruciton encoding etc.), we wouldn't need a runtime check for this, would we? > whether we are running on a kernel with kernel CAS simulation support > will be run time code path check as well. That depends on whether we want to support sparc HW that does have a CAS. It's still not clear to me whether this is a goal, and if it's a goal, whether it's a goal for today or for some time in the future. > This is why we'll need sparc specific versions of the primitives, Which primitives are you talking about? The atomic operations in atomic-machine.h / atomic.h, or the synchronization primitives in nptl/? > and > why it would have been the more optimal if the primitives were > abstracted to the point where we didn't have to duplicate so much > stuff privately just to pull this off. I can't follow. What do you mean precisely? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 15:39 ` Torvald Riegel @ 2016-11-03 17:22 ` David Miller 2016-11-03 18:41 ` Adhemerval Zanella ` (3 more replies) 0 siblings, 4 replies; 30+ messages in thread From: David Miller @ 2016-11-03 17:22 UTC (permalink / raw) To: triegel; +Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software From: Torvald Riegel <triegel@redhat.com> Date: Thu, 03 Nov 2016 16:39:21 +0100 > Is there any difference between the additional CAS on a v8 and the CAS > on a v9? If there should be none (eg, same instruciton encoding etc.), > we wouldn't need a runtime check for this, would we? A quick look at binutils shows that the encoding appears to be the same. > That depends on whether we want to support sparc HW that does have a > CAS. It's still not clear to me whether this is a goal, and if it's a > goal, whether it's a goal for today or for some time in the future. I think there is value in supporting pure-v8, however painful it may be. I personally don't like to see when we drop support for old systems on the floor just because it's too inconvenient or cumbersome to keep them working properly. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 17:22 ` David Miller @ 2016-11-03 18:41 ` Adhemerval Zanella 2016-11-03 20:33 ` David Miller 2016-11-04 10:28 ` Andreas Larsson ` (2 subsequent siblings) 3 siblings, 1 reply; 30+ messages in thread From: Adhemerval Zanella @ 2016-11-03 18:41 UTC (permalink / raw) To: David Miller, triegel; +Cc: andreas, libc-alpha, carlos, software On 03/11/2016 15:22, David Miller wrote: > From: Torvald Riegel <triegel@redhat.com> > Date: Thu, 03 Nov 2016 16:39:21 +0100 > >> Is there any difference between the additional CAS on a v8 and the CAS >> on a v9? If there should be none (eg, same instruciton encoding etc.), >> we wouldn't need a runtime check for this, would we? > > A quick look at binutils shows that the encoding appears to be the same. > >> That depends on whether we want to support sparc HW that does have a >> CAS. It's still not clear to me whether this is a goal, and if it's a >> goal, whether it's a goal for today or for some time in the future. > > I think there is value in supporting pure-v8, however painful it may > be. > > I personally don't like to see when we drop support for old systems on > the floor just because it's too inconvenient or cumbersome to keep > them working properly. In fact I see it should be one of the main reason for dropping support for old system. At least for current topic, it means add complete separate implementation for only one arch, where current work is aimed exactly to avoid it. It is more code to audit/test on very specific environments and adds more complexity while fixing the default implementation (should the patch touch as well the arch specific parts or just let it broke?). ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 18:41 ` Adhemerval Zanella @ 2016-11-03 20:33 ` David Miller 2016-11-03 21:29 ` Adhemerval Zanella 2016-11-03 22:25 ` Torvald Riegel 0 siblings, 2 replies; 30+ messages in thread From: David Miller @ 2016-11-03 20:33 UTC (permalink / raw) To: adhemerval.zanella; +Cc: triegel, andreas, libc-alpha, carlos, software From: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Thu, 3 Nov 2016 16:41:13 -0200 > On 03/11/2016 15:22, David Miller wrote: >> From: Torvald Riegel <triegel@redhat.com> >> Date: Thu, 03 Nov 2016 16:39:21 +0100 >> >>> Is there any difference between the additional CAS on a v8 and the CAS >>> on a v9? If there should be none (eg, same instruciton encoding etc.), >>> we wouldn't need a runtime check for this, would we? >> >> A quick look at binutils shows that the encoding appears to be the same. >> >>> That depends on whether we want to support sparc HW that does have a >>> CAS. It's still not clear to me whether this is a goal, and if it's a >>> goal, whether it's a goal for today or for some time in the future. >> >> I think there is value in supporting pure-v8, however painful it may >> be. >> >> I personally don't like to see when we drop support for old systems on >> the floor just because it's too inconvenient or cumbersome to keep >> them working properly. > > In fact I see it should be one of the main reason for dropping support > for old system. At least for current topic, it means add complete > separate implementation for only one arch, where current work is > aimed exactly to avoid it. It is more code to audit/test on very > specific environments and adds more complexity while fixing the > default implementation (should the patch touch as well the arch > specific parts or just let it broke?). But the person creating this generic infrastructure was not asked to fail to accomodate properly architectures such as sparc v8 when implementing this "generic" solution, but that's what happened right? So the blame is on both sides. I'd feel extremely remiss as an architecture maintainer if simply because someone can't come up with a proper generic mechanism to implement something, my platform might be on the chopping block. Is that really the kind of policy we want to have? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 20:33 ` David Miller @ 2016-11-03 21:29 ` Adhemerval Zanella 2016-11-03 22:25 ` Torvald Riegel 1 sibling, 0 replies; 30+ messages in thread From: Adhemerval Zanella @ 2016-11-03 21:29 UTC (permalink / raw) To: David Miller; +Cc: triegel, andreas, libc-alpha, carlos, software On 03/11/2016 18:33, David Miller wrote: > From: Adhemerval Zanella <adhemerval.zanella@linaro.org> > Date: Thu, 3 Nov 2016 16:41:13 -0200 > >> On 03/11/2016 15:22, David Miller wrote: >>> From: Torvald Riegel <triegel@redhat.com> >>> Date: Thu, 03 Nov 2016 16:39:21 +0100 >>> >>>> Is there any difference between the additional CAS on a v8 and the CAS >>>> on a v9? If there should be none (eg, same instruciton encoding etc.), >>>> we wouldn't need a runtime check for this, would we? >>> >>> A quick look at binutils shows that the encoding appears to be the same. >>> >>>> That depends on whether we want to support sparc HW that does have a >>>> CAS. It's still not clear to me whether this is a goal, and if it's a >>>> goal, whether it's a goal for today or for some time in the future. >>> >>> I think there is value in supporting pure-v8, however painful it may >>> be. >>> >>> I personally don't like to see when we drop support for old systems on >>> the floor just because it's too inconvenient or cumbersome to keep >>> them working properly. >> >> In fact I see it should be one of the main reason for dropping support >> for old system. At least for current topic, it means add complete >> separate implementation for only one arch, where current work is >> aimed exactly to avoid it. It is more code to audit/test on very >> specific environments and adds more complexity while fixing the >> default implementation (should the patch touch as well the arch >> specific parts or just let it broke?). > > But the person creating this generic infrastructure was not asked to > fail to accomodate properly architectures such as sparc v8 when > implementing this "generic" solution, but that's what happened right? > > So the blame is on both sides. > > I'd feel extremely remiss as an architecture maintainer if simply > because someone can't come up with a proper generic mechanism to > implement something, my platform might be on the chopping block. > > Is that really the kind of policy we want to have? That was not really what happened, for this specific case the new pthread_barrier was added to fix a race condition issue (BZ#13065) and first patch version was sent about 1 year ago [1] and Torvald's explicit asked you what would a better solution for sparc32 at time. He ping again before 2.23 release [2] about four months later. Since we got not reply about sparc32, I, as release manager for 2.23, decided that best course of action was to emit an build error [3]. Now, I am fully aware that you or any other sparc mantainer or developer might not had the time back then to create a working implementation in time. Neither we are asking for it. However, I think we need to set proper plan for this specific issue and that was my point of this email. What I think would be a good approach was to know what kind of plan you had for the specific issue back then: * Issue build error? * Create dummy atomic operation that would result in build success, but potentially runtime failures? * Continue to use the old implementation and carry BZ#13065 on sparc? * Implement the kernel atomic kernel primitives? And I suggested to remove sparc-v8 just because I saw no movement on trying to at least re-enable its build. Also, answering your questioning, the idea is not to make you fix all the underlying issues on your maintained platform, bur rather help us decide what would be better for such cases. However for that, we will need your input. And If you check on release wiki [4], lot of platforms have various unsolved issues, but that's not the case to just chomp down then. [1] https://sourceware.org/ml/libc-alpha/2015-07/msg00585.html [2] https://sourceware.org/ml/libc-alpha/2015-12/msg00484.html [3] https://www.sourceware.org/ml/libc-alpha/2016-01/msg00338.html [4] https://sourceware.org/glibc/wiki/Release/2.24 ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 20:33 ` David Miller 2016-11-03 21:29 ` Adhemerval Zanella @ 2016-11-03 22:25 ` Torvald Riegel 1 sibling, 0 replies; 30+ messages in thread From: Torvald Riegel @ 2016-11-03 22:25 UTC (permalink / raw) To: David Miller; +Cc: adhemerval.zanella, andreas, libc-alpha, carlos, software On Thu, 2016-11-03 at 16:33 -0400, David Miller wrote: > From: Adhemerval Zanella <adhemerval.zanella@linaro.org> > Date: Thu, 3 Nov 2016 16:41:13 -0200 > > > On 03/11/2016 15:22, David Miller wrote: > >> From: Torvald Riegel <triegel@redhat.com> > >> Date: Thu, 03 Nov 2016 16:39:21 +0100 > >> > >>> Is there any difference between the additional CAS on a v8 and the CAS > >>> on a v9? If there should be none (eg, same instruciton encoding etc.), > >>> we wouldn't need a runtime check for this, would we? > >> > >> A quick look at binutils shows that the encoding appears to be the same. > >> > >>> That depends on whether we want to support sparc HW that does have a > >>> CAS. It's still not clear to me whether this is a goal, and if it's a > >>> goal, whether it's a goal for today or for some time in the future. > >> > >> I think there is value in supporting pure-v8, however painful it may > >> be. > >> > >> I personally don't like to see when we drop support for old systems on > >> the floor just because it's too inconvenient or cumbersome to keep > >> them working properly. > > > > In fact I see it should be one of the main reason for dropping support > > for old system. At least for current topic, it means add complete > > separate implementation for only one arch, where current work is > > aimed exactly to avoid it. It is more code to audit/test on very > > specific environments and adds more complexity while fixing the > > default implementation (should the patch touch as well the arch > > specific parts or just let it broke?). > > But the person creating this generic infrastructure was not asked to > fail to accomodate properly architectures such as sparc v8 when > implementing this "generic" solution, but that's what happened right? > > So the blame is on both sides. > > I'd feel extremely remiss as an architecture maintainer if simply > because someone can't come up with a proper generic mechanism to > implement something, my platform might be on the chopping block. > > Is that really the kind of policy we want to have? Adding to what Adhemerval said, I want to stress again that lack of support for CAS by the hardware or the kernel is a *serious* problem for anything synchronization-related, especially if one wants to support process-shared synchronization. IMO, expecting availability of CAS is something completely reasonable, and I don't see how concurrent code that relies on CAS could get considered to not be sufficiently generic. I would also categorize support for CAS as something that arch maintainers should take care of. I have repeatedly brought up this topic on libc-alpha and made suggestions for how arch maintainers could take care of it. I'm not aware of any improvement of the situation on the sparcv8 side (until Andreas' recent work); have I missed anything? Also, it's not true that the only solution we offered was to fully remove sparcv8 support. You could just choose to not support process-shared synchronization, for example. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 17:22 ` David Miller 2016-11-03 18:41 ` Adhemerval Zanella @ 2016-11-04 10:28 ` Andreas Larsson 2016-11-04 15:23 ` David Miller 2016-11-04 13:55 ` Richard Henderson 2016-11-04 14:04 ` Richard Henderson 3 siblings, 1 reply; 30+ messages in thread From: Andreas Larsson @ 2016-11-04 10:28 UTC (permalink / raw) To: David Miller, triegel; +Cc: libc-alpha, adhemerval.zanella, carlos, software On 2016-11-03 18:22, David Miller wrote: > From: Torvald Riegel <triegel@redhat.com> > Date: Thu, 03 Nov 2016 16:39:21 +0100 > >> Is there any difference between the additional CAS on a v8 and the CAS >> on a v9? If there should be none (eg, same instruciton encoding etc.), >> we wouldn't need a runtime check for this, would we? > > A quick look at binutils shows that the encoding appears to be the same. The general encoding of the CASA instruction is the same, but on sparcv9 the ASI to use is the primary address space ASI 0x80 and on LEON the ASI to use is the user data space ASI 0xa. So different instruction encodings needs to be used. Unfortunately there is no way general way short of trying to use the CASA instruction and taking a care of a possible illegal instruction to detect if a LEON3 system has CAS or not. But CAS support is implied by the the -mcpu=leon3 flag, and LEON3 systems without CAS can use -mcpu=v8. -- Best regards, Andreas Larson ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-04 10:28 ` Andreas Larsson @ 2016-11-04 15:23 ` David Miller 0 siblings, 0 replies; 30+ messages in thread From: David Miller @ 2016-11-04 15:23 UTC (permalink / raw) To: andreas; +Cc: triegel, libc-alpha, adhemerval.zanella, carlos, software From: Andreas Larsson <andreas@gaisler.com> Date: Fri, 04 Nov 2016 11:27:46 +0100 > On 2016-11-03 18:22, David Miller wrote: >> From: Torvald Riegel <triegel@redhat.com> >> Date: Thu, 03 Nov 2016 16:39:21 +0100 >> >>> Is there any difference between the additional CAS on a v8 and the CAS >>> on a v9? If there should be none (eg, same instruciton encoding >>> etc.), >>> we wouldn't need a runtime check for this, would we? >> >> A quick look at binutils shows that the encoding appears to be the >> same. > > The general encoding of the CASA instruction is the same, but on > sparcv9 the ASI to use is the primary address space ASI 0x80 and on > LEON the ASI to use is the user data space ASI 0xa. So different > instruction encodings needs to be used. > > Unfortunately there is no way general way short of trying to use the > CASA instruction and taking a care of a possible illegal instruction > to detect if a LEON3 system has CAS or not. But CAS support is implied > by the the -mcpu=leon3 flag, and LEON3 systems without CAS can use > -mcpu=v8. This really should be advertised in the _dl_hwcaps. We should try as hard as possible to allow dynamic discovery of this. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 17:22 ` David Miller 2016-11-03 18:41 ` Adhemerval Zanella 2016-11-04 10:28 ` Andreas Larsson @ 2016-11-04 13:55 ` Richard Henderson 2016-11-04 15:31 ` David Miller 2016-11-04 14:04 ` Richard Henderson 3 siblings, 1 reply; 30+ messages in thread From: Richard Henderson @ 2016-11-04 13:55 UTC (permalink / raw) To: David Miller, triegel Cc: andreas, libc-alpha, adhemerval.zanella, carlos, software On 11/03/2016 11:22 AM, David Miller wrote: > From: Torvald Riegel <triegel@redhat.com> > Date: Thu, 03 Nov 2016 16:39:21 +0100 > >> Is there any difference between the additional CAS on a v8 and the CAS >> on a v9? If there should be none (eg, same instruciton encoding etc.), >> we wouldn't need a runtime check for this, would we? > > A quick look at binutils shows that the encoding appears to be the same. Yes and no. The instruction format is the same, but the ASI used is different. The CAS for leon userspace uses ASI_USERDATA (0x0A), not the v9 ASI_P (0x80). It's a really annoying difference that I wish the cpu designers hadn't made. r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-04 13:55 ` Richard Henderson @ 2016-11-04 15:31 ` David Miller 2016-11-04 16:10 ` Richard Henderson 0 siblings, 1 reply; 30+ messages in thread From: David Miller @ 2016-11-04 15:31 UTC (permalink / raw) To: rth; +Cc: triegel, andreas, libc-alpha, adhemerval.zanella, carlos, software From: Richard Henderson <rth@twiddle.net> Date: Fri, 4 Nov 2016 07:55:04 -0600 > On 11/03/2016 11:22 AM, David Miller wrote: >> From: Torvald Riegel <triegel@redhat.com> >> Date: Thu, 03 Nov 2016 16:39:21 +0100 >> >>> Is there any difference between the additional CAS on a v8 and the CAS >>> on a v9? If there should be none (eg, same instruciton encoding >>> etc.), >>> we wouldn't need a runtime check for this, would we? >> >> A quick look at binutils shows that the encoding appears to be the >> same. > > Yes and no. The instruction format is the same, but the ASI used is > different. > > The CAS for leon userspace uses ASI_USERDATA (0x0A), not the v9 ASI_P > (0x80). It's a really annoying difference that I wish the cpu > designers hadn't made. I don't think they had much choice in the matter given how the ASIs are doled out in v9 vs. pre-v9. :-/ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-04 15:31 ` David Miller @ 2016-11-04 16:10 ` Richard Henderson 0 siblings, 0 replies; 30+ messages in thread From: Richard Henderson @ 2016-11-04 16:10 UTC (permalink / raw) To: David Miller Cc: triegel, andreas, libc-alpha, adhemerval.zanella, carlos, software On 11/04/2016 09:31 AM, David Miller wrote: > From: Richard Henderson <rth@twiddle.net> >> The CAS for leon userspace uses ASI_USERDATA (0x0A), not the v9 ASI_P >> (0x80). It's a really annoying difference that I wish the cpu >> designers hadn't made. > > I don't think they had much choice in the matter given how the ASIs > are doled out in v9 vs. pre-v9. :-/ > I didn't think 0x80 was used for anything else on v8, or leon specifically. AFAICS it could easily have been made an alias for 0x0A internally. But oh well, it's done. r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware 2016-11-03 17:22 ` David Miller ` (2 preceding siblings ...) 2016-11-04 13:55 ` Richard Henderson @ 2016-11-04 14:04 ` Richard Henderson 3 siblings, 0 replies; 30+ messages in thread From: Richard Henderson @ 2016-11-04 14:04 UTC (permalink / raw) To: David Miller; +Cc: libc-alpha On 11/03/2016 11:22 AM, David Miller wrote: > I personally don't like to see when we drop support for old systems on > the floor just because it's too inconvenient or cumbersome to keep > them working properly. There's also value in simplifying things by requiring upgrades in lock-step. We've done this for gcc + binutils at times. I don't see why requiring a new kernel to work with a new glibc should be a burden, so long as the former is actually released before the latter. r~ ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2016-11-04 18:44 UTC | newest] Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-11-01 15:08 [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware Andreas Larsson 2016-11-01 15:08 ` [RFC][PATCH 1/2] sparc32: Mark sendmsg and recvmsg system calls as unsupported Andreas Larsson 2016-11-01 17:28 ` Adhemerval Zanella 2016-11-02 11:38 ` Andreas Larsson 2016-11-02 12:49 ` Adhemerval Zanella 2016-11-04 18:36 ` David Miller 2016-11-01 15:08 ` [RFC][PATCH 2/2] sparc32: Use cas for atomic_* operations and use general pthread_barrier_wait Andreas Larsson 2016-11-04 18:37 ` David Miller 2016-11-04 18:44 ` David Miller 2016-11-01 16:00 ` [RFC][PATCH 0/2] Make sparcv8 work again on cas enabled hardware Torvald Riegel 2016-11-01 16:09 ` David Miller 2016-11-01 16:46 ` Torvald Riegel 2016-11-01 16:51 ` David Miller 2016-11-02 10:05 ` Torvald Riegel 2016-11-02 11:29 ` Andreas Larsson 2016-11-02 15:32 ` David Miller 2016-11-02 22:33 ` Torvald Riegel 2016-11-03 2:52 ` David Miller 2016-11-03 15:39 ` Torvald Riegel 2016-11-03 17:22 ` David Miller 2016-11-03 18:41 ` Adhemerval Zanella 2016-11-03 20:33 ` David Miller 2016-11-03 21:29 ` Adhemerval Zanella 2016-11-03 22:25 ` Torvald Riegel 2016-11-04 10:28 ` Andreas Larsson 2016-11-04 15:23 ` David Miller 2016-11-04 13:55 ` Richard Henderson 2016-11-04 15:31 ` David Miller 2016-11-04 16:10 ` Richard Henderson 2016-11-04 14:04 ` Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).