public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux
@ 2008-07-13 20:34 John David Anglin
  2008-07-13 20:39 ` Helge Deller
  0 siblings, 1 reply; 11+ messages in thread
From: John David Anglin @ 2008-07-13 20:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: deller, carlos

> Below is the equivalent patch for HPPA.
> As mentioned before, there are not much differences between this version and
> the ARM version.

I like the idea of this patch but I believe that this needs careful
thought before the interface is cast in stone.

+/* Determine kernel LWS function call (0=32bit, 1=64bit userspace)  */
+#define LWS_CAS (sizeof(unsigned long) == 4 ? 0 : 1)
+
+/* Kernel helper for compare-and-exchange.  */
+#define __kernel_cmpxchg( oldval, newval, mem )                                \
+  ({                                                                   \
+    register long lws_ret   asm("r28");                                        \
+    register long lws_errno asm("r21");                                        \
+    register unsigned long lws_mem asm("r26") = (unsigned long) (mem); \
+    register long lws_old asm("r25") = (oldval);                       \
+    register long lws_new asm("r24") = (newval);                       \
+    asm volatile(      "ble    0xb0(%%sr2, %%r0)       \n\t"           \
+                       "ldi    %5, %%r20               \n\t"           \
+       : "=r" (lws_ret), "=r" (lws_errno), "=r" (lws_mem),             \
+         "=r" (lws_old), "=r" (lws_new)                                \
+       : "i" (LWS_CAS), "2" (lws_mem), "3" (lws_old), "4" (lws_new)    \
+       : "r1", "r20", "r22", "r23", "r31", "memory"                    \
+    );                                                                         \
+    lws_errno;                                                         \
+   })

From a style standpoint, GCC macro defines and their arguments are usually
in upper case.  There is no space before and after the arguments.

You are using the 32-bit and 64-bit cmpxchg's in the 32 and 64-bit
runtimes, respectively.  However, the implementation that's currently
in the kernel for lws_compare_and_swap64 operates on a 32-bit object.
Thus, the type for oldval and newval should be int.

Possibly, the kernel implementation should be modified to add byte,
half and dword operations.  This would avoid some of the shift and
mask operations in the subsequent defines.  There's currently no
64-bit runtime, so changing lws_compare_and_swap64 shouldn't be a
problem.

I would like to see kernel support for lws_compare_and_swap8,
lws_compare_and_swap16, lws_compare_and_swap32 on 32-bit kernels.
I would also like to see lws_compare_and_swap64 changed to do
a 64-bit swap.  Hopefully, this could be done without running
out of space on the gateway page.

+/* Kernel helper for memory barrier.  */
+#define __kernel_dmb() asm volatile ( "" : : : "memory" );

Comment for above?

+/* Note: we implement byte, short and int versions of atomic operations
using
+   the above kernel helpers, but there is no support for "long long"
(64-bit)
+   operations as yet.  */

This comment assumes 32-bit runtime.  "long long" and "long" are
both 64 bits in the 64-bit runtime.  This swap could be done easily
with a 64-bit kernel.

+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define INVERT_MASK_1 0
+#define INVERT_MASK_2 0
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define INVERT_MASK_1 24
+#define INVERT_MASK_2 16
+#else
+#error "Endianess missing"
+#endif

As far as I know, GCC supports no working little endian implementations
on PA-RISC, so the little endian defines are just additional clutter.

+#define MASK_1 0xffu
+#define MASK_2 0xffffu

With the above kernel support, I am hoping the mask defines will
go away.

+#define SUBWORD_SYNC_OP(OP, PFX_OP, INF_OP, TYPE, WIDTH, RETURN)       \
+  TYPE HIDDEN                                                          \
+  NAME##_##RETURN (OP, WIDTH) (TYPE *ptr, TYPE val)                    \
+  {                                                                    \
+    int *wordptr = (int *) ((unsigned int) ptr & ~3);                  \

The cast is wrong for 64-bit runtime.  Should be unsigned long.

+    unsigned int mask, shift, oldval, newval;                          \
+    int failure;                                                       \
+                                                                       \
+    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \

Ditto.

+    int *wordptr = (int *)((unsigned int) ptr & ~3), fail;             \

Ditto.

+    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \

Ditto.

+    int *wordptr = (int *) ((unsigned int) ptr & ~3);                  \

Ditto.

+    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \

Ditto.

I don't like very much the fact that the implementations loop forever
when EFAULT or ENOSYS is returned by the kernel.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux
  2008-07-13 20:34 [PATCH, HPPA] Atomic builtins using kernel helpers for Linux John David Anglin
@ 2008-07-13 20:39 ` Helge Deller
  2008-07-14  0:02   ` Helge Deller
  2008-07-14  1:46   ` [PATCH, HPPA] Atomic builtins using kernel helpers for Linux John David Anglin
  0 siblings, 2 replies; 11+ messages in thread
From: Helge Deller @ 2008-07-13 20:39 UTC (permalink / raw)
  To: John David Anglin; +Cc: gcc-patches, carlos

John David Anglin wrote:
>> Below is the equivalent patch for HPPA.
>> As mentioned before, there are not much differences between this version and
>> the ARM version.
> 
> I like the idea of this patch but I believe that this needs careful
> thought before the interface is cast in stone.

Thanks a lot for the review John!

First of all I should mention, that my initial goal of this patch was to 
a) try to only change necessary parts for HPPA, while keeping 95% of the 
code identical to the ARM implementation (just in case it could share 
code with it). I assume with your proposals below I should drop this 
goal and do a HPPA-only implementation?
b) only support 1,2 and 4byte atomic functions, same as original code 
for ARM.

> +/* Determine kernel LWS function call (0=32bit, 1=64bit userspace)  */
> +#define LWS_CAS (sizeof(unsigned long) == 4 ? 0 : 1)
> +
> +/* Kernel helper for compare-and-exchange.  */
> +#define __kernel_cmpxchg( oldval, newval, mem )                                \
> +  ({                                                                   \
> +    register long lws_ret   asm("r28");                                        \
> +    register long lws_errno asm("r21");                                        \
> +    register unsigned long lws_mem asm("r26") = (unsigned long) (mem); \
> +    register long lws_old asm("r25") = (oldval);                       \
> +    register long lws_new asm("r24") = (newval);                       \
> +    asm volatile(      "ble    0xb0(%%sr2, %%r0)       \n\t"           \
> +                       "ldi    %5, %%r20               \n\t"           \
> +       : "=r" (lws_ret), "=r" (lws_errno), "=r" (lws_mem),             \
> +         "=r" (lws_old), "=r" (lws_new)                                \
> +       : "i" (LWS_CAS), "2" (lws_mem), "3" (lws_old), "4" (lws_new)    \
> +       : "r1", "r20", "r22", "r23", "r31", "memory"                    \
> +    );                                                                         \
> +    lws_errno;                                                         \
> +   })
> 
> From a style standpoint, GCC macro defines and their arguments are usually
> in upper case.  

This is due reason a) (see above).
I could do "#define KERNEL_CMPXCHG" and then a function 
__kernel_cmpxchg() using this macro?

> There is no space before and after the arguments.

Ok.

> You are using the 32-bit and 64-bit cmpxchg's in the 32 and 64-bit
> runtimes, respectively.  However, the implementation that's currently
> in the kernel for lws_compare_and_swap64 operates on a 32-bit object.
> Thus, the type for oldval and newval should be int.

Ok.

> Possibly, the kernel implementation should be modified to add byte,
> half and dword operations.  This would avoid some of the shift and
> mask operations in the subsequent defines.  There's currently no
> 64-bit runtime, so changing lws_compare_and_swap64 shouldn't be a
> problem.
> 
> I would like to see kernel support for lws_compare_and_swap8,
> lws_compare_and_swap16, lws_compare_and_swap32 on 32-bit kernels.
> I would also like to see lws_compare_and_swap64 changed to do
> a 64-bit swap.  

Hmm. Is it needed that often?
I think atomic ops should normally operate on native types (int and long 
long) only.

> Hopefully, this could be done without running out of space
 > on the gateway page.

Kyle mentioned yesterday on IRC, that it probably would be better to 
implement those additional functions (if they really need to be 
implemented!) with VDSO, and avoid adding more functions to the LWS.

> +/* Kernel helper for memory barrier.  */
> +#define __kernel_dmb() asm volatile ( "" : : : "memory" );
> 
> Comment for above?

Again, reason a).  The ARM implementation used the __kernel_dmb() 
function and I tried to not change the code (at least not yet).

> +/* Note: we implement byte, short and int versions of atomic operations
> using
> +   the above kernel helpers, but there is no support for "long long"
> (64-bit)
> +   operations as yet.  */
> 
> This comment assumes 32-bit runtime.  "long long" and "long" are
> both 64 bits in the 64-bit runtime.  This swap could be done easily
> with a 64-bit kernel.

Yes, again reason a).

> +#if __BYTE_ORDER == __LITTLE_ENDIAN
> +#define INVERT_MASK_1 0
> +#define INVERT_MASK_2 0
> +#elif __BYTE_ORDER == __BIG_ENDIAN
> +#define INVERT_MASK_1 24
> +#define INVERT_MASK_2 16
> +#else
> +#error "Endianess missing"
> +#endif
> 
> As far as I know, GCC supports no working little endian implementations
> on PA-RISC, so the little endian defines are just additional clutter.

Reason a).  ARM used #ifdef __ARMEL__

> +#define MASK_1 0xffu
> +#define MASK_2 0xffffu
> 
> With the above kernel support, I am hoping the mask defines will
> go away.
> 
> +#define SUBWORD_SYNC_OP(OP, PFX_OP, INF_OP, TYPE, WIDTH, RETURN)       \
> +  TYPE HIDDEN                                                          \
> +  NAME##_##RETURN (OP, WIDTH) (TYPE *ptr, TYPE val)                    \
> +  {                                                                    \
> +    int *wordptr = (int *) ((unsigned int) ptr & ~3);                  \
> 
> The cast is wrong for 64-bit runtime.  Should be unsigned long.

Ok, but again a).

> +    unsigned int mask, shift, oldval, newval;                          \
> +    int failure;                                                       \
> +                                                                       \
> +    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \
> 
> Ditto.
> 
> +    int *wordptr = (int *)((unsigned int) ptr & ~3), fail;             \
> 
> Ditto.
> 
> +    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \
> 
> Ditto.
> 
> +    int *wordptr = (int *) ((unsigned int) ptr & ~3);                  \
> 
> Ditto.
> 
> +    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \
> 
> Ditto.
> 
> I don't like very much the fact that the implementations loop forever
> when EFAULT or ENOSYS is returned by the kernel.

Yes.
Any idea how to implement a fault generation or program abort in libgcc.a?

Helge

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux
  2008-07-13 20:39 ` Helge Deller
@ 2008-07-14  0:02   ` Helge Deller
  2008-07-15  0:34     ` [PATCH, HPPA] Atomic builtins using kernel helpers for Linux (try #4) Helge Deller
  2008-07-14  1:46   ` [PATCH, HPPA] Atomic builtins using kernel helpers for Linux John David Anglin
  1 sibling, 1 reply; 11+ messages in thread
From: Helge Deller @ 2008-07-14  0:02 UTC (permalink / raw)
  To: gcc-patches

Below is an updated patch, with the exception that it does not yet have a
fix for this:
>> I don't like very much the fact that the implementations loop forever
>> when EFAULT or ENOSYS is returned by the kernel.

ChangeLog

    gcc/
    * config/pa/t-linux (LIB2FUNCS_STATIC_EXTRA): Add
    config/pa/linux-atomic.c.
    * config/pa/t-linux64 (LIB2FUNCS_STATIC_EXTRA): Add
    config/pa/linux-atomic.c.
    * config/pa/linux-atomic.c: New.


Index: gcc/config/pa/linux-atomic.c
===================================================================
--- gcc/config/pa/linux-atomic.c        (revision 0)
+++ gcc/config/pa/linux-atomic.c        (revision 0)
@@ -0,0 +1,288 @@
+/* Linux-specific atomic operations for PA Linux.
+   Copyright (C) 2008 Free Software Foundation, Inc.
+   Based on code contributed by CodeSourcery for ARM EABI Linux.
+   Modifications for PA Linux by Helge Deller <deller@gmx.de>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 2, or (at your option) any later
+version.
+
+In addition to the permissions in the GNU General Public License, the
+Free Software Foundation gives you unlimited permission to link the
+compiled version of this file into combinations with other programs,
+and to distribute those combinations without any restriction coming
+from the use of this file.  (The General Public License restrictions
+do apply in other respects; for example, they cover modification of
+the file, and distribution when not linked into a combine
+executable.)
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to the Free
+Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301, USA.  */
+
+/* Determine kernel LWS function call (0=32bit, 1=64bit userspace)  */
+#define LWS_CAS (sizeof(unsigned long) == 4 ? 0 : 1)
+
+/* Kernel helper for compare-and-exchange a 32bit value.  */
+static inline long __kernel_cmpxchg(int oldval, int newval, int *mem)
+{
+    register unsigned long lws_mem asm("r26") = (unsigned long) (mem);
+    register long lws_ret   asm("r28");
+    register long lws_errno asm("r21");
+    register int lws_old asm("r25") = oldval;
+    register int lws_new asm("r24") = newval;
+    asm volatile(      "ble    0xb0(%%sr2, %%r0)       \n\t"
+                       "ldi    %5, %%r20               \n\t"
+       : "=r" (lws_ret), "=r" (lws_errno), "=r" (lws_mem),
+         "=r" (lws_old), "=r" (lws_new)
+       : "i" (LWS_CAS), "2" (lws_mem), "3" (lws_old), "4" (lws_new)
+       : "r1", "r2", "r20", "r22", "r23", "r27", "r29", "r31", "memory"
+    ); 
+    return lws_errno;
+}
+
+/* Note: we implement byte, short and int versions of atomic operations
using
+   the above kernel helpers, but there is no support for 64-bit operations
as
+   yet.  */
+
+#define HIDDEN __attribute__ ((visibility ("hidden")))
+
+/* Big endian masks  */
+#define INVERT_MASK_1 24
+#define INVERT_MASK_2 16
+
+#define MASK_1 0xffu
+#define MASK_2 0xffffu
+
+#define FETCH_AND_OP_WORD(OP, PFX_OP, INF_OP)                          \
+  int HIDDEN                                                           \
+  __sync_fetch_and_##OP##_4 (int *ptr, int val)                                \
+  {                                                                    \
+    int failure, tmp;                                                  \
+                                                                       \
+    do {                                                               \
+      tmp = *ptr;                                                      \
+      failure = __kernel_cmpxchg (tmp, PFX_OP tmp INF_OP val, ptr);    \
+    } while (failure != 0);                                            \
+                                                                       \
+    return tmp;                                                                \
+  }
+
+FETCH_AND_OP_WORD (add,   , +)
+FETCH_AND_OP_WORD (sub,   , -)
+FETCH_AND_OP_WORD (or,    , |)
+FETCH_AND_OP_WORD (and,   , &)
+FETCH_AND_OP_WORD (xor,   , ^)
+FETCH_AND_OP_WORD (nand, ~, &)
+
+#define NAME_oldval(OP, WIDTH) __sync_fetch_and_##OP##_##WIDTH
+#define NAME_newval(OP, WIDTH) __sync_##OP##_and_fetch_##WIDTH
+
+/* Implement both __sync_<op>_and_fetch and __sync_fetch_and_<op> for
+   subword-sized quantities.  */
+
+#define SUBWORD_SYNC_OP(OP, PFX_OP, INF_OP, TYPE, WIDTH, RETURN)       \
+  TYPE HIDDEN                                                          \
+  NAME##_##RETURN (OP, WIDTH) (TYPE *ptr, TYPE val)                    \
+  {                                                                    \
+    int *wordptr = (int *) ((unsigned long) ptr & ~3);                 \
+    unsigned int mask, shift, oldval, newval;                          \
+    int failure;                                                       \
+                                                                       \
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;    \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    do {                                                               \
+      oldval = *wordptr;                                               \
+      newval = ((PFX_OP ((oldval & mask) >> shift)                     \
+                 INF_OP (unsigned int) val) << shift) & mask;          \
+      newval |= oldval & ~mask;                                                \
+      failure = __kernel_cmpxchg (oldval, newval, wordptr);            \
+    } while (failure != 0);                                            \
+                                                                       \
+    return (RETURN & mask) >> shift;                                   \
+  }
+
+SUBWORD_SYNC_OP (add,   , +, short, 2, oldval)
+SUBWORD_SYNC_OP (sub,   , -, short, 2, oldval)
+SUBWORD_SYNC_OP (or,    , |, short, 2, oldval)
+SUBWORD_SYNC_OP (and,   , &, short, 2, oldval)
+SUBWORD_SYNC_OP (xor,   , ^, short, 2, oldval)
+SUBWORD_SYNC_OP (nand, ~, &, short, 2, oldval)
+
+SUBWORD_SYNC_OP (add,   , +, char, 1, oldval)
+SUBWORD_SYNC_OP (sub,   , -, char, 1, oldval)
+SUBWORD_SYNC_OP (or,    , |, char, 1, oldval)
+SUBWORD_SYNC_OP (and,   , &, char, 1, oldval)
+SUBWORD_SYNC_OP (xor,   , ^, char, 1, oldval)
+SUBWORD_SYNC_OP (nand, ~, &, char, 1, oldval)
+
+#define OP_AND_FETCH_WORD(OP, PFX_OP, INF_OP)                          \
+  int HIDDEN                                                           \
+  __sync_##OP##_and_fetch_4 (int *ptr, int val)                                \
+  {                                                                    \
+    int tmp, failure;                                                  \
+                                                                       \
+    do {                                                               \
+      tmp = *ptr;                                                      \
+      failure = __kernel_cmpxchg (tmp, PFX_OP tmp INF_OP val, ptr);    \
+    } while (failure != 0);                                            \
+                                                                       \
+    return PFX_OP tmp INF_OP val;                                      \
+  }
+
+OP_AND_FETCH_WORD (add,   , +)
+OP_AND_FETCH_WORD (sub,   , -)
+OP_AND_FETCH_WORD (or,    , |)
+OP_AND_FETCH_WORD (and,   , &)
+OP_AND_FETCH_WORD (xor,   , ^)
+OP_AND_FETCH_WORD (nand, ~, &)
+
+SUBWORD_SYNC_OP (add,   , +, short, 2, newval)
+SUBWORD_SYNC_OP (sub,   , -, short, 2, newval)
+SUBWORD_SYNC_OP (or,    , |, short, 2, newval)
+SUBWORD_SYNC_OP (and,   , &, short, 2, newval)
+SUBWORD_SYNC_OP (xor,   , ^, short, 2, newval)
+SUBWORD_SYNC_OP (nand, ~, &, short, 2, newval)
+
+SUBWORD_SYNC_OP (add,   , +, char, 1, newval)
+SUBWORD_SYNC_OP (sub,   , -, char, 1, newval)
+SUBWORD_SYNC_OP (or,    , |, char, 1, newval)
+SUBWORD_SYNC_OP (and,   , &, char, 1, newval)
+SUBWORD_SYNC_OP (xor,   , ^, char, 1, newval)
+SUBWORD_SYNC_OP (nand, ~, &, char, 1, newval)
+
+int HIDDEN
+__sync_val_compare_and_swap_4 (int *ptr, int oldval, int newval)
+{
+  int actual_oldval, fail;
+    
+  while (1)
+    {
+      actual_oldval = *ptr;
+
+      if (oldval != actual_oldval)
+       return actual_oldval;
+
+      fail = __kernel_cmpxchg (actual_oldval, newval, ptr);
+  
+      if (!fail)
+        return oldval;
+    }
+}
+
+#define SUBWORD_VAL_CAS(TYPE, WIDTH)                                   \
+  TYPE HIDDEN                                                          \
+  __sync_val_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,         \
+                                      TYPE newval)                     \
+  {                                                                    \
+    int *wordptr = (int *)((unsigned long) ptr & ~3), fail;            \
+    unsigned int mask, shift, actual_oldval, actual_newval;            \
+                                                                       \
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;    \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    while (1)                                                          \
+      {                                                                        \
+       actual_oldval = *wordptr;                                       \
+                                                                       \
+       if (((actual_oldval & mask) >> shift) != (unsigned int) oldval) \
+          return (actual_oldval & mask) >> shift;                      \
+                                                                       \
+       actual_newval = (actual_oldval & ~mask)                         \
+                       | (((unsigned int) newval << shift) & mask);    \
+                                                                       \
+       fail = __kernel_cmpxchg (actual_oldval, actual_newval,          \
+                                wordptr);                              \
+                                                                       \
+       if (!fail)                                                      \
+          return oldval;                                               \
+      }                                                                        \
+  }
+
+SUBWORD_VAL_CAS (short, 2)
+SUBWORD_VAL_CAS (char,  1)
+
+typedef unsigned char bool;
+
+bool HIDDEN
+__sync_bool_compare_and_swap_4 (int *ptr, int oldval, int newval)
+{
+  int failure = __kernel_cmpxchg (oldval, newval, ptr);
+  return (failure == 0);
+}
+
+#define SUBWORD_BOOL_CAS(TYPE, WIDTH)                                  \
+  bool HIDDEN                                                          \
+  __sync_bool_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,                \
+                                       TYPE newval)                    \
+  {                                                                    \
+    TYPE actual_oldval                                                 \
+      = __sync_val_compare_and_swap_##WIDTH (ptr, oldval, newval);     \
+    return (oldval == actual_oldval);                                  \
+  }
+
+SUBWORD_BOOL_CAS (short, 2)
+SUBWORD_BOOL_CAS (char,  1)
+
+void HIDDEN
+__sync_synchronize (void)
+{
+}
+
+int HIDDEN
+__sync_lock_test_and_set_4 (int *ptr, int val)
+{
+  int failure, oldval;
+
+  do {
+    oldval = *ptr;
+    failure = __kernel_cmpxchg (oldval, val, ptr);
+  } while (failure != 0);
+
+  return oldval;
+}
+
+#define SUBWORD_TEST_AND_SET(TYPE, WIDTH)                              \
+  TYPE HIDDEN                                                          \
+  __sync_lock_test_and_set_##WIDTH (TYPE *ptr, TYPE val)               \
+  {                                                                    \
+    int failure;                                                       \
+    unsigned int oldval, newval, shift, mask;                          \
+    int *wordptr = (int *) ((unsigned long) ptr & ~3);                 \
+                                                                       \
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;    \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    do {                                                               \
+      oldval = *wordptr;                                               \
+      newval = (oldval & ~mask)                                                \
+              | (((unsigned int) val << shift) & mask);                \
+      failure = __kernel_cmpxchg (oldval, newval, wordptr);            \
+    } while (failure != 0);                                            \
+                                                                       \
+    return (oldval & mask) >> shift;                                   \
+  }
+
+SUBWORD_TEST_AND_SET (short, 2)
+SUBWORD_TEST_AND_SET (char,  1)
+
+#define SYNC_LOCK_RELEASE(TYPE, WIDTH)                                 \
+  void HIDDEN                                                          \
+  __sync_lock_release_##WIDTH (TYPE *ptr)                              \
+  {                                                                    \
+    *ptr = 0;                                                          \
+  }
+
+SYNC_LOCK_RELEASE (int,   4)
+SYNC_LOCK_RELEASE (short, 2)
+SYNC_LOCK_RELEASE (char,  1)
Index: gcc/config/pa/t-linux64
===================================================================
--- gcc/config/pa/t-linux64     (revision 137753)
+++ gcc/config/pa/t-linux64     (working copy)
@@ -8,5 +8,7 @@
 # Actually, hppa64 is always PIC but adding -fPIC does no harm.
 CRTSTUFF_T_CFLAGS_S = -fPIC
 
+LIB2FUNCS_STATIC_EXTRA = $(srcdir)/config/pa/linux-atomic.c
+
 # Compile libgcc2.a as PIC.
 TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1
Index: gcc/config/pa/t-linux
===================================================================
--- gcc/config/pa/t-linux       (revision 137753)
+++ gcc/config/pa/t-linux       (working copy)
@@ -9,6 +9,7 @@
 TARGET_LIBGCC2_CFLAGS = -fPIC -DELF=1 -DLINUX=1
 
 LIB2FUNCS_EXTRA=fptr.c
+LIB2FUNCS_STATIC_EXTRA = $(srcdir)/config/pa/linux-atomic.c
 
 fptr.c: $(srcdir)/config/pa/fptr.c
        rm -f fptr.c

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux
  2008-07-13 20:39 ` Helge Deller
  2008-07-14  0:02   ` Helge Deller
@ 2008-07-14  1:46   ` John David Anglin
  2008-07-14 13:37     ` Carlos O'Donell
  1 sibling, 1 reply; 11+ messages in thread
From: John David Anglin @ 2008-07-14  1:46 UTC (permalink / raw)
  To: Helge Deller; +Cc: gcc-patches, carlos

> First of all I should mention, that my initial goal of this patch was to 
> a) try to only change necessary parts for HPPA, while keeping 95% of the 
> code identical to the ARM implementation (just in case it could share 
> code with it). I assume with your proposals below I should drop this 
> goal and do a HPPA-only implementation?

Yes.  All kinds of code in the PA implementation has been copied and
customized from other targets.

> b) only support 1,2 and 4byte atomic functions, same as original code 
> for ARM.

That would be fine as the kernel doesn't support anything more.

> This is due reason a) (see above).
> I could do "#define KERNEL_CMPXCHG" and then a function 
> __kernel_cmpxchg() using this macro?

I think from a performance standpoint we want __kernel_cmpxchg()
to be inlined.  Possibly, it could just be a defined as static inline.
I don't think the macro is necessary.  This may help with side effects
I was wondering if the memory used by the asm should be made explicit.

> Hmm. Is it needed that often?

Probably, they aren't used that often.

> I think atomic ops should normally operate on native types (int and long 
> long) only.

At a minimum, I believe that the kernel support should operate on 32
and 64-bit objects in the 32 and 64-bit runtimes, respectively.  This
allows the exchange of pointers in each runtime.  At the moment, the
64-bit support works on 32 bits.  I don't think it is reasonable to
handle 64-bit objects in the 32-bit runtime.

> Kyle mentioned yesterday on IRC, that it probably would be better to 
> implement those additional functions (if they really need to be 
> implemented!) with VDSO, and avoid adding more functions to the LWS.

The reason for using the gateway page is that processes are never
scheduled off the gateway page or sent signals while on the gateway
page.  At the moment, it is unclear what additional overhead would
be needed to do this with VDSO.  It might be better to stay with
shift and mask.

> /* Kernel helper for memory barrier.  */
> #define __kernel_dmb() asm volatile ( "" : : : "memory" );

Another thought about this.  It is my understanding that the above
is a full barrier.  __sync_lock_release only requires a release
barrier.  __sync_lock_test_and_set needs an acquire barrier.  It's
not clear how to achieve this.

> > I don't like very much the fact that the implementations loop forever
> > when EFAULT or ENOSYS is returned by the kernel.
> 
> Yes.
> Any idea how to implement a fault generation or program abort in libgcc.a?

There is an asm in the glibc dynamic loader to generate an insn fault.
Possibly, that's the best solution as it gives the program a chance to
recover.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux
  2008-07-14  1:46   ` [PATCH, HPPA] Atomic builtins using kernel helpers for Linux John David Anglin
@ 2008-07-14 13:37     ` Carlos O'Donell
  0 siblings, 0 replies; 11+ messages in thread
From: Carlos O'Donell @ 2008-07-14 13:37 UTC (permalink / raw)
  To: John David Anglin; +Cc: Helge Deller, gcc-patches

On Sun, Jul 13, 2008 at 6:08 PM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
> At a minimum, I believe that the kernel support should operate on 32
> and 64-bit objects in the 32 and 64-bit runtimes, respectively.  This
> allows the exchange of pointers in each runtime.  At the moment, the
> 64-bit support works on 32 bits.  I don't think it is reasonable to
> handle 64-bit objects in the 32-bit runtime.

All that is needed is this:

1. Add a new light-weight syscall number.
2. Point the new syscall at the existing 64-bit entry ""

and

3. Fixup ABI comments for 32-bit and 64-bit.

I'll work on implementing that.

>> Kyle mentioned yesterday on IRC, that it probably would be better to
>> implement those additional functions (if they really need to be
>> implemented!) with VDSO, and avoid adding more functions to the LWS.
>
> The reason for using the gateway page is that processes are never
> scheduled off the gateway page or sent signals while on the gateway
> page.  At the moment, it is unclear what additional overhead would
> be needed to do this with VDSO.  It might be better to stay with
> shift and mask.

The VDSO incurs 2 PLT lookups, and then a jump. The LWS is a single
BLE, but is much less flexible, and without additional annotation,
will confuse the debugger. I think libgcc or a VDSO is the right
solution for all the other primitives.

>> > I don't like very much the fact that the implementations loop forever
>> > when EFAULT or ENOSYS is returned by the kernel.
>>
>> Yes.
>> Any idea how to implement a fault generation or program abort in libgcc.a?
>
> There is an asm in the glibc dynamic loader to generate an insn fault.
> Possibly, that's the best solution as it gives the program a chance to
> recover.

You should only loop forever on EAGAIN and EDEADLOCK.

A EFAULT and ENOSYS should definately error out.

GLIBC uses this macro to cause a fault:
~~~
/* An instruction privileged instruction to crash a userspace program.

   We go with iitlbp because it has a history of being used to crash
   programs.  */

#define ABORT_INSTRUCTION asm ("iitlbp %r0,(%sr0, %r0)")
~~~

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux (try #4)
  2008-07-14  0:02   ` Helge Deller
@ 2008-07-15  0:34     ` Helge Deller
  2008-10-29 14:00       ` PING " Andrew Haley
  0 siblings, 1 reply; 11+ messages in thread
From: Helge Deller @ 2008-07-15  0:34 UTC (permalink / raw)
  To: gcc-patches

This is the 4th version of the patch for atomic builtin functions for hppa.
It should address all issues which Dave/Carlos brought up.
- 32bit userspace support, 64bit-ready (there is no 64bit userspace on hppa
yet)
- abort with SIGILL when Linux kernel does not include the kernel helpers
(-ENOSYS)
- abort with SIGILL when userspace points to illegal address (-EFAULT)

Ok for mainline?

ChangeLog

    gcc/
    * config/pa/t-linux (LIB2FUNCS_STATIC_EXTRA): Add
    config/pa/linux-atomic.c.
    * config/pa/t-linux64 (LIB2FUNCS_STATIC_EXTRA): Add
    config/pa/linux-atomic.c.
    * config/pa/linux-atomic.c: New.

Index: gcc/config/pa/linux-atomic.c
===================================================================
--- gcc/config/pa/linux-atomic.c        (revision 0)
+++ gcc/config/pa/linux-atomic.c        (revision 0)
@@ -0,0 +1,295 @@
+/* Linux-specific atomic operations for PA Linux.
+   Copyright (C) 2008 Free Software Foundation, Inc.
+   Based on code contributed by CodeSourcery for ARM EABI Linux.
+   Modifications for PA Linux by Helge Deller <deller@gmx.de>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 2, or (at your option) any later
+version.
+
+In addition to the permissions in the GNU General Public License, the
+Free Software Foundation gives you unlimited permission to link the
+compiled version of this file into combinations with other programs,
+and to distribute those combinations without any restriction coming
+from the use of this file.  (The General Public License restrictions
+do apply in other respects; for example, they cover modification of
+the file, and distribution when not linked into a combine
+executable.)
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to the Free
+Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301, USA.  */
+
+#include <errno.h>
+
+/* A privileged instruction to crash a userspace program with SIGILL.  */
+#define ABORT_INSTRUCTION asm ("iitlbp %r0,(%sr0, %r0)")
+
+/* Determine kernel LWS function call (0=32bit, 1=64bit userspace).  */
+#define LWS_CAS (sizeof(unsigned long) == 4 ? 0 : 1)
+
+/* Kernel helper for compare-and-exchange a 32bit value.  */
+static inline long __kernel_cmpxchg(int oldval, int newval, int *mem)
+{
+    register unsigned long lws_mem asm("r26") = (unsigned long) (mem);
+    register long lws_ret   asm("r28");
+    register long lws_errno asm("r21");
+    register int lws_old asm("r25") = oldval;
+    register int lws_new asm("r24") = newval;
+    asm volatile(      "ble    0xb0(%%sr2, %%r0)       \n\t"
+                       "ldi    %5, %%r20               \n\t"
+       : "=r" (lws_ret), "=r" (lws_errno), "=r" (lws_mem),
+         "=r" (lws_old), "=r" (lws_new)
+       : "i" (LWS_CAS), "2" (lws_mem), "3" (lws_old), "4" (lws_new)
+       : "r1", "r2", "r20", "r22", "r23", "r27", "r29", "r31", "memory"
+    );
+    if (__builtin_expect (lws_errno == -EFAULT || lws_errno == -ENOSYS, 0))
+       ABORT_INSTRUCTION;
+    return lws_errno;
+}
+
+/* Note: we implement byte, short and int versions of atomic operations
using
+   the above kernel helpers, but there is no support for 64-bit operations
as
+   yet.  */
+
+#define HIDDEN __attribute__ ((visibility ("hidden")))
+
+/* Big endian masks  */
+#define INVERT_MASK_1 24
+#define INVERT_MASK_2 16
+
+#define MASK_1 0xffu
+#define MASK_2 0xffffu
+
+#define FETCH_AND_OP_WORD(OP, PFX_OP, INF_OP)                          \
+  int HIDDEN                                                           \
+  __sync_fetch_and_##OP##_4 (int *ptr, int val)                                \
+  {                                                                    \
+    int failure, tmp;                                                  \
+                                                                       \
+    do {                                                               \
+      tmp = *ptr;                                                      \
+      failure = __kernel_cmpxchg (tmp, PFX_OP tmp INF_OP val, ptr);    \
+    } while (failure != 0);                                            \
+                                                                       \
+    return tmp;                                                                \
+  }
+
+FETCH_AND_OP_WORD (add,   , +)
+FETCH_AND_OP_WORD (sub,   , -)
+FETCH_AND_OP_WORD (or,    , |)
+FETCH_AND_OP_WORD (and,   , &)
+FETCH_AND_OP_WORD (xor,   , ^)
+FETCH_AND_OP_WORD (nand, ~, &)
+
+#define NAME_oldval(OP, WIDTH) __sync_fetch_and_##OP##_##WIDTH
+#define NAME_newval(OP, WIDTH) __sync_##OP##_and_fetch_##WIDTH
+
+/* Implement both __sync_<op>_and_fetch and __sync_fetch_and_<op> for
+   subword-sized quantities.  */
+
+#define SUBWORD_SYNC_OP(OP, PFX_OP, INF_OP, TYPE, WIDTH, RETURN)       \
+  TYPE HIDDEN                                                          \
+  NAME##_##RETURN (OP, WIDTH) (TYPE *ptr, TYPE val)                    \
+  {                                                                    \
+    int *wordptr = (int *) ((unsigned long) ptr & ~3);                 \
+    unsigned int mask, shift, oldval, newval;                          \
+    int failure;                                                       \
+                                                                       \
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;    \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    do {                                                               \
+      oldval = *wordptr;                                               \
+      newval = ((PFX_OP ((oldval & mask) >> shift)                     \
+                 INF_OP (unsigned int) val) << shift) & mask;          \
+      newval |= oldval & ~mask;                                                \
+      failure = __kernel_cmpxchg (oldval, newval, wordptr);            \
+    } while (failure != 0);                                            \
+                                                                       \
+    return (RETURN & mask) >> shift;                                   \
+  }
+
+SUBWORD_SYNC_OP (add,   , +, short, 2, oldval)
+SUBWORD_SYNC_OP (sub,   , -, short, 2, oldval)
+SUBWORD_SYNC_OP (or,    , |, short, 2, oldval)
+SUBWORD_SYNC_OP (and,   , &, short, 2, oldval)
+SUBWORD_SYNC_OP (xor,   , ^, short, 2, oldval)
+SUBWORD_SYNC_OP (nand, ~, &, short, 2, oldval)
+
+SUBWORD_SYNC_OP (add,   , +, char, 1, oldval)
+SUBWORD_SYNC_OP (sub,   , -, char, 1, oldval)
+SUBWORD_SYNC_OP (or,    , |, char, 1, oldval)
+SUBWORD_SYNC_OP (and,   , &, char, 1, oldval)
+SUBWORD_SYNC_OP (xor,   , ^, char, 1, oldval)
+SUBWORD_SYNC_OP (nand, ~, &, char, 1, oldval)
+
+#define OP_AND_FETCH_WORD(OP, PFX_OP, INF_OP)                          \
+  int HIDDEN                                                           \
+  __sync_##OP##_and_fetch_4 (int *ptr, int val)                                \
+  {                                                                    \
+    int tmp, failure;                                                  \
+                                                                       \
+    do {                                                               \
+      tmp = *ptr;                                                      \
+      failure = __kernel_cmpxchg (tmp, PFX_OP tmp INF_OP val, ptr);    \
+    } while (failure != 0);                                            \
+                                                                       \
+    return PFX_OP tmp INF_OP val;                                      \
+  }
+
+OP_AND_FETCH_WORD (add,   , +)
+OP_AND_FETCH_WORD (sub,   , -)
+OP_AND_FETCH_WORD (or,    , |)
+OP_AND_FETCH_WORD (and,   , &)
+OP_AND_FETCH_WORD (xor,   , ^)
+OP_AND_FETCH_WORD (nand, ~, &)
+
+SUBWORD_SYNC_OP (add,   , +, short, 2, newval)
+SUBWORD_SYNC_OP (sub,   , -, short, 2, newval)
+SUBWORD_SYNC_OP (or,    , |, short, 2, newval)
+SUBWORD_SYNC_OP (and,   , &, short, 2, newval)
+SUBWORD_SYNC_OP (xor,   , ^, short, 2, newval)
+SUBWORD_SYNC_OP (nand, ~, &, short, 2, newval)
+
+SUBWORD_SYNC_OP (add,   , +, char, 1, newval)
+SUBWORD_SYNC_OP (sub,   , -, char, 1, newval)
+SUBWORD_SYNC_OP (or,    , |, char, 1, newval)
+SUBWORD_SYNC_OP (and,   , &, char, 1, newval)
+SUBWORD_SYNC_OP (xor,   , ^, char, 1, newval)
+SUBWORD_SYNC_OP (nand, ~, &, char, 1, newval)
+
+int HIDDEN
+__sync_val_compare_and_swap_4 (int *ptr, int oldval, int newval)
+{
+  int actual_oldval, fail;
+    
+  while (1)
+    {
+      actual_oldval = *ptr;
+
+      if (oldval != actual_oldval)
+       return actual_oldval;
+
+      fail = __kernel_cmpxchg (actual_oldval, newval, ptr);
+  
+      if (!fail)
+        return oldval;
+    }
+}
+
+#define SUBWORD_VAL_CAS(TYPE, WIDTH)                                   \
+  TYPE HIDDEN                                                          \
+  __sync_val_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,         \
+                                      TYPE newval)                     \
+  {                                                                    \
+    int *wordptr = (int *)((unsigned long) ptr & ~3), fail;            \
+    unsigned int mask, shift, actual_oldval, actual_newval;            \
+                                                                       \
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;    \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    while (1)                                                          \
+      {                                                                        \
+       actual_oldval = *wordptr;                                       \
+                                                                       \
+       if (((actual_oldval & mask) >> shift) != (unsigned int) oldval) \
+          return (actual_oldval & mask) >> shift;                      \
+                                                                       \
+       actual_newval = (actual_oldval & ~mask)                         \
+                       | (((unsigned int) newval << shift) & mask);    \
+                                                                       \
+       fail = __kernel_cmpxchg (actual_oldval, actual_newval,          \
+                                wordptr);                              \
+                                                                       \
+       if (!fail)                                                      \
+          return oldval;                                               \
+      }                                                                        \
+  }
+
+SUBWORD_VAL_CAS (short, 2)
+SUBWORD_VAL_CAS (char,  1)
+
+typedef unsigned char bool;
+
+bool HIDDEN
+__sync_bool_compare_and_swap_4 (int *ptr, int oldval, int newval)
+{
+  int failure = __kernel_cmpxchg (oldval, newval, ptr);
+  return (failure == 0);
+}
+
+#define SUBWORD_BOOL_CAS(TYPE, WIDTH)                                  \
+  bool HIDDEN                                                          \
+  __sync_bool_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,                \
+                                       TYPE newval)                    \
+  {                                                                    \
+    TYPE actual_oldval                                                 \
+      = __sync_val_compare_and_swap_##WIDTH (ptr, oldval, newval);     \
+    return (oldval == actual_oldval);                                  \
+  }
+
+SUBWORD_BOOL_CAS (short, 2)
+SUBWORD_BOOL_CAS (char,  1)
+
+void HIDDEN
+__sync_synchronize (void)
+{
+}
+
+int HIDDEN
+__sync_lock_test_and_set_4 (int *ptr, int val)
+{
+  int failure, oldval;
+
+  do {
+    oldval = *ptr;
+    failure = __kernel_cmpxchg (oldval, val, ptr);
+  } while (failure != 0);
+
+  return oldval;
+}
+
+#define SUBWORD_TEST_AND_SET(TYPE, WIDTH)                              \
+  TYPE HIDDEN                                                          \
+  __sync_lock_test_and_set_##WIDTH (TYPE *ptr, TYPE val)               \
+  {                                                                    \
+    int failure;                                                       \
+    unsigned int oldval, newval, shift, mask;                          \
+    int *wordptr = (int *) ((unsigned long) ptr & ~3);                 \
+                                                                       \
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;    \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    do {                                                               \
+      oldval = *wordptr;                                               \
+      newval = (oldval & ~mask)                                                \
+              | (((unsigned int) val << shift) & mask);                \
+      failure = __kernel_cmpxchg (oldval, newval, wordptr);            \
+    } while (failure != 0);                                            \
+                                                                       \
+    return (oldval & mask) >> shift;                                   \
+  }
+
+SUBWORD_TEST_AND_SET (short, 2)
+SUBWORD_TEST_AND_SET (char,  1)
+
+#define SYNC_LOCK_RELEASE(TYPE, WIDTH)                                 \
+  void HIDDEN                                                          \
+  __sync_lock_release_##WIDTH (TYPE *ptr)                              \
+  {                                                                    \
+    *ptr = 0;                                                          \
+  }
+
+SYNC_LOCK_RELEASE (int,   4)
+SYNC_LOCK_RELEASE (short, 2)
+SYNC_LOCK_RELEASE (char,  1)
Index: gcc/config/pa/t-linux64
===================================================================
--- gcc/config/pa/t-linux64     (revision 137796)
+++ gcc/config/pa/t-linux64     (working copy)
@@ -8,5 +8,7 @@
 # Actually, hppa64 is always PIC but adding -fPIC does no harm.
 CRTSTUFF_T_CFLAGS_S = -fPIC
 
+LIB2FUNCS_STATIC_EXTRA = $(srcdir)/config/pa/linux-atomic.c
+
 # Compile libgcc2.a as PIC.
 TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1
Index: gcc/config/pa/t-linux
===================================================================
--- gcc/config/pa/t-linux       (revision 137796)
+++ gcc/config/pa/t-linux       (working copy)
@@ -9,6 +9,7 @@
 TARGET_LIBGCC2_CFLAGS = -fPIC -DELF=1 -DLINUX=1
 
 LIB2FUNCS_EXTRA=fptr.c
+LIB2FUNCS_STATIC_EXTRA = $(srcdir)/config/pa/linux-atomic.c
 
 fptr.c: $(srcdir)/config/pa/fptr.c
        rm -f fptr.c

^ permalink raw reply	[flat|nested] 11+ messages in thread

* PING Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux  (try #4)
  2008-07-15  0:34     ` [PATCH, HPPA] Atomic builtins using kernel helpers for Linux (try #4) Helge Deller
@ 2008-10-29 14:00       ` Andrew Haley
  2008-10-29 23:50         ` Helge Deller
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Haley @ 2008-10-29 14:00 UTC (permalink / raw)
  To: Helge Deller; +Cc: gcc-patches

Helge Deller wrote:
> This is the 4th version of the patch for atomic builtin functions for hppa.
> It should address all issues which Dave/Carlos brought up.
> - 32bit userspace support, 64bit-ready (there is no 64bit userspace on hppa
> yet)
> - abort with SIGILL when Linux kernel does not include the kernel helpers
> (-ENOSYS)
> - abort with SIGILL when userspace points to illegal address (-EFAULT)
> 
> Ok for mainline?

Did this patch die due to lack of interest?  Looked good to me.

Andrew.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PING Re: [PATCH, HPPA] Atomic builtins using kernel helpers for  Linux (try #4)
  2008-10-29 14:00       ` PING " Andrew Haley
@ 2008-10-29 23:50         ` Helge Deller
  2008-10-30 12:13           ` Andrew Haley
  0 siblings, 1 reply; 11+ messages in thread
From: Helge Deller @ 2008-10-29 23:50 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-patches

Andrew Haley wrote:
> Helge Deller wrote:
>> This is the 4th version of the patch for atomic builtin functions for hppa.
>> It should address all issues which Dave/Carlos brought up.
>> - 32bit userspace support, 64bit-ready (there is no 64bit userspace on hppa
>> yet)
>> - abort with SIGILL when Linux kernel does not include the kernel helpers
>> (-ENOSYS)
>> - abort with SIGILL when userspace points to illegal address (-EFAULT)
>>
>> Ok for mainline?
> 
> Did this patch die due to lack of interest?  Looked good to me.

It was applied:
http://gcc.gnu.org/ml/gcc-patches/2008-09/msg00553.html

Helge

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PING Re: [PATCH, HPPA] Atomic builtins using kernel helpers for  Linux (try #4)
  2008-10-29 23:50         ` Helge Deller
@ 2008-10-30 12:13           ` Andrew Haley
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Haley @ 2008-10-30 12:13 UTC (permalink / raw)
  To: Julian Brown; +Cc: Helge Deller, gcc-patches

Helge Deller wrote:
> Andrew Haley wrote:
>> Helge Deller wrote:
>>> This is the 4th version of the patch for atomic builtin functions for
>>> hppa.
>>> It should address all issues which Dave/Carlos brought up.
>>> - 32bit userspace support, 64bit-ready (there is no 64bit userspace
>>> on hppa
>>> yet)
>>> - abort with SIGILL when Linux kernel does not include the kernel
>>> helpers
>>> (-ENOSYS)
>>> - abort with SIGILL when userspace points to illegal address (-EFAULT)
>>>
>>> Ok for mainline?
>>
>> Did this patch die due to lack of interest?  Looked good to me.
> 
> It was applied:
> http://gcc.gnu.org/ml/gcc-patches/2008-09/msg00553.html

Thanks.  So, the original ARM version seems to have died, but the HP-PA
code -- which was IIRC a derivative of it -- was applied.

svn log svn+ssh://gcc.gnu.org/svn/gcc/trunk/gcc/config/arm/linux-atomic.c
svn: File not found: revision 141459, path '/trunk/gcc/config/arm/linux-atomic.c'

Andrew.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux
       [not found] <487BBC76.9000204@gmx.de>
@ 2008-09-07 17:20 ` John David Anglin
  0 siblings, 0 replies; 11+ messages in thread
From: John David Anglin @ 2008-09-07 17:20 UTC (permalink / raw)
  To: Helge Deller; +Cc: carlos, gcc-patches

Helge,

> Attached is my latest version (version 4) of the patch which includes 
> the ABORT_INSTRUCTION as well.

After testing and making a fwe small changes, I have committed the
patch as shown below to the GCC trunk.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

2008-09-07  Helge Deller  <deller@gmx.de>

        * pa/linux-atomic.c: New file.
	* pa/t-linux (LIB2FUNCS_STATIC_EXTRA): Define.
	* pa/t-linux64 (LIB2FUNCS_STATIC_EXTRA): Define.

--- /dev/null	2008-08-30 12:10:22.758214772 -0400
+++ config/pa/linux-atomic.c	2008-09-07 11:01:38.000000000 -0400
@@ -0,0 +1,300 @@
+/* Linux-specific atomic operations for PA Linux.
+   Copyright (C) 2008 Free Software Foundation, Inc.
+   Based on code contributed by CodeSourcery for ARM EABI Linux.
+   Modifications for PA Linux by Helge Deller <deller@gmx.de>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 2, or (at your option) any later
+version.
+
+In addition to the permissions in the GNU General Public License, the
+Free Software Foundation gives you unlimited permission to link the
+compiled version of this file into combinations with other programs,
+and to distribute those combinations without any restriction coming
+from the use of this file.  (The General Public License restrictions
+do apply in other respects; for example, they cover modification of
+the file, and distribution when not linked into a combine
+executable.)
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to the Free
+Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301, USA.  */
+
+#include <errno.h>
+
+/* All PA-RISC implementations supported by linux have strongly
+   ordered loads and stores.  Only cache flushes and purges can be
+   delayed.  The data cache implementations are all globally
+   coherent.  Thus, there is no need to synchonize memory accesses.
+
+   GCC automatically issues a asm memory barrier when it encounters
+   a __sync_synchronize builtin.  Thus, we do not need to define this
+   builtin.
+
+   We implement byte, short and int versions of each atomic operation
+   using the kernel helper defined below.  There is no support for
+   64-bit operations yet.  */
+
+/* A privileged instruction to crash a userspace program with SIGILL.  */
+#define ABORT_INSTRUCTION asm ("iitlbp %r0,(%sr0, %r0)")
+
+/* Determine kernel LWS function call (0=32-bit, 1=64-bit userspace).  */
+#define LWS_CAS (sizeof(unsigned long) == 4 ? 0 : 1)
+
+/* Kernel helper for compare-and-exchange a 32-bit value.  */
+static inline long
+__kernel_cmpxchg (int oldval, int newval, int *mem)
+{
+  register unsigned long lws_mem asm("r26") = (unsigned long) (mem);
+  register long lws_ret   asm("r28");
+  register long lws_errno asm("r21");
+  register int lws_old asm("r25") = oldval;
+  register int lws_new asm("r24") = newval;
+  asm volatile (	"ble	0xb0(%%sr2, %%r0)	\n\t"
+			"ldi	%5, %%r20		\n\t"
+	: "=r" (lws_ret), "=r" (lws_errno), "=r" (lws_mem),
+	  "=r" (lws_old), "=r" (lws_new)
+	: "i" (LWS_CAS), "2" (lws_mem), "3" (lws_old), "4" (lws_new)
+	: "r1", "r20", "r22", "r23", "r29", "r31", "memory"
+  );
+  if (__builtin_expect (lws_errno == -EFAULT || lws_errno == -ENOSYS, 0))
+    ABORT_INSTRUCTION;
+  return lws_errno;
+}
+
+#define HIDDEN __attribute__ ((visibility ("hidden")))
+
+/* Big endian masks  */
+#define INVERT_MASK_1 24
+#define INVERT_MASK_2 16
+
+#define MASK_1 0xffu
+#define MASK_2 0xffffu
+
+#define FETCH_AND_OP_WORD(OP, PFX_OP, INF_OP)				\
+  int HIDDEN								\
+  __sync_fetch_and_##OP##_4 (int *ptr, int val)				\
+  {									\
+    int failure, tmp;							\
+									\
+    do {								\
+      tmp = *ptr;							\
+      failure = __kernel_cmpxchg (tmp, PFX_OP tmp INF_OP val, ptr);	\
+    } while (failure != 0);						\
+									\
+    return tmp;								\
+  }
+
+FETCH_AND_OP_WORD (add,   , +)
+FETCH_AND_OP_WORD (sub,   , -)
+FETCH_AND_OP_WORD (or,    , |)
+FETCH_AND_OP_WORD (and,   , &)
+FETCH_AND_OP_WORD (xor,   , ^)
+FETCH_AND_OP_WORD (nand, ~, &)
+
+#define NAME_oldval(OP, WIDTH) __sync_fetch_and_##OP##_##WIDTH
+#define NAME_newval(OP, WIDTH) __sync_##OP##_and_fetch_##WIDTH
+
+/* Implement both __sync_<op>_and_fetch and __sync_fetch_and_<op> for
+   subword-sized quantities.  */
+
+#define SUBWORD_SYNC_OP(OP, PFX_OP, INF_OP, TYPE, WIDTH, RETURN)	\
+  TYPE HIDDEN								\
+  NAME##_##RETURN (OP, WIDTH) (TYPE *ptr, TYPE val)			\
+  {									\
+    int *wordptr = (int *) ((unsigned long) ptr & ~3);			\
+    unsigned int mask, shift, oldval, newval;				\
+    int failure;							\
+									\
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;	\
+    mask = MASK_##WIDTH << shift;					\
+									\
+    do {								\
+      oldval = *wordptr;						\
+      newval = ((PFX_OP ((oldval & mask) >> shift)			\
+                 INF_OP (unsigned int) val) << shift) & mask;		\
+      newval |= oldval & ~mask;						\
+      failure = __kernel_cmpxchg (oldval, newval, wordptr);		\
+    } while (failure != 0);						\
+									\
+    return (RETURN & mask) >> shift;					\
+  }
+
+SUBWORD_SYNC_OP (add,   , +, short, 2, oldval)
+SUBWORD_SYNC_OP (sub,   , -, short, 2, oldval)
+SUBWORD_SYNC_OP (or,    , |, short, 2, oldval)
+SUBWORD_SYNC_OP (and,   , &, short, 2, oldval)
+SUBWORD_SYNC_OP (xor,   , ^, short, 2, oldval)
+SUBWORD_SYNC_OP (nand, ~, &, short, 2, oldval)
+
+SUBWORD_SYNC_OP (add,   , +, char, 1, oldval)
+SUBWORD_SYNC_OP (sub,   , -, char, 1, oldval)
+SUBWORD_SYNC_OP (or,    , |, char, 1, oldval)
+SUBWORD_SYNC_OP (and,   , &, char, 1, oldval)
+SUBWORD_SYNC_OP (xor,   , ^, char, 1, oldval)
+SUBWORD_SYNC_OP (nand, ~, &, char, 1, oldval)
+
+#define OP_AND_FETCH_WORD(OP, PFX_OP, INF_OP)				\
+  int HIDDEN								\
+  __sync_##OP##_and_fetch_4 (int *ptr, int val)				\
+  {									\
+    int tmp, failure;							\
+									\
+    do {								\
+      tmp = *ptr;							\
+      failure = __kernel_cmpxchg (tmp, PFX_OP tmp INF_OP val, ptr);	\
+    } while (failure != 0);						\
+									\
+    return PFX_OP tmp INF_OP val;					\
+  }
+
+OP_AND_FETCH_WORD (add,   , +)
+OP_AND_FETCH_WORD (sub,   , -)
+OP_AND_FETCH_WORD (or,    , |)
+OP_AND_FETCH_WORD (and,   , &)
+OP_AND_FETCH_WORD (xor,   , ^)
+OP_AND_FETCH_WORD (nand, ~, &)
+
+SUBWORD_SYNC_OP (add,   , +, short, 2, newval)
+SUBWORD_SYNC_OP (sub,   , -, short, 2, newval)
+SUBWORD_SYNC_OP (or,    , |, short, 2, newval)
+SUBWORD_SYNC_OP (and,   , &, short, 2, newval)
+SUBWORD_SYNC_OP (xor,   , ^, short, 2, newval)
+SUBWORD_SYNC_OP (nand, ~, &, short, 2, newval)
+
+SUBWORD_SYNC_OP (add,   , +, char, 1, newval)
+SUBWORD_SYNC_OP (sub,   , -, char, 1, newval)
+SUBWORD_SYNC_OP (or,    , |, char, 1, newval)
+SUBWORD_SYNC_OP (and,   , &, char, 1, newval)
+SUBWORD_SYNC_OP (xor,   , ^, char, 1, newval)
+SUBWORD_SYNC_OP (nand, ~, &, char, 1, newval)
+
+int HIDDEN
+__sync_val_compare_and_swap_4 (int *ptr, int oldval, int newval)
+{
+  int actual_oldval, fail;
+    
+  while (1)
+    {
+      actual_oldval = *ptr;
+
+      if (oldval != actual_oldval)
+	return actual_oldval;
+
+      fail = __kernel_cmpxchg (actual_oldval, newval, ptr);
+  
+      if (!fail)
+	return oldval;
+    }
+}
+
+#define SUBWORD_VAL_CAS(TYPE, WIDTH)					\
+  TYPE HIDDEN								\
+  __sync_val_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,		\
+				       TYPE newval)			\
+  {									\
+    int *wordptr = (int *)((unsigned long) ptr & ~3), fail;		\
+    unsigned int mask, shift, actual_oldval, actual_newval;		\
+									\
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;	\
+    mask = MASK_##WIDTH << shift;					\
+									\
+    while (1)								\
+      {									\
+	actual_oldval = *wordptr;					\
+									\
+	if (((actual_oldval & mask) >> shift) != (unsigned int) oldval)	\
+          return (actual_oldval & mask) >> shift;			\
+									\
+	actual_newval = (actual_oldval & ~mask)				\
+			| (((unsigned int) newval << shift) & mask);	\
+									\
+	fail = __kernel_cmpxchg (actual_oldval, actual_newval,		\
+				 wordptr);				\
+									\
+	if (!fail)							\
+	  return oldval;						\
+      }									\
+  }
+
+SUBWORD_VAL_CAS (short, 2)
+SUBWORD_VAL_CAS (char,  1)
+
+typedef unsigned char bool;
+
+bool HIDDEN
+__sync_bool_compare_and_swap_4 (int *ptr, int oldval, int newval)
+{
+  int failure = __kernel_cmpxchg (oldval, newval, ptr);
+  return (failure == 0);
+}
+
+#define SUBWORD_BOOL_CAS(TYPE, WIDTH)					\
+  bool HIDDEN								\
+  __sync_bool_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,		\
+					TYPE newval)			\
+  {									\
+    TYPE actual_oldval							\
+      = __sync_val_compare_and_swap_##WIDTH (ptr, oldval, newval);	\
+    return (oldval == actual_oldval);					\
+  }
+
+SUBWORD_BOOL_CAS (short, 2)
+SUBWORD_BOOL_CAS (char,  1)
+
+int HIDDEN
+__sync_lock_test_and_set_4 (int *ptr, int val)
+{
+  int failure, oldval;
+
+  do {
+    oldval = *ptr;
+    failure = __kernel_cmpxchg (oldval, val, ptr);
+  } while (failure != 0);
+
+  return oldval;
+}
+
+#define SUBWORD_TEST_AND_SET(TYPE, WIDTH)				\
+  TYPE HIDDEN								\
+  __sync_lock_test_and_set_##WIDTH (TYPE *ptr, TYPE val)		\
+  {									\
+    int failure;							\
+    unsigned int oldval, newval, shift, mask;				\
+    int *wordptr = (int *) ((unsigned long) ptr & ~3);			\
+									\
+    shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;	\
+    mask = MASK_##WIDTH << shift;					\
+									\
+    do {								\
+      oldval = *wordptr;						\
+      newval = (oldval & ~mask)						\
+	       | (((unsigned int) val << shift) & mask);		\
+      failure = __kernel_cmpxchg (oldval, newval, wordptr);		\
+    } while (failure != 0);						\
+									\
+    return (oldval & mask) >> shift;					\
+  }
+
+SUBWORD_TEST_AND_SET (short, 2)
+SUBWORD_TEST_AND_SET (char,  1)
+
+#define SYNC_LOCK_RELEASE(TYPE, WIDTH)					\
+  void HIDDEN								\
+  __sync_lock_release_##WIDTH (TYPE *ptr)				\
+  {									\
+    *ptr = 0;								\
+  }
+
+SYNC_LOCK_RELEASE (int,   4)
+SYNC_LOCK_RELEASE (short, 2)
+SYNC_LOCK_RELEASE (char,  1)
Index: gcc/config/pa/t-linux64
===================================================================
--- config/pa/t-linux64	(revision 137974)
+++ config/pa/t-linux64	(working copy)
@@ -8,5 +8,7 @@
 # Actually, hppa64 is always PIC but adding -fPIC does no harm.
 CRTSTUFF_T_CFLAGS_S = -fPIC
 
+LIB2FUNCS_STATIC_EXTRA = $(srcdir)/config/pa/linux-atomic.c
+
 # Compile libgcc2.a as PIC.
 TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1
Index: gcc/config/pa/t-linux
===================================================================
--- config/pa/t-linux	(revision 137974)
+++ config/pa/t-linux	(working copy)
@@ -9,6 +9,7 @@
 TARGET_LIBGCC2_CFLAGS = -fPIC -DELF=1 -DLINUX=1
 
 LIB2FUNCS_EXTRA=fptr.c
+LIB2FUNCS_STATIC_EXTRA = $(srcdir)/config/pa/linux-atomic.c
 
 fptr.c: $(srcdir)/config/pa/fptr.c
 	rm -f fptr.c

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, HPPA] Atomic builtins using kernel helpers for Linux
  2008-07-10 19:57         ` Mark Mitchell
@ 2008-07-13 16:26           ` Helge Deller
  0 siblings, 0 replies; 11+ messages in thread
From: Helge Deller @ 2008-07-13 16:26 UTC (permalink / raw)
  To: gcc-patches

Mark Mitchell wrote:

> Daniel Jacobowitz wrote:
>> On Thu, Jul 10, 2008 at 10:36:13AM -0400, Carlos O'Donell wrote:
>>> Do you think that will get messy?
>> 
>> Yes.  I also think it's of limited use.  Out of line calls make sense
>> for the architectures which need kernel helpers, which on Linux I
>> think is ARM pre-v6K, hppa, and maybe coldfire.  Anyone with
>> instructions is going to want to use them inline.
> 
> I think that if Julian's patch is OK for ARM, then we should put it in.
>   If parts of it can be reused on HPPA, ColdFire, etc., then we can
> refactor at that point.

Below is the equivalent patch for HPPA.
As mentioned before, there are not much differences between this version and
the ARM version.

OK for mainline as well ?

Helge

ChangeLog

    gcc/
    * config/pa/t-linux (LIB2FUNCS_STATIC_EXTRA): Add
    config/pa/linux-atomic.c.
    * config/pa/t-linux64 (LIB2FUNCS_STATIC_EXTRA): Add
    config/pa/linux-atomic.c.
    * config/pa/linux-atomic.c: New.


Index: gcc/config/pa/linux-atomic.c
===================================================================
--- gcc/config/pa/linux-atomic.c        (revision 0)
+++ gcc/config/pa/linux-atomic.c        (revision 0)
@@ -0,0 +1,299 @@
+/* Linux-specific atomic operations for PA Linux.
+   Copyright (C) 2008 Free Software Foundation, Inc.
+   Based on code contributed by CodeSourcery for ARM EABI Linux.
+   Modifications for PA Linux by Helge Deller <deller@gmx.de>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 2, or (at your option) any later
+version.
+
+In addition to the permissions in the GNU General Public License, the
+Free Software Foundation gives you unlimited permission to link the
+compiled version of this file into combinations with other programs,
+and to distribute those combinations without any restriction coming
+from the use of this file.  (The General Public License restrictions
+do apply in other respects; for example, they cover modification of
+the file, and distribution when not linked into a combine
+executable.)
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to the Free
+Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301, USA.  */
+
+/* Determine kernel LWS function call (0=32bit, 1=64bit userspace)  */
+#define LWS_CAS (sizeof(unsigned long) == 4 ? 0 : 1)
+
+/* Kernel helper for compare-and-exchange.  */
+#define __kernel_cmpxchg( oldval, newval, mem )                                \
+  ({                                                                   \
+    register long lws_ret   asm("r28");                                        \
+    register long lws_errno asm("r21");                                        \
+    register unsigned long lws_mem asm("r26") = (unsigned long) (mem); \
+    register long lws_old asm("r25") = (oldval);                       \
+    register long lws_new asm("r24") = (newval);                       \
+    asm volatile(      "ble    0xb0(%%sr2, %%r0)       \n\t"           \
+                       "ldi    %5, %%r20               \n\t"           \
+       : "=r" (lws_ret), "=r" (lws_errno), "=r" (lws_mem),             \
+         "=r" (lws_old), "=r" (lws_new)                                \
+       : "i" (LWS_CAS), "2" (lws_mem), "3" (lws_old), "4" (lws_new)    \
+       : "r1", "r20", "r22", "r23", "r31", "memory"                    \
+    );                                                                         \
+    lws_errno;                                                         \
+   })
+
+/* Kernel helper for memory barrier.  */
+#define __kernel_dmb() asm volatile ( "" : : : "memory" );
+
+/* Note: we implement byte, short and int versions of atomic operations
using
+   the above kernel helpers, but there is no support for "long long"
(64-bit)
+   operations as yet.  */
+
+#define HIDDEN __attribute__ ((visibility ("hidden")))
+
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define INVERT_MASK_1 0
+#define INVERT_MASK_2 0
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define INVERT_MASK_1 24
+#define INVERT_MASK_2 16
+#else
+#error "Endianess missing"
+#endif
+
+#define MASK_1 0xffu
+#define MASK_2 0xffffu
+
+#define FETCH_AND_OP_WORD(OP, PFX_OP, INF_OP)                          \
+  int HIDDEN                                                           \
+  __sync_fetch_and_##OP##_4 (int *ptr, int val)                                \
+  {                                                                    \
+    int failure, tmp;                                                  \
+                                                                       \
+    do {                                                               \
+      tmp = *ptr;                                                      \
+      failure = __kernel_cmpxchg (tmp, PFX_OP tmp INF_OP val, ptr);    \
+    } while (failure != 0);                                            \
+                                                                       \
+    return tmp;                                                                \
+  }
+
+FETCH_AND_OP_WORD (add,   , +)
+FETCH_AND_OP_WORD (sub,   , -)
+FETCH_AND_OP_WORD (or,    , |)
+FETCH_AND_OP_WORD (and,   , &)
+FETCH_AND_OP_WORD (xor,   , ^)
+FETCH_AND_OP_WORD (nand, ~, &)
+
+#define NAME_oldval(OP, WIDTH) __sync_fetch_and_##OP##_##WIDTH
+#define NAME_newval(OP, WIDTH) __sync_##OP##_and_fetch_##WIDTH
+
+/* Implement both __sync_<op>_and_fetch and __sync_fetch_and_<op> for
+   subword-sized quantities.  */
+
+#define SUBWORD_SYNC_OP(OP, PFX_OP, INF_OP, TYPE, WIDTH, RETURN)       \
+  TYPE HIDDEN                                                          \
+  NAME##_##RETURN (OP, WIDTH) (TYPE *ptr, TYPE val)                    \
+  {                                                                    \
+    int *wordptr = (int *) ((unsigned int) ptr & ~3);                  \
+    unsigned int mask, shift, oldval, newval;                          \
+    int failure;                                                       \
+                                                                       \
+    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    do {                                                               \
+      oldval = *wordptr;                                               \
+      newval = ((PFX_OP ((oldval & mask) >> shift)                     \
+                 INF_OP (unsigned int) val) << shift) & mask;          \
+      newval |= oldval & ~mask;                                                \
+      failure = __kernel_cmpxchg (oldval, newval, wordptr);            \
+    } while (failure != 0);                                            \
+                                                                       \
+    return (RETURN & mask) >> shift;                                   \
+  }
+
+SUBWORD_SYNC_OP (add,   , +, short, 2, oldval)
+SUBWORD_SYNC_OP (sub,   , -, short, 2, oldval)
+SUBWORD_SYNC_OP (or,    , |, short, 2, oldval)
+SUBWORD_SYNC_OP (and,   , &, short, 2, oldval)
+SUBWORD_SYNC_OP (xor,   , ^, short, 2, oldval)
+SUBWORD_SYNC_OP (nand, ~, &, short, 2, oldval)
+
+SUBWORD_SYNC_OP (add,   , +, char, 1, oldval)
+SUBWORD_SYNC_OP (sub,   , -, char, 1, oldval)
+SUBWORD_SYNC_OP (or,    , |, char, 1, oldval)
+SUBWORD_SYNC_OP (and,   , &, char, 1, oldval)
+SUBWORD_SYNC_OP (xor,   , ^, char, 1, oldval)
+SUBWORD_SYNC_OP (nand, ~, &, char, 1, oldval)
+
+#define OP_AND_FETCH_WORD(OP, PFX_OP, INF_OP)                          \
+  int HIDDEN                                                           \
+  __sync_##OP##_and_fetch_4 (int *ptr, int val)                                \
+  {                                                                    \
+    int tmp, failure;                                                  \
+                                                                       \
+    do {                                                               \
+      tmp = *ptr;                                                      \
+      failure = __kernel_cmpxchg (tmp, PFX_OP tmp INF_OP val, ptr);    \
+    } while (failure != 0);                                            \
+                                                                       \
+    return PFX_OP tmp INF_OP val;                                      \
+  }
+
+OP_AND_FETCH_WORD (add,   , +)
+OP_AND_FETCH_WORD (sub,   , -)
+OP_AND_FETCH_WORD (or,    , |)
+OP_AND_FETCH_WORD (and,   , &)
+OP_AND_FETCH_WORD (xor,   , ^)
+OP_AND_FETCH_WORD (nand, ~, &)
+
+SUBWORD_SYNC_OP (add,   , +, short, 2, newval)
+SUBWORD_SYNC_OP (sub,   , -, short, 2, newval)
+SUBWORD_SYNC_OP (or,    , |, short, 2, newval)
+SUBWORD_SYNC_OP (and,   , &, short, 2, newval)
+SUBWORD_SYNC_OP (xor,   , ^, short, 2, newval)
+SUBWORD_SYNC_OP (nand, ~, &, short, 2, newval)
+
+SUBWORD_SYNC_OP (add,   , +, char, 1, newval)
+SUBWORD_SYNC_OP (sub,   , -, char, 1, newval)
+SUBWORD_SYNC_OP (or,    , |, char, 1, newval)
+SUBWORD_SYNC_OP (and,   , &, char, 1, newval)
+SUBWORD_SYNC_OP (xor,   , ^, char, 1, newval)
+SUBWORD_SYNC_OP (nand, ~, &, char, 1, newval)
+
+int HIDDEN
+__sync_val_compare_and_swap_4 (int *ptr, int oldval, int newval)
+{
+  int actual_oldval, fail;
+    
+  while (1)
+    {
+      actual_oldval = *ptr;
+
+      if (oldval != actual_oldval)
+       return actual_oldval;
+
+      fail = __kernel_cmpxchg (actual_oldval, newval, ptr);
+  
+      if (!fail)
+        return oldval;
+    }
+}
+
+#define SUBWORD_VAL_CAS(TYPE, WIDTH)                                   \
+  TYPE HIDDEN                                                          \
+  __sync_val_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,         \
+                                      TYPE newval)                     \
+  {                                                                    \
+    int *wordptr = (int *)((unsigned int) ptr & ~3), fail;             \
+    unsigned int mask, shift, actual_oldval, actual_newval;            \
+                                                                       \
+    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    while (1)                                                          \
+      {                                                                        \
+       actual_oldval = *wordptr;                                       \
+                                                                       \
+       if (((actual_oldval & mask) >> shift) != (unsigned int) oldval) \
+          return (actual_oldval & mask) >> shift;                      \
+                                                                       \
+       actual_newval = (actual_oldval & ~mask)                         \
+                       | (((unsigned int) newval << shift) & mask);    \
+                                                                       \
+       fail = __kernel_cmpxchg (actual_oldval, actual_newval,          \
+                                wordptr);                              \
+                                                                       \
+       if (!fail)                                                      \
+          return oldval;                                               \
+      }                                                                        \
+  }
+
+SUBWORD_VAL_CAS (short, 2)
+SUBWORD_VAL_CAS (char,  1)
+
+typedef unsigned char bool;
+
+bool HIDDEN
+__sync_bool_compare_and_swap_4 (int *ptr, int oldval, int newval)
+{
+  int failure = __kernel_cmpxchg (oldval, newval, ptr);
+  return (failure == 0);
+}
+
+#define SUBWORD_BOOL_CAS(TYPE, WIDTH)                                  \
+  bool HIDDEN                                                          \
+  __sync_bool_compare_and_swap_##WIDTH (TYPE *ptr, TYPE oldval,                \
+                                       TYPE newval)                    \
+  {                                                                    \
+    TYPE actual_oldval                                                 \
+      = __sync_val_compare_and_swap_##WIDTH (ptr, oldval, newval);     \
+    return (oldval == actual_oldval);                                  \
+  }
+
+SUBWORD_BOOL_CAS (short, 2)
+SUBWORD_BOOL_CAS (char,  1)
+
+void HIDDEN
+__sync_synchronize (void)
+{
+  __kernel_dmb ();
+}
+
+int HIDDEN
+__sync_lock_test_and_set_4 (int *ptr, int val)
+{
+  int failure, oldval;
+
+  do {
+    oldval = *ptr;
+    failure = __kernel_cmpxchg (oldval, val, ptr);
+  } while (failure != 0);
+
+  return oldval;
+}
+
+#define SUBWORD_TEST_AND_SET(TYPE, WIDTH)                              \
+  TYPE HIDDEN                                                          \
+  __sync_lock_test_and_set_##WIDTH (TYPE *ptr, TYPE val)               \
+  {                                                                    \
+    int failure;                                                       \
+    unsigned int oldval, newval, shift, mask;                          \
+    int *wordptr = (int *) ((unsigned int) ptr & ~3);                  \
+                                                                       \
+    shift = (((unsigned int) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;     \
+    mask = MASK_##WIDTH << shift;                                      \
+                                                                       \
+    do {                                                               \
+      oldval = *wordptr;                                               \
+      newval = (oldval & ~mask)                                                \
+              | (((unsigned int) val << shift) & mask);                \
+      failure = __kernel_cmpxchg (oldval, newval, wordptr);            \
+    } while (failure != 0);                                            \
+                                                                       \
+    return (oldval & mask) >> shift;                                   \
+  }
+
+SUBWORD_TEST_AND_SET (short, 2)
+SUBWORD_TEST_AND_SET (char,  1)
+
+#define SYNC_LOCK_RELEASE(TYPE, WIDTH)                                 \
+  void HIDDEN                                                          \
+  __sync_lock_release_##WIDTH (TYPE *ptr)                              \
+  {                                                                    \
+    *ptr = 0;                                                          \
+    __kernel_dmb ();                                                   \
+  }
+
+SYNC_LOCK_RELEASE (int,   4)
+SYNC_LOCK_RELEASE (short, 2)
+SYNC_LOCK_RELEASE (char,  1)
Index: gcc/config/pa/t-linux64
===================================================================
--- gcc/config/pa/t-linux64     (revision 137753)
+++ gcc/config/pa/t-linux64     (working copy)
@@ -8,5 +8,7 @@
 # Actually, hppa64 is always PIC but adding -fPIC does no harm.
 CRTSTUFF_T_CFLAGS_S = -fPIC
 
+LIB2FUNCS_STATIC_EXTRA = $(srcdir)/config/pa/linux-atomic.c
+
 # Compile libgcc2.a as PIC.
 TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1
Index: gcc/config/pa/t-linux
===================================================================
--- gcc/config/pa/t-linux       (revision 137753)
+++ gcc/config/pa/t-linux       (working copy)
@@ -9,6 +9,7 @@
 TARGET_LIBGCC2_CFLAGS = -fPIC -DELF=1 -DLINUX=1
 
 LIB2FUNCS_EXTRA=fptr.c
+LIB2FUNCS_STATIC_EXTRA = $(srcdir)/config/pa/linux-atomic.c
 
 fptr.c: $(srcdir)/config/pa/fptr.c
        rm -f fptr.c




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-10-30  9:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-13 20:34 [PATCH, HPPA] Atomic builtins using kernel helpers for Linux John David Anglin
2008-07-13 20:39 ` Helge Deller
2008-07-14  0:02   ` Helge Deller
2008-07-15  0:34     ` [PATCH, HPPA] Atomic builtins using kernel helpers for Linux (try #4) Helge Deller
2008-10-29 14:00       ` PING " Andrew Haley
2008-10-29 23:50         ` Helge Deller
2008-10-30 12:13           ` Andrew Haley
2008-07-14  1:46   ` [PATCH, HPPA] Atomic builtins using kernel helpers for Linux John David Anglin
2008-07-14 13:37     ` Carlos O'Donell
     [not found] <487BBC76.9000204@gmx.de>
2008-09-07 17:20 ` John David Anglin
  -- strict thread matches above, loose matches on Subject: below --
2008-07-01 17:16 [PATCH, ARM] Atomic builtins using kernel helpers for Linux/EABI Julian Brown
2008-07-09 22:36 ` Helge Deller
2008-07-10 14:10   ` Andrew Haley
2008-07-10 15:37     ` Carlos O'Donell
2008-07-10 15:54       ` Daniel Jacobowitz
2008-07-10 19:57         ` Mark Mitchell
2008-07-13 16:26           ` [PATCH, HPPA] Atomic builtins using kernel helpers for Linux Helge Deller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).