public inbox for libc-hacker@sourceware.org
 help / color / mirror / Atom feed
* new syscall stub support for ia64 libc
@ 2003-10-29  4:26 David Mosberger
  2003-10-29  9:51 ` Jakub Jelinek
  2003-10-29 17:54 ` Ulrich Drepper
  0 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-10-29  4:26 UTC (permalink / raw)
  To: libc-hacker; +Cc: davidm

Now that NPTL etc. have settled, I'd like to renew the effort in
getting the new system call stubs supported on ia64.  As you may
recall, the new syscall stubs are designed to support light-weight
system calls (by taking advantage of the EPC instruction).  The
light-weight syscalls can yield huge performance improvements.  For
example, gettimeofday() and sigprocmask() run about 3 times faster as
lightweight syscalls.

What I'd like to see is something that makes it possible for
non-threaded and NPTL apps to take advantage of the new syscall stubs.
If LinuxThreads apps don't get the benefit, that's OK, I suppose.

Does this sound reasonable?  If so, how should I go about this?
Should I re-send the (forward-ported) version of the original patch
that I did so you can see what's involved?  Any other suggestions?

Thanks,

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-29  4:26 new syscall stub support for ia64 libc David Mosberger
@ 2003-10-29  9:51 ` Jakub Jelinek
  2003-10-30  8:04   ` David Mosberger
  2003-10-29 17:54 ` Ulrich Drepper
  1 sibling, 1 reply; 98+ messages in thread
From: Jakub Jelinek @ 2003-10-29  9:51 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

On Tue, Oct 28, 2003 at 08:26:09PM -0800, David Mosberger wrote:
> Now that NPTL etc. have settled, I'd like to renew the effort in
> getting the new system call stubs supported on ia64.  As you may
> recall, the new syscall stubs are designed to support light-weight
> system calls (by taking advantage of the EPC instruction).  The
> light-weight syscalls can yield huge performance improvements.  For
> example, gettimeofday() and sigprocmask() run about 3 times faster as
> lightweight syscalls.
> 
> What I'd like to see is something that makes it possible for
> non-threaded and NPTL apps to take advantage of the new syscall stubs.
> If LinuxThreads apps don't get the benefit, that's OK, I suppose.
> 
> Does this sound reasonable?  If so, how should I go about this?
> Should I re-send the (forward-ported) version of the original patch
> that I did so you can see what's involved?  Any other suggestions?

Yes, please post a forward ported complete tested patch (the last
version of the patch I and Ulrich saw was incomplete (but lead to
discovery of a linker bug)).
NPTL is now in sources CVS, so things are way easier for diffing...
Thanks.

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-29  4:26 new syscall stub support for ia64 libc David Mosberger
  2003-10-29  9:51 ` Jakub Jelinek
@ 2003-10-29 17:54 ` Ulrich Drepper
  1 sibling, 0 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-10-29 17:54 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker, davidm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> What I'd like to see is something that makes it possible for
> non-threaded and NPTL apps to take advantage of the new syscall stubs.

We do this for x86 as well.  The thread register is simply set up for
every program, not only threaded programs.  And this is done on x86
inside ld.so.  I assume you want something like this.  Otherwise I don't
understand the reference to NPTL.


> If LinuxThreads apps don't get the benefit, that's OK, I suppose.

I couldn't care less about LT.

- -- 
- --------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/n/x+2ijCOnn/RHQRAtWwAJ4iDxQLUMtvG4Cy3R21WVlMtLhebwCfSG+a
ReLQtGHhwQ2UUmWPYvz7yOc=
=WXO7
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-29  9:51 ` Jakub Jelinek
@ 2003-10-30  8:04   ` David Mosberger
  2003-10-30  9:09     ` Jakub Jelinek
  2003-10-31  8:45     ` Richard Henderson
  0 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-10-30  8:04 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, libc-hacker

Hi Jakub,

>>>>> On Wed, 29 Oct 2003 08:44:36 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> Yes, please post a forward ported complete tested patch (the last
  Jakub> version of the patch I and Ulrich saw was incomplete (but lead to
  Jakub> discovery of a linker bug)).
  Jakub> NPTL is now in sources CVS, so things are way easier for diffing...

OK, here is a preliminary patch.  Some comments:

 - The "assert (ph->p_vaddr == GL(dl_sysinfo_dso)" check in elf/rtld.c
   is too strict.  On ia64, we have two LOAD segments, so the check
   can't possibly succeed:

    $ readelf -l arch/ia64/kernel/gate.so |grep LOAD
    LOAD           0x0000000000000000 0xa000000000010000 0xa000000000010000
    LOAD           0x0000000000000000 0xa000000000020000 0xa000000000020000

   The patch below simply #if's out the code, but perhaps a better
   fix would be to check ph->p_vaddr only for the first LOAD segment?

 - The changes to linuxthreads/{manager,pthread}.c are almost certainly
   wrong, but I'm not sure I understand how you want things set up to
   ensure that single-threaded apps use the new stub but linuxthread
   apps use the old one.

 - The ia64-specific vfork.S for now are done via old-style stub.  I'm
   not sure whether this is still necessary and will look into it.

 - The libc-start.c change is also "wrong" but since DL_SYSDEP_OSCHECK()
   may do syscalls, it is necessary to do __pthread_initialize_minimal()
   first, as otherwise the minimal thread descriptor isn't setup.

The rest should be good.  With the patch applied, "make check" gets
through all the tests except the linuxthread ones.  I wasn't sure
whether it's worth tracking those down, since the linuxthreads part is
likely to change anyhow.

	--david

Index: elf/rtld.c
===================================================================
RCS file: /cvs/glibc/libc/elf/rtld.c,v
retrieving revision 1.299
diff -u -r1.299 rtld.c
--- elf/rtld.c	27 Oct 2003 20:08:32 -0000	1.299
+++ elf/rtld.c	30 Oct 2003 07:55:02 -0000
@@ -1169,8 +1169,10 @@
 		  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
 		  break;
 		}
+#if 0
 	      if (ph->p_type == PT_LOAD)
 		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+#endif
 	    }
 	  elf_get_dynamic_info (l, dyn_temp);
 	  _dl_setup_hash (l);
Index: linuxthreads/descr.h
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/descr.h,v
retrieving revision 1.14
diff -u -r1.14 descr.h
--- linuxthreads/descr.h	17 Sep 2003 09:39:00 -0000	1.14
+++ linuxthreads/descr.h	30 Oct 2003 07:55:02 -0000
@@ -189,7 +189,23 @@
 #endif
   size_t p_alloca_cutoff;	/* Maximum size which should be allocated
 				   using alloca() instead of malloc().  */
+#if TLS_TCB_AT_TP
   /* New elements must be added at the end.  */
+#else
+  union {
+    struct {
+      void *reserved[11];	/* reserve for future use */
+      void *tcb;		/* XXX do we really need this? */
+      union dtv *dtvp;		/* XXX do we really need this? */
+      pthread_descr self;	/* XXX do we really need this? */
+      int multiple_threads;
+#ifdef NEED_DL_SYSINFO
+      uintptr_t sysinfo;
+#endif
+    } data;
+    void *__padding[16];
+  } p_header __attribute__ ((aligned(32)));
+#endif
 } __attribute__ ((aligned(32))); /* We need to align the structure so that
 				    doubles are aligned properly.  This is 8
 				    bytes on MIPS and 16 bytes on MIPS64.
Index: linuxthreads/manager.c
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/manager.c,v
retrieving revision 1.96
diff -u -r1.96 manager.c
--- linuxthreads/manager.c	15 Oct 2003 05:53:06 -0000	1.96
+++ linuxthreads/manager.c	30 Oct 2003 07:55:02 -0000
@@ -650,6 +650,10 @@
 #if !defined USE_TLS || !TLS_DTV_AT_TP
   new_thread->p_header.data.tcb = new_thread;
   new_thread->p_header.data.self = new_thread;
+# if 1
+  /* XXX why isn't this done already??? */
+  new_thread->p_header.data.sysinfo = GL(dl_sysinfo);
+# endif
 #endif
 #if TLS_MULTIPLE_THREADS_IN_TCB || !defined USE_TLS || !TLS_DTV_AT_TP
   new_thread->p_multiple_threads = 1;
Index: linuxthreads/pthread.c
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/pthread.c,v
retrieving revision 1.132
diff -u -r1.132 pthread.c
--- linuxthreads/pthread.c	15 Oct 2003 05:53:44 -0000	1.132
+++ linuxthreads/pthread.c	30 Oct 2003 07:55:02 -0000
@@ -357,6 +357,11 @@
 
   self = THREAD_SELF;
 
+#if 1
+  /* XXX why isn't this done already??? */
+  self->p_header.data.sysinfo = GL(dl_sysinfo);
+#endif
+
   /* The memory for the thread descriptor was allocated elsewhere as
      part of the TLS allocation.  We have to initialize the data
      structure by hand.  This initialization must mirror the struct
@@ -676,6 +681,10 @@
   mgr->p_header.data.tcb = tcbp;
   mgr->p_header.data.self = mgr;
   mgr->p_header.data.multiple_threads = 1;
+# if 1
+  /* XXX why isn't this done already??? */
+  mgr->p_header.data.sysinfo = GL(dl_sysinfo);
+# endif
 #elif TLS_MULTIPLE_THREADS_IN_TCB
   mgr->p_multiple_threads = 1;
 #endif
Index: linuxthreads/sysdeps/ia64/tcb-offsets.sym
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/sysdeps/ia64/tcb-offsets.sym,v
retrieving revision 1.6
diff -u -r1.6 tcb-offsets.sym
--- linuxthreads/sysdeps/ia64/tcb-offsets.sym	25 Apr 2003 22:04:27 -0000	1.6
+++ linuxthreads/sysdeps/ia64/tcb-offsets.sym	30 Oct 2003 07:55:02 -0000
@@ -4,6 +4,7 @@
 --
 #ifdef USE_TLS
 MULTIPLE_THREADS_OFFSET offsetof (struct _pthread_descr_struct, p_multiple_threads) - sizeof (struct _pthread_descr_struct)
+SYSINFO_OFFSET		offsetof (struct _pthread_descr_struct, p_header.data.sysinfo) - sizeof (struct _pthread_descr_struct)
 #else
 MULTIPLE_THREADS_OFFSET offsetof (tcbhead_t, multiple_threads)
 #endif
Index: linuxthreads/sysdeps/ia64/tls.h
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/sysdeps/ia64/tls.h,v
retrieving revision 1.6
diff -u -r1.6 tls.h
--- linuxthreads/sysdeps/ia64/tls.h	31 Jul 2003 19:16:34 -0000	1.6
+++ linuxthreads/sysdeps/ia64/tls.h	30 Oct 2003 07:55:02 -0000
@@ -20,10 +20,13 @@
 #ifndef _TLS_H
 #define _TLS_H
 
+#include <dl-sysdep.h>
+
 #ifndef __ASSEMBLER__
 
 # include <pt-machine.h>
 # include <stddef.h>
+# include <stdint.h>
 
 /* Type for the dtv.  */
 typedef union dtv
@@ -83,8 +86,10 @@
 /* Code to initially initialize the thread pointer.  This might need
    special attention since 'errno' is not yet available and if the
    operation can cause a failure 'errno' must not be touched.  */
-#  define TLS_INIT_TP(tcbp, secondcall) \
-  (__thread_self = (__typeof (__thread_self)) (tcbp), NULL)
+#  define TLS_INIT_TP(tcbp, secondcall)			\
+  (__thread_self = (__typeof (__thread_self)) (tcbp),	\
+   THREAD_SELF->p_header.data.sysinfo = GL(dl_sysinfo),	\
+   NULL)
 
 /* Return the address of the dtv for the current thread.  */
 #  define THREAD_DTV() \
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/vfork.S
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/sysdeps/unix/sysv/linux/ia64/vfork.S,v
retrieving revision 1.4
diff -u -r1.4 vfork.S
--- linuxthreads/sysdeps/unix/sysv/linux/ia64/vfork.S	11 Feb 2003 06:27:53 -0000	1.4
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/vfork.S	30 Oct 2003 07:55:02 -0000
@@ -43,9 +43,13 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
+#if 0
 	DO_CALL (SYS_ify (clone))
+#else
+	mov r15=SYS_ify(clone)
+	break __BREAK_SYSCALL
+#endif
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)
Index: sysdeps/generic/libc-start.c
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/generic/libc-start.c,v
retrieving revision 1.46
diff -u -r1.46 libc-start.c
--- sysdeps/generic/libc-start.c	31 Jul 2003 19:20:39 -0000	1.46
+++ sysdeps/generic/libc-start.c	30 Oct 2003 07:55:04 -0000
@@ -123,14 +123,6 @@
 #  endif
   _dl_aux_init (auxvec);
 # endif
-# ifdef DL_SYSDEP_OSCHECK
-  if (!__libc_multiple_libcs)
-    {
-      /* This needs to run to initiliaze _dl_osversion before TLS
-	 setup might check it.  */
-      DL_SYSDEP_OSCHECK (__libc_fatal);
-    }
-# endif
 
   /* Initialize the thread library at least a bit since the libgcc
      functions are using thread functions if these are available and
@@ -142,6 +134,15 @@
 # endif
     __pthread_initialize_minimal ();
 #endif
+
+# ifdef DL_SYSDEP_OSCHECK
+  if (!__libc_multiple_libcs)
+    {
+      /* This needs to run to initiliaze _dl_osversion before TLS
+	 setup might check it.  */
+      DL_SYSDEP_OSCHECK (__libc_fatal);
+    }
+# endif
 
   /* Register the destructor of the dynamic linker if there is any.  */
   if (__builtin_expect (rtld_fini != NULL, 1))
Index: sysdeps/unix/sysv/linux/ia64/clone2.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/clone2.S,v
retrieving revision 1.7
diff -u -r1.7 clone2.S
--- sysdeps/unix/sysv/linux/ia64/clone2.S	13 Mar 2003 04:36:59 -0000	1.7
+++ sysdeps/unix/sysv/linux/ia64/clone2.S	30 Oct 2003 07:55:07 -0000
@@ -25,49 +25,56 @@
 /* 	         size_t child_stack_size, int flags, void *arg,		*/
 /*	         pid_t *parent_tid, void *tls, pid_t *child_tid)	*/
 
+#define CHILD	p8
+#define PARENT	p9
+
 ENTRY(__clone2)
-	alloc r2=ar.pfs,8,2,6,0
+	.prologue
+	alloc r2=ar.pfs,8,0,6,0
 	cmp.eq p6,p0=0,in0
 	mov r8=EINVAL
-(p6)	br.cond.spnt.few __syscall_error
-	;;
-	flushrs			/* This is necessary, since the child	*/
-				/* will be running with the same 	*/
-				/* register backing store for a few 	*/
-				/* instructions.  We need to ensure	*/
-				/* that it will not read or write the	*/
-				/* backing store.			*/
-	mov loc0=in0		/* save fn	*/
-	mov loc1=in4		/* save arg	*/
 	mov out0=in3		/* Flags are first syscall argument.	*/
 	mov out1=in1		/* Stack address.			*/
+(p6)	br.cond.spnt.many __syscall_error
+	;;
 	mov out2=in2		/* Stack size.				*/
 	mov out3=in5		/* Parent TID Pointer			*/
 	mov out4=in7		/* Child TID Pointer			*/
  	mov out5=in6		/* TLS pointer				*/
-        DO_CALL (SYS_ify (clone2))
+	/*
+	 * clone2() is special: the child cannot execute br.ret right
+	 * after the system call returns, because it starts out
+	 * executing on an empty stack.  Because of this, we can't use
+	 * the new (lightweight) syscall convention here.  Instead, we
+	 * just fall back on always using "break".
+	 *
+	 * Furthermore, since the child starts with an empty stack, we
+	 * need to avoid unwinding past invalid memory.  To that end,
+	 * we'll pretend now that __clone2() is the end of the
+	 * call-chain.  This is wrong for the parent, but only until
+	 * it returns from clone2() but it's better than the
+	 * alternative.
+	 */
+	mov r15=SYS_ify (clone2)
+	.save rp, r0
+	break __BREAK_SYSCALL
+	.body
         cmp.eq p6,p0=-1,r10
+	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?   */
+(p6)	br.cond.spnt.many __syscall_error
 	;;
-(p6)	br.cond.spnt.few __syscall_error
-
-#	define CHILD p6
-#	define PARENT p7
-	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?	*/
-	;;
-(CHILD)	ld8 out1=[loc0],8	/* Retrieve code pointer.	*/
-(CHILD)	mov out0=loc1		/* Pass proper argument	to fn */
+(CHILD)	ld8 out1=[in0],8	/* Retrieve code pointer.	*/
+(CHILD)	mov out0=in4		/* Pass proper argument	to fn */
 (PARENT) ret
 	;;
-	ld8 gp=[loc0]		/* Load function gp.		*/
+	ld8 gp=[in0]		/* Load function gp.		*/
 	mov b6=out1
-	;;
-	br.call.dptk.few rp=b6	/* Call fn(arg) in the child 	*/
+	br.call.dptk.many rp=b6	/* Call fn(arg) in the child 	*/
 	;;
 	mov out0=r8		/* Argument to _exit		*/
 	.globl _exit
-	br.call.dpnt.few rp=_exit /* call _exit with result from fn.	*/
+	br.call.dpnt.many rp=_exit /* call _exit with result from fn.	*/
 	ret			/* Not reached.		*/
-
 PSEUDO_END(__clone2)
 
 /* For now we leave __clone undefined.  This is unlikely to be a	*/
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/sysdep.h,v
retrieving revision 1.17
diff -u -r1.17 sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h	16 Aug 2003 08:00:24 -0000	1.17
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h	30 Oct 2003 07:55:08 -0000
@@ -23,6 +23,7 @@
 
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
+#include <tls.h>
 
 /* For Linux we can use the system call table in the header file
 	/usr/include/asm/unistd.h
@@ -95,9 +96,32 @@
 	cmp.eq p6,p0=-1,r10;			\
 (p6)	br.cond.spnt.few __syscall_error;
 
-#define DO_CALL(num)				\
+#define DO_CALL_VIA_BREAK(num)			\
 	mov r15=num;				\
-	break __BREAK_SYSCALL;
+	break __BREAK_SYSCALL
+
+#if defined HAVE_TLS_SUPPORT && (!defined NOT_IN_libc || defined IS_IN_libpthread)
+
+/* Use the lightweight stub only if (a) we have a suitably modern
+   thread-control block (HAVE_TLS_SUPPORT) and (b) we're not compiling
+   the runtime loader (which might do syscalls before being fully
+   relocated). */
+
+#define DO_CALL(num)				\
+	.prologue;				\
+        adds r2 = SYSINFO_OFFSET, r13;;		\
+        ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+        mov r11 = ar.pfs;;			\
+	.body;					\
+	mov r15 = num;				\
+        mov b7 = r2;				\
+        br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+        mov ar.pfs = r11
+#else
+#define DO_CALL(num)				DO_CALL_VIA_BREAK(num)
+#endif
 
 #undef PSEUDO_END
 #define PSEUDO_END(name)	.endp C_SYMBOL_NAME(name);
@@ -144,6 +168,47 @@
    (non-negative) errno on error or the return value on success.
  */
 #undef INLINE_SYSCALL
+#undef INTERNAL_SYSCALL
+#if defined HAVE_TLS_SUPPORT && (!defined NOT_IN_libc || defined IS_IN_libpthread)
+
+#define DO_INLINE_SYSCALL(name, nr, args...)							\
+    register long _r8 __asm ("r8");								\
+    register long _r10 __asm ("r10");								\
+    register long _r15 __asm ("r15") = __NR_##name;						\
+    long _retval;										\
+    LOAD_ARGS_##nr (args);									\
+    /*												\
+     * Don't specify any unwind info here.  We mark ar.pfs as clobbered.  This will force	\
+     * the compiler to save ar.pfs somewhere and emit appropriate unwind info for that		\
+     * save.											\
+     */												\
+    __asm __volatile ("adds r2 = -8, r13;;\n"							\
+		      "ld8 r2 = [r2];;\n"							\
+		      "mov b7=r2;\n"								\
+		      "br.call.sptk.many b6=b7;;\n"						\
+                      : "=r" (_r8), "=r" (_r10), "=r" (_r15) ASM_OUTARGS_##nr			\
+                      : "2" (_r15) ASM_ARGS_##nr						\
+		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);					\
+    _retval = _r8;
+
+#define INLINE_SYSCALL(name, nr, args...)	\
+  ({						\
+    DO_INLINE_SYSCALL(name, nr, args)		\
+    if (_r10 == -1)				\
+      {						\
+        __set_errno (_retval);			\
+        _retval = -1;				\
+      }						\
+    _retval; })
+
+#define INTERNAL_SYSCALL(name, err, nr, args...)	\
+  ({							\
+    DO_INLINE_SYSCALL(name, nr, args)			\
+    err = _r10;						\
+    _retval; })
+
+#else /* !new syscall-stub */
+
 #define INLINE_SYSCALL(name, nr, args...)			\
   ({								\
     register long _r8 asm ("r8");				\
@@ -164,10 +229,6 @@
       }								\
     _retval; })
 
-#undef INTERNAL_SYSCALL_DECL
-#define INTERNAL_SYSCALL_DECL(err) long int err
-
-#undef INTERNAL_SYSCALL
 #define INTERNAL_SYSCALL(name, err, nr, args...)		\
   ({								\
     register long _r8 asm ("r8");				\
@@ -183,6 +244,11 @@
     _retval = _r8;						\
     err = _r10;							\
     _retval; })
+
+#endif /* !new syscall-stub */
+
+#undef INTERNAL_SYSCALL_DECL
+#define INTERNAL_SYSCALL_DECL(err) long int err
 
 #undef INTERNAL_SYSCALL_ERROR_P
 #define INTERNAL_SYSCALL_ERROR_P(val, err)	(err == -1)
Index: sysdeps/unix/sysv/linux/ia64/vfork.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/vfork.S,v
retrieving revision 1.4
diff -u -r1.4 vfork.S
--- sysdeps/unix/sysv/linux/ia64/vfork.S	31 Dec 2002 20:37:30 -0000	1.4
+++ sysdeps/unix/sysv/linux/ia64/vfork.S	30 Oct 2003 07:55:08 -0000
@@ -34,9 +34,13 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
+#if 0
 	DO_CALL (SYS_ify (clone))
+#else
+	mov r15=SYS_ify(clone)
+	break __BREAK_SYSCALL
+#endif
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-30  8:04   ` David Mosberger
@ 2003-10-30  9:09     ` Jakub Jelinek
  2003-10-30 19:38       ` Roland McGrath
  2003-10-30 19:59       ` David Mosberger
  2003-10-31  8:45     ` Richard Henderson
  1 sibling, 2 replies; 98+ messages in thread
From: Jakub Jelinek @ 2003-10-30  9:09 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

On Thu, Oct 30, 2003 at 12:04:22AM -0800, David Mosberger wrote:
> Hi Jakub,
> 
> >>>>> On Wed, 29 Oct 2003 08:44:36 +0100, Jakub Jelinek <jakub@redhat.com> said:
> 
>   Jakub> Yes, please post a forward ported complete tested patch (the last
>   Jakub> version of the patch I and Ulrich saw was incomplete (but lead to
>   Jakub> discovery of a linker bug)).
>   Jakub> NPTL is now in sources CVS, so things are way easier for diffing...
> 
> OK, here is a preliminary patch.  Some comments:
> 
>  - The "assert (ph->p_vaddr == GL(dl_sysinfo_dso)" check in elf/rtld.c
>    is too strict.  On ia64, we have two LOAD segments, so the check
>    can't possibly succeed:
> 
>     $ readelf -l arch/ia64/kernel/gate.so |grep LOAD
>     LOAD           0x0000000000000000 0xa000000000010000 0xa000000000010000
>     LOAD           0x0000000000000000 0xa000000000020000 0xa000000000020000

Can you readelf -Wl arch/ia64/kernel/gate.so |grep LOAD 
(or grep -A1 LOAD instead), or better yet readelf -Wa arch/ia64/kernel/gate.so
? I'd like to understand why you need the second LOAD segment, what stuff
has it in etc.

>    The patch below simply #if's out the code, but perhaps a better
>    fix would be to check ph->p_vaddr only for the first LOAD segment?

Yeah, that's doable, add some variable #ifndef NDEBUG, increment it for
each PT_LOAD and use it in the assert.

>  - The changes to linuxthreads/{manager,pthread}.c are almost certainly
>    wrong, but I'm not sure I understand how you want things set up to
>    ensure that single-threaded apps use the new stub but linuxthread
>    apps use the old one.

IMHO NEED_DL_SYSINFO should be defined in both NPTL and
Linuxthread ia64/dl-sysdep.h, while USE_DL_SYSINFO only in NPTL.
And sysdep.h should use sysinfo only if USE_DL_SYSINFO is defined.
Then linuxthreads will work just fine (use break always, who cares)
and NPTL will use VDSO.

>  - The libc-start.c change is also "wrong" but since DL_SYSDEP_OSCHECK()
>    may do syscalls, it is necessary to do __pthread_initialize_minimal()
>    first, as otherwise the minimal thread descriptor isn't setup.

This is handled on IA-32 by providing DL_SYSINFO_DEFAULT (and defining
USE_DL_SYSINFO).  This means the few syscalls in DL_SYSDEP_OSCHECK will
use the break insn and the rest will use VDSO if available.

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-30  9:09     ` Jakub Jelinek
@ 2003-10-30 19:38       ` Roland McGrath
  2003-10-30 19:59       ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: Roland McGrath @ 2003-10-30 19:38 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, libc-hacker

> Can you readelf -Wl arch/ia64/kernel/gate.so |grep LOAD (or grep -A1 LOAD
> instead), or better yet readelf -Wa arch/ia64/kernel/gate.so ? I'd like
> to understand why you need the second LOAD segment, what stuff has it in
> etc.

There are two LOAD segments for the same one page of file data mapped into
two consecutive pages with different permissions.  The first segment is
read-only and the second is execute-only.  The execute-only permission
(cannot be read by user mode) is required for the magic EPC instruction.
Hence we need two mappings to see the DSO info and to execute.

> IMHO NEED_DL_SYSINFO should be defined in both NPTL and
> Linuxthread ia64/dl-sysdep.h, while USE_DL_SYSINFO only in NPTL.
> And sysdep.h should use sysinfo only if USE_DL_SYSINFO is defined.
> Then linuxthreads will work just fine (use break always, who cares)
> and NPTL will use VDSO.

Agreed.  THis is what i386 does.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-30  9:09     ` Jakub Jelinek
  2003-10-30 19:38       ` Roland McGrath
@ 2003-10-30 19:59       ` David Mosberger
  2003-10-30 20:23         ` Jakub Jelinek
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-10-30 19:59 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, libc-hacker

>>>>> On Thu, 30 Oct 2003 08:03:02 +0100, Jakub Jelinek <jakub@redhat.com> said:

  >> - The "assert (ph->p_vaddr == GL(dl_sysinfo_dso)" check in
  >> elf/rtld.c is too strict.  On ia64, we have two LOAD segments, so
  >> the check can't possibly succeed:

  >> $ readelf -l arch/ia64/kernel/gate.so |grep LOAD LOAD
  >> 0x0000000000000000 0xa000000000010000 0xa000000000010000 LOAD
  >> 0x0000000000000000 0xa000000000020000 0xa000000000020000

  Jakub> Can you readelf -Wl arch/ia64/kernel/gate.so |grep LOAD (or
  Jakub> grep -A1 LOAD instead), or better yet readelf -Wa
  Jakub> arch/ia64/kernel/gate.so ? I'd like to understand why you
  Jakub> need the second LOAD segment, what stuff has it in etc.

How about I try to explain?  The reason there are two segments is that
the privilege-promote page used to enter the kernel is executable only
at the user-level.  To export the DSO headers etc., we thus create a
second, read-only mapping.  Here is the expanded readelf output:

$ readelf -Wl arch/ia64/kernel/gate.so |grep LOAD
LOAD           0x000000 0xa000000000010000 0xa000000000010000 0x000628 0x000628 R   0x10000
LOAD           0x000000 0xa000000000020000 0xa000000000020000 0x0009e0 0x0009e0   E 0x10000

  >> - The changes to linuxthreads/{manager,pthread}.c are almost
  >> certainly wrong, but I'm not sure I understand how you want
  >> things set up to ensure that single-threaded apps use the new
  >> stub but linuxthread apps use the old one.

  Jakub> IMHO NEED_DL_SYSINFO should be defined in both NPTL and
  Jakub> Linuxthread ia64/dl-sysdep.h, while USE_DL_SYSINFO only in
  Jakub> NPTL. And sysdep.h should use sysinfo only if USE_DL_SYSINFO is
  Jakub> defined.  Then linuxthreads will work just fine (use break
  Jakub> always, who cares) and NPTL will use VDSO.

But wouldn't this imply that non-threaded apps won't use the new
stubs?  I'm probably missing something here.

  >> - The libc-start.c change is also "wrong" but since
  >> DL_SYSDEP_OSCHECK() may do syscalls, it is necessary to do
  >> __pthread_initialize_minimal() first, as otherwise the minimal
  >> thread descriptor isn't setup.

  Jakub> This is handled on IA-32 by providing DL_SYSINFO_DEFAULT (and
  Jakub> defining USE_DL_SYSINFO).  This means the few syscalls in
  Jakub> DL_SYSDEP_OSCHECK will use the break insn and the rest will
  Jakub> use VDSO if available.

OK, that behavior should be fine (especially if the DL_SYSDEP_OSCHECK
can go away completely with the right libc config option).

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-30 19:59       ` David Mosberger
@ 2003-10-30 20:23         ` Jakub Jelinek
  2003-10-30 22:35           ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Jakub Jelinek @ 2003-10-30 20:23 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

On Thu, Oct 30, 2003 at 11:59:28AM -0800, David Mosberger wrote:
>   Jakub> IMHO NEED_DL_SYSINFO should be defined in both NPTL and
>   Jakub> Linuxthread ia64/dl-sysdep.h, while USE_DL_SYSINFO only in
>   Jakub> NPTL. And sysdep.h should use sysinfo only if USE_DL_SYSINFO is
>   Jakub> defined.  Then linuxthreads will work just fine (use break
>   Jakub> always, who cares) and NPTL will use VDSO.
> 
> But wouldn't this imply that non-threaded apps won't use the new
> stubs?  I'm probably missing something here.

No. In recent glibc's, you always have to use matching
libc.so+libpthread.so+librt.so, so you either use
libc.so+libpthread.so+librt.so from NPTL build, or from Linuxthreads
build. In Linuxthreads build non-threaded nor threaded apps won't use
the new stubs, just the dynamic linker will have support for them
(but won't use them). That is so that ld.so from Linuxthreads build
can load NPTL libc.so+libpthread.so+librt.so.
In NPTL build, both non-threaded and threaded apps will use the new
stubs after TLS is set up.

>   Jakub> This is handled on IA-32 by providing DL_SYSINFO_DEFAULT (and
>   Jakub> defining USE_DL_SYSINFO).  This means the few syscalls in
>   Jakub> DL_SYSDEP_OSCHECK will use the break insn and the rest will
>   Jakub> use VDSO if available.
> 
> OK, that behavior should be fine (especially if the DL_SYSDEP_OSCHECK
> can go away completely with the right libc config option).

DL_SYSDEP_OSCHECK is going away only if you configure glibc without
--enable-kernel=x.y.z.  Which is something you almost never want to do
in NPTL build.

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-30 20:23         ` Jakub Jelinek
@ 2003-10-30 22:35           ` David Mosberger
  0 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-10-30 22:35 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, libc-hacker

>>>>> On Thu, 30 Oct 2003 19:17:25 +0100, Jakub Jelinek <jakub@redhat.com> said:

  >> But wouldn't this imply that non-threaded apps won't use the new
  >> stubs?  I'm probably missing something here.

  Jakub> No. In recent glibc's, you always have to use matching
  Jakub> libc.so+libpthread.so+librt.so, so you either use
  Jakub> libc.so+libpthread.so+librt.so from NPTL build, or from Linuxthreads
  Jakub> build.

Oh, thanks for explaining that.  Yes, then things should be fine.

Let me see if I can come up with an updated patch.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-30  8:04   ` David Mosberger
  2003-10-30  9:09     ` Jakub Jelinek
@ 2003-10-31  8:45     ` Richard Henderson
  2003-10-31  9:07       ` Jakub Jelinek
  2003-10-31 16:43       ` new syscall stub support for ia64 libc David Mosberger
  1 sibling, 2 replies; 98+ messages in thread
From: Richard Henderson @ 2003-10-31  8:45 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

On Thu, Oct 30, 2003 at 12:04:22AM -0800, David Mosberger wrote:
> +    __asm __volatile ("adds r2 = -8, r13;;\n"
> +		      "ld8 r2 = [r2];;\n"
> +		      "mov b7=r2;\n"
> +		      "br.call.sptk.many b6=b7;;\n"
> +                      : "=r" (_r8), "=r" (_r10),
> +			   "=r" (_r15) ASM_OUTARGS_##nr
> +                      : "2" (_r15) ASM_ARGS_##nr
> +		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);

Any particular reason why you're managing the memory load
from assembly?  Seems to me you could do

	__asm __volatile ("br.call.sptk.many b6=%0"
			  : ...
			  : "b" (__thread_self->whatever)
			  : ...);

Anyway, the magic -8 there certainly looks dangerous.


r~

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-31  8:45     ` Richard Henderson
@ 2003-10-31  9:07       ` Jakub Jelinek
  2003-10-31 16:45         ` David Mosberger
  2003-10-31 16:43       ` new syscall stub support for ia64 libc David Mosberger
  1 sibling, 1 reply; 98+ messages in thread
From: Jakub Jelinek @ 2003-10-31  9:07 UTC (permalink / raw)
  To: davidm, libc-hacker

On Fri, Oct 31, 2003 at 12:42:49AM -0800, Richard Henderson wrote:
> On Thu, Oct 30, 2003 at 12:04:22AM -0800, David Mosberger wrote:
> > +    __asm __volatile ("adds r2 = -8, r13;;\n"
> > +		      "ld8 r2 = [r2];;\n"
> > +		      "mov b7=r2;\n"
> > +		      "br.call.sptk.many b6=b7;;\n"
> > +                      : "=r" (_r8), "=r" (_r10),
> > +			   "=r" (_r15) ASM_OUTARGS_##nr
> > +                      : "2" (_r15) ASM_ARGS_##nr
> > +		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);
> 
> Any particular reason why you're managing the memory load
> from assembly?  Seems to me you could do
> 
> 	__asm __volatile ("br.call.sptk.many b6=%0"
> 			  : ...
> 			  : "b" (__thread_self->whatever)

Cannot tcbhead_t's private field be reused for the sysinfo pointer
actually on IA-64? That way 32 bytes wouldn't have to be wasted
at end of struct pthread, it would be at the same location in
linuxthreads as well as NPTL build (so that linuxthreads ld.so
can load NPTL libc/libpthread).
It would need some small changes in generic code, particularly
macroizing sysinfo access.
On IA-32 this could be:
#define THREAD_SELF_SYSINFO THREAD_GETMEM (THREAD_SELF, header.sysinfo)
#define THREAD_SYSINFO(pd) ((pd)->header.sysinfo)
and on IA-64:
#define THREAD_SELF_SYSINFO (((tcbhead_t) __thread_self)->private)
#define THREAD_SYSINFO(pd) (((tcbhead_t) ((pd) + 1))->private)

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-31  8:45     ` Richard Henderson
  2003-10-31  9:07       ` Jakub Jelinek
@ 2003-10-31 16:43       ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-10-31 16:43 UTC (permalink / raw)
  To: Richard Henderson; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Fri, 31 Oct 2003 00:42:49 -0800, Richard Henderson <rth@twiddle.net> said:

  Richard> On Thu, Oct 30, 2003 at 12:04:22AM -0800, David Mosberger wrote:
  >> +    __asm __volatile ("adds r2 = -8, r13;;\n"
  >> +		      "ld8 r2 = [r2];;\n"
  >> +		      "mov b7=r2;\n"
  >> +		      "br.call.sptk.many b6=b7;;\n"
  >> +                      : "=r" (_r8), "=r" (_r10),
  >> +			   "=r" (_r15) ASM_OUTARGS_##nr
  >> +                      : "2" (_r15) ASM_ARGS_##nr
  >> +		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);

  Richard> Any particular reason why you're managing the memory load
  Richard> from assembly?  Seems to me you could do

  Richard> __asm __volatile ("br.call.sptk.many b6=%0"
  Richard> : ...
  Richard> : "b" (__thread_self->whatever)
  Richard> : ...);

That's probably worth trying.

  Richard> Anyway, the magic -8 there certainly looks dangerous.

Not really.  The offset must be architected (effectively, it's part of
the ia64 linux abi), because otherwise you can't use the new syscall
stubs outside of the C library.  But I agree it would be cleaner to
use the offset macro in this particular case (for consistency); I just
didn't want to sort out the include file dependency mess that made
this difficult at the time.

Regardless, we should probably add an assertion somewhere that
SYSINFO_OFFSET == -8.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-31  9:07       ` Jakub Jelinek
@ 2003-10-31 16:45         ` David Mosberger
  2003-10-31 16:54           ` Jakub Jelinek
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-10-31 16:45 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, libc-hacker

>>>>> On Fri, 31 Oct 2003 08:01:14 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> Cannot tcbhead_t's private field be reused for the sysinfo pointer
  Jakub> actually on IA-64?

As long as we can make it appear at offsets -8, that would be OK.

  Jakub> That way 32 bytes wouldn't have to be wasted at end of struct
  Jakub> pthread, it would be at the same location in linuxthreads as
  Jakub> well as NPTL build

That's not sufficient.  The offset must be the same across _all_ libcs,
so that non-libc code can use the new syscall stubs as well.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-31 16:45         ` David Mosberger
@ 2003-10-31 16:54           ` Jakub Jelinek
  2003-10-31 18:29             ` David Mosberger
                               ` (2 more replies)
  0 siblings, 3 replies; 98+ messages in thread
From: Jakub Jelinek @ 2003-10-31 16:54 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

On Fri, Oct 31, 2003 at 08:45:22AM -0800, David Mosberger wrote:
> >>>>> On Fri, 31 Oct 2003 08:01:14 +0100, Jakub Jelinek <jakub@redhat.com> said:
> 
>   Jakub> Cannot tcbhead_t's private field be reused for the sysinfo pointer
>   Jakub> actually on IA-64?
> 
> As long as we can make it appear at offsets -8, that would be OK.

It would be offset 8, not -8 actually.

>   Jakub> That way 32 bytes wouldn't have to be wasted at end of struct
>   Jakub> pthread, it would be at the same location in linuxthreads as
>   Jakub> well as NPTL build
> 
> That's not sufficient.  The offset must be the same across _all_ libcs,
> so that non-libc code can use the new syscall stubs as well.

I don't think it is a good idea to give access to glibc internals (which
these are) to outside code.  There is syscall(3) function for a reason.
Also, derefencing r13[-8] (or r13[8]) in various syscall stub macros
is not going to work if you run the program then against older glibc
and there won't be any symbol versioning which would catch it up.

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-31 16:54           ` Jakub Jelinek
@ 2003-10-31 18:29             ` David Mosberger
  2003-11-03 21:46             ` David Mosberger
  2003-11-12 22:53             ` David Mosberger
  2 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-10-31 18:29 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, libc-hacker

>>>>> On Fri, 31 Oct 2003 15:47:24 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> On Fri, Oct 31, 2003 at 08:45:22AM -0800, David Mosberger wrote:
  >> >>>>> On Fri, 31 Oct 2003 08:01:14 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> Cannot tcbhead_t's private field be reused for the sysinfo pointer
  Jakub> actually on IA-64?

  >> As long as we can make it appear at offsets -8, that would be OK.

  Jakub> It would be offset 8, not -8 actually.

Ah, you mean _that_ one.  It's defined by the psABI and since the new
syscall stub was defined much later, I considered it off limits.  I'd
hate to change the offset now, many months after an open discussion on
this topic.  On the other hand, if we do want to change it, now is
probably the last chance.

  Jakub> That way 32 bytes wouldn't have to be wasted at end of struct
  Jakub> pthread, it would be at the same location in linuxthreads as
  Jakub> well as NPTL build

I don't think you have to waste 32 bytes.  You could do something like:

  struct pthread {
	  :
	long reserved;
  };

and then access sys_info via ((long *) tcb_pointer)[-1].  That way, no
matter what the alignment, you'll always access the right word and will
only use up an extra 8 bytes.

  >> That's not sufficient.  The offset must be the same across _all_ libcs,
  >> so that non-libc code can use the new syscall stubs as well.

  Jakub> I don't think it is a good idea to give access to glibc
  Jakub> internals (which these are) to outside code.  There is
  Jakub> syscall(3) function for a reason.

It's not glibc internals, it's ia64 linux abi.  I certainly do not
encourage the use of handcrafted syscall stubs, but sometimes, they're
needed.

  Jakub> Also, derefencing r13[-8] (or r13[8]) in various syscall stub
  Jakub> macros is not going to work if you run the program then
  Jakub> against older glibc and there won't be any symbol versioning
  Jakub> which would catch it up.

If someone chooses to use the new syscall stubs, they'll have to live
with that.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-31 16:54           ` Jakub Jelinek
  2003-10-31 18:29             ` David Mosberger
@ 2003-11-03 21:46             ` David Mosberger
  2003-11-12 22:53             ` David Mosberger
  2 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-03 21:46 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, libc-hacker

Here is an updated patch.  I'm going to be on travel for a good part
of this week, so I thought I'd send out the latest patch in case
someone wants to try it out or had some feedback.  I haven't
incorporated Rich's suggestion yet and the current "struct pthread"
version still is bigger than strictly needed, but with the patch "make
check" for NPTL-enabled libc succeeds for the most part (modulo
tst-tls1, tst-local{1,2}, and tst-cancel{6,9,17}).  Also, the number
of "break 0x100000" calls inside libc is down to about 4 (primarily
due to clone2(), syscall(), and vfork()).

Note that there was a relatively large change to lowlevellock.h.
Apart from include-file headaches, I didn't see why this code
couldn't/shouldn't use Jakub's syscall inline macros so I changed them
accordingly (perhaps the code was originaly written before Jakub's
inline-macro existed?).  Obviously, the include-file issue still needs
to be worked out (for now, I just copied the relevant definitions from
sysdep.h).

Oh, one more thing: I think my change to nptl/descr.h may have broken
all other platforms which define TLS_DTV_AT_TP.  I thought there was
an odd asymmetry between platforms which do or do not define
TLS_DTV_AT_TP and I thought it would be cleaner if each platform were
to define a tcbhead_t (plus it allows moving the multiple_threads
member close to the sysinfo member, which means cancellable syscall
stubs will touch only one extra cache line), but for now, I only
changed the ia64 header, so the other platforms would break (alpha,
powerpc, and sh appear to be affected by this).

	--david

Index: elf/rtld.c
===================================================================
RCS file: /cvs/glibc/libc/elf/rtld.c,v
retrieving revision 1.299
diff -u -r1.299 rtld.c
--- elf/rtld.c	27 Oct 2003 20:08:32 -0000	1.299
+++ elf/rtld.c	3 Nov 2003 21:26:41 -0000
@@ -1169,8 +1169,10 @@
 		  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
 		  break;
 		}
+#if 0
 	      if (ph->p_type == PT_LOAD)
 		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+#endif
 	    }
 	  elf_get_dynamic_info (l, dyn_temp);
 	  _dl_setup_hash (l);
Index: linuxthreads/descr.h
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/descr.h,v
retrieving revision 1.14
diff -u -r1.14 descr.h
--- linuxthreads/descr.h	17 Sep 2003 09:39:00 -0000	1.14
+++ linuxthreads/descr.h	3 Nov 2003 21:26:42 -0000
@@ -189,7 +189,23 @@
 #endif
   size_t p_alloca_cutoff;	/* Maximum size which should be allocated
 				   using alloca() instead of malloc().  */
+#if TLS_TCB_AT_TP
   /* New elements must be added at the end.  */
+#else
+  union {
+    struct {
+      void *reserved[11];	/* reserve for future use */
+      void *tcb;		/* XXX do we really need this? */
+      union dtv *dtvp;		/* XXX do we really need this? */
+      pthread_descr self;	/* XXX do we really need this? */
+      int multiple_threads;
+#ifdef NEED_DL_SYSINFO
+      uintptr_t sysinfo;
+#endif
+    } data;
+    void *__padding[16];
+  } p_header __attribute__ ((aligned(32)));
+#endif
 } __attribute__ ((aligned(32))); /* We need to align the structure so that
 				    doubles are aligned properly.  This is 8
 				    bytes on MIPS and 16 bytes on MIPS64.
Index: linuxthreads/sysdeps/ia64/tls.h
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/sysdeps/ia64/tls.h,v
retrieving revision 1.6
diff -u -r1.6 tls.h
--- linuxthreads/sysdeps/ia64/tls.h	31 Jul 2003 19:16:34 -0000	1.6
+++ linuxthreads/sysdeps/ia64/tls.h	3 Nov 2003 21:26:43 -0000
@@ -20,10 +20,13 @@
 #ifndef _TLS_H
 #define _TLS_H
 
+#include <dl-sysdep.h>
+
 #ifndef __ASSEMBLER__
 
 # include <pt-machine.h>
 # include <stddef.h>
+# include <stdint.h>
 
 /* Type for the dtv.  */
 typedef union dtv
@@ -83,8 +86,10 @@
 /* Code to initially initialize the thread pointer.  This might need
    special attention since 'errno' is not yet available and if the
    operation can cause a failure 'errno' must not be touched.  */
-#  define TLS_INIT_TP(tcbp, secondcall) \
-  (__thread_self = (__typeof (__thread_self)) (tcbp), NULL)
+#  define TLS_INIT_TP(tcbp, secondcall)			\
+  (__thread_self = (__typeof (__thread_self)) (tcbp),	\
+   THREAD_SELF->p_header.data.sysinfo = GL(dl_sysinfo),	\
+   NULL)
 
 /* Return the address of the dtv for the current thread.  */
 #  define THREAD_DTV() \
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/vfork.S
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/sysdeps/unix/sysv/linux/ia64/vfork.S,v
retrieving revision 1.4
diff -u -r1.4 vfork.S
--- linuxthreads/sysdeps/unix/sysv/linux/ia64/vfork.S	11 Feb 2003 06:27:53 -0000	1.4
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/vfork.S	3 Nov 2003 21:26:43 -0000
@@ -43,9 +43,8 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
-	DO_CALL (SYS_ify (clone))
+	DO_CALL_VIA_BREAK (SYS_ify (clone))
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)
Index: nptl/descr.h
===================================================================
RCS file: /cvs/glibc/libc/nptl/descr.h,v
retrieving revision 1.20
diff -u -r1.20 descr.h
--- nptl/descr.h	22 Jul 2003 23:04:00 -0000	1.20
+++ nptl/descr.h	3 Nov 2003 21:26:44 -0000
@@ -95,17 +95,11 @@
 /* Thread descriptor data structure.  */
 struct pthread
 {
+#if !TLS_DTV_AT_TP
   union
   {
-#if !TLS_DTV_AT_TP
     /* This overlaps the TCB as used for TLS without threads (see tls.h).  */
     tcbhead_t header;
-#else
-    struct
-    {
-      int multiple_threads;
-    } header;
-#endif
 
     /* This extra padding has no special purpose, and this structure layout
        is private and subject to change without affecting the official ABI.
@@ -113,6 +107,7 @@
        implementation-specific instrumentation hack or suchlike.  */
     void *__padding[16];
   };
+#endif
 
   /* This descriptor's link on the `stack_used' or `__stack_user' list.  */
   list_t list;
@@ -239,6 +234,19 @@
 
   /* Resolver state.  */
   struct __res_state res;
+
+#if TLS_DTV_AT_TP
+  union
+  {
+    tcbhead_t header;
+
+    /* This extra padding has no special purpose, and this structure layout
+       is private and subject to change without affecting the official ABI.
+       We just have it here in case it might be convenient for some
+       implementation-specific instrumentation hack or suchlike.  */
+    void *__padding[16];
+  };
+#endif
 } __attribute ((aligned (TCB_ALIGNMENT)));
 
 
Index: nptl/sysdeps/ia64/tcb-offsets.sym
===================================================================
RCS file: /cvs/glibc/libc/nptl/sysdeps/ia64/tcb-offsets.sym,v
retrieving revision 1.4
diff -u -r1.4 tcb-offsets.sym
--- nptl/sysdeps/ia64/tcb-offsets.sym	25 Apr 2003 22:15:55 -0000	1.4
+++ nptl/sysdeps/ia64/tcb-offsets.sym	3 Nov 2003 21:26:44 -0000
@@ -2,3 +2,4 @@
 #include <tls.h>
 
 MULTIPLE_THREADS_OFFSET offsetof (struct pthread, header.multiple_threads) - sizeof (struct pthread)
+SYSINFO_OFFSET		offsetof (struct pthread, header.sysinfo) - sizeof (struct pthread)
Index: nptl/sysdeps/ia64/tls.h
===================================================================
RCS file: /cvs/glibc/libc/nptl/sysdeps/ia64/tls.h,v
retrieving revision 1.4
diff -u -r1.4 tls.h
--- nptl/sysdeps/ia64/tls.h	9 Sep 2003 07:00:21 -0000	1.4
+++ nptl/sysdeps/ia64/tls.h	3 Nov 2003 21:26:44 -0000
@@ -36,10 +36,15 @@
 } dtv_t;
 
 
+/* tcbhead_t must be exactly 16 64-bit words, such that sysinfo lines
+   up with the end of "struct pthread". */
 typedef struct
 {
+  void *reserved[12];
   dtv_t *dtv;
   void *private;
+  int multiple_threads;
+  uintptr_t sysinfo;	/* must be the last word in the tcb! */
 } tcbhead_t;
 
 # define TLS_MULTIPLE_THREADS_IN_TCB 1
@@ -100,11 +105,19 @@
 #  define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#if defined NEED_DL_SYSINFO
+# define INIT_SYSINFO   THREAD_SELF->header.sysinfo = GL(dl_sysinfo)
+#else
+# define INIT_SYSINFO   NULL
+#endif
+
 /* Code to initially initialize the thread pointer.  This might need
    special attention since 'errno' is not yet available and if the
    operation can cause a failure 'errno' must not be touched.  */
-# define TLS_INIT_TP(thrdescr, secondcall) \
-  (__thread_self = (thrdescr), NULL)
+#  define TLS_INIT_TP(thrdescr, secondcall)	\
+  (__thread_self = (thrdescr),			\
+   INIT_SYSINFO,				\
+   NULL)
 
 /* Return the address of the dtv for the current thread.  */
 #  define THREAD_DTV() \
Index: nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
===================================================================
RCS file: /cvs/glibc/libc/nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h,v
retrieving revision 1.14
diff -u -r1.14 lowlevellock.h
--- nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h	22 Sep 2003 21:27:26 -0000	1.14
+++ nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h	3 Nov 2003 21:26:45 -0000
@@ -26,7 +26,7 @@
 #include <ia64intrin.h>
 #include <atomic.h>
 
-#define SYS_futex		1230
+#define __NR_futex		1230
 #define FUTEX_WAIT		0
 #define FUTEX_WAKE		1
 #define FUTEX_REQUEUE		3
@@ -34,6 +34,102 @@
 /* Initializer for compatibility lock.	*/
 #define LLL_MUTEX_LOCK_INITIALIZER (0)
 
+/* XXX share this with sysdep.h! */
+
+#define DO_INLINE_SYSCALL(name, nr, args...)				\
+    register long _r8 __asm ("r8");					\
+    register long _r10 __asm ("r10");					\
+    register long _r15 __asm ("r15") = __NR_##name;			\
+    long _retval;							\
+    LOAD_ARGS_##nr (args);						\
+    /*									\
+     * Don't specify any unwind info here.  We mark ar.pfs as		\
+     * clobbered.  This will force the compiler to save ar.pfs		\
+     * somewhere and emit appropriate unwind info for that save.	\
+     */									\
+    __asm __volatile ("adds r2 = -8, r13;;\n"				\
+		      "ld8 r2 = [r2];;\n"				\
+		      "mov b7=r2;\n"					\
+		      "br.call.sptk.many b6=b7;;\n"			\
+                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)		\
+		        ASM_OUTARGS_##nr				\
+                      : "2" (_r15) ASM_ARGS_##nr			\
+		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);		\
+    _retval = _r8;
+
+#define LOAD_ARGS_0()   do { } while (0)
+#define LOAD_ARGS_1(out0)				\
+  register long _out0 asm ("out0") = (long) (out0);	\
+  LOAD_ARGS_0 ()
+#define LOAD_ARGS_2(out0, out1)				\
+  register long _out1 asm ("out1") = (long) (out1);	\
+  LOAD_ARGS_1 (out0)
+#define LOAD_ARGS_3(out0, out1, out2)			\
+  register long _out2 asm ("out2") = (long) (out2);	\
+  LOAD_ARGS_2 (out0, out1)
+#define LOAD_ARGS_4(out0, out1, out2, out3)		\
+  register long _out3 asm ("out3") = (long) (out3);	\
+  LOAD_ARGS_3 (out0, out1, out2)
+#define LOAD_ARGS_5(out0, out1, out2, out3, out4)	\
+  register long _out4 asm ("out4") = (long) (out4);	\
+  LOAD_ARGS_4 (out0, out1, out2, out3)
+
+#define ASM_OUTARGS_0
+#define ASM_OUTARGS_1	ASM_OUTARGS_0, "=r" (_out0)
+#define ASM_OUTARGS_2	ASM_OUTARGS_1, "=r" (_out1)
+#define ASM_OUTARGS_3	ASM_OUTARGS_2, "=r" (_out2)
+#define ASM_OUTARGS_4	ASM_OUTARGS_3, "=r" (_out3)
+#define ASM_OUTARGS_5	ASM_OUTARGS_4, "=r" (_out4)
+
+#define ASM_ARGS_0
+#define ASM_ARGS_1	ASM_ARGS_0, "3" (_out0)
+#define ASM_ARGS_2	ASM_ARGS_1, "4" (_out1)
+#define ASM_ARGS_3	ASM_ARGS_2, "5" (_out2)
+#define ASM_ARGS_4	ASM_ARGS_3, "6" (_out3)
+#define ASM_ARGS_5	ASM_ARGS_4, "7" (_out4)
+
+#define ASM_CLOBBERS_0	ASM_CLOBBERS_1, "out0"
+#define ASM_CLOBBERS_1	ASM_CLOBBERS_2, "out1"
+#define ASM_CLOBBERS_2	ASM_CLOBBERS_3, "out2"
+#define ASM_CLOBBERS_3	ASM_CLOBBERS_4, "out3"
+#define ASM_CLOBBERS_4	ASM_CLOBBERS_5, "out4"
+#define ASM_CLOBBERS_5	ASM_CLOBBERS_6, "out5"
+#define ASM_CLOBBERS_6	, "out6", "out7",				\
+  /* Non-stacked integer registers, minus r8, r10, r15.  */		\
+  "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	\
+  "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	\
+  "r28", "r29", "r30", "r31",						\
+  /* Predicate registers.  */						\
+  "p6", "p7", "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15",	\
+  /* Non-rotating fp registers.  */					\
+  "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	\
+  /* Branch registers.  */						\
+  "b6", "b7"
+
+/* XXX end sharable stuff */
+
+#define lll_futex_wait(futex, val) lll_futex_timed_wait (futex, val, 0)
+
+#define lll_futex_timed_wait(ftx, val, timespec)						\
+({												\
+   DO_INLINE_SYSCALL(futex, 4, (long) (ftx), FUTEX_WAIT, (int) (val), (long) (timespec));	\
+   _r10 == -1 ? -_retval : _retval;								\
+})
+
+#define lll_futex_wake(ftx, nr)									\
+({												\
+   DO_INLINE_SYSCALL(futex, 3, (long) (ftx), FUTEX_WAKE, (int) (nr));				\
+   _r10 == -1 ? -_retval : _retval;								\
+})
+
+#define lll_futex_requeue(ftx, nr_wake, nr_move, mutex)						\
+({												\
+   DO_INLINE_SYSCALL(futex, 5, (long) (ftx), FUTEX_REQUEUE, (int) (nr_wake),			\
+		     (int) (nr_move), (long) (mutex));						\
+   _r10 == -1 ? -_retval : _retval;								\
+})
+
+#if 0
 #define lll_futex_clobbers \
   "out5", "out6", "out7",						      \
   /* Non-stacked integer registers, minus r8, r10, r15.  */		      \
@@ -48,9 +144,8 @@
   "b6", "b7",								      \
   "memory"
 
-#define lll_futex_wait(futex, val) lll_futex_timed_wait (futex, val, 0)
 
-#define lll_futex_timed_wait(futex, val, timespec) \
+#define lll_futex_timed_wait(futex, val, timespec)			      \
   ({									      \
      register long int __o0 asm ("out0") = (long int) (futex);		      \
      register long int __o1 asm ("out1") = FUTEX_WAIT;			      \
@@ -69,7 +164,6 @@
      __r10 == -1 ? -__r8 : __r8;					      \
   })
 
-
 #define lll_futex_wake(futex, nr) \
   ({									      \
      register long int __o0 asm ("out0") = (long int) (futex);		      \
@@ -109,6 +203,7 @@
 		       : lll_futex_clobbers);				      \
      __r10 == -1 ? -__r8 : __r8;					      \
   })
+#endif
 
 
 static inline int
Index: nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
===================================================================
RCS file: /cvs/glibc/libc/nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h,v
retrieving revision 1.7
diff -u -r1.7 sysdep-cancel.h
--- nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h	8 Jul 2003 03:47:52 -0000	1.7
+++ nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h	3 Nov 2003 21:26:45 -0000
@@ -29,13 +29,21 @@
 # define PSEUDO(name, syscall_name, args)				      \
 .text;									      \
 ENTRY (name)								      \
-     adds r14 = MULTIPLE_THREADS_OFFSET, r13;;				      \
+     .prologue;								      \
+     adds r2 = SYSINFO_OFFSET, r13;					      \
+     adds r14 = MULTIPLE_THREADS_OFFSET, r13;				      \
+     .save ar.pfs, r11;							      \
+     mov r11 = ar.pfs;;							      \
+     .body;								      \
      ld4 r14 = [r14];							      \
+     ld8 r2 = [r2];							      \
      mov r15 = SYS_ify(syscall_name);;					      \
      cmp4.ne p6, p7 = 0, r14;						      \
-(p6) br.cond.spnt .Lpseudo_cancel;;					      \
-     break __BREAK_SYSCALL;;						      \
-     cmp.eq p6,p0=-1,r10;						      \
+     mov b7 = r2;							      \
+(p6) br.cond.spnt .Lpseudo_cancel;					      \
+     br.call.sptk.many b6 = b7;;					      \
+     mov ar.pfs = r11;							      \
+     cmp.eq p6,p0 = -1, r10;						      \
 (p6) br.cond.spnt.few __syscall_error;					      \
      ret;;								      \
      .endp name;							      \
@@ -45,17 +53,20 @@
 __GC_##name:								      \
 .Lpseudo_cancel:							      \
      .prologue;								      \
-     .regstk args, 5, args, 0;						      \
+     .regstk args, 6, args, 0;						      \
      .save ar.pfs, loc0;						      \
-     alloc loc0 = ar.pfs, args, 5, args, 0;				      \
+     alloc loc0 = ar.pfs, args, 6, args, 0;				      \
+     adds loc5 = SYSINFO_OFFSET, r13;					      \
      .save rp, loc1;							      \
      mov loc1 = rp;;							      \
      .body;								      \
+     ld8 loc5 = [loc5];							      \
      CENABLE;;								      \
      mov loc2 = r8;							      \
+     mov b7 = loc5;							      \
      COPY_ARGS_##args							      \
      mov r15 = SYS_ify(syscall_name);					      \
-     break __BREAK_SYSCALL;;						      \
+     br.call.sptk.many b6 = b7;;					      \
      mov loc3 = r8;							      \
      mov loc4 = r10;							      \
      mov out0 = loc2;							      \
Index: sysdeps/ia64/elf/start.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/elf/start.S,v
retrieving revision 1.12
diff -u -r1.12 start.S
--- sysdeps/ia64/elf/start.S	29 Mar 2003 19:18:27 -0000	1.12
+++ sysdeps/ia64/elf/start.S	3 Nov 2003 21:26:47 -0000
@@ -19,6 +19,8 @@
 
 #include <sysdep.h>
 
+#undef ret
+
 #include <asm/unistd.h>
 #include <asm/fpu.h>
 
Index: sysdeps/unix/sysv/linux/ia64/brk.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/brk.S,v
retrieving revision 1.4
diff -u -r1.4 brk.S
--- sysdeps/unix/sysv/linux/ia64/brk.S	3 Mar 2003 07:11:46 -0000	1.4
+++ sysdeps/unix/sysv/linux/ia64/brk.S	3 Nov 2003 21:26:52 -0000
@@ -35,19 +35,17 @@
 weak_alias (__curbrk, ___brk_addr)
 
 LEAF(__brk)
-	mov	r15=__NR_brk
-	break.i	__BREAK_SYSCALL
+	.regstk 1, 0, 0, 0
+	DO_CALL(__NR_brk)
+	cmp.ltu	p6, p0 = ret0, in0
+	addl r9 = @ltoff(__curbrk), gp
 	;;
-	cmp.ltu	p6,p0=ret0,r32	/* r32 is the input register, even though we
-				   haven't allocated a frame */
-	addl	r9=@ltoff(__curbrk),gp
-	;;
-	ld8	r9=[r9]
-(p6) 	mov	ret0=ENOMEM
+	ld8 r9 = [r9]
+(p6) 	mov ret0 = ENOMEM
 (p6)	br.cond.spnt.few __syscall_error
 	;;
-	st8	[r9]=ret0
-	mov 	ret0=0
+	st8 [r9] = ret0
+	mov ret0 = 0
 	ret
 END(__brk)
 
Index: sysdeps/unix/sysv/linux/ia64/clone2.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/clone2.S,v
retrieving revision 1.7
diff -u -r1.7 clone2.S
--- sysdeps/unix/sysv/linux/ia64/clone2.S	13 Mar 2003 04:36:59 -0000	1.7
+++ sysdeps/unix/sysv/linux/ia64/clone2.S	3 Nov 2003 21:26:52 -0000
@@ -25,49 +25,56 @@
 /* 	         size_t child_stack_size, int flags, void *arg,		*/
 /*	         pid_t *parent_tid, void *tls, pid_t *child_tid)	*/
 
+#define CHILD	p8
+#define PARENT	p9
+
 ENTRY(__clone2)
-	alloc r2=ar.pfs,8,2,6,0
+	.prologue
+	alloc r2=ar.pfs,8,0,6,0
 	cmp.eq p6,p0=0,in0
 	mov r8=EINVAL
-(p6)	br.cond.spnt.few __syscall_error
-	;;
-	flushrs			/* This is necessary, since the child	*/
-				/* will be running with the same 	*/
-				/* register backing store for a few 	*/
-				/* instructions.  We need to ensure	*/
-				/* that it will not read or write the	*/
-				/* backing store.			*/
-	mov loc0=in0		/* save fn	*/
-	mov loc1=in4		/* save arg	*/
 	mov out0=in3		/* Flags are first syscall argument.	*/
 	mov out1=in1		/* Stack address.			*/
+(p6)	br.cond.spnt.many __syscall_error
+	;;
 	mov out2=in2		/* Stack size.				*/
 	mov out3=in5		/* Parent TID Pointer			*/
 	mov out4=in7		/* Child TID Pointer			*/
  	mov out5=in6		/* TLS pointer				*/
-        DO_CALL (SYS_ify (clone2))
+	/*
+	 * clone2() is special: the child cannot execute br.ret right
+	 * after the system call returns, because it starts out
+	 * executing on an empty stack.  Because of this, we can't use
+	 * the new (lightweight) syscall convention here.  Instead, we
+	 * just fall back on always using "break".
+	 *
+	 * Furthermore, since the child starts with an empty stack, we
+	 * need to avoid unwinding past invalid memory.  To that end,
+	 * we'll pretend now that __clone2() is the end of the
+	 * call-chain.  This is wrong for the parent, but only until
+	 * it returns from clone2() but it's better than the
+	 * alternative.
+	 */
+	mov r15=SYS_ify (clone2)
+	.save rp, r0
+	break __BREAK_SYSCALL
+	.body
         cmp.eq p6,p0=-1,r10
+	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?   */
+(p6)	br.cond.spnt.many __syscall_error
 	;;
-(p6)	br.cond.spnt.few __syscall_error
-
-#	define CHILD p6
-#	define PARENT p7
-	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?	*/
-	;;
-(CHILD)	ld8 out1=[loc0],8	/* Retrieve code pointer.	*/
-(CHILD)	mov out0=loc1		/* Pass proper argument	to fn */
+(CHILD)	ld8 out1=[in0],8	/* Retrieve code pointer.	*/
+(CHILD)	mov out0=in4		/* Pass proper argument	to fn */
 (PARENT) ret
 	;;
-	ld8 gp=[loc0]		/* Load function gp.		*/
+	ld8 gp=[in0]		/* Load function gp.		*/
 	mov b6=out1
-	;;
-	br.call.dptk.few rp=b6	/* Call fn(arg) in the child 	*/
+	br.call.dptk.many rp=b6	/* Call fn(arg) in the child 	*/
 	;;
 	mov out0=r8		/* Argument to _exit		*/
 	.globl _exit
-	br.call.dpnt.few rp=_exit /* call _exit with result from fn.	*/
+	br.call.dpnt.many rp=_exit /* call _exit with result from fn.	*/
 	ret			/* Not reached.		*/
-
 PSEUDO_END(__clone2)
 
 /* For now we leave __clone undefined.  This is unlikely to be a	*/
Index: sysdeps/unix/sysv/linux/ia64/getcontext.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/getcontext.S,v
retrieving revision 1.8
diff -u -r1.8 getcontext.S
--- sysdeps/unix/sysv/linux/ia64/getcontext.S	26 Sep 2003 08:41:51 -0000	1.8
+++ sysdeps/unix/sysv/linux/ia64/getcontext.S	3 Nov 2003 21:26:53 -0000
@@ -35,26 +35,27 @@
 
 ENTRY(__getcontext)
 	.prologue
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_BLOCK, NULL, &sc->sc_mask):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_BLOCK
-	mov out1 = 0
-	add out2 = r2, in0
-	mov out3 = 8	// sizeof kernel sigset_t
 
-	break __BREAK_SYSCALL
 	flushrs					// save dirty partition on rbs
+	mov out1 = 0
+	add out2 = r3, in0
+
+	mov out3 = 8	// sizeof kernel sigset_t
+	DO_CALL(__NR_rt_sigprocmask)
 
 	mov.m rFPSR = ar.fpsr
 	mov.m rRSC = ar.rsc
 	add r2 = SC_GR+1*8, r32
 	;;
 	mov.m rBSP = ar.bsp
+	.prologue
 	.save ar.unat, rUNAT
 	mov.m rUNAT = ar.unat
 	.body
@@ -63,7 +64,7 @@
 
 .mem.offset 0,0; st8.spill [r2] = r1, (5*8 - 1*8)
 .mem.offset 8,0; st8.spill [r3] = r4, 16
-	mov.i rPFS = ar.pfs
+	mov rPFS = r11
 	;;
 .mem.offset 0,0; st8.spill [r2] = r5, 16
 .mem.offset 8,0; st8.spill [r3] = r6, 48
Index: sysdeps/unix/sysv/linux/ia64/setcontext.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/setcontext.S,v
retrieving revision 1.4
diff -u -r1.4 setcontext.S
--- sysdeps/unix/sysv/linux/ia64/setcontext.S	28 May 2003 20:45:25 -0000	1.4
+++ sysdeps/unix/sysv/linux/ia64/setcontext.S	3 Nov 2003 21:26:53 -0000
@@ -32,20 +32,21 @@
   other than the PRESERVED state.  */
 
 ENTRY(__setcontext)
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.prologue
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_SETMASK, &sc->sc_mask, NULL):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_SETMASK
-	add out1 = r2, in0
+	;;
+	add out1 = r3, in0
 	mov out2 = 0
 	mov out3 = 8	// sizeof kernel sigset_t
 
 	invala
-	break __BREAK_SYSCALL
+	DO_CALL(__NR_rt_sigprocmask)
 	add r2 = SC_NAT, r32
 
 	add r3 = SC_RNAT, r32			// r3 <- &sc_ar_rnat
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/sysdep.h,v
retrieving revision 1.17
diff -u -r1.17 sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h	16 Aug 2003 08:00:24 -0000	1.17
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h	3 Nov 2003 21:26:53 -0000
@@ -23,6 +23,8 @@
 
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
+#include <dl-sysdep.h>
+#include <tls.h>
 
 /* For Linux we can use the system call table in the header file
 	/usr/include/asm/unistd.h
@@ -51,6 +53,13 @@
 # define __NR_semtimedop 1247
 #endif
 
+#if defined USE_DL_SYSINFO \
+	&& (!defined NOT_IN_libc || defined IS_IN_libpthread)
+# define IA64_USE_NEW_STUB
+#else
+# undef IA64_USE_NEW_STUB
+#endif
+
 #ifdef __ASSEMBLER__
 
 #undef CALL_MCOUNT
@@ -95,9 +104,41 @@
 	cmp.eq p6,p0=-1,r10;			\
 (p6)	br.cond.spnt.few __syscall_error;
 
-#define DO_CALL(num)				\
+#define DO_CALL_VIA_BREAK(num)			\
 	mov r15=num;				\
-	break __BREAK_SYSCALL;
+	break __BREAK_SYSCALL
+
+#ifdef IA64_USE_NEW_STUB
+# ifdef SHARED
+#  define DO_CALL(num)				\
+	.prologue;				\
+        adds r2 = SYSINFO_OFFSET, r13;;		\
+        ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+        mov r11 = ar.pfs;;			\
+	.body;					\
+	mov r15 = num;				\
+        mov b7 = r2;				\
+        br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+        mov ar.pfs = r11
+# else /* !SHARED */
+#  define DO_CALL(num)				\
+	.prologue;				\
+	movl r2 = _dl_sysinfo;;			\
+        ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+        mov r11 = ar.pfs;;			\
+	.body;					\
+	mov r15 = num;				\
+        mov b7 = r2;				\
+        br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+        mov ar.pfs = r11
+# endif
+#else
+# define DO_CALL(num)				DO_CALL_VIA_BREAK(num)
+#endif
 
 #undef PSEUDO_END
 #define PSEUDO_END(name)	.endp C_SYMBOL_NAME(name);
@@ -144,6 +185,48 @@
    (non-negative) errno on error or the return value on success.
  */
 #undef INLINE_SYSCALL
+#undef INTERNAL_SYSCALL
+
+#ifdef IA64_USE_NEW_STUB
+#define DO_INLINE_SYSCALL(name, nr, args...)				\
+    register long _r8 __asm ("r8");					\
+    register long _r10 __asm ("r10");					\
+    register long _r15 __asm ("r15") = __NR_##name;			\
+    long _retval;							\
+    LOAD_ARGS_##nr (args);						\
+    /*									\
+     * Don't specify any unwind info here.  We mark ar.pfs as		\
+     * clobbered.  This will force the compiler to save ar.pfs		\
+     * somewhere and emit appropriate unwind info for that save.	\
+     */									\
+    __asm __volatile ("adds r2 = -8, r13;;\n"				\
+		      "ld8 r2 = [r2];;\n"				\
+		      "mov b7=r2;\n"					\
+		      "br.call.sptk.many b6=b7;;\n"			\
+                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)		\
+		        ASM_OUTARGS_##nr				\
+                      : "2" (_r15) ASM_ARGS_##nr			\
+		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);		\
+    _retval = _r8;
+
+#define INLINE_SYSCALL(name, nr, args...)	\
+  ({						\
+    DO_INLINE_SYSCALL(name, nr, args)		\
+    if (_r10 == -1)				\
+      {						\
+        __set_errno (_retval);			\
+        _retval = -1;				\
+      }						\
+    _retval; })
+
+#define INTERNAL_SYSCALL(name, err, nr, args...)	\
+  ({							\
+    DO_INLINE_SYSCALL(name, nr, args)			\
+    err = _r10;						\
+    _retval; })
+
+#else /* !IA64_USE_NEW_STUB */
+
 #define INLINE_SYSCALL(name, nr, args...)			\
   ({								\
     register long _r8 asm ("r8");				\
@@ -164,10 +247,6 @@
       }								\
     _retval; })
 
-#undef INTERNAL_SYSCALL_DECL
-#define INTERNAL_SYSCALL_DECL(err) long int err
-
-#undef INTERNAL_SYSCALL
 #define INTERNAL_SYSCALL(name, err, nr, args...)		\
   ({								\
     register long _r8 asm ("r8");				\
@@ -184,6 +263,11 @@
     err = _r10;							\
     _retval; })
 
+#endif /* !IA64_USE_NEW_STUB */
+
+#undef INTERNAL_SYSCALL_DECL
+#define INTERNAL_SYSCALL_DECL(err) long int err
+
 #undef INTERNAL_SYSCALL_ERROR_P
 #define INTERNAL_SYSCALL_ERROR_P(val, err)	(err == -1)
 
@@ -226,12 +310,6 @@
 #define ASM_ARGS_5	ASM_ARGS_4, "7" (_out4)
 #define ASM_ARGS_6	ASM_ARGS_5, "8" (_out5)
 
-#define ASM_CLOBBERS_0	ASM_CLOBBERS_1, "out0"
-#define ASM_CLOBBERS_1	ASM_CLOBBERS_2, "out1"
-#define ASM_CLOBBERS_2	ASM_CLOBBERS_3, "out2"
-#define ASM_CLOBBERS_3	ASM_CLOBBERS_4, "out3"
-#define ASM_CLOBBERS_4	ASM_CLOBBERS_5, "out4"
-#define ASM_CLOBBERS_5	ASM_CLOBBERS_6, "out5"
 #define ASM_CLOBBERS_6	, "out6", "out7",				\
   /* Non-stacked integer registers, minus r8, r10, r15.  */		\
   "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	\
Index: sysdeps/unix/sysv/linux/ia64/vfork.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/vfork.S,v
retrieving revision 1.4
diff -u -r1.4 vfork.S
--- sysdeps/unix/sysv/linux/ia64/vfork.S	31 Dec 2002 20:37:30 -0000	1.4
+++ sysdeps/unix/sysv/linux/ia64/vfork.S	3 Nov 2003 21:26:53 -0000
@@ -34,9 +34,8 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
-	DO_CALL (SYS_ify (clone))
+	DO_CALL_VIA_BREAK (SYS_ify (clone))
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)
Index: nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
===================================================================
--- /dev/null	2003-08-25 16:34:40.000000000 -0700
+++ nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h	2003-10-30 14:50:53.000000000 -0800
@@ -0,0 +1,64 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+/* This macro must be defined to either 0 or 1.
+
+   If 1, then an errno global variable hidden in ld.so will work right with
+   all the errno-using libc code compiled for ld.so, and there is never a
+   need to share the errno location with libc.  This is appropriate only if
+   all the libc functions that ld.so uses are called without PLT and always
+   get the versions linked into ld.so rather than the libc ones.  */
+
+#ifdef IS_IN_rtld
+# define RTLD_PRIVATE_ERRNO 1
+#else
+# define RTLD_PRIVATE_ERRNO 0
+#endif
+
+/* Traditionally system calls have been made using break 0x100000.  A
+   second method was introduced which, if possible, will use the EPC
+   instruction.  To signal the presence and where to find the code the
+   kernel passes an AT_SYSINFO_EHDR pointer in the auxiliary vector to
+   the application.  */
+#define NEED_DL_SYSINFO	1
+#define USE_DL_SYSINFO	1
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-10-31 16:54           ` Jakub Jelinek
  2003-10-31 18:29             ` David Mosberger
  2003-11-03 21:46             ` David Mosberger
@ 2003-11-12 22:53             ` David Mosberger
  2003-11-12 23:10               ` Ulrich Drepper
  2003-11-13  7:32               ` David Mosberger
  2 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-12 22:53 UTC (permalink / raw)
  To: drepper, Jakub Jelinek; +Cc: davidm, libc-hacker

OK, here is the latest patch to add the new syscall stub support for
ia64.  I hope this will be the second-last version.  Assuming there
are no huge complaints, I hope the next patch will be the final one
that can be applied.  The attached patch seems to work reasonably
well:

 - with linuxthreads, make check shows one failure in tst-numeric
   (probably an existing issue)
 - with nptl, make check shows failures in tst-cancel{6,9,17};
   I'll need to look into these as it's likely that I botched
   something in sysdep-cancel.h.

What's changed:

 - The sysinfo pointer is now stored in tcbhead_t.private, as per
   Jakub's suggestion (this means the offset changes from -8 to +8).
   The new THREAD_SELF_SYSINFO and THREAD_SYSINFO macros were added
   to both x86 and ia64, so the patch shouldn't break x86 (or any other
   platform for that matter).

 - As per Rich's suggestion, I tweaked DO_INLINE_SYSCALL so GCC
   takes care of loading the sysinfo-pointer into the branch-register.
   Unfortunately, GCC 3.3.2 inserts an unnecessary stop-bit between
   the load to the branch register and the indirect branch, but that's
   a compiler bug, so I'm not going to worry about it as far as libc
   is concerned.

 - The inline-syscall macros are now in a separate file,
   sysdep-inline.h which makes it possible for lowlevellock.h and
   sysdep.h to share the necessary macros.  Also, the thread-pointer
   variable __thread_self now gets declared here as a "void *" in
   sysdep-inline.h.  We can argue whether this is the perfect place,
   but it's clear to me that we need to declare the variable in _some_
   header-file that can be included anywhere, as anything else causes
   lots of headache.

Please take a good look and let me know if you find any issues.

Thanks!

	--david

Index: elf/rtld.c
--- elf/rtld.c
+++ elf/rtld.c
@@ -1169,7 +1169,7 @@
 		  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
 		  break;
 		}
-	      if (ph->p_type == PT_LOAD)
+	      if (i == 0 && ph->p_type == PT_LOAD)
 		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
 	    }
 	  elf_get_dynamic_info (l, dyn_temp);
Index: linuxthreads/sysdeps/ia64/pt-machine.h
--- linuxthreads/sysdeps/ia64/pt-machine.h
+++ linuxthreads/sysdeps/ia64/pt-machine.h
@@ -22,6 +22,7 @@
 #define _PT_MACHINE_H   1
 
 #include <ia64intrin.h>
+#include <sysdep-inline.h>
 
 #ifndef PT_EI
 # define PT_EI extern inline __attribute__ ((always_inline))
@@ -51,11 +52,6 @@
 #define CURRENT_STACK_FRAME  stack_pointer
 register char *stack_pointer __asm__ ("sp");
 
-
-/* Register r13 (tp) is reserved by the ABI as "thread pointer". */
-struct _pthread_descr_struct;
-register struct _pthread_descr_struct *__thread_self __asm__("r13");
-
 /* Return the thread descriptor for the current thread.  */
 #define THREAD_SELF  __thread_self
 
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,45 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+#define NEED_DL_SYSINFO	1
+#undef USE_DL_SYSINFO
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: nptl/ChangeLog
--- nptl/ChangeLog
+++ nptl/ChangeLog
@@ -1 +1,9 @@
+2003-11-12    <davidm@hpl.hp.com>
+
+	* sysdeps/i386/tls.h (THREAD_SELF_SYSINFO): New macro.
+	(THREAD_SYSINFO): Ditto.
+
+	* sysdeps/ia64/tls.h (THREAD_SELF_SYSINFO): New macro.
+	(THREAD_SYSINFO): Ditto.
+
 2003-11-06  Ulrich Drepper  <drepper@redhat.com>
Index: nptl/allocatestack.c
--- nptl/allocatestack.c
+++ nptl/allocatestack.c
@@ -352,7 +352,7 @@
 
 #ifdef NEED_DL_SYSINFO
       /* Copy the sysinfo value from the parent.  */
-      pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+      THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
       /* The process ID is also the same as that of the caller.  */
@@ -488,7 +488,7 @@
 
 #ifdef NEED_DL_SYSINFO
 	  /* Copy the sysinfo value from the parent.  */
-	  pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+	  THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
 	  /* The process ID is also the same as that of the caller.  */
Index: nptl/descr.h
Index: nptl/sysdeps/i386/tls.h
--- nptl/sysdeps/i386/tls.h
+++ nptl/sysdeps/i386/tls.h
@@ -128,6 +128,8 @@
 # define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	THREAD_GETMEM (THREAD_SELF, header.sysinfo)
+#define THREAD_SYSINFO(pd)	((pd)->header.sysinfo)
 
 /* Macros to load from and store into segment registers.  */
 # ifndef TLS_GET_GS
Index: nptl/sysdeps/ia64/tcb-offsets.sym
--- nptl/sysdeps/ia64/tcb-offsets.sym
+++ nptl/sysdeps/ia64/tcb-offsets.sym
@@ -2,3 +2,4 @@
 #include <tls.h>
 
 MULTIPLE_THREADS_OFFSET offsetof (struct pthread, header.multiple_threads) - sizeof (struct pthread)
+SYSINFO_OFFSET		offsetof (tcbhead_t, private)
Index: nptl/sysdeps/ia64/tls.h
--- nptl/sysdeps/ia64/tls.h
+++ nptl/sysdeps/ia64/tls.h
@@ -64,8 +64,6 @@
 /* Get system call information.  */
 # include <sysdep.h>
 
-register struct pthread *__thread_self __asm__("r13");
-
 /* This is the size of the initial TCB.  */
 # define TLS_INIT_TCB_SIZE sizeof (tcbhead_t)
 
@@ -100,18 +98,27 @@
 #  define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	(((tcbhead_t *) __thread_self)->private)
+#define THREAD_SYSINFO(pd)	(((tcbhead_t *) ((pd) + 1))->private)
+
+#if defined NEED_DL_SYSINFO
+# define INIT_SYSINFO   THREAD_SELF_SYSINFO = GL(dl_sysinfo)
+#else
+# define INIT_SYSINFO   NULL
+#endif
+
 /* Code to initially initialize the thread pointer.  This might need
    special attention since 'errno' is not yet available and if the
    operation can cause a failure 'errno' must not be touched.  */
 # define TLS_INIT_TP(thrdescr, secondcall) \
-  (__thread_self = (thrdescr), NULL)
+  (__thread_self = (thrdescr), INIT_SYSINFO, NULL)
 
 /* Return the address of the dtv for the current thread.  */
 #  define THREAD_DTV() \
   (((tcbhead_t *)__thread_self)->dtv)
 
 /* Return the thread descriptor for the current thread.  */
-# define THREAD_SELF (__thread_self - 1)
+# define THREAD_SELF ((struct pthread *) __thread_self - 1)
 
 /* Magic for libthread_db to know how to do THREAD_SELF.  */
 # define DB_THREAD_SELF REGISTER (64, 13 * 8, -sizeof (struct pthread))
Index: nptl/sysdeps/pthread/createthread.c
--- nptl/sysdeps/pthread/createthread.c
+++ nptl/sysdeps/pthread/createthread.c
@@ -226,7 +226,7 @@
     }
 
 #ifdef NEED_DL_SYSINFO
-  assert (THREAD_GETMEM (THREAD_SELF, header.sysinfo) == pd->header.sysinfo);
+  assert (THREAD_SELF_SYSINFO == THREAD_SYSINFO(pd));
 #endif
 
   /* Actually create the thread.  */
Index: nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
--- nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
@@ -26,7 +26,7 @@
 #include <ia64intrin.h>
 #include <atomic.h>
 
-#define SYS_futex		1230
+#define __NR_futex		1230
 #define FUTEX_WAIT		0
 #define FUTEX_WAKE		1
 #define FUTEX_REQUEUE		3
@@ -34,81 +34,30 @@
 /* Initializer for compatibility lock.	*/
 #define LLL_MUTEX_LOCK_INITIALIZER (0)
 
-#define lll_futex_clobbers \
-  "out5", "out6", "out7",						      \
-  /* Non-stacked integer registers, minus r8, r10, r15.  */		      \
-  "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	      \
-  "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	      \
-  "r28", "r29", "r30", "r31",						      \
-  /* Predicate registers.  */						      \
-  "p6", "p7", "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15",	      \
-  /* Non-rotating fp registers.  */					      \
-  "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	      \
-  /* Branch registers.  */						      \
-  "b6", "b7",								      \
-  "memory"
+#define IA64_USE_NEW_STUB
+#include <sysdep-inline.h>
 
 #define lll_futex_wait(futex, val) lll_futex_timed_wait (futex, val, 0)
 
-#define lll_futex_timed_wait(futex, val, timespec) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAIT;			      \
-     register int __o2 asm ("out2") = (int) (val);			      \
-     register long int __o3 asm ("out3") = (long int) (timespec);	      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %7;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3)   \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2), "6" (__o3)				      \
-		       : "out4", lll_futex_clobbers);			      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-#define lll_futex_wake(futex, nr) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAKE;			      \
-     register int __o2 asm ("out2") = (int) (nr);			      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %6;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2)		      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2)					      \
-		       : "out3", "out4", lll_futex_clobbers);		      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-#define lll_futex_requeue(futex, nr_wake, nr_move, mutex) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_REQUEUE;		      \
-     register int __o2 asm ("out2") = (int) (nr_wake);			      \
-     register int __o3 asm ("out3") = (int) (nr_move);			      \
-     register long int __o4 asm ("out4") = (long int) (mutex);		      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %8;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3),  \
-			 "=r" (__o4)					      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-			 "5" (__o2), "6" (__o3), "7" (__o4)		      \
-		       : lll_futex_clobbers);				      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
+#define lll_futex_timed_wait(ftx, val, timespec)			\
+({									\
+   DO_INLINE_SYSCALL(futex, 4, (long) (ftx), FUTEX_WAIT, (int) (val),	\
+		     (long) (timespec));				\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_wake(ftx, nr)						\
+({									\
+   DO_INLINE_SYSCALL(futex, 3, (long) (ftx), FUTEX_WAKE, (int) (nr));	\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_requeue(ftx, nr_wake, nr_move, mutex)			     \
+({									     \
+   DO_INLINE_SYSCALL(futex, 5, (long) (ftx), FUTEX_REQUEUE, (int) (nr_wake), \
+		     (int) (nr_move), (long) (mutex));			     \
+   _r10 == -1 ? -_retval : _retval;					     \
+})
 
 
 static inline int
Index: nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
--- nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
@@ -29,13 +29,21 @@
 # define PSEUDO(name, syscall_name, args)				      \
 .text;									      \
 ENTRY (name)								      \
-     adds r14 = MULTIPLE_THREADS_OFFSET, r13;;				      \
+     .prologue;								      \
+     adds r2 = SYSINFO_OFFSET, r13;					      \
+     adds r14 = MULTIPLE_THREADS_OFFSET, r13;				      \
+     .save ar.pfs, r11;							      \
+     mov r11 = ar.pfs;;							      \
+     .body;								      \
      ld4 r14 = [r14];							      \
+     ld8 r2 = [r2];							      \
      mov r15 = SYS_ify(syscall_name);;					      \
      cmp4.ne p6, p7 = 0, r14;						      \
-(p6) br.cond.spnt .Lpseudo_cancel;;					      \
-     break __BREAK_SYSCALL;;						      \
-     cmp.eq p6,p0=-1,r10;						      \
+     mov b7 = r2;							      \
+(p6) br.cond.spnt .Lpseudo_cancel;					      \
+     br.call.sptk.many b6 = b7;;					      \
+     mov ar.pfs = r11;							      \
+     cmp.eq p6,p0 = -1, r10;						      \
 (p6) br.cond.spnt.few __syscall_error;					      \
      ret;;								      \
      .endp name;							      \
@@ -45,17 +53,20 @@
 __GC_##name:								      \
 .Lpseudo_cancel:							      \
      .prologue;								      \
-     .regstk args, 5, args, 0;						      \
+     .regstk args, 6, args, 0;						      \
      .save ar.pfs, loc0;						      \
-     alloc loc0 = ar.pfs, args, 5, args, 0;				      \
+     alloc loc0 = ar.pfs, args, 6, args, 0;				      \
+     adds loc5 = SYSINFO_OFFSET, r13;					      \
      .save rp, loc1;							      \
      mov loc1 = rp;;							      \
      .body;								      \
+     ld8 loc5 = [loc5];							      \
      CENABLE;;								      \
      mov loc2 = r8;							      \
+     mov b7 = loc5;							      \
      COPY_ARGS_##args							      \
      mov r15 = SYS_ify(syscall_name);					      \
-     break __BREAK_SYSCALL;;						      \
+     br.call.sptk.many b6 = b7;;					      \
      mov loc3 = r8;							      \
      mov loc4 = r10;							      \
      mov out0 = loc2;							      \
Index: sysdeps/ia64/elf/start.S
Index: sysdeps/unix/sysv/linux/ia64/brk.S
--- sysdeps/unix/sysv/linux/ia64/brk.S
+++ sysdeps/unix/sysv/linux/ia64/brk.S
@@ -35,19 +35,17 @@
 weak_alias (__curbrk, ___brk_addr)
 
 LEAF(__brk)
-	mov	r15=__NR_brk
-	break.i	__BREAK_SYSCALL
+	.regstk 1, 0, 0, 0
+	DO_CALL(__NR_brk)
+	cmp.ltu	p6, p0 = ret0, in0
+	addl r9 = @ltoff(__curbrk), gp
 	;;
-	cmp.ltu	p6,p0=ret0,r32	/* r32 is the input register, even though we
-				   haven't allocated a frame */
-	addl	r9=@ltoff(__curbrk),gp
-	;;
-	ld8	r9=[r9]
-(p6) 	mov	ret0=ENOMEM
+	ld8 r9 = [r9]
+(p6) 	mov ret0 = ENOMEM
 (p6)	br.cond.spnt.few __syscall_error
 	;;
-	st8	[r9]=ret0
-	mov 	ret0=0
+	st8 [r9] = ret0
+	mov ret0 = 0
 	ret
 END(__brk)
 
Index: sysdeps/unix/sysv/linux/ia64/getcontext.S
--- sysdeps/unix/sysv/linux/ia64/getcontext.S
+++ sysdeps/unix/sysv/linux/ia64/getcontext.S
@@ -35,26 +35,27 @@
 
 ENTRY(__getcontext)
 	.prologue
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_BLOCK, NULL, &sc->sc_mask):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_BLOCK
-	mov out1 = 0
-	add out2 = r2, in0
-	mov out3 = 8	// sizeof kernel sigset_t
 
-	break __BREAK_SYSCALL
 	flushrs					// save dirty partition on rbs
+	mov out1 = 0
+	add out2 = r3, in0
+
+	mov out3 = 8	// sizeof kernel sigset_t
+	DO_CALL(__NR_rt_sigprocmask)
 
 	mov.m rFPSR = ar.fpsr
 	mov.m rRSC = ar.rsc
 	add r2 = SC_GR+1*8, r32
 	;;
 	mov.m rBSP = ar.bsp
+	.prologue
 	.save ar.unat, rUNAT
 	mov.m rUNAT = ar.unat
 	.body
@@ -63,7 +64,7 @@
 
 .mem.offset 0,0; st8.spill [r2] = r1, (5*8 - 1*8)
 .mem.offset 8,0; st8.spill [r3] = r4, 16
-	mov.i rPFS = ar.pfs
+	mov rPFS = r11
 	;;
 .mem.offset 0,0; st8.spill [r2] = r5, 16
 .mem.offset 8,0; st8.spill [r3] = r6, 48
Index: sysdeps/unix/sysv/linux/ia64/setcontext.S
--- sysdeps/unix/sysv/linux/ia64/setcontext.S
+++ sysdeps/unix/sysv/linux/ia64/setcontext.S
@@ -32,20 +32,21 @@
   other than the PRESERVED state.  */
 
 ENTRY(__setcontext)
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.prologue
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_SETMASK, &sc->sc_mask, NULL):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_SETMASK
-	add out1 = r2, in0
+	;;
+	add out1 = r3, in0
 	mov out2 = 0
 	mov out3 = 8	// sizeof kernel sigset_t
 
 	invala
-	break __BREAK_SYSCALL
+	DO_CALL(__NR_rt_sigprocmask)
 	add r2 = SC_NAT, r32
 
 	add r3 = SC_RNAT, r32			// r3 <- &sc_ar_rnat
Index: sysdeps/unix/sysv/linux/ia64/sysdep-inline.h
--- /dev/null
+++ sysdeps/unix/sysv/linux/ia64/sysdep-inline.h
@@ -0,0 +1,155 @@
+#ifndef _LINUX_IA64_SYSDEP_INLINE_H
+#define _LINUX_IA64_SYSDEP_INLINE_H 1
+
+/* On IA-64 we have stacked registers for passing arguments.  The
+   "out" registers end up being the called function's "in"
+   registers.
+
+   Also, since we have plenty of registers we have two return values
+   from a syscall.  r10 is set to -1 on error, whilst r8 contains the
+   (non-negative) errno on error or the return value on success.
+ */
+
+#include <dl-sysdep.h>
+
+#if defined USE_DL_SYSINFO \
+	&& (!defined NOT_IN_libc || defined IS_IN_libpthread)
+# define IA64_USE_NEW_STUB
+#else
+# undef IA64_USE_NEW_STUB
+#endif
+
+#ifndef __ASSEMBLER__
+
+#include <asm/unistd.h>
+
+#define BREAK_INSN_1(num)	"break " #num ";;\n\t"
+#define BREAK_INSN(num)		BREAK_INSN_1(num)
+
+register void *__thread_self __asm__("r13");
+
+#ifdef IA64_USE_NEW_STUB
+
+struct _ia64_tcb {
+  void *dtv;
+  void *private;
+};
+
+#define DO_INLINE_SYSCALL(name, nr, args...)				\
+    register long _r8 __asm ("r8");					\
+    register long _r10 __asm ("r10");					\
+    register long _r15 __asm ("r15") = __NR_##name;			\
+    register void *_b7 __asm ("b7") =					\
+	((struct _ia64_tcb *) __thread_self)->private;			\
+    long _retval;							\
+    LOAD_ARGS_##nr (args);						\
+    /*									\
+     * Don't specify any unwind info here.  We mark ar.pfs as		\
+     * clobbered.  This will force the compiler to save ar.pfs		\
+     * somewhere and emit appropriate unwind info for that save.	\
+     */									\
+    __asm __volatile ("br.call.sptk.many b6=%0;;\n"			\
+		      : "=b"(_b7), "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
+			ASM_OUTARGS_##nr				\
+		      : "0" (_b7), "3" (_r15) ASM_ARGS_##nr		\
+		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);		\
+    _retval = _r8;
+
+#else /* !IA64_USE_NEW_STUB */
+
+#define DO_INLINE_SYSCALL(name, nr, args...)			\
+    register long _r8 asm ("r8");				\
+    register long _r10 asm ("r10");				\
+    register long _r15 asm ("r15") = __NR_##name;		\
+    long _retval;						\
+    LOAD_ARGS_##nr (args);					\
+    __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
+		      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
+			ASM_OUTARGS_##nr			\
+		      : "2" (_r15) ASM_ARGS_##nr		\
+		      : "memory" ASM_CLOBBERS_##nr);		\
+    _retval = _r8;
+
+#endif /* !IA64_USE_NEW_STUB */
+
+#undef INTERNAL_SYSCALL_DECL
+#define INTERNAL_SYSCALL_DECL(err) long int err
+
+#undef INTERNAL_SYSCALL_ERROR_P
+#define INTERNAL_SYSCALL_ERROR_P(val, err)	(err == -1)
+
+#undef INTERNAL_SYSCALL_ERRNO
+#define INTERNAL_SYSCALL_ERRNO(val, err)	(val)
+
+#define LOAD_ARGS_0()   do { } while (0)
+#define LOAD_ARGS_1(out0)				\
+  register long _out0 asm ("out0") = (long) (out0);	\
+  LOAD_ARGS_0 ()
+#define LOAD_ARGS_2(out0, out1)				\
+  register long _out1 asm ("out1") = (long) (out1);	\
+  LOAD_ARGS_1 (out0)
+#define LOAD_ARGS_3(out0, out1, out2)			\
+  register long _out2 asm ("out2") = (long) (out2);	\
+  LOAD_ARGS_2 (out0, out1)
+#define LOAD_ARGS_4(out0, out1, out2, out3)		\
+  register long _out3 asm ("out3") = (long) (out3);	\
+  LOAD_ARGS_3 (out0, out1, out2)
+#define LOAD_ARGS_5(out0, out1, out2, out3, out4)	\
+  register long _out4 asm ("out4") = (long) (out4);	\
+  LOAD_ARGS_4 (out0, out1, out2, out3)
+#define LOAD_ARGS_6(out0, out1, out2, out3, out4, out5)	\
+  register long _out5 asm ("out5") = (long) (out5);	\
+  LOAD_ARGS_5 (out0, out1, out2, out3, out4)
+
+#define ASM_OUTARGS_0
+#define ASM_OUTARGS_1	ASM_OUTARGS_0, "=r" (_out0)
+#define ASM_OUTARGS_2	ASM_OUTARGS_1, "=r" (_out1)
+#define ASM_OUTARGS_3	ASM_OUTARGS_2, "=r" (_out2)
+#define ASM_OUTARGS_4	ASM_OUTARGS_3, "=r" (_out3)
+#define ASM_OUTARGS_5	ASM_OUTARGS_4, "=r" (_out4)
+#define ASM_OUTARGS_6	ASM_OUTARGS_5, "=r" (_out5)
+
+#ifdef IA64_USE_NEW_STUB
+# define ASM_ARGS_0
+# define ASM_ARGS_1	ASM_ARGS_0, "4" (_out0)
+# define ASM_ARGS_2	ASM_ARGS_1, "5" (_out1)
+# define ASM_ARGS_3	ASM_ARGS_2, "6" (_out2)
+# define ASM_ARGS_4	ASM_ARGS_3, "7" (_out3)
+# define ASM_ARGS_5	ASM_ARGS_4, "8" (_out4)
+# define ASM_ARGS_6	ASM_ARGS_5, "9" (_out5)
+#else
+# define ASM_ARGS_0
+# define ASM_ARGS_1	ASM_ARGS_0, "3" (_out0)
+# define ASM_ARGS_2	ASM_ARGS_1, "4" (_out1)
+# define ASM_ARGS_3	ASM_ARGS_2, "5" (_out2)
+# define ASM_ARGS_4	ASM_ARGS_3, "6" (_out3)
+# define ASM_ARGS_5	ASM_ARGS_4, "7" (_out4)
+# define ASM_ARGS_6	ASM_ARGS_5, "8" (_out5)
+#endif
+
+#define ASM_CLOBBERS_0	ASM_CLOBBERS_1, "out0"
+#define ASM_CLOBBERS_1	ASM_CLOBBERS_2, "out1"
+#define ASM_CLOBBERS_2	ASM_CLOBBERS_3, "out2"
+#define ASM_CLOBBERS_3	ASM_CLOBBERS_4, "out3"
+#define ASM_CLOBBERS_4	ASM_CLOBBERS_5, "out4"
+#define ASM_CLOBBERS_5	ASM_CLOBBERS_6, "out5"
+#define ASM_CLOBBERS_6_COMMON	, "out6", "out7",			\
+  /* Non-stacked integer registers, minus r8, r10, r15.  */		\
+  "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	\
+  "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	\
+  "r28", "r29", "r30", "r31",						\
+  /* Predicate registers.  */						\
+  "p6", "p7", "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15",	\
+  /* Non-rotating fp registers.  */					\
+  "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	\
+  /* Branch registers.  */						\
+  "b6"
+
+#ifdef IA64_USE_NEW_STUB
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON
+#else
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON , "b7"
+#endif
+
+#endif /* __ASSEMBLER__ */
+#endif /* _LINUX_IA64_SYSDEP_INLINE_H */
Index: sysdeps/unix/sysv/linux/ia64/vfork.S
--- sysdeps/unix/sysv/linux/ia64/vfork.S
+++ sysdeps/unix/sysv/linux/ia64/vfork.S
@@ -34,9 +34,8 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
-	DO_CALL (SYS_ify (clone))
+	DO_CALL_VIA_BREAK (SYS_ify (clone))
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-12 22:53             ` David Mosberger
@ 2003-11-12 23:10               ` Ulrich Drepper
  2003-11-12 23:47                 ` David Mosberger
  2003-11-13  7:32               ` David Mosberger
  1 sibling, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-12 23:10 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

>  - The inline-syscall macros are now in a separate file,

I don't like this one bit.  This makes ia64 sources different from the
rest which is not necessary.

The rest looks OK as far as I can see from inspection.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/srwr2ijCOnn/RHQRAht8AJ98hRAqf+PI0Vop6cEA41IvmrFrMwCfc9Ud
4xCWEpwRFwsGzAj5+CuxjKk=
=Ysm0
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-12 23:10               ` Ulrich Drepper
@ 2003-11-12 23:47                 ` David Mosberger
  2003-11-12 23:57                   ` Jakub Jelinek
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-12 23:47 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Wed, 12 Nov 2003 15:03:07 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> David Mosberger wrote:

  >> - The inline-syscall macros are now in a separate file,

  Uli> I don't like this one bit.  This makes ia64 sources different from the
  Uli> rest which is not necessary.

So what do you suggest?  I don't think lowlevellock.h can include sysdep.h:

 - sysdep.h includes tls.h
 - tls.h includes descr.h
 - descr.h includes lowlevellock.h

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-12 23:47                 ` David Mosberger
@ 2003-11-12 23:57                   ` Jakub Jelinek
  2003-11-13  2:38                     ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Jakub Jelinek @ 2003-11-12 23:57 UTC (permalink / raw)
  To: davidm; +Cc: Ulrich Drepper, libc-hacker

On Wed, Nov 12, 2003 at 03:47:37PM -0800, David Mosberger wrote:
> >>>>> On Wed, 12 Nov 2003 15:03:07 -0800, Ulrich Drepper <drepper@redhat.com> said:
> 
>   Uli> David Mosberger wrote:
> 
>   >> - The inline-syscall macros are now in a separate file,
> 
>   Uli> I don't like this one bit.  This makes ia64 sources different from the
>   Uli> rest which is not necessary.
> 
> So what do you suggest?  I don't think lowlevellock.h can include sysdep.h:
> 
>  - sysdep.h includes tls.h
>  - tls.h includes descr.h

2x yes

>  - descr.h includes lowlevellock.h

No.  lowlevellock.h doesn't need it, see what e.g. SPARC or PPC are doing
in lowlevellock.h.  Just a few inlines have to be replaced with macros,
that's all.

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-12 23:57                   ` Jakub Jelinek
@ 2003-11-13  2:38                     ` David Mosberger
  2003-11-13  3:46                       ` Ulrich Drepper
  2003-11-13  8:23                       ` Jakub Jelinek
  0 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-13  2:38 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Wed, 12 Nov 2003 22:50:46 +0100, Jakub Jelinek <jakub@redhat.com> said:

  >> - sysdep.h includes tls.h - tls.h includes descr.h

  Jakub> 2x yes

  >> - descr.h includes lowlevellock.h

  Jakub> No.  lowlevellock.h doesn't need it, see what e.g. SPARC or
  Jakub> PPC are doing in lowlevellock.h.  Just a few inlines have to
  Jakub> be replaced with macros, that's all.

I see what you mean.  It's a bit icky to depend on the other
header-files including the pre-requisites of lowlevellock.h, but yes,
it makes the problem _much_ simpler to handle, so thanks for the tip.

Next stupid question: what's the purpose of librt?  I'm asking since
sysdep.h now says:

#if defined USE_DL_SYSINFO \
	&& (!defined NOT_IN_libc || defined IS_IN_libpthread)
# define IA64_USE_NEW_STUB
#else
# undef IA64_USE_NEW_STUB
#endif

and in <sysdep-cancel.h>, it says:

#if !defined NOT_IN_libc || defined IS_IN_libpthread || defined IS_IN_librt

# undef PSEUDO
# define PSEUDO(name, syscall_name, args)				      \
     :
   <code that uses the IA64_USE_NEW_STUB convention>
     :

That is, anything in librt that uses the PSEUDO() macro from
sysdep-cancel.h will try to uses the new stub, but nothing else will.
Will this inconsistency cause problems?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13  2:38                     ` David Mosberger
@ 2003-11-13  3:46                       ` Ulrich Drepper
  2003-11-13  3:53                         ` David Mosberger
  2003-11-13  8:23                       ` Jakub Jelinek
  1 sibling, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-13  3:46 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Next stupid question: what's the purpose of librt?  I'm asking since
> sysdep.h now says:

It's just another POSIX-defined library.  It needs some special
treatment since it also can contain system calls.

> 
> #if defined USE_DL_SYSINFO \
> 	&& (!defined NOT_IN_libc || defined IS_IN_libpthread)
> # define IA64_USE_NEW_STUB
> #else
> # undef IA64_USE_NEW_STUB
> #endif
> 
> and in <sysdep-cancel.h>, it says:
> 
> #if !defined NOT_IN_libc || defined IS_IN_libpthread || defined IS_IN_librt
> 
> # undef PSEUDO
> # define PSEUDO(name, syscall_name, args)				      \
>      :
>    <code that uses the IA64_USE_NEW_STUB convention>
>      :
> 
> That is, anything in librt that uses the PSEUDO() macro from
> sysdep-cancel.h will try to uses the new stub, but nothing else will.

I don't understand.  The code you quotes shows that the new stub is used
for libc and libpthread so far.  The cancellation handling is in
addition also used for librt.  The two are independent, but you should
add || defined IS_IN_librt to the code in the ia64 header.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/svy+2ijCOnn/RHQRAiMXAJ9qf8m0eBQ0rBvY4alPhswD/whwvQCfbPjI
tSm11v+b8h3eX5T4tYh0cAw=
=K/5U
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13  3:46                       ` Ulrich Drepper
@ 2003-11-13  3:53                         ` David Mosberger
  0 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-13  3:53 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Wed, 12 Nov 2003 19:38:38 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> Next stupid question: what's the purpose of librt?  I'm asking since
  >> sysdep.h now says:

  Uli> It's just another POSIX-defined library.  It needs some special
  Uli> treatment since it also can contain system calls.

Ah, thanks.

  >> #if defined USE_DL_SYSINFO \
  >> && (!defined NOT_IN_libc || defined IS_IN_libpthread)
  >> # define IA64_USE_NEW_STUB
  >> #else
  >> # undef IA64_USE_NEW_STUB
  >> #endif

  >> and in <sysdep-cancel.h>, it says:

  >> #if !defined NOT_IN_libc || defined IS_IN_libpthread || defined IS_IN_librt

  >> # undef PSEUDO
  >> # define PSEUDO(name, syscall_name, args)				      \
  >> :
  >> <code that uses the IA64_USE_NEW_STUB convention>
  >> :

  >> That is, anything in librt that uses the PSEUDO() macro from
  >> sysdep-cancel.h will try to uses the new stub, but nothing else will.

  Uli> I don't understand.  The code you quotes shows that the new stub is used
  Uli> for libc and libpthread so far.

My example wasn't very clear: the new definition of PSEUDO() uses open
code for using the new-stub convention, so I can get a better
schedule.

  Uli> The cancellation handling is in addition also used for librt.
  Uli> The two are independent, but you should add || defined
  Uli> IS_IN_librt to the code in the ia64 header.

OK, if I can turn on IA64_USE_NEW_STUB for IS_IN_librt in sysdep.h
then all is fine.

Thanks,

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-12 22:53             ` David Mosberger
  2003-11-12 23:10               ` Ulrich Drepper
@ 2003-11-13  7:32               ` David Mosberger
  2003-11-13  9:24                 ` Ulrich Drepper
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-13  7:32 UTC (permalink / raw)
  To: drepper, Jakub Jelinek, libc-hacker

>>>>> On Wed, 12 Nov 2003 14:53:32 -0800, David Mosberger <davidm@linux.hpl.hp.com> said:


  David> - with nptl, make check shows failures in tst-cancel{6,9,17};
  David> I'll need to look into these as it's likely that I botched
  David> something in sysdep-cancel.h.

OK, I believe tst-cancel6 may be failing because cancellating the
victim thread fails to release the lock on the file that it's trying
to read from.  Is this test known to work for ia64 at all?  Is there a
good way to trace what's going on during cancellation?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13  2:38                     ` David Mosberger
  2003-11-13  3:46                       ` Ulrich Drepper
@ 2003-11-13  8:23                       ` Jakub Jelinek
  1 sibling, 0 replies; 98+ messages in thread
From: Jakub Jelinek @ 2003-11-13  8:23 UTC (permalink / raw)
  To: davidm; +Cc: Ulrich Drepper, libc-hacker

On Wed, Nov 12, 2003 at 06:38:10PM -0800, David Mosberger wrote:
> I see what you mean.  It's a bit icky to depend on the other
> header-files including the pre-requisites of lowlevellock.h, but yes,
> it makes the problem _much_ simpler to handle, so thanks for the tip.

When using the syscalls in macros only and not inlines, it doesn't
depend on the order of pre-requisities of lowlevellock.h, just
on sysdep.h being included in all the files which use lll_futex_ macros
(only a handful of sources in NPTL sysdeps/ which all include sysdep.h).

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13  7:32               ` David Mosberger
@ 2003-11-13  9:24                 ` Ulrich Drepper
  2003-11-13 17:30                   ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-13  9:24 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> OK, I believe tst-cancel6 may be failing because cancellating the
> victim thread fails to release the lock on the file that it's trying
> to read from.  Is this test known to work for ia64 at all?

All tests work on ia64.


> Is there a good way to trace what's going on during cancellation?

There isn't any although rth was going to look into adding something
like this.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/s0we2ijCOnn/RHQRAuoJAKDHh7yzisV5zkN1bhROad3E+2e7vACdFPuJ
8MAd9DyxUYFjlVkmhfO9Vwo=
=3lFH
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13  9:24                 ` Ulrich Drepper
@ 2003-11-13 17:30                   ` David Mosberger
  2003-11-13 17:56                     ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-13 17:30 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Thu, 13 Nov 2003 01:17:18 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> OK, I believe tst-cancel6 may be failing because cancellating the
  >> victim thread fails to release the lock on the file that it's trying
  >> to read from.  Is this test known to work for ia64 at all?

  Uli> All tests work on ia64.

OK.  Which compiler?  I'm using gcc-3.3.2 (provided by Debian).

  >> Is there a good way to trace what's going on during cancellation?

  Uli> There isn't any although rth was going to look into adding something
  Uli> like this.

I can build libunwind with debugging enabled, but I don't think it's
gonna do me any good, since all fprintf calls will get cancelled, once
cancellation has begun.  Would it perhaps be possible for glibc to
provide a non-cancellable version of fprintf?

Where can I find the definitely list of cancellable routines?  I'm a
bit worried about libunwind.  It may call mmap (for memory allocation)
and pthread_mutex_{lock,unlock} (when global caching is in effect).

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13 17:30                   ` David Mosberger
@ 2003-11-13 17:56                     ` Ulrich Drepper
  2003-11-13 18:47                       ` David Mosberger
  2003-11-13 21:34                       ` David Mosberger
  0 siblings, 2 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-13 17:56 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> OK.  Which compiler?  I'm using gcc-3.3.2 (provided by Debian).

What do you guess?  Of course one of our compilers.  The latest official
ia64 compiler is the one from RHEL3.  I'm sure I've also tested at some
point the compiler which went finally into FC1.


>  Would it perhaps be possible for glibc to
> provide a non-cancellable version of fprintf?

Definitely not.  How can you expect fprintf to work for low-level
debugging like that of cancellation handling?

There is one way, but it's not official and might just change underneath
you.  You can add 'c' to the mode string of fopen() to request no
cancellation.  But this requires opening a new stream.


> Where can I find the definitely list of cancellable routines?

POSIX standard.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/s8P52ijCOnn/RHQRAvdwAJ4i/o9pYAiK4PJge0hIKIsW/x7XpgCcCqjJ
zk3WYpsX9F3CKc8RgxgM2rM=
=aWtA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13 17:56                     ` Ulrich Drepper
@ 2003-11-13 18:47                       ` David Mosberger
  2003-11-13 20:16                         ` Ulrich Drepper
  2003-11-13 21:34                       ` David Mosberger
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-13 18:47 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Thu, 13 Nov 2003 09:48:41 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> Would it perhaps be possible for glibc to provide a
  >> non-cancellable version of fprintf?

  Uli> Definitely not.  How can you expect fprintf to work for
  Uli> low-level debugging like that of cancellation handling?

Didn't you answer your own question in the paragraph below?

  Uli> There is one way, but it's not official and might just change
  Uli> underneath you.  You can add 'c' to the mode string of fopen()
  Uli> to request no cancellation.  But this requires opening a new
  Uli> stream.

OK, I'll look into adding that to libunwind.

  >> Where can I find the definitely list of cancellable routines?

  Uli> POSIX standard.

Thanks.  In case there are others on this list who are not intimately
familiar with the POSIX standard, here is the relevant URL:

 http://www.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_09.html

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13 18:47                       ` David Mosberger
@ 2003-11-13 20:16                         ` Ulrich Drepper
  0 siblings, 0 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-13 20:16 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Didn't you answer your own question in the paragraph below?

No.  The non-cancelable interfaces are used internally to provide
interfaces which have to behave like using streams and which either
mustn't be canceled or the behavior is undefined.  In any case, they are
all part of the implementation.  Everything else should use the official
interfaces which are cancelable.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/s+QN2ijCOnn/RHQRAhkYAKCUfQ+5SG198yPqulEadp5/hFtpFQCfcDK2
Cs1wYWwuBSlZgmSLS2Ocf0s=
=M+m5
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13 17:56                     ` Ulrich Drepper
  2003-11-13 18:47                       ` David Mosberger
@ 2003-11-13 21:34                       ` David Mosberger
  2003-11-13 21:44                         ` Jakub Jelinek
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-13 21:34 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

Attached is what I hope to be the final patch.  I think it's taking
care of all the feedback I received so far.  I'm still looking into
the tst-cancel6 failure, but I'm fairly confident that the issue is no
due to the patch, but due to my environment (probably a
compiler/libgcc_s issue).  I did try the unmodified CVS tree and
tst-cancel6 also fails in the same way, giving further credibility to
this explanation.

Thanks,

	--david

libc/ChangeLog

2003-11-12  David Mosberger  <davidm@hpl.hp.com>

	* elf/rtld.c (dl_main): Restrict dl_sysinfo_dso check to first
	program header.  On ia64, the check failed previously because
	there are two program headers.

	* sysdeps/unix/sysv/linux/ia64/brk.S (__curbrk): Restructure it
	to take advantage of DO_CALL() macro.
	* sysdeps/unix/sysv/linux/ia64/clone2.S: Ditto.
	* sysdeps/unix/sysv/linux/ia64/getcontext.S: Ditto.
	* sysdeps/unix/sysv/linux/ia64/setcontext.S: Ditto.

	* sysdeps/unix/sysv/linux/ia64/sysdep.h: Add include of
	<dl-sysdep.h> and <tls.h>.
	(IA64_USE_NEW_STUB): New macro.
	(DO_CALL_VIA_BREAK): Ditto.
	(DO_CALL): Add new variants for IA64_USE_NEW_STUB.
	(DO_INLINE_SYSCALL): New macro.
	(INLINE_SYSCALL): Define in terms of DO_INLINE_SYSCALL.
	(INTERNAL_SYSCALL): Ditto.
	(ASM_ARGS_0, ASM_ARGS_1, ASM_ARGS_2, ASM_ARGS_3, ASM_ARGS_4,
	ASM_ARGS_5, ASM_ARGS_6): Add new variant for IA64_USE_NEW_STUB.
	(ASM_CLOBBERS_6_COMMON): New macro.
	(ASM_CLOBBERS_6): Add new variant for IA64_USE_NEW_STUB.

	* sysdeps/unix/sysv/linux/ia64/vfork.S: Use DO_CALL_VIA_BREAK()
	instead of DO_CALL().

linuxthreads/ChangeLog

2003-11-12  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/dl-sysdep.h: New file.

nptl/ChangeLog

2003-11-12 David Mosberger  <davidm@hpl.hp.com>

	* allocatestack.c (allocate_stack): Use THREAD_SYSINFO and
	THREAD_SELF_SYSINFO instead of open code.

	* sysdeps/i386/tls.h (THREAD_SELF_SYSINFO): New macro.
	(THREAD_SYSINFO): Ditto.

	* sysdeps/ia64/tcb-offsets.sym: Add SYSINFO_OFFSET.

	* sysdeps/ia64/tls.h: Move declaration of __thread_self up so it
	comes before the include of <sysdep.h>.
	(THREAD_SELF_SYSINFO): New macro.
	(THREAD_SYSINFO): Ditto.
	(INIT_SYSINFO): New macro.
	(TLS_INIT_TP): Call INIT_SYSINFO.

	* sysdeps/pthread/createthread.c (create_thread): Use
	THREAD_SELF_SYSINFO and THREAD_SYSINFO instead of open code.

	* sysdeps/unix/sysv/linux/ia64/dl-sysdep.h: New file.

	* sysdeps/unix/sysv/linux/ia64/lowlevellock.h (__NR_futex): Rename
	from SYS_futex, to match expectations of
	sysdep.h:DO_INLINE_SYSCALL.
	(lll_futex_clobbers): Remove.
	(lll_futex_timed_wait): Rewrite in terms of DO_INLINE_SYSCALL.
	(lll_futex_wake): Ditto.
	(lll_futex_requeue): Ditto.
	(__lll_mutex_trylock): Rewrite to a macro, so we can include this
	file before DO_INLINE_SYSCALL is defined (proposed by Jakub
	Jelinek).
	(__lll_mutex_lock): Ditto.
	(__lll_mutex_cond_lock): Ditto.
	(__lll_mutex_timed_lock): Ditto.
	(__lll_mutex_unlock): Ditto.
	(__lll_mutex_unlock_force): Ditto.

	* sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h (PSEUDO): Take
	advantage of new syscall stub and optimize accordingly.

Index: elf/rtld.c
--- elf/rtld.c
+++ elf/rtld.c
@@ -1169,7 +1169,7 @@
 		  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
 		  break;
 		}
-	      if (ph->p_type == PT_LOAD)
+	      if (i == 0 && ph->p_type == PT_LOAD)
 		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
 	    }
 	  elf_get_dynamic_info (l, dyn_temp);
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,45 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+#define NEED_DL_SYSINFO	1
+#undef USE_DL_SYSINFO
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: nptl/allocatestack.c
--- nptl/allocatestack.c
+++ nptl/allocatestack.c
@@ -352,7 +352,7 @@
 
 #ifdef NEED_DL_SYSINFO
       /* Copy the sysinfo value from the parent.  */
-      pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+      THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
       /* The process ID is also the same as that of the caller.  */
@@ -488,7 +488,7 @@
 
 #ifdef NEED_DL_SYSINFO
 	  /* Copy the sysinfo value from the parent.  */
-	  pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+	  THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
 	  /* The process ID is also the same as that of the caller.  */
Index: nptl/sysdeps/i386/tls.h
--- nptl/sysdeps/i386/tls.h
+++ nptl/sysdeps/i386/tls.h
@@ -128,6 +128,8 @@
 # define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	THREAD_GETMEM (THREAD_SELF, header.sysinfo)
+#define THREAD_SYSINFO(pd)	((pd)->header.sysinfo)
 
 /* Macros to load from and store into segment registers.  */
 # ifndef TLS_GET_GS
Index: nptl/sysdeps/ia64/tcb-offsets.sym
--- nptl/sysdeps/ia64/tcb-offsets.sym
+++ nptl/sysdeps/ia64/tcb-offsets.sym
@@ -2,3 +2,4 @@
 #include <tls.h>
 
 MULTIPLE_THREADS_OFFSET offsetof (struct pthread, header.multiple_threads) - sizeof (struct pthread)
+SYSINFO_OFFSET		offsetof (tcbhead_t, private)
Index: nptl/sysdeps/ia64/tls.h
--- nptl/sysdeps/ia64/tls.h
+++ nptl/sysdeps/ia64/tls.h
@@ -42,6 +42,8 @@
   void *private;
 } tcbhead_t;
 
+register struct pthread *__thread_self __asm__("r13");
+
 # define TLS_MULTIPLE_THREADS_IN_TCB 1
 
 #else /* __ASSEMBLER__ */
@@ -64,8 +66,6 @@
 /* Get system call information.  */
 # include <sysdep.h>
 
-register struct pthread *__thread_self __asm__("r13");
-
 /* This is the size of the initial TCB.  */
 # define TLS_INIT_TCB_SIZE sizeof (tcbhead_t)
 
@@ -100,11 +100,20 @@
 #  define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	(((tcbhead_t *) __thread_self)->private)
+#define THREAD_SYSINFO(pd)	(((tcbhead_t *) ((pd) + 1))->private)
+
+#if defined NEED_DL_SYSINFO
+# define INIT_SYSINFO   THREAD_SELF_SYSINFO = (void *) GL(dl_sysinfo)
+#else
+# define INIT_SYSINFO   NULL
+#endif
+
 /* Code to initially initialize the thread pointer.  This might need
    special attention since 'errno' is not yet available and if the
    operation can cause a failure 'errno' must not be touched.  */
 # define TLS_INIT_TP(thrdescr, secondcall) \
-  (__thread_self = (thrdescr), NULL)
+  (__thread_self = (thrdescr), INIT_SYSINFO, NULL)
 
 /* Return the address of the dtv for the current thread.  */
 #  define THREAD_DTV() \
Index: nptl/sysdeps/pthread/createthread.c
--- nptl/sysdeps/pthread/createthread.c
+++ nptl/sysdeps/pthread/createthread.c
@@ -226,7 +226,7 @@
     }
 
 #ifdef NEED_DL_SYSINFO
-  assert (THREAD_GETMEM (THREAD_SELF, header.sysinfo) == pd->header.sysinfo);
+  assert (THREAD_SELF_SYSINFO == THREAD_SYSINFO(pd));
 #endif
 
   /* Actually create the thread.  */
Index: nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,64 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+/* This macro must be defined to either 0 or 1.
+
+   If 1, then an errno global variable hidden in ld.so will work right with
+   all the errno-using libc code compiled for ld.so, and there is never a
+   need to share the errno location with libc.  This is appropriate only if
+   all the libc functions that ld.so uses are called without PLT and always
+   get the versions linked into ld.so rather than the libc ones.  */
+
+#ifdef IS_IN_rtld
+# define RTLD_PRIVATE_ERRNO 1
+#else
+# define RTLD_PRIVATE_ERRNO 0
+#endif
+
+/* Traditionally system calls have been made using break 0x100000.  A
+   second method was introduced which, if possible, will use the EPC
+   instruction.  To signal the presence and where to find the code the
+   kernel passes an AT_SYSINFO_EHDR pointer in the auxiliary vector to
+   the application.  */
+#define NEED_DL_SYSINFO	1
+#define USE_DL_SYSINFO	1
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
--- nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
@@ -26,7 +26,7 @@
 #include <ia64intrin.h>
 #include <atomic.h>
 
-#define SYS_futex		1230
+#define __NR_futex		1230
 #define FUTEX_WAIT		0
 #define FUTEX_WAKE		1
 #define FUTEX_REQUEUE		3
@@ -34,112 +34,52 @@
 /* Initializer for compatibility lock.	*/
 #define LLL_MUTEX_LOCK_INITIALIZER (0)
 
-#define lll_futex_clobbers \
-  "out5", "out6", "out7",						      \
-  /* Non-stacked integer registers, minus r8, r10, r15.  */		      \
-  "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	      \
-  "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	      \
-  "r28", "r29", "r30", "r31",						      \
-  /* Predicate registers.  */						      \
-  "p6", "p7", "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15",	      \
-  /* Non-rotating fp registers.  */					      \
-  "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	      \
-  /* Branch registers.  */						      \
-  "b6", "b7",								      \
-  "memory"
-
 #define lll_futex_wait(futex, val) lll_futex_timed_wait (futex, val, 0)
 
-#define lll_futex_timed_wait(futex, val, timespec) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAIT;			      \
-     register int __o2 asm ("out2") = (int) (val);			      \
-     register long int __o3 asm ("out3") = (long int) (timespec);	      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %7;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3)   \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2), "6" (__o3)				      \
-		       : "out4", lll_futex_clobbers);			      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-#define lll_futex_wake(futex, nr) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAKE;			      \
-     register int __o2 asm ("out2") = (int) (nr);			      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %6;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2)		      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2)					      \
-		       : "out3", "out4", lll_futex_clobbers);		      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
+#define lll_futex_timed_wait(ftx, val, timespec)			\
+({									\
+   DO_INLINE_SYSCALL(futex, 4, (long) (ftx), FUTEX_WAIT, (int) (val),	\
+		     (long) (timespec));				\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_wake(ftx, nr)						\
+({									\
+   DO_INLINE_SYSCALL(futex, 3, (long) (ftx), FUTEX_WAKE, (int) (nr));	\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_requeue(ftx, nr_wake, nr_move, mutex)			     \
+({									     \
+   DO_INLINE_SYSCALL(futex, 5, (long) (ftx), FUTEX_REQUEUE, (int) (nr_wake), \
+		     (int) (nr_move), (long) (mutex));			     \
+   _r10 == -1 ? -_retval : _retval;					     \
+})
 
 
-#define lll_futex_requeue(futex, nr_wake, nr_move, mutex) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_REQUEUE;		      \
-     register int __o2 asm ("out2") = (int) (nr_wake);			      \
-     register int __o3 asm ("out3") = (int) (nr_move);			      \
-     register long int __o4 asm ("out4") = (long int) (mutex);		      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %8;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3),  \
-			 "=r" (__o4)					      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-			 "5" (__o2), "6" (__o3), "7" (__o4)		      \
-		       : lll_futex_clobbers);				      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-static inline int
-__attribute__ ((always_inline))
-__lll_mutex_trylock (int *futex)
-{
-  return atomic_compare_and_exchange_val_acq (futex, 1, 0) != 0;
-}
+#define __lll_mutex_trylock(futex) \
+  (atomic_compare_and_exchange_val_acq (futex, 1, 0) != 0)
 #define lll_mutex_trylock(futex) __lll_mutex_trylock (&(futex))
 
 
 extern void __lll_lock_wait (int *futex) attribute_hidden;
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_lock (int *futex)
-{
-  if (atomic_compare_and_exchange_bool_acq (futex, 1, 0) != 0)
-    __lll_lock_wait (futex);
-}
+#define __lll_mutex_lock(futex)						\
+  ((void) ({								\
+    int *__futex = (futex);						\
+    if (atomic_compare_and_exchange_bool_acq (__futex, 1, 0) != 0)	\
+      __lll_lock_wait (__futex);					\
+  }))
 #define lll_mutex_lock(futex) __lll_mutex_lock (&(futex))
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_cond_lock (int *futex)
-{
-  if (atomic_compare_and_exchange_bool_acq (futex, 2, 0) != 0)
-    __lll_lock_wait (futex);
-}
+#define __lll_mutex_cond_lock(futex)					\
+  ((void) ({								\
+    int *__futex = (futex);						\
+    if (atomic_compare_and_exchange_bool_acq (__futex, 2, 0) != 0)	\
+      __lll_lock_wait (__futex);					\
+  }))
 #define lll_mutex_cond_lock(futex) __lll_mutex_cond_lock (&(futex))
 
 
@@ -147,41 +87,37 @@
      attribute_hidden;
 
 
-static inline int
-__attribute__ ((always_inline))
-__lll_mutex_timedlock (int *futex, const struct timespec *abstime)
-{
-  int result = 0;
-
-  if (atomic_compare_and_exchange_bool_acq (futex, 1, 0) != 0)
-    result = __lll_timedlock_wait (futex, abstime);
-
-  return result;
-}
+#define __lll_mutex_timedlock(futex, abstime)				\
+  ({									\
+     int *__futex = (futex);						\
+     int __val = 0;							\
+									\
+     if (atomic_compare_and_exchange_bool_acq (__futex, 1, 0) != 0)	\
+       __val = __lll_timedlock_wait (__futex, abstime);			\
+     __val;								\
+  })
 #define lll_mutex_timedlock(futex, abstime) \
   __lll_mutex_timedlock (&(futex), abstime)
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_unlock (int *futex)
-{
-  int val = atomic_exchange_rel (futex, 0);
-
-  if (__builtin_expect (val > 1, 0))
-    lll_futex_wake (futex, 1);
-}
+#define __lll_mutex_unlock(futex)			\
+  ((void) ({						\
+    int *__futex = (futex);				\
+    int __val = atomic_exchange_rel (__futex, 0);	\
+							\
+    if (__builtin_expect (__val > 1, 0))		\
+      lll_futex_wake (__futex, 1);			\
+  }))
 #define lll_mutex_unlock(futex) \
   __lll_mutex_unlock(&(futex))
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_unlock_force (int *futex)
-{
-  (void) atomic_exchange_rel (futex, 0);
-  lll_futex_wake (futex, 1);
-}
+#define __lll_mutex_unlock_force(futex)		\
+  ((void) ({					\
+    int *__futex = (futex);			\
+    (void) atomic_exchange_rel (__futex, 0);	\
+    lll_futex_wake (__futex, 1);		\
+  }))
 #define lll_mutex_unlock_force(futex) \
   __lll_mutex_unlock_force(&(futex))
 
Index: nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
--- nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
@@ -29,13 +29,21 @@
 # define PSEUDO(name, syscall_name, args)				      \
 .text;									      \
 ENTRY (name)								      \
-     adds r14 = MULTIPLE_THREADS_OFFSET, r13;;				      \
+     .prologue;								      \
+     adds r2 = SYSINFO_OFFSET, r13;					      \
+     adds r14 = MULTIPLE_THREADS_OFFSET, r13;				      \
+     .save ar.pfs, r11;							      \
+     mov r11 = ar.pfs;;							      \
+     .body;								      \
      ld4 r14 = [r14];							      \
+     ld8 r2 = [r2];							      \
      mov r15 = SYS_ify(syscall_name);;					      \
      cmp4.ne p6, p7 = 0, r14;						      \
-(p6) br.cond.spnt .Lpseudo_cancel;;					      \
-     break __BREAK_SYSCALL;;						      \
-     cmp.eq p6,p0=-1,r10;						      \
+     mov b7 = r2;							      \
+(p6) br.cond.spnt .Lpseudo_cancel;					      \
+     br.call.sptk.many b6 = b7;;					      \
+     mov ar.pfs = r11;							      \
+     cmp.eq p6,p0 = -1, r10;						      \
 (p6) br.cond.spnt.few __syscall_error;					      \
      ret;;								      \
      .endp name;							      \
@@ -48,14 +56,17 @@
      .regstk args, 5, args, 0;						      \
      .save ar.pfs, loc0;						      \
      alloc loc0 = ar.pfs, args, 5, args, 0;				      \
+     adds loc4 = SYSINFO_OFFSET, r13;					      \
      .save rp, loc1;							      \
      mov loc1 = rp;;							      \
      .body;								      \
+     ld8 loc4 = [loc4];							      \
      CENABLE;;								      \
      mov loc2 = r8;							      \
+     mov b7 = loc4;							      \
      COPY_ARGS_##args							      \
      mov r15 = SYS_ify(syscall_name);					      \
-     break __BREAK_SYSCALL;;						      \
+     br.call.sptk.many b6 = b7;;					      \
      mov loc3 = r8;							      \
      mov loc4 = r10;							      \
      mov out0 = loc2;							      \
Index: sysdeps/unix/sysv/linux/ia64/brk.S
--- sysdeps/unix/sysv/linux/ia64/brk.S
+++ sysdeps/unix/sysv/linux/ia64/brk.S
@@ -35,19 +35,17 @@
 weak_alias (__curbrk, ___brk_addr)
 
 LEAF(__brk)
-	mov	r15=__NR_brk
-	break.i	__BREAK_SYSCALL
+	.regstk 1, 0, 0, 0
+	DO_CALL(__NR_brk)
+	cmp.ltu	p6, p0 = ret0, in0
+	addl r9 = @ltoff(__curbrk), gp
 	;;
-	cmp.ltu	p6,p0=ret0,r32	/* r32 is the input register, even though we
-				   haven't allocated a frame */
-	addl	r9=@ltoff(__curbrk),gp
-	;;
-	ld8	r9=[r9]
-(p6) 	mov	ret0=ENOMEM
+	ld8 r9 = [r9]
+(p6) 	mov ret0 = ENOMEM
 (p6)	br.cond.spnt.few __syscall_error
 	;;
-	st8	[r9]=ret0
-	mov 	ret0=0
+	st8 [r9] = ret0
+	mov ret0 = 0
 	ret
 END(__brk)
 
Index: sysdeps/unix/sysv/linux/ia64/clone2.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/clone2.S,v
retrieving revision 1.7
diff -u -r1.7 clone2.S
--- sysdeps/unix/sysv/linux/ia64/clone2.S	13 Mar 2003 04:36:59 -0000	1.7
+++ sysdeps/unix/sysv/linux/ia64/clone2.S	13 Nov 2003 20:02:48 -0000
@@ -25,49 +25,56 @@
 /* 	         size_t child_stack_size, int flags, void *arg,		*/
 /*	         pid_t *parent_tid, void *tls, pid_t *child_tid)	*/
 
+#define CHILD	p8
+#define PARENT	p9
+
 ENTRY(__clone2)
-	alloc r2=ar.pfs,8,2,6,0
+	.prologue
+	alloc r2=ar.pfs,8,0,6,0
 	cmp.eq p6,p0=0,in0
 	mov r8=EINVAL
-(p6)	br.cond.spnt.few __syscall_error
-	;;
-	flushrs			/* This is necessary, since the child	*/
-				/* will be running with the same 	*/
-				/* register backing store for a few 	*/
-				/* instructions.  We need to ensure	*/
-				/* that it will not read or write the	*/
-				/* backing store.			*/
-	mov loc0=in0		/* save fn	*/
-	mov loc1=in4		/* save arg	*/
 	mov out0=in3		/* Flags are first syscall argument.	*/
 	mov out1=in1		/* Stack address.			*/
+(p6)	br.cond.spnt.many __syscall_error
+	;;
 	mov out2=in2		/* Stack size.				*/
 	mov out3=in5		/* Parent TID Pointer			*/
 	mov out4=in7		/* Child TID Pointer			*/
  	mov out5=in6		/* TLS pointer				*/
-        DO_CALL (SYS_ify (clone2))
+	/*
+	 * clone2() is special: the child cannot execute br.ret right
+	 * after the system call returns, because it starts out
+	 * executing on an empty stack.  Because of this, we can't use
+	 * the new (lightweight) syscall convention here.  Instead, we
+	 * just fall back on always using "break".
+	 *
+	 * Furthermore, since the child starts with an empty stack, we
+	 * need to avoid unwinding past invalid memory.  To that end,
+	 * we'll pretend now that __clone2() is the end of the
+	 * call-chain.  This is wrong for the parent, but only until
+	 * it returns from clone2() but it's better than the
+	 * alternative.
+	 */
+	mov r15=SYS_ify (clone2)
+	.save rp, r0
+	break __BREAK_SYSCALL
+	.body
         cmp.eq p6,p0=-1,r10
+	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?   */
+(p6)	br.cond.spnt.many __syscall_error
 	;;
-(p6)	br.cond.spnt.few __syscall_error
-
-#	define CHILD p6
-#	define PARENT p7
-	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?	*/
-	;;
-(CHILD)	ld8 out1=[loc0],8	/* Retrieve code pointer.	*/
-(CHILD)	mov out0=loc1		/* Pass proper argument	to fn */
+(CHILD)	ld8 out1=[in0],8	/* Retrieve code pointer.	*/
+(CHILD)	mov out0=in4		/* Pass proper argument	to fn */
 (PARENT) ret
 	;;
-	ld8 gp=[loc0]		/* Load function gp.		*/
+	ld8 gp=[in0]		/* Load function gp.		*/
 	mov b6=out1
-	;;
-	br.call.dptk.few rp=b6	/* Call fn(arg) in the child 	*/
+	br.call.dptk.many rp=b6	/* Call fn(arg) in the child 	*/
 	;;
 	mov out0=r8		/* Argument to _exit		*/
 	.globl _exit
-	br.call.dpnt.few rp=_exit /* call _exit with result from fn.	*/
+	br.call.dpnt.many rp=_exit /* call _exit with result from fn.	*/
 	ret			/* Not reached.		*/
-
 PSEUDO_END(__clone2)
 
 /* For now we leave __clone undefined.  This is unlikely to be a	*/
Index: sysdeps/unix/sysv/linux/ia64/getcontext.S
--- sysdeps/unix/sysv/linux/ia64/getcontext.S
+++ sysdeps/unix/sysv/linux/ia64/getcontext.S
@@ -35,26 +35,27 @@
 
 ENTRY(__getcontext)
 	.prologue
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_BLOCK, NULL, &sc->sc_mask):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_BLOCK
-	mov out1 = 0
-	add out2 = r2, in0
-	mov out3 = 8	// sizeof kernel sigset_t
 
-	break __BREAK_SYSCALL
 	flushrs					// save dirty partition on rbs
+	mov out1 = 0
+	add out2 = r3, in0
+
+	mov out3 = 8	// sizeof kernel sigset_t
+	DO_CALL(__NR_rt_sigprocmask)
 
 	mov.m rFPSR = ar.fpsr
 	mov.m rRSC = ar.rsc
 	add r2 = SC_GR+1*8, r32
 	;;
 	mov.m rBSP = ar.bsp
+	.prologue
 	.save ar.unat, rUNAT
 	mov.m rUNAT = ar.unat
 	.body
@@ -63,7 +64,7 @@
 
 .mem.offset 0,0; st8.spill [r2] = r1, (5*8 - 1*8)
 .mem.offset 8,0; st8.spill [r3] = r4, 16
-	mov.i rPFS = ar.pfs
+	mov rPFS = r11
 	;;
 .mem.offset 0,0; st8.spill [r2] = r5, 16
 .mem.offset 8,0; st8.spill [r3] = r6, 48
Index: sysdeps/unix/sysv/linux/ia64/setcontext.S
--- sysdeps/unix/sysv/linux/ia64/setcontext.S
+++ sysdeps/unix/sysv/linux/ia64/setcontext.S
@@ -32,20 +32,21 @@
   other than the PRESERVED state.  */
 
 ENTRY(__setcontext)
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.prologue
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_SETMASK, &sc->sc_mask, NULL):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_SETMASK
-	add out1 = r2, in0
+	;;
+	add out1 = r3, in0
 	mov out2 = 0
 	mov out3 = 8	// sizeof kernel sigset_t
 
 	invala
-	break __BREAK_SYSCALL
+	DO_CALL(__NR_rt_sigprocmask)
 	add r2 = SC_NAT, r32
 
 	add r3 = SC_RNAT, r32			// r3 <- &sc_ar_rnat
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/sysdep.h,v
retrieving revision 1.17
diff -u -r1.17 sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h	16 Aug 2003 08:00:24 -0000	1.17
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h	13 Nov 2003 20:02:48 -0000
@@ -23,6 +23,8 @@
 
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
+#include <dl-sysdep.h>
+#include <tls.h>
 
 /* For Linux we can use the system call table in the header file
 	/usr/include/asm/unistd.h
@@ -51,6 +53,14 @@
 # define __NR_semtimedop 1247
 #endif
 
+#if defined USE_DL_SYSINFO \
+	&& (!defined NOT_IN_libc \
+	    || defined IS_IN_libpthread || defined IS_IN_librt)
+# define IA64_USE_NEW_STUB
+#else
+# undef IA64_USE_NEW_STUB
+#endif
+
 #ifdef __ASSEMBLER__
 
 #undef CALL_MCOUNT
@@ -95,9 +105,41 @@
 	cmp.eq p6,p0=-1,r10;			\
 (p6)	br.cond.spnt.few __syscall_error;
 
-#define DO_CALL(num)				\
+#define DO_CALL_VIA_BREAK(num)			\
 	mov r15=num;				\
-	break __BREAK_SYSCALL;
+	break __BREAK_SYSCALL
+
+#ifdef IA64_USE_NEW_STUB
+# ifdef SHARED
+#  define DO_CALL(num)				\
+	.prologue;				\
+	adds r2 = SYSINFO_OFFSET, r13;;		\
+	ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+	mov r11 = ar.pfs;;			\
+	.body;					\
+	mov r15 = num;				\
+	mov b7 = r2;				\
+	br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+	mov ar.pfs = r11
+# else /* !SHARED */
+#  define DO_CALL(num)				\
+	.prologue;				\
+	mov r15 = num;				\
+	movl r2 = _dl_sysinfo;;			\
+	ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+	mov r11 = ar.pfs;;			\
+	.body;					\
+	mov b7 = r2;				\
+	br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+	mov ar.pfs = r11
+# endif
+#else
+# define DO_CALL(num)				DO_CALL_VIA_BREAK(num)
+#endif
 
 #undef PSEUDO_END
 #define PSEUDO_END(name)	.endp C_SYMBOL_NAME(name);
@@ -143,45 +185,64 @@
    from a syscall.  r10 is set to -1 on error, whilst r8 contains the
    (non-negative) errno on error or the return value on success.
  */
-#undef INLINE_SYSCALL
-#define INLINE_SYSCALL(name, nr, args...)			\
-  ({								\
+
+#ifdef IA64_USE_NEW_STUB
+
+#define DO_INLINE_SYSCALL(name, nr, args...)					\
+    register long _r8 __asm ("r8");						\
+    register long _r10 __asm ("r10");						\
+    register long _r15 __asm ("r15") = __NR_##name;				\
+    register void *_b7 __asm ("b7") = ((tcbhead_t *) __thread_self)->private;	\
+    long _retval;								\
+    LOAD_ARGS_##nr (args);							\
+    /*										\
+     * Don't specify any unwind info here.  We mark ar.pfs as			\
+     * clobbered.  This will force the compiler to save ar.pfs			\
+     * somewhere and emit appropriate unwind info for that save.		\
+     */										\
+    __asm __volatile ("br.call.sptk.many b6=%0;;\n"				\
+		      : "=b"(_b7), "=r" (_r8), "=r" (_r10), "=r" (_r15)		\
+			ASM_OUTARGS_##nr					\
+		      : "0" (_b7), "3" (_r15) ASM_ARGS_##nr			\
+		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);			\
+    _retval = _r8;
+
+#else /* !IA64_USE_NEW_STUB */
+
+#define DO_INLINE_SYSCALL(name, nr, args...)			\
     register long _r8 asm ("r8");				\
     register long _r10 asm ("r10");				\
     register long _r15 asm ("r15") = __NR_##name;		\
     long _retval;						\
     LOAD_ARGS_##nr (args);					\
     __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
-                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
+		      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
 			ASM_OUTARGS_##nr			\
-                      : "2" (_r15) ASM_ARGS_##nr		\
-                      : "memory" ASM_CLOBBERS_##nr);		\
-    _retval = _r8;						\
-    if (_r10 == -1)						\
-      {								\
-        __set_errno (_retval);					\
-        _retval = -1;						\
-      }								\
+		      : "2" (_r15) ASM_ARGS_##nr		\
+		      : "memory" ASM_CLOBBERS_##nr);		\
+    _retval = _r8;
+
+#endif /* !IA64_USE_NEW_STUB */
+
+#undef INLINE_SYSCALL
+#define INLINE_SYSCALL(name, nr, args...)	\
+  ({						\
+    DO_INLINE_SYSCALL(name, nr, args)		\
+    if (_r10 == -1)				\
+      {						\
+	__set_errno (_retval);			\
+	_retval = -1;				\
+      }						\
     _retval; })
 
 #undef INTERNAL_SYSCALL_DECL
 #define INTERNAL_SYSCALL_DECL(err) long int err
 
 #undef INTERNAL_SYSCALL
-#define INTERNAL_SYSCALL(name, err, nr, args...)		\
-  ({								\
-    register long _r8 asm ("r8");				\
-    register long _r10 asm ("r10");				\
-    register long _r15 asm ("r15") = __NR_##name;		\
-    long _retval;						\
-    LOAD_ARGS_##nr (args);					\
-    __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
-                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
-			ASM_OUTARGS_##nr			\
-                      : "2" (_r15) ASM_ARGS_##nr		\
-                      : "memory" ASM_CLOBBERS_##nr);		\
-    _retval = _r8;						\
-    err = _r10;							\
+#define INTERNAL_SYSCALL(name, err, nr, args...)	\
+  ({							\
+    DO_INLINE_SYSCALL(name, nr, args)			\
+    err = _r10;						\
     _retval; })
 
 #undef INTERNAL_SYSCALL_ERROR_P
@@ -218,6 +279,15 @@
 #define ASM_OUTARGS_5	ASM_OUTARGS_4, "=r" (_out4)
 #define ASM_OUTARGS_6	ASM_OUTARGS_5, "=r" (_out5)
 
+#ifdef IA64_USE_NEW_STUB
+#define ASM_ARGS_0
+#define ASM_ARGS_1	ASM_ARGS_0, "4" (_out0)
+#define ASM_ARGS_2	ASM_ARGS_1, "5" (_out1)
+#define ASM_ARGS_3	ASM_ARGS_2, "6" (_out2)
+#define ASM_ARGS_4	ASM_ARGS_3, "7" (_out3)
+#define ASM_ARGS_5	ASM_ARGS_4, "8" (_out4)
+#define ASM_ARGS_6	ASM_ARGS_5, "9" (_out5)
+#else
 #define ASM_ARGS_0
 #define ASM_ARGS_1	ASM_ARGS_0, "3" (_out0)
 #define ASM_ARGS_2	ASM_ARGS_1, "4" (_out1)
@@ -225,6 +295,7 @@
 #define ASM_ARGS_4	ASM_ARGS_3, "6" (_out3)
 #define ASM_ARGS_5	ASM_ARGS_4, "7" (_out4)
 #define ASM_ARGS_6	ASM_ARGS_5, "8" (_out5)
+#endif
 
 #define ASM_CLOBBERS_0	ASM_CLOBBERS_1, "out0"
 #define ASM_CLOBBERS_1	ASM_CLOBBERS_2, "out1"
@@ -232,7 +303,7 @@
 #define ASM_CLOBBERS_3	ASM_CLOBBERS_4, "out3"
 #define ASM_CLOBBERS_4	ASM_CLOBBERS_5, "out4"
 #define ASM_CLOBBERS_5	ASM_CLOBBERS_6, "out5"
-#define ASM_CLOBBERS_6	, "out6", "out7",				\
+#define ASM_CLOBBERS_6_COMMON	, "out6", "out7",			\
   /* Non-stacked integer registers, minus r8, r10, r15.  */		\
   "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	\
   "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	\
@@ -242,7 +313,13 @@
   /* Non-rotating fp registers.  */					\
   "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	\
   /* Branch registers.  */						\
-  "b6", "b7"
+  "b6"
+
+#ifdef IA64_USE_NEW_STUB
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON
+#else
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON , "b7"
+#endif
 
 #endif /* not __ASSEMBLER__ */
 
Index: sysdeps/unix/sysv/linux/ia64/vfork.S
--- sysdeps/unix/sysv/linux/ia64/vfork.S
+++ sysdeps/unix/sysv/linux/ia64/vfork.S
@@ -34,9 +34,8 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
-	DO_CALL (SYS_ify (clone))
+	DO_CALL_VIA_BREAK (SYS_ify (clone))
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13 21:34                       ` David Mosberger
@ 2003-11-13 21:44                         ` Jakub Jelinek
  2003-11-13 21:58                           ` David Mosberger
  2003-11-13 23:45                           ` David Mosberger
  0 siblings, 2 replies; 98+ messages in thread
From: Jakub Jelinek @ 2003-11-13 21:44 UTC (permalink / raw)
  To: davidm; +Cc: Ulrich Drepper, libc-hacker

On Thu, Nov 13, 2003 at 01:34:13PM -0800, David Mosberger wrote:
> --- elf/rtld.c
> +++ elf/rtld.c
> @@ -1169,7 +1169,7 @@
>  		  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
>  		  break;
>  		}
> -	      if (ph->p_type == PT_LOAD)
> +	      if (i == 0 && ph->p_type == PT_LOAD)
>  		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
>  	    }
>  	  elf_get_dynamic_info (l, dyn_temp);

Shouldn't this be:

+#ifndef NDEBUG
+	  uint_fast16_t pt_load_num = 0;
+#endif
          for (uint_fast16_t i = 0; i < l->l_phnum; ++i)
            {
              const ElfW(Phdr) *const ph = &l->l_phdr[i];
              if (ph->p_type == PT_DYNAMIC)
                {
                  l->l_ld = (void *) ph->p_vaddr;
                  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
                  break;
                }
+#ifndef NDEBUG
              if (ph->p_type == PT_LOAD)
-		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+		{
+		  assert (pt_load_num
+			  || (void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+		  pt_load_num++;
+		}
+#endif
            }

?

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13 21:44                         ` Jakub Jelinek
@ 2003-11-13 21:58                           ` David Mosberger
  2003-11-13 23:45                           ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-13 21:58 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Thu, 13 Nov 2003 20:38:17 +0100, Jakub Jelinek <jakub@redhat.com> said:
  Jakub> Shouldn't this be:

Does it make a difference in practice?  AFAIK, PT_LOAD is the first
program header in the kernel DSO and that's unlikely to change.  But
yes, your version covers a wider range of possibilities.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13 21:44                         ` Jakub Jelinek
  2003-11-13 21:58                           ` David Mosberger
@ 2003-11-13 23:45                           ` David Mosberger
  2003-11-14  1:44                             ` Ulrich Drepper
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-13 23:45 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

To cover all bases, here is an updated patch with Jakub's version of
the elf/rtld.c check.

	--david

libc/ChangeLog

2003-11-12  David Mosberger  <davidm@hpl.hp.com>

	* elf/rtld.c (dl_main): Restrict dl_sysinfo_dso check to first
	program header.  On ia64, the check failed previously because
	there are two program headers.

	* sysdeps/unix/sysv/linux/ia64/brk.S (__curbrk): Restructure it
	to take advantage of DO_CALL() macro.
	* sysdeps/unix/sysv/linux/ia64/clone2.S: Ditto.
	* sysdeps/unix/sysv/linux/ia64/getcontext.S: Ditto.
	* sysdeps/unix/sysv/linux/ia64/setcontext.S: Ditto.

	* sysdeps/unix/sysv/linux/ia64/sysdep.h: Add include of
	<dl-sysdep.h> and <tls.h>.
	(IA64_USE_NEW_STUB): New macro.
	(DO_CALL_VIA_BREAK): Ditto.
	(DO_CALL): Add new variants for IA64_USE_NEW_STUB.
	(DO_INLINE_SYSCALL): New macro.
	(INLINE_SYSCALL): Define in terms of DO_INLINE_SYSCALL.
	(INTERNAL_SYSCALL): Ditto.
	(ASM_ARGS_0, ASM_ARGS_1, ASM_ARGS_2, ASM_ARGS_3, ASM_ARGS_4,
	ASM_ARGS_5, ASM_ARGS_6): Add new variant for IA64_USE_NEW_STUB.
	(ASM_CLOBBERS_6_COMMON): New macro.
	(ASM_CLOBBERS_6): Add new variant for IA64_USE_NEW_STUB.

	* sysdeps/unix/sysv/linux/ia64/vfork.S: Use DO_CALL_VIA_BREAK()
	instead of DO_CALL().

linuxthreads/ChangeLog

2003-11-12  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/dl-sysdep.h: New file.

nptl/ChangeLog

2003-11-12 David Mosberger  <davidm@hpl.hp.com>

	* allocatestack.c (allocate_stack): Use THREAD_SYSINFO and
	THREAD_SELF_SYSINFO instead of open code.

	* sysdeps/i386/tls.h (THREAD_SELF_SYSINFO): New macro.
	(THREAD_SYSINFO): Ditto.

	* sysdeps/ia64/tcb-offsets.sym: Add SYSINFO_OFFSET.

	* sysdeps/ia64/tls.h: Move declaration of __thread_self up so it
	comes before the include of <sysdep.h>.
	(THREAD_SELF_SYSINFO): New macro.
	(THREAD_SYSINFO): Ditto.
	(INIT_SYSINFO): New macro.
	(TLS_INIT_TP): Call INIT_SYSINFO.

	* sysdeps/pthread/createthread.c (create_thread): Use
	THREAD_SELF_SYSINFO and THREAD_SYSINFO instead of open code.

	* sysdeps/unix/sysv/linux/ia64/dl-sysdep.h: New file.

	* sysdeps/unix/sysv/linux/ia64/lowlevellock.h (__NR_futex): Rename
	from SYS_futex, to match expectations of
	sysdep.h:DO_INLINE_SYSCALL.
	(lll_futex_clobbers): Remove.
	(lll_futex_timed_wait): Rewrite in terms of DO_INLINE_SYSCALL.
	(lll_futex_wake): Ditto.
	(lll_futex_requeue): Ditto.
	(__lll_mutex_trylock): Rewrite to a macro, so we can include this
	file before DO_INLINE_SYSCALL is defined (proposed by Jakub
	Jelinek).
	(__lll_mutex_lock): Ditto.
	(__lll_mutex_cond_lock): Ditto.
	(__lll_mutex_timed_lock): Ditto.
	(__lll_mutex_unlock): Ditto.
	(__lll_mutex_unlock_force): Ditto.

	* sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h (PSEUDO): Take
	advantage of new syscall stub and optimize accordingly.

Index: elf/rtld.c
===================================================================
RCS file: /cvs/glibc/libc/elf/rtld.c,v
retrieving revision 1.299
diff -u -r1.299 rtld.c
--- elf/rtld.c	27 Oct 2003 20:08:32 -0000	1.299
+++ elf/rtld.c	13 Nov 2003 23:14:00 -0000
@@ -1156,6 +1156,9 @@
       if (__builtin_expect (l != NULL, 1))
 	{
 	  static ElfW(Dyn) dyn_temp[DL_RO_DYN_TEMP_CNT];
+#ifndef NDEBUG
+	  uint_fast16_t pt_load_num = 0;
+#endif
 
 	  l->l_phdr = ((const void *) GL(dl_sysinfo_dso)
 		       + GL(dl_sysinfo_dso)->e_phoff);
@@ -1169,8 +1172,14 @@
 		  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
 		  break;
 		}
+#ifndef NDEBUG
 	      if (ph->p_type == PT_LOAD)
-		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+		{
+		  assert (pt_load_num
+			  || (void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+		  pt_load_num++;
+		}
+#endif
 	    }
 	  elf_get_dynamic_info (l, dyn_temp);
 	  _dl_setup_hash (l);
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,45 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+#define NEED_DL_SYSINFO	1
+#undef USE_DL_SYSINFO
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: nptl/allocatestack.c
--- nptl/allocatestack.c
+++ nptl/allocatestack.c
@@ -352,7 +352,7 @@
 
 #ifdef NEED_DL_SYSINFO
       /* Copy the sysinfo value from the parent.  */
-      pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+      THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
       /* The process ID is also the same as that of the caller.  */
@@ -488,7 +488,7 @@
 
 #ifdef NEED_DL_SYSINFO
 	  /* Copy the sysinfo value from the parent.  */
-	  pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+	  THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
 	  /* The process ID is also the same as that of the caller.  */
Index: nptl/sysdeps/i386/tls.h
--- nptl/sysdeps/i386/tls.h
+++ nptl/sysdeps/i386/tls.h
@@ -128,6 +128,8 @@
 # define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	THREAD_GETMEM (THREAD_SELF, header.sysinfo)
+#define THREAD_SYSINFO(pd)	((pd)->header.sysinfo)
 
 /* Macros to load from and store into segment registers.  */
 # ifndef TLS_GET_GS
Index: nptl/sysdeps/ia64/tcb-offsets.sym
--- nptl/sysdeps/ia64/tcb-offsets.sym
+++ nptl/sysdeps/ia64/tcb-offsets.sym
@@ -2,3 +2,4 @@
 #include <tls.h>
 
 MULTIPLE_THREADS_OFFSET offsetof (struct pthread, header.multiple_threads) - sizeof (struct pthread)
+SYSINFO_OFFSET		offsetof (tcbhead_t, private)
Index: nptl/sysdeps/ia64/tls.h
--- nptl/sysdeps/ia64/tls.h
+++ nptl/sysdeps/ia64/tls.h
@@ -42,6 +42,8 @@
   void *private;
 } tcbhead_t;
 
+register struct pthread *__thread_self __asm__("r13");
+
 # define TLS_MULTIPLE_THREADS_IN_TCB 1
 
 #else /* __ASSEMBLER__ */
@@ -64,8 +66,6 @@
 /* Get system call information.  */
 # include <sysdep.h>
 
-register struct pthread *__thread_self __asm__("r13");
-
 /* This is the size of the initial TCB.  */
 # define TLS_INIT_TCB_SIZE sizeof (tcbhead_t)
 
@@ -100,11 +100,20 @@
 #  define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	(((tcbhead_t *) __thread_self)->private)
+#define THREAD_SYSINFO(pd)	(((tcbhead_t *) ((pd) + 1))->private)
+
+#if defined NEED_DL_SYSINFO
+# define INIT_SYSINFO   THREAD_SELF_SYSINFO = (void *) GL(dl_sysinfo)
+#else
+# define INIT_SYSINFO   NULL
+#endif
+
 /* Code to initially initialize the thread pointer.  This might need
    special attention since 'errno' is not yet available and if the
    operation can cause a failure 'errno' must not be touched.  */
 # define TLS_INIT_TP(thrdescr, secondcall) \
-  (__thread_self = (thrdescr), NULL)
+  (__thread_self = (thrdescr), INIT_SYSINFO, NULL)
 
 /* Return the address of the dtv for the current thread.  */
 #  define THREAD_DTV() \
Index: nptl/sysdeps/pthread/createthread.c
--- nptl/sysdeps/pthread/createthread.c
+++ nptl/sysdeps/pthread/createthread.c
@@ -226,7 +226,7 @@
     }
 
 #ifdef NEED_DL_SYSINFO
-  assert (THREAD_GETMEM (THREAD_SELF, header.sysinfo) == pd->header.sysinfo);
+  assert (THREAD_SELF_SYSINFO == THREAD_SYSINFO(pd));
 #endif
 
   /* Actually create the thread.  */
Index: nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,64 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+/* This macro must be defined to either 0 or 1.
+
+   If 1, then an errno global variable hidden in ld.so will work right with
+   all the errno-using libc code compiled for ld.so, and there is never a
+   need to share the errno location with libc.  This is appropriate only if
+   all the libc functions that ld.so uses are called without PLT and always
+   get the versions linked into ld.so rather than the libc ones.  */
+
+#ifdef IS_IN_rtld
+# define RTLD_PRIVATE_ERRNO 1
+#else
+# define RTLD_PRIVATE_ERRNO 0
+#endif
+
+/* Traditionally system calls have been made using break 0x100000.  A
+   second method was introduced which, if possible, will use the EPC
+   instruction.  To signal the presence and where to find the code the
+   kernel passes an AT_SYSINFO_EHDR pointer in the auxiliary vector to
+   the application.  */
+#define NEED_DL_SYSINFO	1
+#define USE_DL_SYSINFO	1
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
--- nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
@@ -26,7 +26,7 @@
 #include <ia64intrin.h>
 #include <atomic.h>
 
-#define SYS_futex		1230
+#define __NR_futex		1230
 #define FUTEX_WAIT		0
 #define FUTEX_WAKE		1
 #define FUTEX_REQUEUE		3
@@ -34,112 +34,52 @@
 /* Initializer for compatibility lock.	*/
 #define LLL_MUTEX_LOCK_INITIALIZER (0)
 
-#define lll_futex_clobbers \
-  "out5", "out6", "out7",						      \
-  /* Non-stacked integer registers, minus r8, r10, r15.  */		      \
-  "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	      \
-  "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	      \
-  "r28", "r29", "r30", "r31",						      \
-  /* Predicate registers.  */						      \
-  "p6", "p7", "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15",	      \
-  /* Non-rotating fp registers.  */					      \
-  "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	      \
-  /* Branch registers.  */						      \
-  "b6", "b7",								      \
-  "memory"
-
 #define lll_futex_wait(futex, val) lll_futex_timed_wait (futex, val, 0)
 
-#define lll_futex_timed_wait(futex, val, timespec) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAIT;			      \
-     register int __o2 asm ("out2") = (int) (val);			      \
-     register long int __o3 asm ("out3") = (long int) (timespec);	      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %7;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3)   \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2), "6" (__o3)				      \
-		       : "out4", lll_futex_clobbers);			      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-#define lll_futex_wake(futex, nr) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAKE;			      \
-     register int __o2 asm ("out2") = (int) (nr);			      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %6;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2)		      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2)					      \
-		       : "out3", "out4", lll_futex_clobbers);		      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
+#define lll_futex_timed_wait(ftx, val, timespec)			\
+({									\
+   DO_INLINE_SYSCALL(futex, 4, (long) (ftx), FUTEX_WAIT, (int) (val),	\
+		     (long) (timespec));				\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_wake(ftx, nr)						\
+({									\
+   DO_INLINE_SYSCALL(futex, 3, (long) (ftx), FUTEX_WAKE, (int) (nr));	\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_requeue(ftx, nr_wake, nr_move, mutex)			     \
+({									     \
+   DO_INLINE_SYSCALL(futex, 5, (long) (ftx), FUTEX_REQUEUE, (int) (nr_wake), \
+		     (int) (nr_move), (long) (mutex));			     \
+   _r10 == -1 ? -_retval : _retval;					     \
+})
 
 
-#define lll_futex_requeue(futex, nr_wake, nr_move, mutex) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_REQUEUE;		      \
-     register int __o2 asm ("out2") = (int) (nr_wake);			      \
-     register int __o3 asm ("out3") = (int) (nr_move);			      \
-     register long int __o4 asm ("out4") = (long int) (mutex);		      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %8;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3),  \
-			 "=r" (__o4)					      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-			 "5" (__o2), "6" (__o3), "7" (__o4)		      \
-		       : lll_futex_clobbers);				      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-static inline int
-__attribute__ ((always_inline))
-__lll_mutex_trylock (int *futex)
-{
-  return atomic_compare_and_exchange_val_acq (futex, 1, 0) != 0;
-}
+#define __lll_mutex_trylock(futex) \
+  (atomic_compare_and_exchange_val_acq (futex, 1, 0) != 0)
 #define lll_mutex_trylock(futex) __lll_mutex_trylock (&(futex))
 
 
 extern void __lll_lock_wait (int *futex) attribute_hidden;
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_lock (int *futex)
-{
-  if (atomic_compare_and_exchange_bool_acq (futex, 1, 0) != 0)
-    __lll_lock_wait (futex);
-}
+#define __lll_mutex_lock(futex)						\
+  ((void) ({								\
+    int *__futex = (futex);						\
+    if (atomic_compare_and_exchange_bool_acq (__futex, 1, 0) != 0)	\
+      __lll_lock_wait (__futex);					\
+  }))
 #define lll_mutex_lock(futex) __lll_mutex_lock (&(futex))
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_cond_lock (int *futex)
-{
-  if (atomic_compare_and_exchange_bool_acq (futex, 2, 0) != 0)
-    __lll_lock_wait (futex);
-}
+#define __lll_mutex_cond_lock(futex)					\
+  ((void) ({								\
+    int *__futex = (futex);						\
+    if (atomic_compare_and_exchange_bool_acq (__futex, 2, 0) != 0)	\
+      __lll_lock_wait (__futex);					\
+  }))
 #define lll_mutex_cond_lock(futex) __lll_mutex_cond_lock (&(futex))
 
 
@@ -147,41 +87,37 @@
      attribute_hidden;
 
 
-static inline int
-__attribute__ ((always_inline))
-__lll_mutex_timedlock (int *futex, const struct timespec *abstime)
-{
-  int result = 0;
-
-  if (atomic_compare_and_exchange_bool_acq (futex, 1, 0) != 0)
-    result = __lll_timedlock_wait (futex, abstime);
-
-  return result;
-}
+#define __lll_mutex_timedlock(futex, abstime)				\
+  ({									\
+     int *__futex = (futex);						\
+     int __val = 0;							\
+									\
+     if (atomic_compare_and_exchange_bool_acq (__futex, 1, 0) != 0)	\
+       __val = __lll_timedlock_wait (__futex, abstime);			\
+     __val;								\
+  })
 #define lll_mutex_timedlock(futex, abstime) \
   __lll_mutex_timedlock (&(futex), abstime)
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_unlock (int *futex)
-{
-  int val = atomic_exchange_rel (futex, 0);
-
-  if (__builtin_expect (val > 1, 0))
-    lll_futex_wake (futex, 1);
-}
+#define __lll_mutex_unlock(futex)			\
+  ((void) ({						\
+    int *__futex = (futex);				\
+    int __val = atomic_exchange_rel (__futex, 0);	\
+							\
+    if (__builtin_expect (__val > 1, 0))		\
+      lll_futex_wake (__futex, 1);			\
+  }))
 #define lll_mutex_unlock(futex) \
   __lll_mutex_unlock(&(futex))
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_unlock_force (int *futex)
-{
-  (void) atomic_exchange_rel (futex, 0);
-  lll_futex_wake (futex, 1);
-}
+#define __lll_mutex_unlock_force(futex)		\
+  ((void) ({					\
+    int *__futex = (futex);			\
+    (void) atomic_exchange_rel (__futex, 0);	\
+    lll_futex_wake (__futex, 1);		\
+  }))
 #define lll_mutex_unlock_force(futex) \
   __lll_mutex_unlock_force(&(futex))
 
Index: nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
--- nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
@@ -29,13 +29,21 @@
 # define PSEUDO(name, syscall_name, args)				      \
 .text;									      \
 ENTRY (name)								      \
-     adds r14 = MULTIPLE_THREADS_OFFSET, r13;;				      \
+     .prologue;								      \
+     adds r2 = SYSINFO_OFFSET, r13;					      \
+     adds r14 = MULTIPLE_THREADS_OFFSET, r13;				      \
+     .save ar.pfs, r11;							      \
+     mov r11 = ar.pfs;;							      \
+     .body;								      \
      ld4 r14 = [r14];							      \
+     ld8 r2 = [r2];							      \
      mov r15 = SYS_ify(syscall_name);;					      \
      cmp4.ne p6, p7 = 0, r14;						      \
-(p6) br.cond.spnt .Lpseudo_cancel;;					      \
-     break __BREAK_SYSCALL;;						      \
-     cmp.eq p6,p0=-1,r10;						      \
+     mov b7 = r2;							      \
+(p6) br.cond.spnt .Lpseudo_cancel;					      \
+     br.call.sptk.many b6 = b7;;					      \
+     mov ar.pfs = r11;							      \
+     cmp.eq p6,p0 = -1, r10;						      \
 (p6) br.cond.spnt.few __syscall_error;					      \
      ret;;								      \
      .endp name;							      \
@@ -48,14 +56,17 @@
      .regstk args, 5, args, 0;						      \
      .save ar.pfs, loc0;						      \
      alloc loc0 = ar.pfs, args, 5, args, 0;				      \
+     adds loc4 = SYSINFO_OFFSET, r13;					      \
      .save rp, loc1;							      \
      mov loc1 = rp;;							      \
      .body;								      \
+     ld8 loc4 = [loc4];							      \
      CENABLE;;								      \
      mov loc2 = r8;							      \
+     mov b7 = loc4;							      \
      COPY_ARGS_##args							      \
      mov r15 = SYS_ify(syscall_name);					      \
-     break __BREAK_SYSCALL;;						      \
+     br.call.sptk.many b6 = b7;;					      \
      mov loc3 = r8;							      \
      mov loc4 = r10;							      \
      mov out0 = loc2;							      \
Index: sysdeps/unix/sysv/linux/ia64/brk.S
--- sysdeps/unix/sysv/linux/ia64/brk.S
+++ sysdeps/unix/sysv/linux/ia64/brk.S
@@ -35,19 +35,17 @@
 weak_alias (__curbrk, ___brk_addr)
 
 LEAF(__brk)
-	mov	r15=__NR_brk
-	break.i	__BREAK_SYSCALL
+	.regstk 1, 0, 0, 0
+	DO_CALL(__NR_brk)
+	cmp.ltu	p6, p0 = ret0, in0
+	addl r9 = @ltoff(__curbrk), gp
 	;;
-	cmp.ltu	p6,p0=ret0,r32	/* r32 is the input register, even though we
-				   haven't allocated a frame */
-	addl	r9=@ltoff(__curbrk),gp
-	;;
-	ld8	r9=[r9]
-(p6) 	mov	ret0=ENOMEM
+	ld8 r9 = [r9]
+(p6) 	mov ret0 = ENOMEM
 (p6)	br.cond.spnt.few __syscall_error
 	;;
-	st8	[r9]=ret0
-	mov 	ret0=0
+	st8 [r9] = ret0
+	mov ret0 = 0
 	ret
 END(__brk)
 
Index: sysdeps/unix/sysv/linux/ia64/clone2.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/clone2.S,v
retrieving revision 1.7
diff -u -r1.7 clone2.S
--- sysdeps/unix/sysv/linux/ia64/clone2.S	13 Mar 2003 04:36:59 -0000	1.7
+++ sysdeps/unix/sysv/linux/ia64/clone2.S	13 Nov 2003 20:02:48 -0000
@@ -25,49 +25,56 @@
 /* 	         size_t child_stack_size, int flags, void *arg,		*/
 /*	         pid_t *parent_tid, void *tls, pid_t *child_tid)	*/
 
+#define CHILD	p8
+#define PARENT	p9
+
 ENTRY(__clone2)
-	alloc r2=ar.pfs,8,2,6,0
+	.prologue
+	alloc r2=ar.pfs,8,0,6,0
 	cmp.eq p6,p0=0,in0
 	mov r8=EINVAL
-(p6)	br.cond.spnt.few __syscall_error
-	;;
-	flushrs			/* This is necessary, since the child	*/
-				/* will be running with the same 	*/
-				/* register backing store for a few 	*/
-				/* instructions.  We need to ensure	*/
-				/* that it will not read or write the	*/
-				/* backing store.			*/
-	mov loc0=in0		/* save fn	*/
-	mov loc1=in4		/* save arg	*/
 	mov out0=in3		/* Flags are first syscall argument.	*/
 	mov out1=in1		/* Stack address.			*/
+(p6)	br.cond.spnt.many __syscall_error
+	;;
 	mov out2=in2		/* Stack size.				*/
 	mov out3=in5		/* Parent TID Pointer			*/
 	mov out4=in7		/* Child TID Pointer			*/
  	mov out5=in6		/* TLS pointer				*/
-        DO_CALL (SYS_ify (clone2))
+	/*
+	 * clone2() is special: the child cannot execute br.ret right
+	 * after the system call returns, because it starts out
+	 * executing on an empty stack.  Because of this, we can't use
+	 * the new (lightweight) syscall convention here.  Instead, we
+	 * just fall back on always using "break".
+	 *
+	 * Furthermore, since the child starts with an empty stack, we
+	 * need to avoid unwinding past invalid memory.  To that end,
+	 * we'll pretend now that __clone2() is the end of the
+	 * call-chain.  This is wrong for the parent, but only until
+	 * it returns from clone2() but it's better than the
+	 * alternative.
+	 */
+	mov r15=SYS_ify (clone2)
+	.save rp, r0
+	break __BREAK_SYSCALL
+	.body
         cmp.eq p6,p0=-1,r10
+	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?   */
+(p6)	br.cond.spnt.many __syscall_error
 	;;
-(p6)	br.cond.spnt.few __syscall_error
-
-#	define CHILD p6
-#	define PARENT p7
-	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?	*/
-	;;
-(CHILD)	ld8 out1=[loc0],8	/* Retrieve code pointer.	*/
-(CHILD)	mov out0=loc1		/* Pass proper argument	to fn */
+(CHILD)	ld8 out1=[in0],8	/* Retrieve code pointer.	*/
+(CHILD)	mov out0=in4		/* Pass proper argument	to fn */
 (PARENT) ret
 	;;
-	ld8 gp=[loc0]		/* Load function gp.		*/
+	ld8 gp=[in0]		/* Load function gp.		*/
 	mov b6=out1
-	;;
-	br.call.dptk.few rp=b6	/* Call fn(arg) in the child 	*/
+	br.call.dptk.many rp=b6	/* Call fn(arg) in the child 	*/
 	;;
 	mov out0=r8		/* Argument to _exit		*/
 	.globl _exit
-	br.call.dpnt.few rp=_exit /* call _exit with result from fn.	*/
+	br.call.dpnt.many rp=_exit /* call _exit with result from fn.	*/
 	ret			/* Not reached.		*/
-
 PSEUDO_END(__clone2)
 
 /* For now we leave __clone undefined.  This is unlikely to be a	*/
Index: sysdeps/unix/sysv/linux/ia64/getcontext.S
--- sysdeps/unix/sysv/linux/ia64/getcontext.S
+++ sysdeps/unix/sysv/linux/ia64/getcontext.S
@@ -35,26 +35,27 @@
 
 ENTRY(__getcontext)
 	.prologue
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_BLOCK, NULL, &sc->sc_mask):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_BLOCK
-	mov out1 = 0
-	add out2 = r2, in0
-	mov out3 = 8	// sizeof kernel sigset_t
 
-	break __BREAK_SYSCALL
 	flushrs					// save dirty partition on rbs
+	mov out1 = 0
+	add out2 = r3, in0
+
+	mov out3 = 8	// sizeof kernel sigset_t
+	DO_CALL(__NR_rt_sigprocmask)
 
 	mov.m rFPSR = ar.fpsr
 	mov.m rRSC = ar.rsc
 	add r2 = SC_GR+1*8, r32
 	;;
 	mov.m rBSP = ar.bsp
+	.prologue
 	.save ar.unat, rUNAT
 	mov.m rUNAT = ar.unat
 	.body
@@ -63,7 +64,7 @@
 
 .mem.offset 0,0; st8.spill [r2] = r1, (5*8 - 1*8)
 .mem.offset 8,0; st8.spill [r3] = r4, 16
-	mov.i rPFS = ar.pfs
+	mov rPFS = r11
 	;;
 .mem.offset 0,0; st8.spill [r2] = r5, 16
 .mem.offset 8,0; st8.spill [r3] = r6, 48
Index: sysdeps/unix/sysv/linux/ia64/setcontext.S
--- sysdeps/unix/sysv/linux/ia64/setcontext.S
+++ sysdeps/unix/sysv/linux/ia64/setcontext.S
@@ -32,20 +32,21 @@
   other than the PRESERVED state.  */
 
 ENTRY(__setcontext)
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.prologue
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_SETMASK, &sc->sc_mask, NULL):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_SETMASK
-	add out1 = r2, in0
+	;;
+	add out1 = r3, in0
 	mov out2 = 0
 	mov out3 = 8	// sizeof kernel sigset_t
 
 	invala
-	break __BREAK_SYSCALL
+	DO_CALL(__NR_rt_sigprocmask)
 	add r2 = SC_NAT, r32
 
 	add r3 = SC_RNAT, r32			// r3 <- &sc_ar_rnat
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/sysdep.h,v
retrieving revision 1.17
diff -u -r1.17 sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h	16 Aug 2003 08:00:24 -0000	1.17
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h	13 Nov 2003 20:02:48 -0000
@@ -23,6 +23,8 @@
 
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
+#include <dl-sysdep.h>
+#include <tls.h>
 
 /* For Linux we can use the system call table in the header file
 	/usr/include/asm/unistd.h
@@ -51,6 +53,14 @@
 # define __NR_semtimedop 1247
 #endif
 
+#if defined USE_DL_SYSINFO \
+	&& (!defined NOT_IN_libc \
+	    || defined IS_IN_libpthread || defined IS_IN_librt)
+# define IA64_USE_NEW_STUB
+#else
+# undef IA64_USE_NEW_STUB
+#endif
+
 #ifdef __ASSEMBLER__
 
 #undef CALL_MCOUNT
@@ -95,9 +105,41 @@
 	cmp.eq p6,p0=-1,r10;			\
 (p6)	br.cond.spnt.few __syscall_error;
 
-#define DO_CALL(num)				\
+#define DO_CALL_VIA_BREAK(num)			\
 	mov r15=num;				\
-	break __BREAK_SYSCALL;
+	break __BREAK_SYSCALL
+
+#ifdef IA64_USE_NEW_STUB
+# ifdef SHARED
+#  define DO_CALL(num)				\
+	.prologue;				\
+	adds r2 = SYSINFO_OFFSET, r13;;		\
+	ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+	mov r11 = ar.pfs;;			\
+	.body;					\
+	mov r15 = num;				\
+	mov b7 = r2;				\
+	br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+	mov ar.pfs = r11
+# else /* !SHARED */
+#  define DO_CALL(num)				\
+	.prologue;				\
+	mov r15 = num;				\
+	movl r2 = _dl_sysinfo;;			\
+	ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+	mov r11 = ar.pfs;;			\
+	.body;					\
+	mov b7 = r2;				\
+	br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+	mov ar.pfs = r11
+# endif
+#else
+# define DO_CALL(num)				DO_CALL_VIA_BREAK(num)
+#endif
 
 #undef PSEUDO_END
 #define PSEUDO_END(name)	.endp C_SYMBOL_NAME(name);
@@ -143,45 +185,64 @@
    from a syscall.  r10 is set to -1 on error, whilst r8 contains the
    (non-negative) errno on error or the return value on success.
  */
-#undef INLINE_SYSCALL
-#define INLINE_SYSCALL(name, nr, args...)			\
-  ({								\
+
+#ifdef IA64_USE_NEW_STUB
+
+#define DO_INLINE_SYSCALL(name, nr, args...)					\
+    register long _r8 __asm ("r8");						\
+    register long _r10 __asm ("r10");						\
+    register long _r15 __asm ("r15") = __NR_##name;				\
+    register void *_b7 __asm ("b7") = ((tcbhead_t *) __thread_self)->private;	\
+    long _retval;								\
+    LOAD_ARGS_##nr (args);							\
+    /*										\
+     * Don't specify any unwind info here.  We mark ar.pfs as			\
+     * clobbered.  This will force the compiler to save ar.pfs			\
+     * somewhere and emit appropriate unwind info for that save.		\
+     */										\
+    __asm __volatile ("br.call.sptk.many b6=%0;;\n"				\
+		      : "=b"(_b7), "=r" (_r8), "=r" (_r10), "=r" (_r15)		\
+			ASM_OUTARGS_##nr					\
+		      : "0" (_b7), "3" (_r15) ASM_ARGS_##nr			\
+		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);			\
+    _retval = _r8;
+
+#else /* !IA64_USE_NEW_STUB */
+
+#define DO_INLINE_SYSCALL(name, nr, args...)			\
     register long _r8 asm ("r8");				\
     register long _r10 asm ("r10");				\
     register long _r15 asm ("r15") = __NR_##name;		\
     long _retval;						\
     LOAD_ARGS_##nr (args);					\
     __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
-                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
+		      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
 			ASM_OUTARGS_##nr			\
-                      : "2" (_r15) ASM_ARGS_##nr		\
-                      : "memory" ASM_CLOBBERS_##nr);		\
-    _retval = _r8;						\
-    if (_r10 == -1)						\
-      {								\
-        __set_errno (_retval);					\
-        _retval = -1;						\
-      }								\
+		      : "2" (_r15) ASM_ARGS_##nr		\
+		      : "memory" ASM_CLOBBERS_##nr);		\
+    _retval = _r8;
+
+#endif /* !IA64_USE_NEW_STUB */
+
+#undef INLINE_SYSCALL
+#define INLINE_SYSCALL(name, nr, args...)	\
+  ({						\
+    DO_INLINE_SYSCALL(name, nr, args)		\
+    if (_r10 == -1)				\
+      {						\
+	__set_errno (_retval);			\
+	_retval = -1;				\
+      }						\
     _retval; })
 
 #undef INTERNAL_SYSCALL_DECL
 #define INTERNAL_SYSCALL_DECL(err) long int err
 
 #undef INTERNAL_SYSCALL
-#define INTERNAL_SYSCALL(name, err, nr, args...)		\
-  ({								\
-    register long _r8 asm ("r8");				\
-    register long _r10 asm ("r10");				\
-    register long _r15 asm ("r15") = __NR_##name;		\
-    long _retval;						\
-    LOAD_ARGS_##nr (args);					\
-    __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
-                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
-			ASM_OUTARGS_##nr			\
-                      : "2" (_r15) ASM_ARGS_##nr		\
-                      : "memory" ASM_CLOBBERS_##nr);		\
-    _retval = _r8;						\
-    err = _r10;							\
+#define INTERNAL_SYSCALL(name, err, nr, args...)	\
+  ({							\
+    DO_INLINE_SYSCALL(name, nr, args)			\
+    err = _r10;						\
     _retval; })
 
 #undef INTERNAL_SYSCALL_ERROR_P
@@ -218,6 +279,15 @@
 #define ASM_OUTARGS_5	ASM_OUTARGS_4, "=r" (_out4)
 #define ASM_OUTARGS_6	ASM_OUTARGS_5, "=r" (_out5)
 
+#ifdef IA64_USE_NEW_STUB
+#define ASM_ARGS_0
+#define ASM_ARGS_1	ASM_ARGS_0, "4" (_out0)
+#define ASM_ARGS_2	ASM_ARGS_1, "5" (_out1)
+#define ASM_ARGS_3	ASM_ARGS_2, "6" (_out2)
+#define ASM_ARGS_4	ASM_ARGS_3, "7" (_out3)
+#define ASM_ARGS_5	ASM_ARGS_4, "8" (_out4)
+#define ASM_ARGS_6	ASM_ARGS_5, "9" (_out5)
+#else
 #define ASM_ARGS_0
 #define ASM_ARGS_1	ASM_ARGS_0, "3" (_out0)
 #define ASM_ARGS_2	ASM_ARGS_1, "4" (_out1)
@@ -225,6 +295,7 @@
 #define ASM_ARGS_4	ASM_ARGS_3, "6" (_out3)
 #define ASM_ARGS_5	ASM_ARGS_4, "7" (_out4)
 #define ASM_ARGS_6	ASM_ARGS_5, "8" (_out5)
+#endif
 
 #define ASM_CLOBBERS_0	ASM_CLOBBERS_1, "out0"
 #define ASM_CLOBBERS_1	ASM_CLOBBERS_2, "out1"
@@ -232,7 +303,7 @@
 #define ASM_CLOBBERS_3	ASM_CLOBBERS_4, "out3"
 #define ASM_CLOBBERS_4	ASM_CLOBBERS_5, "out4"
 #define ASM_CLOBBERS_5	ASM_CLOBBERS_6, "out5"
-#define ASM_CLOBBERS_6	, "out6", "out7",				\
+#define ASM_CLOBBERS_6_COMMON	, "out6", "out7",			\
   /* Non-stacked integer registers, minus r8, r10, r15.  */		\
   "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	\
   "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	\
@@ -242,7 +313,13 @@
   /* Non-rotating fp registers.  */					\
   "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	\
   /* Branch registers.  */						\
-  "b6", "b7"
+  "b6"
+
+#ifdef IA64_USE_NEW_STUB
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON
+#else
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON , "b7"
+#endif
 
 #endif /* not __ASSEMBLER__ */
 
Index: sysdeps/unix/sysv/linux/ia64/vfork.S
--- sysdeps/unix/sysv/linux/ia64/vfork.S
+++ sysdeps/unix/sysv/linux/ia64/vfork.S
@@ -34,9 +34,8 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
-	DO_CALL (SYS_ify (clone))
+	DO_CALL_VIA_BREAK (SYS_ify (clone))
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-13 23:45                           ` David Mosberger
@ 2003-11-14  1:44                             ` Ulrich Drepper
  2003-11-14  1:54                               ` David Mosberger
  2003-11-14  2:18                               ` David Mosberger
  0 siblings, 2 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-14  1:44 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:
> To cover all bases, here is an updated patch with Jakub's version of
> the elf/rtld.c check.

The patch fails on kernels without the sysinfo support.  Compilations
stops when rpcgen is used.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/tDGz2ijCOnn/RHQRAv4NAKC91f1diYXnK7gTS44tl5SKiFBZuQCffC3R
DUQiqVou65EO+jEy3D+YMsE=
=Rozj
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  1:44                             ` Ulrich Drepper
@ 2003-11-14  1:54                               ` David Mosberger
  2003-11-14  2:18                               ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-14  1:54 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Thu, 13 Nov 2003 17:36:51 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

  Uli> David Mosberger wrote:
  >> To cover all bases, here is an updated patch with Jakub's version
  >> of the elf/rtld.c check.

  Uli> The patch fails on kernels without the sysinfo support.
  Uli> Compilations stops when rpcgen is used.

Hmmh, that's odd.  Let me check.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  1:44                             ` Ulrich Drepper
  2003-11-14  1:54                               ` David Mosberger
@ 2003-11-14  2:18                               ` David Mosberger
  2003-11-14  2:57                                 ` Ulrich Drepper
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-14  2:18 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Thu, 13 Nov 2003 17:36:51 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

  Uli> David Mosberger wrote:
  >> To cover all bases, here is an updated patch with Jakub's version
  >> of the elf/rtld.c check.

  Uli> The patch fails on kernels without the sysinfo support.
  Uli> Compilations stops when rpcgen is used.

Is this with linuxthreads?  I don't think nptl can work on 2.4
kernels.  Can you send me the configure-line you're using?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  2:18                               ` David Mosberger
@ 2003-11-14  2:57                                 ` Ulrich Drepper
  2003-11-14  3:22                                   ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-14  2:57 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Is this with linuxthreads?  I don't think nptl can work on 2.4
> kernels.  Can you send me the configure-line you're using?

The RH kernels have NPTL support.

configure --prefix=/usr --enable-add-ons=nptl --disable-profile
- --enable-kernel=current --with-tls

This line always works.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/tELF2ijCOnn/RHQRAnyPAJ9+P134prm1AmZ8tiyRue8Yb4m4VwCfb4Jo
fOIcpsqz+H4HUd9OZJ1daIU=
=OFNW
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  2:57                                 ` Ulrich Drepper
@ 2003-11-14  3:22                                   ` David Mosberger
  2003-11-14  3:39                                     ` Ulrich Drepper
  2003-11-14  5:29                                     ` Ulrich Drepper
  0 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-14  3:22 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Thu, 13 Nov 2003 18:49:41 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Ulrich> David Mosberger wrote:

  >> Is this with linuxthreads?  I don't think nptl can work on 2.4
  >> kernels.  Can you send me the configure-line you're using?

  Ulrich> The RH kernels have NPTL support.

How could I forget... ;-)

  Ulrich> configure --prefix=/usr --enable-add-ons=nptl
  Ulrich> --disable-profile - --enable-kernel=current --with-tls

Thanks.

I hacked a 2.6 kernel to not pass the sysinfo aux vector entries.  The
build worked fine for me.  There is nothing unusual on your end, I
assume?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  3:22                                   ` David Mosberger
@ 2003-11-14  3:39                                     ` Ulrich Drepper
  2003-11-14  5:29                                     ` Ulrich Drepper
  1 sibling, 0 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-14  3:39 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> I hacked a 2.6 kernel to not pass the sysinfo aux vector entries.  The
> build worked fine for me.  There is nothing unusual on your end, I
> assume?

The kernel work fine with NPTL and the current libc before the patch.
That should qualify as nothing unusual.  There certainly is not code
which sets AT_SYSINFO etc.

What I found so far is that it i fails inside libc, not ld.so.  I.e.,
the startup already left ld.so when it bombs.  In which, the test
program I have (io/pwd, one of the simplest programs there is) fails in
the printf code.  The fstat(1,...) is executed, then it fails.  In a
correct program the next step would be an mmap() syscall.  mmap itself
isn't broken since ld.so already performed a ton of those calls (though
not exactly the same code, it uses the copy in ld.so itself).

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/tEyj2ijCOnn/RHQRAhW6AKDF6gLufh9j1aDildXCxHRcqKzDBgCfVnRW
zJIH1hY/S0/0ryyasVegiVk=
=aKCO
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  3:22                                   ` David Mosberger
  2003-11-14  3:39                                     ` Ulrich Drepper
@ 2003-11-14  5:29                                     ` Ulrich Drepper
  2003-11-14  5:49                                       ` David Mosberger
  1 sibling, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-14  5:29 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The problem is that r32 in _IO_file_doallocate gets corrupted when
_IO_file_stat is called.  The latter makes a syscall.  Seems some of the
asm wizardry associated with INLINE_SYSCALL is going wrong.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/tGZm2ijCOnn/RHQRAi6WAJ4rHKu4oxFmtXnVHde8/cJWv6ZS9gCgv83K
2ac9BuG/4jSYdW+pfhZgG6s=
=QlEk
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  5:29                                     ` Ulrich Drepper
@ 2003-11-14  5:49                                       ` David Mosberger
  2003-11-14  6:04                                         ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-14  5:49 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Thu, 13 Nov 2003 21:21:42 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> The problem is that r32 in _IO_file_doallocate gets corrupted
  Uli> when _IO_file_stat is called.  The latter makes a syscall.
  Uli> Seems some of the asm wizardry associated with INLINE_SYSCALL
  Uli> is going wrong.

Ah, perhaps ar.pfs didn't get preserved.  Earlier versions of GCC
ignored clobbers to ar.pfs, but I don't remember exactly when it was
fixed.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  5:49                                       ` David Mosberger
@ 2003-11-14  6:04                                         ` Ulrich Drepper
  2003-11-14  6:43                                           ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-14  6:04 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Ah, perhaps ar.pfs didn't get preserved.  Earlier versions of GCC
> ignored clobbers to ar.pfs, but I don't remember exactly when it was
> fixed.

Looks like it.  This is the entire fxstat.os file:

0000000000000000 <__GI___fxstat>:
   0:   0c 80 2c 06 80 05       [MFI]       alloc r16=ar.pfs,11,3,0
   6:   00 00 00 02 00 c0                   nop.f 0x0
   c:   81 68 00 84                         adds r14=8,r13
  10:   02 20 01 44 00 21       [MII]       mov r36=r34
  16:   30 02 84 2c 00 00                   sxt4 r35=r33;;
  1c:   00 00 04 00                         nop.i 0x0
  20:   03 70 00 1c 18 10       [MII]       ld8 r14=[r14]
  26:   f0 e0 01 12 48 e0                   mov r15=1212;;
  2c:   e0 08 00 07                         mov b7=r14;;
  30:   1d 00 00 00 01 00       [MFB]       nop.m 0x0
  36:   00 00 00 02 00 c0                   nop.f 0x0
  3c:   78 00 80 10                         br.call.sptk.many b6=b7;;
  40:   0d 38 fc 15 06 3b       [MFI]       cmp.eq p7,p6=-1,r10
                        42: LTOFF_TPREL22       __libc_errno
  46:   00 00 00 02 00 c0                   nop.f 0x0
  4c:   01 08 00 90                         addl r14=0,r1;;
  50:   eb 70 00 1c 18 d0       [MMI] (p07) ld8 r14=[r14];;
  56:   e1 70 34 00 40 00             (p07) add r14=r14,r13
  5c:   00 00 04 00                         nop.i 0x0;;
  60:   f1 00 20 1c 90 d1       [MIB] (p07) st4 [r14]=r8
  66:   81 f8 f3 ff 4f 80             (p07) mov r8=-1
  6c:   08 00 84 00                         br.ret.sptk.many b0;;



The compiler used is gcc 3.2 based.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/tG382ijCOnn/RHQRAnQTAKDJB281wJ0x+5ou42icJi/quyf0MgCeNTVY
XnJzUJBVUYudG0CypDEEonw=
=ltUp
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  6:04                                         ` Ulrich Drepper
@ 2003-11-14  6:43                                           ` David Mosberger
  2003-11-14 19:53                                             ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-14  6:43 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Thu, 13 Nov 2003 21:54:04 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> Ah, perhaps ar.pfs didn't get preserved.  Earlier versions of GCC
  >> ignored clobbers to ar.pfs, but I don't remember exactly when it
  >> was fixed.

  Uli> Looks like it.

Bingo!
With gcc-3.3.2, it looks like this:

0000000000000000 <__GI___fxstat>:
   0:   00 18 31 08 80 05       [MII]       alloc r35=ar.pfs,12,4,0
   6:   f0 e0 01 12 48 c0                   mov r15=1212
   c:   81 68 00 84                         adds r14=8,r13
  10:   02 28 01 44 00 21       [MII]       mov r37=r34
  16:   40 02 84 2c 00 00                   sxt4 r36=r33;;
  1c:   00 00 04 00                         nop.i 0x0
  20:   0b 70 00 1c 18 10       [MMI]       ld8 r14=[r14];;
  26:   00 00 00 02 00 e0                   nop.m 0x0
  2c:   e0 08 00 07                         mov b7=r14;;
  30:   1d 00 00 00 01 00       [MFB]       nop.m 0x0
  36:   00 00 00 02 00 c0                   nop.f 0x0
  3c:   78 00 80 10                         br.call.sptk.many b6=b7;;
  40:   03 70 00 02 00 24       [MII]       addl r14=0,r1
  46:   00 18 01 55 00 e0                   mov.i ar.pfs=r35;;
  4c:   f0 57 18 ec                         cmp.eq p7,p6=-1,r10;;
  50:   eb 70 00 1c 18 d0       [MMI] (p07) ld8 r14=[r14];;
  56:   e1 70 34 00 40 00             (p07) add r14=r14,r13
  5c:   00 00 04 00                         nop.i 0x0;;
  60:   f1 00 20 1c 90 d1       [MIB] (p07) st4 [r14]=r8
  66:   81 f8 f3 ff 4f 80             (p07) mov r8=-1
  6c:   08 00 84 00                         br.ret.sptk.many b0;;

  Uli> The compiler used is gcc 3.2 based.

OK, it may well be that it was fixed between 3.2 and 3.3.  Let me see
if I can track down the exact patch.

Ah, it's this one:

2003-04-25  Richard Henderson  <rth@redhat.com>

        * config/ia64/ia64.c (ia64_compute_frame_size): Allow inline asm
        to clobber ar.pfs and ar.unat.
        (ia64_expand_prologue): Force alloc instruction if ar.pfs saved;
        fix test for spilling ar.pfs to the stack.

I attached the diff below.

	--david

Index: config/ia64/ia64.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/config/ia64/ia64.c,v
retrieving revision 1.220
retrieving revision 1.221
diff -u -r1.220 -r1.221
--- config/ia64/ia64.c	24 Apr 2003 17:23:52 -0000	1.220
+++ config/ia64/ia64.c	25 Apr 2003 21:02:25 -0000	1.221
@@ -1878,6 +1878,17 @@
 	  spill_size += 8;
 	  n_spilled += 1;
 	}
+
+      if (regs_ever_live[AR_PFS_REGNUM])
+	{
+	  SET_HARD_REG_BIT (mask, AR_PFS_REGNUM);
+	  current_frame_info.reg_save_ar_pfs = find_gr_spill (1);
+	  if (current_frame_info.reg_save_ar_pfs == 0)
+	    {
+	      extra_spill_size += 8;
+	      n_spilled += 1;
+	    }
+	}
     }
 
   /* Unwind descriptor hackery: things are most efficient if we allocate
@@ -1916,8 +1927,10 @@
     }
 
   /* If we're forced to use st8.spill, we're forced to save and restore
-     ar.unat as well.  */
-  if (spilled_gr_p || cfun->machine->n_varargs)
+     ar.unat as well.  The check for existing liveness allows inline asm
+     to touch ar.unat.  */
+  if (spilled_gr_p || cfun->machine->n_varargs
+      || regs_ever_live[AR_UNAT_REGNUM])
     {
       regs_ever_live[AR_UNAT_REGNUM] = 1;
       SET_HARD_REG_BIT (mask, AR_UNAT_REGNUM);
@@ -2378,7 +2391,8 @@
   /* We don't need an alloc instruction if we've used no outputs or locals.  */
   if (current_frame_info.n_local_regs == 0
       && current_frame_info.n_output_regs == 0
-      && current_frame_info.n_input_regs <= current_function_args_info.int_regs)
+      && current_frame_info.n_input_regs <= current_function_args_info.int_regs
+      && !TEST_HARD_REG_BIT (current_frame_info.mask, AR_PFS_REGNUM))
     {
       /* If there is no alloc, but there are input registers used, then we
 	 need a .regstk directive.  */
@@ -2540,8 +2554,8 @@
   /* The alloc insn already copied ar.pfs into a general register.  The
      only thing we have to do now is copy that register to a stack slot
      if we'd not allocated a local register for the job.  */
-  if (current_frame_info.reg_save_ar_pfs == 0
-      && ! current_function_is_leaf)
+  if (TEST_HARD_REG_BIT (current_frame_info.mask, AR_PFS_REGNUM)
+      && current_frame_info.reg_save_ar_pfs == 0)
     {
       reg = gen_rtx_REG (DImode, AR_PFS_REGNUM);
       do_spill (gen_movdi_x, ar_pfs_save_reg, cfa_off, reg);

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14  6:43                                           ` David Mosberger
@ 2003-11-14 19:53                                             ` Ulrich Drepper
  2003-11-14 19:56                                               ` David Mosberger
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-14 19:53 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

With a new compiler I get past that point and programs in general work.
 But several NPTL tests fail now.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/tTET2ijCOnn/RHQRAsaVAJ44Mz28wVQLadx5uJFkVVk3dLZvjQCfR1OZ
J+RyJgE6Fo+jeAUUKRXQyOI=
=F3pa
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14 19:53                                             ` Ulrich Drepper
@ 2003-11-14 19:56                                               ` David Mosberger
  2003-11-14 20:36                                                 ` Ulrich Drepper
  2003-11-14 20:13                                               ` patch to fix unwind info for ia64 David Mosberger
  2003-11-14 20:21                                               ` David Mosberger
  2 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-14 19:56 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Fri, 14 Nov 2003 11:46:27 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> With a new compiler I get past that point and programs in
  Uli> general work.

Great!

  Uli> But several NPTL tests fail now.

Which tests specifically?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* patch to fix unwind info for ia64
  2003-11-14 19:53                                             ` Ulrich Drepper
  2003-11-14 19:56                                               ` David Mosberger
@ 2003-11-14 20:13                                               ` David Mosberger
  2003-11-14 20:21                                               ` David Mosberger
  2 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-14 20:13 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: libc-hacker

I ran Harish Patil's unwcheck script on libc.so and found several
routines which had bad unwind info (number of instructions covered by
unwind info didn't match actual length of function).  The problems
were due to a known bug in GAS which causes bad unwind info when
.align is used between a .proc/.endp-pair.  Unfortunately, this is not
an easy bug to fix, so, for now, we may just have to work around the
problem in the few assembly files that have this issue.  The patch
below does that.  The change to pipe.S may seem unrelated but besides
resulting in better code, the change also fixes a similar kind of bad
unwind-info problem for pipe.S that would show up once the new
syscall-stub patch is applied.

	--david

ChangeLog

2003-11-14 David Mosberger   <davidm@hpl.hp.com>

	* sysdeps/ia64/memccpy.S: Work around GAS_ALIGN_BREAKS_UNWIND_INFO bug.
	* sysdeps/ia64/memcpy.S: Ditto.
	* sysdeps/ia64/memset.S: Ditto.
	* sysdeps/ia64/memmove.S: Ditto.  Also move the jump-table to
	out of .tex into .rodata, where it belongs

	* sysdeps/unix/sysv/linux/ia64/sysdep.h
	(GAS_ALIGN_BREAKS_UNWIND_INFO): Define this macro to indicate
	that all existing GAS versions have a problem with .align inside
	a function.

	* sysdeps/unix/sysv/linux/ia64/pipe.S: There is no need to
	save/restore input-arguments, because they're necessarily
	preserved by the kernel to support syscall-restart.

Index: sysdeps/ia64/memccpy.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memccpy.S,v
retrieving revision 1.6
diff -u -r1.6 memccpy.S
--- sysdeps/ia64/memccpy.S	9 Sep 2003 20:15:59 -0000	1.6
+++ sysdeps/ia64/memccpy.S	14 Nov 2003 20:05:30 -0000
@@ -52,6 +52,15 @@
 #define loopcnt		r30
 #define	value		r31
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+/* Manually force proper loop-alignment.  Note: be sure to
+   double-check the code-layout after making any changes to
+   this routine! */
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align 32
+#endif
+
 ENTRY(memccpy)
 	.prologue
 	alloc 	r2 = ar.pfs, 4, 40 - 4, 0, 40
@@ -110,7 +119,7 @@
 	mov	ar.ec = MEMLAT + 6 + 1 	// six more passes needed
 	ld8	r[1] = [asrc], 8 	// r[1] = w0
 	cmp.ne	p6, p0 = r0, r0	;;	// clear p6
-	.align	32
+	ALIGN(32)
 .l2:
 (p[0])		ld8.s	r[0] = [asrc], 8		// r[0] = w1
 (p[MEMLAT])	shr.u	tmp1[0] = r[1 + MEMLAT], sh1	// tmp1 = w0 >> sh1
Index: sysdeps/ia64/memcpy.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memcpy.S,v
retrieving revision 1.10
diff -u -r1.10 memcpy.S
--- sysdeps/ia64/memcpy.S	29 Apr 2003 22:47:19 -0000	1.10
+++ sysdeps/ia64/memcpy.S	14 Nov 2003 20:05:30 -0000
@@ -103,14 +103,22 @@
 #define the_z		z
 #endif
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+/* Manually force proper loop-alignment.  Note: be sure to
+   double-check the code-layout after making any changes to
+   this routine! */
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align 32
+#endif
 
 #if defined(USE_LFETCH)
 #define LOOP(shift)						\
-		.align	32 ;					\
+		ALIGN(32);					\
 .loop##shift##:							\
 { .mmb								\
 (p[0])	ld8.nt1	r[0] = [asrc], 8 ;				\
-(p[0])	lfetch.nt1 [ptr1], 16 ;				\
+(p[0])	lfetch.nt1 [ptr1], 16 ;					\
 	nop.b 0 ;						\
 } { .mib							\
 (p[MEMLAT+1]) st8 [dest] = tmp3, 8 ;				\
@@ -118,7 +126,7 @@
  	nop.b 0 ;;						\
  } { .mmb							\
 (p[0])	ld8.nt1	s[0] = [asrc], 8 ;				\
-(p[0])	lfetch.nt1	[ptr2], 16 ;			\
+(p[0])	lfetch.nt1	[ptr2], 16 ;				\
 	nop.b 0 ;						\
 } { .mib							\
 (p[MEMLAT+1]) st8 [dest] = tmp4, 8 ;				\
@@ -130,7 +138,7 @@
 }
 #else
 #define LOOP(shift)						\
-		.align	32 ;					\
+		ALIGN(32);					\
 .loop##shift##:							\
 { .mmb								\
 (p[0])	ld8.nt1	r[0] = [asrc], 8 ;				\
@@ -254,7 +262,11 @@
 	movi0	ar.lc = loopcnt 	// set the loop counter
 ;; }
 
+#ifdef  GAS_ALIGN_BREAKS_UNWIND_INFO
+	{ nop 0 }
+#else
 	.align	32
+#endif
 #if defined(USE_FLP)
 .l1: // ------------------------------- // L1: Everything a multiple of 8
 { .mmi
Index: sysdeps/ia64/memmove.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memmove.S,v
retrieving revision 1.6
diff -u -r1.6 memmove.S
--- sysdeps/ia64/memmove.S	29 Apr 2003 22:47:19 -0000	1.6
+++ sysdeps/ia64/memmove.S	14 Nov 2003 20:05:30 -0000
@@ -56,12 +56,18 @@
 #define loopcnt		r30
 #define	value		r31
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align 32
+#endif
+
 #define LOOP(shift)							\
-		.align	32 ;						\
+		ALIGN(32);						\
 .loop##shift##:								\
 (p[0])		ld8	r[0] = [asrc], 8 ;	/* w1 */		\
 (p[MEMLAT+1])	st8	[dest] = value, 8 ;				\
-(p[MEMLAT])	shrp	value = r[MEMLAT], r[MEMLAT+1], shift ;	\
+(p[MEMLAT])	shrp	value = r[MEMLAT], r[MEMLAT+1], shift ;		\
 		nop.b	0 ;						\
 		nop.b	0 ;						\
 		br.ctop.sptk .loop##shift ;				\
@@ -228,6 +234,10 @@
 (p[MEMLAT])	st1	[dest] = r[MEMLAT], -1
 		br.ctop.dptk .l6
 		br.cond.sptk .restore_and_exit
+END(memmove)
+
+	.rodata
+	.align 8
 .table:
 	data8	0			// dummy entry
 	data8 	.loop56 - .loop8
@@ -238,5 +248,4 @@
 	data8	.loop56 - .loop48
 	data8	.loop56 - .loop56
 
-END(memmove)
 libc_hidden_builtin_def (memmove)
Index: sysdeps/ia64/memset.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memset.S,v
retrieving revision 1.6
diff -u -r1.6 memset.S
--- sysdeps/ia64/memset.S	29 Apr 2003 22:47:19 -0000	1.6
+++ sysdeps/ia64/memset.S	14 Nov 2003 20:05:30 -0000
@@ -153,7 +153,9 @@
 (p_zr)	br.cond.dptk.many .l1b			// Jump to use stf.spill
 ;; }
 
+#ifndef GAS_ALIGN_BREAKS_UNWIND_INFO
 	.align 32 // -------- //  L1A: store ahead into cache lines; fill later
+#endif
 { .mmi
 	and	tmp = -(LINE_SIZE), cnt		// compute end of range
 	mov	ptr9 = ptr1			// used for prefetching
@@ -222,7 +224,11 @@
 	br.cond.dpnt.many  .move_bytes_from_alignment	// Branch no. 3
 ;; }
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+	{ nop 0 }
+#else
 	.align 32
+#endif
 .l1b:	// ------------------ //  L1B: store ahead into cache lines; fill later
 { .mmi
 	and	tmp = -(LINE_SIZE), cnt		// compute end of range
@@ -283,13 +289,15 @@
 { .mib
 	cmp.eq	p_scr, p0 = loopcnt, r0
 	add	loopcnt = -1, loopcnt
-(p_scr)	br.cond.dpnt.many .store_words
+(p_scr)	br.cond.dpnt.many store_words
 ;; }
 { .mib
 	and	cnt = 0x1f, cnt		// compute the remaining cnt
 	movi0   ar.lc = loopcnt
 ;; }
+#ifndef GAS_ALIGN_BREAKS_UNWIND_INFO
 	.align 32
+#endif
 .l2:	// ---------------------------- //  L2A:  store 32B in 2 cycles
 { .mmb
 	store	[ptr1] = myval, 8
@@ -299,7 +307,7 @@
 	store	[ptr2] = myval, 24
 	br.cloop.dptk.many .l2
 ;; }
-.store_words:
+store_words:
 { .mib
 	cmp.gt	p_scr, p0 = 8, cnt		// just a few bytes left ?
 (p_scr)	br.cond.dpnt.many .move_bytes_from_alignment	// Branch
Index: sysdeps/unix/sysv/linux/ia64/pipe.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/pipe.S,v
retrieving revision 1.4
diff -u -r1.4 pipe.S
--- sysdeps/unix/sysv/linux/ia64/pipe.S	3 Aug 2002 06:57:51 -0000	1.4
+++ sysdeps/unix/sysv/linux/ia64/pipe.S	14 Nov 2003 20:05:31 -0000
@@ -22,15 +22,14 @@
 #include <sysdep.h>
 
 ENTRY(__pipe)
-       st8 [sp]=r32		// save ptr across system call
+       .regstk 1,0,0
        DO_CALL (SYS_ify (pipe))
-       ld8 r2=[sp]
        cmp.ne p6,p0=-1,r10
        ;;
-(p6)   st4 [r2]=r8,4
+(p6)   st4 [in0]=r8,4
 (p6)   mov ret0=0
        ;;
-(p6)   st4 [r2]=r9
+(p6)   st4 [in0]=r9
 (p6)   ret
        br.cond.spnt.few __syscall_error
 PSEUDO_END(__pipe)
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/sysdep.h,v
retrieving revision 1.17
diff -u -r1.17 sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h	16 Aug 2003 08:00:24 -0000	1.17
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h	14 Nov 2003 20:05:31 -0000
@@ -24,6 +24,13 @@
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
 
+/* As of GAS v2.4.90.0.7, including a ".align" directive inside a
+   function will cause bad unwind info to be emitted (GAS doesn't know
+   how to account for the padding introduced by the .align directive).
+   Turning on this macro will work around this bug by introducing the
+   necessary padding explicitly. */
+#define GAS_ALIGN_BREAKS_UNWIND_INFO
+
 /* For Linux we can use the system call table in the header file
 	/usr/include/asm/unistd.h
    of the kernel.  But these symbols do not follow the SYS_* syntax

^ permalink raw reply	[flat|nested] 98+ messages in thread

* patch to fix unwind info for ia64
  2003-11-14 19:53                                             ` Ulrich Drepper
  2003-11-14 19:56                                               ` David Mosberger
  2003-11-14 20:13                                               ` patch to fix unwind info for ia64 David Mosberger
@ 2003-11-14 20:21                                               ` David Mosberger
  2003-11-14 20:24                                                 ` Roland McGrath
  2003-11-15 17:42                                                 ` Andreas Schwab
  2 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-14 20:21 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: libc-hacker

[Oops, there was a stupid 2-character typo in the patch for pipe.S
 which my testing didn't catch initially.  Corrected in the patch
 below.]

I ran Harish Patil's unwcheck script on libc.so and found several
routines which had bad unwind info (number of instructions covered by
unwind info didn't match actual length of function).  The problems
were due to a known bug in GAS which causes bad unwind info when
.align is used between a .proc/.endp-pair.  Unfortunately, this is not
an easy bug to fix, so, for now, we may just have to work around the
problem in the few assembly files that have this issue.  The patch
below does that.  The change to pipe.S may seem unrelated but besides
resulting in better code, the change also fixes a similar kind of bad
unwind-info problem for pipe.S that would show up once the new
syscall-stub patch is applied.

	--david

ChangeLog

2003-11-14 David Mosberger   <davidm@hpl.hp.com>

	* sysdeps/ia64/memccpy.S: Work around GAS_ALIGN_BREAKS_UNWIND_INFO bug.
	* sysdeps/ia64/memcpy.S: Ditto.
	* sysdeps/ia64/memset.S: Ditto.
	* sysdeps/ia64/memmove.S: Ditto.  Also move the jump-table to
	out of .tex into .rodata, where it belongs

	* sysdeps/unix/sysv/linux/ia64/sysdep.h
	(GAS_ALIGN_BREAKS_UNWIND_INFO): Define this macro to indicate
	that all existing GAS versions have a problem with .align inside
	a function.

	* sysdeps/unix/sysv/linux/ia64/pipe.S: There is no need to
	save/restore input-arguments, because they're necessarily
	preserved by the kernel to support syscall-restart.

Index: sysdeps/ia64/memccpy.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memccpy.S,v
retrieving revision 1.6
diff -u -r1.6 memccpy.S
--- sysdeps/ia64/memccpy.S	9 Sep 2003 20:15:59 -0000	1.6
+++ sysdeps/ia64/memccpy.S	14 Nov 2003 20:05:30 -0000
@@ -52,6 +52,15 @@
 #define loopcnt		r30
 #define	value		r31
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+/* Manually force proper loop-alignment.  Note: be sure to
+   double-check the code-layout after making any changes to
+   this routine! */
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align 32
+#endif
+
 ENTRY(memccpy)
 	.prologue
 	alloc 	r2 = ar.pfs, 4, 40 - 4, 0, 40
@@ -110,7 +119,7 @@
 	mov	ar.ec = MEMLAT + 6 + 1 	// six more passes needed
 	ld8	r[1] = [asrc], 8 	// r[1] = w0
 	cmp.ne	p6, p0 = r0, r0	;;	// clear p6
-	.align	32
+	ALIGN(32)
 .l2:
 (p[0])		ld8.s	r[0] = [asrc], 8		// r[0] = w1
 (p[MEMLAT])	shr.u	tmp1[0] = r[1 + MEMLAT], sh1	// tmp1 = w0 >> sh1
Index: sysdeps/ia64/memcpy.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memcpy.S,v
retrieving revision 1.10
diff -u -r1.10 memcpy.S
--- sysdeps/ia64/memcpy.S	29 Apr 2003 22:47:19 -0000	1.10
+++ sysdeps/ia64/memcpy.S	14 Nov 2003 20:05:30 -0000
@@ -103,14 +103,22 @@
 #define the_z		z
 #endif
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+/* Manually force proper loop-alignment.  Note: be sure to
+   double-check the code-layout after making any changes to
+   this routine! */
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align 32
+#endif
 
 #if defined(USE_LFETCH)
 #define LOOP(shift)						\
-		.align	32 ;					\
+		ALIGN(32);					\
 .loop##shift##:							\
 { .mmb								\
 (p[0])	ld8.nt1	r[0] = [asrc], 8 ;				\
-(p[0])	lfetch.nt1 [ptr1], 16 ;				\
+(p[0])	lfetch.nt1 [ptr1], 16 ;					\
 	nop.b 0 ;						\
 } { .mib							\
 (p[MEMLAT+1]) st8 [dest] = tmp3, 8 ;				\
@@ -118,7 +126,7 @@
  	nop.b 0 ;;						\
  } { .mmb							\
 (p[0])	ld8.nt1	s[0] = [asrc], 8 ;				\
-(p[0])	lfetch.nt1	[ptr2], 16 ;			\
+(p[0])	lfetch.nt1	[ptr2], 16 ;				\
 	nop.b 0 ;						\
 } { .mib							\
 (p[MEMLAT+1]) st8 [dest] = tmp4, 8 ;				\
@@ -130,7 +138,7 @@
 }
 #else
 #define LOOP(shift)						\
-		.align	32 ;					\
+		ALIGN(32);					\
 .loop##shift##:							\
 { .mmb								\
 (p[0])	ld8.nt1	r[0] = [asrc], 8 ;				\
@@ -254,7 +262,11 @@
 	movi0	ar.lc = loopcnt 	// set the loop counter
 ;; }
 
+#ifdef  GAS_ALIGN_BREAKS_UNWIND_INFO
+	{ nop 0 }
+#else
 	.align	32
+#endif
 #if defined(USE_FLP)
 .l1: // ------------------------------- // L1: Everything a multiple of 8
 { .mmi
Index: sysdeps/ia64/memmove.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memmove.S,v
retrieving revision 1.6
diff -u -r1.6 memmove.S
--- sysdeps/ia64/memmove.S	29 Apr 2003 22:47:19 -0000	1.6
+++ sysdeps/ia64/memmove.S	14 Nov 2003 20:05:30 -0000
@@ -56,12 +56,18 @@
 #define loopcnt		r30
 #define	value		r31
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align 32
+#endif
+
 #define LOOP(shift)							\
-		.align	32 ;						\
+		ALIGN(32);						\
 .loop##shift##:								\
 (p[0])		ld8	r[0] = [asrc], 8 ;	/* w1 */		\
 (p[MEMLAT+1])	st8	[dest] = value, 8 ;				\
-(p[MEMLAT])	shrp	value = r[MEMLAT], r[MEMLAT+1], shift ;	\
+(p[MEMLAT])	shrp	value = r[MEMLAT], r[MEMLAT+1], shift ;		\
 		nop.b	0 ;						\
 		nop.b	0 ;						\
 		br.ctop.sptk .loop##shift ;				\
@@ -228,6 +234,10 @@
 (p[MEMLAT])	st1	[dest] = r[MEMLAT], -1
 		br.ctop.dptk .l6
 		br.cond.sptk .restore_and_exit
+END(memmove)
+
+	.rodata
+	.align 8
 .table:
 	data8	0			// dummy entry
 	data8 	.loop56 - .loop8
@@ -238,5 +248,4 @@
 	data8	.loop56 - .loop48
 	data8	.loop56 - .loop56
 
-END(memmove)
 libc_hidden_builtin_def (memmove)
Index: sysdeps/ia64/memset.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memset.S,v
retrieving revision 1.6
diff -u -r1.6 memset.S
--- sysdeps/ia64/memset.S	29 Apr 2003 22:47:19 -0000	1.6
+++ sysdeps/ia64/memset.S	14 Nov 2003 20:05:30 -0000
@@ -153,7 +153,9 @@
 (p_zr)	br.cond.dptk.many .l1b			// Jump to use stf.spill
 ;; }
 
+#ifndef GAS_ALIGN_BREAKS_UNWIND_INFO
 	.align 32 // -------- //  L1A: store ahead into cache lines; fill later
+#endif
 { .mmi
 	and	tmp = -(LINE_SIZE), cnt		// compute end of range
 	mov	ptr9 = ptr1			// used for prefetching
@@ -222,7 +224,11 @@
 	br.cond.dpnt.many  .move_bytes_from_alignment	// Branch no. 3
 ;; }
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+	{ nop 0 }
+#else
 	.align 32
+#endif
 .l1b:	// ------------------ //  L1B: store ahead into cache lines; fill later
 { .mmi
 	and	tmp = -(LINE_SIZE), cnt		// compute end of range
@@ -283,13 +289,15 @@
 { .mib
 	cmp.eq	p_scr, p0 = loopcnt, r0
 	add	loopcnt = -1, loopcnt
-(p_scr)	br.cond.dpnt.many .store_words
+(p_scr)	br.cond.dpnt.many store_words
 ;; }
 { .mib
 	and	cnt = 0x1f, cnt		// compute the remaining cnt
 	movi0   ar.lc = loopcnt
 ;; }
+#ifndef GAS_ALIGN_BREAKS_UNWIND_INFO
 	.align 32
+#endif
 .l2:	// ---------------------------- //  L2A:  store 32B in 2 cycles
 { .mmb
 	store	[ptr1] = myval, 8
@@ -299,7 +307,7 @@
 	store	[ptr2] = myval, 24
 	br.cloop.dptk.many .l2
 ;; }
-.store_words:
+store_words:
 { .mib
 	cmp.gt	p_scr, p0 = 8, cnt		// just a few bytes left ?
 (p_scr)	br.cond.dpnt.many .move_bytes_from_alignment	// Branch
Index: sysdeps/unix/sysv/linux/ia64/pipe.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/pipe.S,v
retrieving revision 1.4
diff -u -r1.4 pipe.S
--- sysdeps/unix/sysv/linux/ia64/pipe.S	3 Aug 2002 06:57:51 -0000	1.4
+++ sysdeps/unix/sysv/linux/ia64/pipe.S	14 Nov 2003 20:05:31 -0000
@@ -22,15 +22,14 @@
 #include <sysdep.h>
 
 ENTRY(__pipe)
-       st8 [sp]=r32		// save ptr across system call
+       .regstk 1,0,0,0
        DO_CALL (SYS_ify (pipe))
-       ld8 r2=[sp]
        cmp.ne p6,p0=-1,r10
        ;;
-(p6)   st4 [r2]=r8,4
+(p6)   st4 [in0]=r8,4
 (p6)   mov ret0=0
        ;;
-(p6)   st4 [r2]=r9
+(p6)   st4 [in0]=r9
 (p6)   ret
        br.cond.spnt.few __syscall_error
 PSEUDO_END(__pipe)
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/sysdep.h,v
retrieving revision 1.17
diff -u -r1.17 sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h	16 Aug 2003 08:00:24 -0000	1.17
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h	14 Nov 2003 20:05:31 -0000
@@ -24,6 +24,13 @@
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
 
+/* As of GAS v2.4.90.0.7, including a ".align" directive inside a
+   function will cause bad unwind info to be emitted (GAS doesn't know
+   how to account for the padding introduced by the .align directive).
+   Turning on this macro will work around this bug by introducing the
+   necessary padding explicitly. */
+#define GAS_ALIGN_BREAKS_UNWIND_INFO
+
 /* For Linux we can use the system call table in the header file
 	/usr/include/asm/unistd.h
    of the kernel.  But these symbols do not follow the SYS_* syntax

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: patch to fix unwind info for ia64
  2003-11-14 20:21                                               ` David Mosberger
@ 2003-11-14 20:24                                                 ` Roland McGrath
  2003-11-14 21:12                                                   ` David Mosberger
  2003-11-15 17:42                                                 ` Andreas Schwab
  1 sibling, 1 reply; 98+ messages in thread
From: Roland McGrath @ 2003-11-14 20:24 UTC (permalink / raw)
  To: davidm; +Cc: Ulrich Drepper, libc-hacker

> 	* sysdeps/unix/sysv/linux/ia64/sysdep.h
> 	(GAS_ALIGN_BREAKS_UNWIND_INFO): Define this macro to indicate
> 	that all existing GAS versions have a problem with .align inside
> 	a function.

Is this fixed in mainline binutils?  Can you write a configure test to
check for the bug rather than assuming it?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14 19:56                                               ` David Mosberger
@ 2003-11-14 20:36                                                 ` Ulrich Drepper
  2003-11-15  0:51                                                   ` David Mosberger
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-14 20:36 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Which tests specifically?

Tons of them.  tst-mutex8, tst-cond7, ...  Don't have the complete list.
 The interesting part is that they abort, not crash.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/tTso2ijCOnn/RHQRApC8AJ4qOkjSG1cVeqA3WRjYRzOj0+4BYgCfSkmY
h4T82BsiP7yK+2AxH2lex0Y=
=PhFo
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: patch to fix unwind info for ia64
  2003-11-14 20:24                                                 ` Roland McGrath
@ 2003-11-14 21:12                                                   ` David Mosberger
  0 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-14 21:12 UTC (permalink / raw)
  To: Roland McGrath; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Fri, 14 Nov 2003 12:24:21 -0800, Roland McGrath <roland@redhat.com> said:

  >> * sysdeps/unix/sysv/linux/ia64/sysdep.h
  >> (GAS_ALIGN_BREAKS_UNWIND_INFO): Define this macro to indicate
  >> that all existing GAS versions have a problem with .align inside
  >> a function.

  Roland> Is this fixed in mainline binutils?

No.

  Roland> Can you write a configure test to check for the bug rather
  Roland> than assuming it?

If you don't mind relying on readelf being installed, an automated
test would be possible.  But I'm not sure this issue will ever get
fixed.  Perhaps the "fix" will be just to issue an error when someone
uses .align inside .proc/.endp, so I'm not sure going to great lengths
with an automated test makes much sense at this point.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14 20:36                                                 ` Ulrich Drepper
@ 2003-11-15  0:51                                                   ` David Mosberger
  2003-11-15  9:38                                                   ` David Mosberger
  2003-11-15 19:05                                                   ` David Mosberger
  2 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-15  0:51 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Fri, 14 Nov 2003 12:29:28 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Ulrich> David Mosberger wrote:

  >> Which tests specifically?

  Ulrich> Tons of them.  tst-mutex8, tst-cond7, ...  Don't have the
  Ulrich> complete list.  The interesting part is that they abort, not
  Ulrich> crash.

OK, I can reproduce this now.  I didn't see it before because the
Debian gcc-3.3.2 package failed to install unwind.h and as a result,
NPTL was built with HAVE_FORCED_UNWIND turned off.  Now that it's on,
I'm also seeing lots of NPTL failures (starting with the ones you
mention).  Let me take a look.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14 20:36                                                 ` Ulrich Drepper
  2003-11-15  0:51                                                   ` David Mosberger
@ 2003-11-15  9:38                                                   ` David Mosberger
  2003-11-17 18:21                                                     ` Ulrich Drepper
  2003-11-17 22:15                                                     ` David Mosberger
  2003-11-15 19:05                                                   ` David Mosberger
  2 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-15  9:38 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Fri, 14 Nov 2003 12:29:28 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> Which tests specifically?

  Uli> Tons of them.  tst-mutex8, tst-cond7, ...  Don't have the
  Uli> complete list.  The interesting part is that they abort, not
  Uli> crash.

OK, it looks to me like the unwinder is failing.  I suspect it's
choking either on the .altrp directive or on the unwind directives for
the signal trampoline.  I plugged libunwind into libgcc_s.so.1 and now
tst-mutex8, tst-cancel6, and many other tests are working fine.  I
still have some others failing (basic4, stack{1,2},
cancelx{4,5,10,16,17,18}, cleanupx{0,1,3,4}, and oncex{3,4}), but I
don't understand the problem there yet and I'll need to look into
those next week.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: patch to fix unwind info for ia64
  2003-11-14 20:21                                               ` David Mosberger
  2003-11-14 20:24                                                 ` Roland McGrath
@ 2003-11-15 17:42                                                 ` Andreas Schwab
  2003-11-15 18:52                                                   ` David Mosberger
  1 sibling, 1 reply; 98+ messages in thread
From: Andreas Schwab @ 2003-11-15 17:42 UTC (permalink / raw)
  To: davidm; +Cc: Ulrich Drepper, libc-hacker

David Mosberger <davidm@napali.hpl.hp.com> writes:

> +#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
> +/* Manually force proper loop-alignment.  Note: be sure to
> +   double-check the code-layout after making any changes to
> +   this routine! */
> +# define ALIGN(n)	{ nop 0 }
> +#else
> +# define ALIGN(n)	.align 32
> +#endif

Why is ALIGN taking a parameter if it's always ignored anyway?

> +/* As of GAS v2.4.90.0.7, including a ".align" directive inside a

s/2.4/2.14/

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: patch to fix unwind info for ia64
  2003-11-15 17:42                                                 ` Andreas Schwab
@ 2003-11-15 18:52                                                   ` David Mosberger
  2003-11-19  6:19                                                     ` David Mosberger
  2003-11-19 15:25                                                     ` Ulrich Drepper
  0 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-15 18:52 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Sat, 15 Nov 2003 18:41:34 +0100, Andreas Schwab <schwab@suse.de> said:

  Andreas> Why is ALIGN taking a parameter if it's always ignored
  Andreas> anyway?

Thanks for catching this typo.  Updated patch is attached.

	--david

ChangeLog

2003-11-14 David Mosberger   <davidm@hpl.hp.com>

	* sysdeps/ia64/memccpy.S: Work around GAS_ALIGN_BREAKS_UNWIND_INFO bug.
	* sysdeps/ia64/memcpy.S: Ditto.
	* sysdeps/ia64/memset.S: Ditto.
	* sysdeps/ia64/memmove.S: Ditto.  Also move the jump-table to
	out of .tex into .rodata, where it belongs

	* sysdeps/unix/sysv/linux/ia64/sysdep.h
	(GAS_ALIGN_BREAKS_UNWIND_INFO): Define this macro to indicate
	that all existing GAS versions have a problem with .align inside
	a function.

	* sysdeps/unix/sysv/linux/ia64/pipe.S: There is no need to
	save/restore input-arguments, because they're necessarily
	preserved by the kernel to support syscall-restart.

Index: sysdeps/ia64/memccpy.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memccpy.S,v
retrieving revision 1.6
diff -u -r1.6 memccpy.S
--- sysdeps/ia64/memccpy.S	9 Sep 2003 20:15:59 -0000	1.6
+++ sysdeps/ia64/memccpy.S	14 Nov 2003 20:05:30 -0000
@@ -52,6 +52,15 @@
 #define loopcnt		r30
 #define	value		r31
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+/* Manually force proper loop-alignment.  Note: be sure to
+   double-check the code-layout after making any changes to
+   this routine! */
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align n
+#endif
+
 ENTRY(memccpy)
 	.prologue
 	alloc 	r2 = ar.pfs, 4, 40 - 4, 0, 40
@@ -110,7 +119,7 @@
 	mov	ar.ec = MEMLAT + 6 + 1 	// six more passes needed
 	ld8	r[1] = [asrc], 8 	// r[1] = w0
 	cmp.ne	p6, p0 = r0, r0	;;	// clear p6
-	.align	32
+	ALIGN(32)
 .l2:
 (p[0])		ld8.s	r[0] = [asrc], 8		// r[0] = w1
 (p[MEMLAT])	shr.u	tmp1[0] = r[1 + MEMLAT], sh1	// tmp1 = w0 >> sh1
Index: sysdeps/ia64/memcpy.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memcpy.S,v
retrieving revision 1.10
diff -u -r1.10 memcpy.S
--- sysdeps/ia64/memcpy.S	29 Apr 2003 22:47:19 -0000	1.10
+++ sysdeps/ia64/memcpy.S	14 Nov 2003 20:05:30 -0000
@@ -103,14 +103,22 @@
 #define the_z		z
 #endif
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+/* Manually force proper loop-alignment.  Note: be sure to
+   double-check the code-layout after making any changes to
+   this routine! */
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align n
+#endif
 
 #if defined(USE_LFETCH)
 #define LOOP(shift)						\
-		.align	32 ;					\
+		ALIGN(32);					\
 .loop##shift##:							\
 { .mmb								\
 (p[0])	ld8.nt1	r[0] = [asrc], 8 ;				\
-(p[0])	lfetch.nt1 [ptr1], 16 ;				\
+(p[0])	lfetch.nt1 [ptr1], 16 ;					\
 	nop.b 0 ;						\
 } { .mib							\
 (p[MEMLAT+1]) st8 [dest] = tmp3, 8 ;				\
@@ -118,7 +126,7 @@
  	nop.b 0 ;;						\
  } { .mmb							\
 (p[0])	ld8.nt1	s[0] = [asrc], 8 ;				\
-(p[0])	lfetch.nt1	[ptr2], 16 ;			\
+(p[0])	lfetch.nt1	[ptr2], 16 ;				\
 	nop.b 0 ;						\
 } { .mib							\
 (p[MEMLAT+1]) st8 [dest] = tmp4, 8 ;				\
@@ -130,7 +138,7 @@
 }
 #else
 #define LOOP(shift)						\
-		.align	32 ;					\
+		ALIGN(32);					\
 .loop##shift##:							\
 { .mmb								\
 (p[0])	ld8.nt1	r[0] = [asrc], 8 ;				\
@@ -254,7 +262,11 @@
 	movi0	ar.lc = loopcnt 	// set the loop counter
 ;; }
 
+#ifdef  GAS_ALIGN_BREAKS_UNWIND_INFO
+	{ nop 0 }
+#else
 	.align	32
+#endif
 #if defined(USE_FLP)
 .l1: // ------------------------------- // L1: Everything a multiple of 8
 { .mmi
Index: sysdeps/ia64/memmove.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memmove.S,v
retrieving revision 1.6
diff -u -r1.6 memmove.S
--- sysdeps/ia64/memmove.S	29 Apr 2003 22:47:19 -0000	1.6
+++ sysdeps/ia64/memmove.S	14 Nov 2003 20:05:30 -0000
@@ -56,12 +56,18 @@
 #define loopcnt		r30
 #define	value		r31
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align n
+#endif
+
 #define LOOP(shift)							\
-		.align	32 ;						\
+		ALIGN(32);						\
 .loop##shift##:								\
 (p[0])		ld8	r[0] = [asrc], 8 ;	/* w1 */		\
 (p[MEMLAT+1])	st8	[dest] = value, 8 ;				\
-(p[MEMLAT])	shrp	value = r[MEMLAT], r[MEMLAT+1], shift ;	\
+(p[MEMLAT])	shrp	value = r[MEMLAT], r[MEMLAT+1], shift ;		\
 		nop.b	0 ;						\
 		nop.b	0 ;						\
 		br.ctop.sptk .loop##shift ;				\
@@ -228,6 +234,10 @@
 (p[MEMLAT])	st1	[dest] = r[MEMLAT], -1
 		br.ctop.dptk .l6
 		br.cond.sptk .restore_and_exit
+END(memmove)
+
+	.rodata
+	.align 8
 .table:
 	data8	0			// dummy entry
 	data8 	.loop56 - .loop8
@@ -238,5 +248,4 @@
 	data8	.loop56 - .loop48
 	data8	.loop56 - .loop56
 
-END(memmove)
 libc_hidden_builtin_def (memmove)
Index: sysdeps/ia64/memset.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memset.S,v
retrieving revision 1.6
diff -u -r1.6 memset.S
--- sysdeps/ia64/memset.S	29 Apr 2003 22:47:19 -0000	1.6
+++ sysdeps/ia64/memset.S	14 Nov 2003 20:05:30 -0000
@@ -153,7 +153,9 @@
 (p_zr)	br.cond.dptk.many .l1b			// Jump to use stf.spill
 ;; }
 
+#ifndef GAS_ALIGN_BREAKS_UNWIND_INFO
 	.align 32 // -------- //  L1A: store ahead into cache lines; fill later
+#endif
 { .mmi
 	and	tmp = -(LINE_SIZE), cnt		// compute end of range
 	mov	ptr9 = ptr1			// used for prefetching
@@ -222,7 +224,11 @@
 	br.cond.dpnt.many  .move_bytes_from_alignment	// Branch no. 3
 ;; }
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+	{ nop 0 }
+#else
 	.align 32
+#endif
 .l1b:	// ------------------ //  L1B: store ahead into cache lines; fill later
 { .mmi
 	and	tmp = -(LINE_SIZE), cnt		// compute end of range
@@ -283,13 +289,15 @@
 { .mib
 	cmp.eq	p_scr, p0 = loopcnt, r0
 	add	loopcnt = -1, loopcnt
-(p_scr)	br.cond.dpnt.many .store_words
+(p_scr)	br.cond.dpnt.many store_words
 ;; }
 { .mib
 	and	cnt = 0x1f, cnt		// compute the remaining cnt
 	movi0   ar.lc = loopcnt
 ;; }
+#ifndef GAS_ALIGN_BREAKS_UNWIND_INFO
 	.align 32
+#endif
 .l2:	// ---------------------------- //  L2A:  store 32B in 2 cycles
 { .mmb
 	store	[ptr1] = myval, 8
@@ -299,7 +307,7 @@
 	store	[ptr2] = myval, 24
 	br.cloop.dptk.many .l2
 ;; }
-.store_words:
+store_words:
 { .mib
 	cmp.gt	p_scr, p0 = 8, cnt		// just a few bytes left ?
 (p_scr)	br.cond.dpnt.many .move_bytes_from_alignment	// Branch
Index: sysdeps/unix/sysv/linux/ia64/pipe.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/pipe.S,v
retrieving revision 1.4
diff -u -r1.4 pipe.S
--- sysdeps/unix/sysv/linux/ia64/pipe.S	3 Aug 2002 06:57:51 -0000	1.4
+++ sysdeps/unix/sysv/linux/ia64/pipe.S	14 Nov 2003 20:05:31 -0000
@@ -22,15 +22,14 @@
 #include <sysdep.h>
 
 ENTRY(__pipe)
-       st8 [sp]=r32		// save ptr across system call
+       .regstk 1,0,0,0
        DO_CALL (SYS_ify (pipe))
-       ld8 r2=[sp]
        cmp.ne p6,p0=-1,r10
        ;;
-(p6)   st4 [r2]=r8,4
+(p6)   st4 [in0]=r8,4
 (p6)   mov ret0=0
        ;;
-(p6)   st4 [r2]=r9
+(p6)   st4 [in0]=r9
 (p6)   ret
        br.cond.spnt.few __syscall_error
 PSEUDO_END(__pipe)
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/sysdep.h,v
retrieving revision 1.17
diff -u -r1.17 sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h	16 Aug 2003 08:00:24 -0000	1.17
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h	14 Nov 2003 20:05:31 -0000
@@ -24,6 +24,13 @@
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
 
+/* As of GAS v2.4.90.0.7, including a ".align" directive inside a
+   function will cause bad unwind info to be emitted (GAS doesn't know
+   how to account for the padding introduced by the .align directive).
+   Turning on this macro will work around this bug by introducing the
+   necessary padding explicitly. */
+#define GAS_ALIGN_BREAKS_UNWIND_INFO
+
 /* For Linux we can use the system call table in the header file
 	/usr/include/asm/unistd.h
    of the kernel.  But these symbols do not follow the SYS_* syntax

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-14 20:36                                                 ` Ulrich Drepper
  2003-11-15  0:51                                                   ` David Mosberger
  2003-11-15  9:38                                                   ` David Mosberger
@ 2003-11-15 19:05                                                   ` David Mosberger
  2003-11-17 18:14                                                     ` Ulrich Drepper
  2 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-15 19:05 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Fri, 14 Nov 2003 12:29:28 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> Which tests specifically?

  Uli> Tons of them.  tst-mutex8, tst-cond7, ...  Don't have the
  Uli> complete list.  The interesting part is that they abort, not
  Uli> crash.

Related question: it seems to me tst-cancel6 cannot possibly succeed
with NPTL _unless_ HAVE_FORCED_UNWIND is true?  The reason is that the
only way the file lock will be released is through a cleanup handler:

# ifdef __EXCEPTIONS
#  define _IO_acquire_lock(_fp) \
  do {									      \
    _IO_FILE *_IO_acquire_lock_file					      \
	__attribute__((cleanup (_IO_acquire_lock_fct)))			      \
	= (_fp);							      \
    _IO_flockfile (_IO_acquire_lock_file);

# else
#  define _IO_acquire_lock(_fp) _IO_acquire_lock_needs_exceptions_enabled
# endif

So here, if __EXCEPTIONS is defined, but HAVE_FORCED_UNWIND is false,
the compilation will succeed, but at runtime, the cleanup handler
won't get invoked.  Perhaps #ifdef __EXCEPTIONS should be changed to
#if defined(__EXCEPTIONS) && defined(HAVE_FORCED_UNWIND) ?  (Yeah,
it's a bit weird to have one but not the other defined, but it seems
to be the case on all existing Debian/unstable systems, so it's
perhaps not so rare.)

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-15 19:05                                                   ` David Mosberger
@ 2003-11-17 18:14                                                     ` Ulrich Drepper
  2003-11-18  0:47                                                       ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-17 18:14 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Related question: it seems to me tst-cancel6 cannot possibly succeed
> with NPTL _unless_ HAVE_FORCED_UNWIND is true?

That might be.  It's no supported configuration anyway.  I think it's
time to make configure fail in all these non-standard environments.  It
was mainly useful when developing.  But now all the tools are there.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/uQ5m2ijCOnn/RHQRAngvAKC7YoY5A6aNwJb25QtbPeVU40DHZwCfbEcY
2OXO/JIp9dbqYJlT88yO/P8=
=/HFc
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-15  9:38                                                   ` David Mosberger
@ 2003-11-17 18:21                                                     ` Ulrich Drepper
  2003-11-17 18:35                                                       ` David Mosberger
  2003-11-18  7:54                                                       ` David Mosberger
  2003-11-17 22:15                                                     ` David Mosberger
  1 sibling, 2 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-17 18:21 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> OK, it looks to me like the unwinder is failing.  I suspect it's
> choking either on the .altrp directive or on the unwind directives for
> the signal trampoline.

But we already did unwind through signal handlers.  It must be something
new your patch adds.  So likely the code in _dl_sysinfo_break, the
.altrp etc.  If a signal is received the thread is usually in a syscall
so we have to unwind through the signal handler frame, the sigreturn
stuff, and then the frame around the break instruction.  It must be the
step from sigreturn to _dl_sysinfo_break, everything else is the same.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/uQ+G2ijCOnn/RHQRAo1NAJ0aF0EYnOiEfGCJId1GEzRd83bvDwCeMZFY
ZQR4Jc+29Se2JNc/wCu4Jlg=
=Gvc+
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-17 18:21                                                     ` Ulrich Drepper
@ 2003-11-17 18:35                                                       ` David Mosberger
  2003-11-18  7:54                                                       ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-17 18:35 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Mon, 17 Nov 2003 10:12:22 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> But we already did unwind through signal handlers.  It must be
  Uli> something new your patch adds.  So likely the code in
  Uli> _dl_sysinfo_break, the .altrp etc.  If a signal is received the
  Uli> thread is usually in a syscall so we have to unwind through the
  Uli> signal handler frame, the sigreturn stuff, and then the frame
  Uli> around the break instruction.  It must be the step from
  Uli> sigreturn to _dl_sysinfo_break, everything else is the same.

It's likely altrp but the ia64 unwinder in libgcc is rather broken so
it could be a more complicated interaction (e.g., the unwinder won't
work completely correctly with label_state/copy_state which is
something the signal trampoline uses).  In any case, my first priority
is to get it to work with libunwind, since it's much easier to debug
the code with it.  Once that's working, I'll try to figure out what's
holding up GCC's unwinder.  If it's reasonably easy to fix, I'll make
a patch.  Otherwise, we may just want to give up on the builtin
unwinder (it has already has at least one or two serious bugs, which
would be hard to fix).

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-15  9:38                                                   ` David Mosberger
  2003-11-17 18:21                                                     ` Ulrich Drepper
@ 2003-11-17 22:15                                                     ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-17 22:15 UTC (permalink / raw)
  To: Ulrich Drepper, Jakub Jelinek; +Cc: libc-hacker, davidm

>>>>> On Sat, 15 Nov 2003 01:38:05 -0800, David Mosberger <davidm@linux.hpl.hp.com> said:

  >> I still have some others failing (basic4, stack{1,2},
  >> cancelx{4,5,10,16,17,18}, cleanupx{0,1,3,4}, and oncex{3,4}), but
  >> I don't understand the problem there yet and I'll need to look
  >> into those next week.

The first problem was a stupid kernel bug in the light-weight handler
for sigprocmask.  I already pushed a fix for this to Linus and now
tst-basic4 as well as many other tests are working again.

The next genuine failure is in tst-cancelx4.  The problem here seems
to be due to the fact that unwind-c.c:__gcc_personality_v0 gets linked
statically into the test application, which pulls in the unwinder-code
from libgcc_eh.a (which, in my case, is the old, unwinder, not the
libunwind-based one).  But unwind-forcedunwind.c picks the unwinder up
via libgcc_s.so.1, which ends up using the libunwind-based unwinder,
so now I end up with two conflicting unwinders and when a context gets
passed from one unwinder to the other, bad things happen, of course.

Now I suppose you could argue that it's my inconsistent setup that's
the root-cause of the problem, but it worries me a bit to have
binaries that work correctly only if libgcc_s.so.1 matches with the
unwinder that was built statically into the application.  Effectively,
it would mean we could never change the contents of "struct
_Unwind_Context" without breaking binary compatiblity.  I don't think
that was intended?

Was the idea, perhaps, that the application's references to
_Unwind_*/__gcc_personality_v0 also get satisfied from libpthread?  If
so, I think those symbols would have to be exported.

Could you shed some light on how this is supposed to work?

Thanks,

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-17 18:14                                                     ` Ulrich Drepper
@ 2003-11-18  0:47                                                       ` David Mosberger
  2003-11-18  1:02                                                         ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-18  0:47 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Mon, 17 Nov 2003 10:07:34 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> Related question: it seems to me tst-cancel6 cannot possibly
  >> succeed with NPTL _unless_ HAVE_FORCED_UNWIND is true?

  Uli> That might be.  It's no supported configuration anyway.  I
  Uli> think it's time to make configure fail in all these
  Uli> non-standard environments.  It was mainly useful when
  Uli> developing.  But now all the tools are there.

Looking some more into the cleanup-handler-based locking, I'm not sure
I understand why it is race-free.  If I'm reading the code right,
we've got:

nptl/sysdeps/pthread/bits/stdio-lock.h:

	#  define _IO_acquire_lock(_fp)				\
	  do {							\
	    _IO_FILE *_IO_acquire_lock_file			\
	        __attribute__((cleanup (_IO_acquire_lock_fct)))	\
	        = (_fp);					\
	    _IO_flockfile (_IO_acquire_lock_file);

libio/libioP.h:

	static inline void
	__attribute__ ((__always_inline__))
	_IO_acquire_lock_fct (_IO_FILE **p)
	{
	  _IO_FILE *fp = *p;
	  if ((fp->_flags & _IO_USER_LOCK) == 0)
	    _IO_funlockfile (fp);
	}

What happens if a thread gets canceled right after declaring
_IO_acquire_lock_file, but before the _IO_flockfile() call?  Wouldn't
that cause the cleanup-handler to be invoked and then
_IO_funlockfile() to be called erroneously?

Or is NPTL simply not making any correctness-guarantees when
PTHREAD_CANCEL_ASYNCHRONOUS is in effect?  If so, I'm still wondering
what would happen if a signal occurred at that point and then signal
handler called some cancellable routine (such as write())?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  0:47                                                       ` David Mosberger
@ 2003-11-18  1:02                                                         ` Ulrich Drepper
  2003-11-18  1:22                                                           ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-18  1:02 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Or is NPTL simply not making any correctness-guarantees when
> PTHREAD_CANCEL_ASYNCHRONOUS is in effect?

It is not allowed to call any library function except for
pthread_cancel(), pthread_setcancelstate(), and pthread_setcanceltype()
(XSH 2.9.5.4).

If you want to discuss correctness issues I request you first read the
POSIX standard.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/uW132ijCOnn/RHQRAgK7AJ95lZAmjyjh9IhYIHETqfwWU2/HRQCgwke7
JJCRxJO0z5D+hlzwMKLv6j0=
=z5VA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  1:02                                                         ` Ulrich Drepper
@ 2003-11-18  1:22                                                           ` David Mosberger
  2003-11-18  1:37                                                             ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-18  1:22 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Mon, 17 Nov 2003 16:53:10 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> David Mosberger wrote:

  >> Or is NPTL simply not making any correctness-guarantees when
  >> PTHREAD_CANCEL_ASYNCHRONOUS is in effect?

  Uli> It is not allowed to call any library function except for
  Uli> pthread_cancel(), pthread_setcancelstate(), and
  Uli> pthread_setcanceltype() (XSH 2.9.5.4).

The HTML version doesn't have section numbers, but I think you're
referring to this section:

  Async-Cancel Safety

  The pthread_cancel(), pthread_setcancelstate(), and
  pthread_setcanceltype() functions are defined to be async-cancel
  safe.

  No other functions in this volume of IEEE Std 1003.1-2001 are
  required to be async-cancel-safe.

I don't see anything that would prevent NPTL from providing better
async-cancel-safety, but juding by your response, it doesn't.

  Uli> If you want to discuss correctness issues I request you first
  Uli> read the POSIX standard.

Hey, I'm trying...

I still don't understand the interaction between signals and
thread-cancellation and I couldn't find where this is being discussed
in the standard.  Any pointers?

BTW: All NPTL-tests now succeed with libunwind, except that
tst-cancel10 occasionaly segfaults due to some sort of memory
corruption/race.  I'll see now if I can fix GCC's unwind-ia64.c to
work also.

Thanks,

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  1:22                                                           ` David Mosberger
@ 2003-11-18  1:37                                                             ` Ulrich Drepper
  2003-11-18  1:46                                                               ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-18  1:37 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> I don't see anything that would prevent NPTL from providing better
> async-cancel-safety, but juding by your response, it doesn't.

Certainly not.  Since all this special handling makes the normal case
slower.  Asynchronous cancellation is bad and fortunately rarely used,
so no effort which slows down general code and which is necessary to
support async cancel is worth it.


> I still don't understand the interaction between signals and
> thread-cancellation and I couldn't find where this is being discussed
> in the standard.  Any pointers?

What interaction?  Cancellation is implemented via signals.  That should
be obvious.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/uXY42ijCOnn/RHQRAqA1AJ44VvAwgqTBZzhBFVUfzNDSROTfkgCfW33k
u1Ut+fy8b8x8z6qjoDLGfiA=
=ncVK
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  1:37                                                             ` Ulrich Drepper
@ 2003-11-18  1:46                                                               ` David Mosberger
  2003-11-18  2:17                                                                 ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-18  1:46 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Mon, 17 Nov 2003 17:30:32 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> I don't see anything that would prevent NPTL from providing
  >> better async-cancel-safety, but juding by your response, it
  >> doesn't.

  Uli> Certainly not.  Since all this special handling makes the
  Uli> normal case slower.  Asynchronous cancellation is bad and
  Uli> fortunately rarely used, so no effort which slows down general
  Uli> code and which is necessary to support async cancel is worth
  Uli> it.

That makes sense.  Thanks for confirming.

  >> I still don't understand the interaction between signals and
  >> thread-cancellation and I couldn't find where this is being
  >> discussed in the standard.  Any pointers?

  Uli> What interaction?  Cancellation is implemented via signals.
  Uli> That should be obvious.

The one I mentioned: signal handler gets called in this code right
before the _IO_flockfile():

	    _IO_FILE *_IO_acquire_lock_file			\
	        __attribute__((cleanup (_IO_acquire_lock_fct)))	\
	        = (_fp);					\
	    _IO_flockfile (_IO_acquire_lock_file);

and then the signal handler calls write(), which ends up getting
cancelled.  What prevents this from happening?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  1:46                                                               ` David Mosberger
@ 2003-11-18  2:17                                                                 ` Ulrich Drepper
  2003-11-18  5:44                                                                   ` David Mosberger
  2003-11-18 19:18                                                                   ` David Mosberger
  0 siblings, 2 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-18  2:17 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> The one I mentioned: signal handler gets called in this code right
> before the _IO_flockfile():
> 
> 	    _IO_FILE *_IO_acquire_lock_file			\
> 	        __attribute__((cleanup (_IO_acquire_lock_fct)))	\
> 	        = (_fp);					\
> 	    _IO_flockfile (_IO_acquire_lock_file);
> 
> and then the signal handler calls write(), which ends up getting
> cancelled.  What prevents this from happening?

Why should it be prevented?  If you call write in a signal handler you
either disable cancellation of live with it.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/uX8D2ijCOnn/RHQRAhSHAJ9sssJ94YNIqdAxmdbegxPykfZ4VgCgpl9p
xi4QRm0VQGAwAS2TDQq5Re0=
=TAQ0
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  2:17                                                                 ` Ulrich Drepper
@ 2003-11-18  5:44                                                                   ` David Mosberger
  2003-11-18 19:18                                                                   ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-18  5:44 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Mon, 17 Nov 2003 18:08:03 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> David Mosberger wrote:

  >> The one I mentioned: signal handler gets called in this code
  >> right before the _IO_flockfile():

  >> _IO_FILE *_IO_acquire_lock_file \ __attribute__((cleanup
  >> (_IO_acquire_lock_fct))) \ = (_fp); \ _IO_flockfile
  >> (_IO_acquire_lock_file);

  >> and then the signal handler calls write(), which ends up getting
  >> cancelled.  What prevents this from happening?

  Uli> Why should it be prevented?  If you call write in a signal
  Uli> handler you either disable cancellation of live with it.

Ah, I see.  I guess that's OK as long as programmer's using
pthread_cancel() are aware of this.  Reading the POSIX description, it
would not have occurred to me that calling a cancellable function from
a (asynchronous) signal handler is effectively equivalent to enabling
asynchronous cancellation.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-17 18:21                                                     ` Ulrich Drepper
  2003-11-17 18:35                                                       ` David Mosberger
@ 2003-11-18  7:54                                                       ` David Mosberger
  2003-11-18  8:22                                                         ` Ulrich Drepper
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-18  7:54 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Mon, 17 Nov 2003 10:12:22 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> David Mosberger wrote:

  >> OK, it looks to me like the unwinder is failing.  I suspect it's
  >> choking either on the .altrp directive or on the unwind
  >> directives for the signal trampoline.

  Uli> But we already did unwind through signal handlers.  It must be
  Uli> something new your patch adds.  So likely the code in
  Uli> _dl_sysinfo_break, the .altrp etc.  If a signal is received the
  Uli> thread is usually in a syscall so we have to unwind through the
  Uli> signal handler frame, the sigreturn stuff, and then the frame
  Uli> around the break instruction.  It must be the step from
  Uli> sigreturn to _dl_sysinfo_break, everything else is the same.

I tracked this down now and unfortunately it's a fundamental
limitation of the the GCC ia64 unwinder: it assumes that unwinding
happens only at procedure-call-boundaries.  It never tracks any
scratch registers, which means it simply cannot properly unwind across
signal-handlers.  You may get lucky at times, but it can't work in
general.  The reason the new syscall stubs trigger this problem is
that the return-pointer gets saved in register b6, which is a scratch
register.  My suggestion is to deprecate the built-in GCC ia64
unwinder and to use libunwind instead.  For NPTL, I suppose that would
mean to check whether the available unwinder can properly unwind
across signal-handlers and, if so, enable the new syscall stubs.

Does this sound reasonable?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  7:54                                                       ` David Mosberger
@ 2003-11-18  8:22                                                         ` Ulrich Drepper
  2003-11-18 16:45                                                           ` David Mosberger
                                                                             ` (2 more replies)
  0 siblings, 3 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-18  8:22 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> I tracked this down now and unfortunately it's a fundamental
> limitation of the the GCC ia64 unwinder:

Well, I have to trust you on this.  No time to dive into all these details.


> For NPTL, I suppose that would
> mean to check whether the available unwinder can properly unwind
> across signal-handlers and, if so, enable the new syscall stubs.

It's also a runtime issue.  We can certainly require a fixed libgcc_s
with support for the external libunwind at compile time.  But it'll be a
while until things are fixed correctly everywhere.

Jakub will know better how to handle the gcc side.  I'll defer to him
providing such a gcc.  If it works and is acceptable, we can look at
transitioning over to it.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/udR12ijCOnn/RHQRAgRgAJ9S3zorsUt9QDIfWjZQYmHi5RVMjgCgrgM5
m0wvCdfNVte+WiUfDn6GK3E=
=atBx
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  8:22                                                         ` Ulrich Drepper
@ 2003-11-18 16:45                                                           ` David Mosberger
  2003-11-19 23:37                                                           ` unwind failures due to __pthread_initialize_minimal David Mosberger
  2003-11-26  9:40                                                           ` new syscall stub support for ia64 libc David Mosberger
  2 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-18 16:45 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Tue, 18 Nov 2003 00:12:37 -0800, Ulrich Drepper <drepper@redhat.com> said:

  >> I tracked this down now and unfortunately it's a fundamental
  >> limitation of the the GCC ia64 unwinder:

  Uli> Well, I have to trust you on this.  No time to dive into all
  Uli> these details.

Sure.  For the record, here are some pointers to why this can't
work: the first clue is this line:

  fs->when_target = (context->rp - context->region_start) / 16 * 3;

This ignores the slot-number, which is OK as long as you do
synchronous unwinding only.  Of course, this by itself would
be trivial to fix.  Something like this would do:

  fs->when_target = ((context->rp & ~0xfUL) - context->region_start) / 16 * 3
	  + (context->rp & 0xf);

The next clue is in uw_update_reg_address(), where there is
stuff along the lines of:

    case UNW_WHERE_BR:
      /* Note that while RVAL can only be 1-5 from normal descriptors,
	 we can want to look at B0 due to having manually unwound a
	 signal frame.  */
      if (rval <= 5)
	addr = context->br_loc[rval];
      else
	abort ();
      break;

Here, you see that branch registers b6 and b7 (which are scratch regs)
are not handled at all.

Then, we can look at desc_abi(): it doesn't do anything, meaning the
unwinder won't be able to detect signal frames at all.

Other issues: that unwinder isn't able to resume execution at a point
that was interrupted by a signal (scratch regs won't get restored), is
known to be broken w.r.t. sigstack(), and cannot support dynamically
generated code.

So I think it's just time to give up on it (unless someone wants to
volunteer and fix those things up; but given that they already work
with libunwind, that seemsm like a wasted effort to me).

  >> For NPTL, I suppose that would mean to check whether the
  >> available unwinder can properly unwind across signal-handlers
  >> and, if so, enable the new syscall stubs.

  Uli> It's also a runtime issue.  We can certainly require a fixed
  Uli> libgcc_s with support for the external libunwind at compile
  Uli> time.  But it'll be a while until things are fixed correctly
  Uli> everywhere.

Hopefully it won't take too long.

  Uli> Jakub will know better how to handle the gcc side.  I'll defer
  Uli> to him providing such a gcc.  If it works and is acceptable, we
  Uli> can look at transitioning over to it.

I submitted some fix ups for GCC just yesterday.  With those, GCC
should be all sets.

The only outstanding issue for libc is the fact that it's linking in
libgcc_eh.a (even for dynamically linked apps).  I'm not convinced it
should and if it really does, it probably needs a way to figure out
what dependent libraries there might be (e.g., libunwind-enabled GCC
would need -lunwind along with -lgcc_eh).

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  2:17                                                                 ` Ulrich Drepper
  2003-11-18  5:44                                                                   ` David Mosberger
@ 2003-11-18 19:18                                                                   ` David Mosberger
  2003-11-18 19:35                                                                     ` Ulrich Drepper
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-18 19:18 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Mon, 17 Nov 2003 18:08:03 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> David Mosberger wrote:

  >> The one I mentioned: signal handler gets called in this code
  >> right before the _IO_flockfile():

  >> _IO_FILE *_IO_acquire_lock_file \ __attribute__((cleanup
  >> (_IO_acquire_lock_fct))) \ = (_fp); \ _IO_flockfile
  >> (_IO_acquire_lock_file);

  >> and then the signal handler calls write(), which ends up getting
  >> cancelled.  What prevents this from happening?

  Uli> Why should it be prevented?  If you call write in a signal
  Uli> handler you either disable cancellation of live with it.

Hans Boehm pointed out that pthread_setcancelstate() isn't
asynch-signal-safe, so cancellation would have to be turned off for
the entire app, not just while a signal-handler is running.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18 19:18                                                                   ` David Mosberger
@ 2003-11-18 19:35                                                                     ` Ulrich Drepper
  2003-11-18 20:08                                                                       ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-18 19:35 UTC (permalink / raw)
  To: davidm; +Cc: Jakub Jelinek, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Hans Boehm pointed out that pthread_setcancelstate() isn't
> asynch-signal-safe,

Where does he get this from?  pthread_setcancelstate() is one of the
three functions which are guaranteed to be async-cancel safe.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/unMX2ijCOnn/RHQRAu+hAJ4q6sIrnEiC5awtd+mb5EN3DDoO7QCdFlnD
WBEdMuj6xlqA0DkbQC/r1mY=
=H8sc
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18 19:35                                                                     ` Ulrich Drepper
@ 2003-11-18 20:08                                                                       ` David Mosberger
  0 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-18 20:08 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

>>>>> On Tue, 18 Nov 2003 11:29:27 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> David Mosberger wrote:

  >> Hans Boehm pointed out that pthread_setcancelstate() isn't
  >> asynch-signal-safe,

  Uli> Where does he get this from?

I assume from here (table of async-safe routines in the "Signal
Concepts" section):

 http://www.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_09.html

  Uli> pthread_setcancelstate() is one of the three functions which
  Uli> are guaranteed to be async-cancel safe.

Yes, that's why I _assumed_ it would be async-signal-safe, but it
seems now that was an incorrect assumption.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: patch to fix unwind info for ia64
  2003-11-15 18:52                                                   ` David Mosberger
@ 2003-11-19  6:19                                                     ` David Mosberger
  2003-11-19 15:25                                                     ` Ulrich Drepper
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-11-19  6:19 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: davidm, Ulrich Drepper, libc-hacker

Could this patch please be applied?  Roland suggested to add a
configure test, but I'm not sure it's OK to rely on readelf in
configure and I also think it's total overkill given that it's
unlikely that the GAS bug will be fixed any time soon (or ever).

Thanks,

	--david

>>>>> On Sat, 15 Nov 2003 18:41:34 +0100, Andreas Schwab <schwab@suse.de> said:

  Andreas> Why is ALIGN taking a parameter if it's always ignored
  Andreas> anyway?

Thanks for catching this typo.  Updated patch is attached.

	--david

ChangeLog

2003-11-14 David Mosberger   <davidm@hpl.hp.com>

	* sysdeps/ia64/memccpy.S: Work around GAS_ALIGN_BREAKS_UNWIND_INFO bug.
	* sysdeps/ia64/memcpy.S: Ditto.
	* sysdeps/ia64/memset.S: Ditto.
	* sysdeps/ia64/memmove.S: Ditto.  Also move the jump-table to
	out of .tex into .rodata, where it belongs

	* sysdeps/unix/sysv/linux/ia64/sysdep.h
	(GAS_ALIGN_BREAKS_UNWIND_INFO): Define this macro to indicate
	that all existing GAS versions have a problem with .align inside
	a function.

	* sysdeps/unix/sysv/linux/ia64/pipe.S: There is no need to
	save/restore input-arguments, because they're necessarily
	preserved by the kernel to support syscall-restart.

Index: sysdeps/ia64/memccpy.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memccpy.S,v
retrieving revision 1.6
diff -u -r1.6 memccpy.S
--- sysdeps/ia64/memccpy.S	9 Sep 2003 20:15:59 -0000	1.6
+++ sysdeps/ia64/memccpy.S	14 Nov 2003 20:05:30 -0000
@@ -52,6 +52,15 @@
 #define loopcnt		r30
 #define	value		r31
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+/* Manually force proper loop-alignment.  Note: be sure to
+   double-check the code-layout after making any changes to
+   this routine! */
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align n
+#endif
+
 ENTRY(memccpy)
 	.prologue
 	alloc 	r2 = ar.pfs, 4, 40 - 4, 0, 40
@@ -110,7 +119,7 @@
 	mov	ar.ec = MEMLAT + 6 + 1 	// six more passes needed
 	ld8	r[1] = [asrc], 8 	// r[1] = w0
 	cmp.ne	p6, p0 = r0, r0	;;	// clear p6
-	.align	32
+	ALIGN(32)
 .l2:
 (p[0])		ld8.s	r[0] = [asrc], 8		// r[0] = w1
 (p[MEMLAT])	shr.u	tmp1[0] = r[1 + MEMLAT], sh1	// tmp1 = w0 >> sh1
Index: sysdeps/ia64/memcpy.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memcpy.S,v
retrieving revision 1.10
diff -u -r1.10 memcpy.S
--- sysdeps/ia64/memcpy.S	29 Apr 2003 22:47:19 -0000	1.10
+++ sysdeps/ia64/memcpy.S	14 Nov 2003 20:05:30 -0000
@@ -103,14 +103,22 @@
 #define the_z		z
 #endif
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+/* Manually force proper loop-alignment.  Note: be sure to
+   double-check the code-layout after making any changes to
+   this routine! */
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align n
+#endif
 
 #if defined(USE_LFETCH)
 #define LOOP(shift)						\
-		.align	32 ;					\
+		ALIGN(32);					\
 .loop##shift##:							\
 { .mmb								\
 (p[0])	ld8.nt1	r[0] = [asrc], 8 ;				\
-(p[0])	lfetch.nt1 [ptr1], 16 ;				\
+(p[0])	lfetch.nt1 [ptr1], 16 ;					\
 	nop.b 0 ;						\
 } { .mib							\
 (p[MEMLAT+1]) st8 [dest] = tmp3, 8 ;				\
@@ -118,7 +126,7 @@
  	nop.b 0 ;;						\
  } { .mmb							\
 (p[0])	ld8.nt1	s[0] = [asrc], 8 ;				\
-(p[0])	lfetch.nt1	[ptr2], 16 ;			\
+(p[0])	lfetch.nt1	[ptr2], 16 ;				\
 	nop.b 0 ;						\
 } { .mib							\
 (p[MEMLAT+1]) st8 [dest] = tmp4, 8 ;				\
@@ -130,7 +138,7 @@
 }
 #else
 #define LOOP(shift)						\
-		.align	32 ;					\
+		ALIGN(32);					\
 .loop##shift##:							\
 { .mmb								\
 (p[0])	ld8.nt1	r[0] = [asrc], 8 ;				\
@@ -254,7 +262,11 @@
 	movi0	ar.lc = loopcnt 	// set the loop counter
 ;; }
 
+#ifdef  GAS_ALIGN_BREAKS_UNWIND_INFO
+	{ nop 0 }
+#else
 	.align	32
+#endif
 #if defined(USE_FLP)
 .l1: // ------------------------------- // L1: Everything a multiple of 8
 { .mmi
Index: sysdeps/ia64/memmove.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memmove.S,v
retrieving revision 1.6
diff -u -r1.6 memmove.S
--- sysdeps/ia64/memmove.S	29 Apr 2003 22:47:19 -0000	1.6
+++ sysdeps/ia64/memmove.S	14 Nov 2003 20:05:30 -0000
@@ -56,12 +56,18 @@
 #define loopcnt		r30
 #define	value		r31
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+# define ALIGN(n)	{ nop 0 }
+#else
+# define ALIGN(n)	.align n
+#endif
+
 #define LOOP(shift)							\
-		.align	32 ;						\
+		ALIGN(32);						\
 .loop##shift##:								\
 (p[0])		ld8	r[0] = [asrc], 8 ;	/* w1 */		\
 (p[MEMLAT+1])	st8	[dest] = value, 8 ;				\
-(p[MEMLAT])	shrp	value = r[MEMLAT], r[MEMLAT+1], shift ;	\
+(p[MEMLAT])	shrp	value = r[MEMLAT], r[MEMLAT+1], shift ;		\
 		nop.b	0 ;						\
 		nop.b	0 ;						\
 		br.ctop.sptk .loop##shift ;				\
@@ -228,6 +234,10 @@
 (p[MEMLAT])	st1	[dest] = r[MEMLAT], -1
 		br.ctop.dptk .l6
 		br.cond.sptk .restore_and_exit
+END(memmove)
+
+	.rodata
+	.align 8
 .table:
 	data8	0			// dummy entry
 	data8 	.loop56 - .loop8
@@ -238,5 +248,4 @@
 	data8	.loop56 - .loop48
 	data8	.loop56 - .loop56
 
-END(memmove)
 libc_hidden_builtin_def (memmove)
Index: sysdeps/ia64/memset.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/ia64/memset.S,v
retrieving revision 1.6
diff -u -r1.6 memset.S
--- sysdeps/ia64/memset.S	29 Apr 2003 22:47:19 -0000	1.6
+++ sysdeps/ia64/memset.S	14 Nov 2003 20:05:30 -0000
@@ -153,7 +153,9 @@
 (p_zr)	br.cond.dptk.many .l1b			// Jump to use stf.spill
 ;; }
 
+#ifndef GAS_ALIGN_BREAKS_UNWIND_INFO
 	.align 32 // -------- //  L1A: store ahead into cache lines; fill later
+#endif
 { .mmi
 	and	tmp = -(LINE_SIZE), cnt		// compute end of range
 	mov	ptr9 = ptr1			// used for prefetching
@@ -222,7 +224,11 @@
 	br.cond.dpnt.many  .move_bytes_from_alignment	// Branch no. 3
 ;; }
 
+#ifdef GAS_ALIGN_BREAKS_UNWIND_INFO
+	{ nop 0 }
+#else
 	.align 32
+#endif
 .l1b:	// ------------------ //  L1B: store ahead into cache lines; fill later
 { .mmi
 	and	tmp = -(LINE_SIZE), cnt		// compute end of range
@@ -283,13 +289,15 @@
 { .mib
 	cmp.eq	p_scr, p0 = loopcnt, r0
 	add	loopcnt = -1, loopcnt
-(p_scr)	br.cond.dpnt.many .store_words
+(p_scr)	br.cond.dpnt.many store_words
 ;; }
 { .mib
 	and	cnt = 0x1f, cnt		// compute the remaining cnt
 	movi0   ar.lc = loopcnt
 ;; }
+#ifndef GAS_ALIGN_BREAKS_UNWIND_INFO
 	.align 32
+#endif
 .l2:	// ---------------------------- //  L2A:  store 32B in 2 cycles
 { .mmb
 	store	[ptr1] = myval, 8
@@ -299,7 +307,7 @@
 	store	[ptr2] = myval, 24
 	br.cloop.dptk.many .l2
 ;; }
-.store_words:
+store_words:
 { .mib
 	cmp.gt	p_scr, p0 = 8, cnt		// just a few bytes left ?
 (p_scr)	br.cond.dpnt.many .move_bytes_from_alignment	// Branch
Index: sysdeps/unix/sysv/linux/ia64/pipe.S
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/pipe.S,v
retrieving revision 1.4
diff -u -r1.4 pipe.S
--- sysdeps/unix/sysv/linux/ia64/pipe.S	3 Aug 2002 06:57:51 -0000	1.4
+++ sysdeps/unix/sysv/linux/ia64/pipe.S	14 Nov 2003 20:05:31 -0000
@@ -22,15 +22,14 @@
 #include <sysdep.h>
 
 ENTRY(__pipe)
-       st8 [sp]=r32		// save ptr across system call
+       .regstk 1,0,0,0
        DO_CALL (SYS_ify (pipe))
-       ld8 r2=[sp]
        cmp.ne p6,p0=-1,r10
        ;;
-(p6)   st4 [r2]=r8,4
+(p6)   st4 [in0]=r8,4
 (p6)   mov ret0=0
        ;;
-(p6)   st4 [r2]=r9
+(p6)   st4 [in0]=r9
 (p6)   ret
        br.cond.spnt.few __syscall_error
 PSEUDO_END(__pipe)
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
===================================================================
RCS file: /cvs/glibc/libc/sysdeps/unix/sysv/linux/ia64/sysdep.h,v
retrieving revision 1.17
diff -u -r1.17 sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h	16 Aug 2003 08:00:24 -0000	1.17
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h	14 Nov 2003 20:05:31 -0000
@@ -24,6 +24,13 @@
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
 
+/* As of GAS v2.4.90.0.7, including a ".align" directive inside a
+   function will cause bad unwind info to be emitted (GAS doesn't know
+   how to account for the padding introduced by the .align directive).
+   Turning on this macro will work around this bug by introducing the
+   necessary padding explicitly. */
+#define GAS_ALIGN_BREAKS_UNWIND_INFO
+
 /* For Linux we can use the system call table in the header file
 	/usr/include/asm/unistd.h
    of the kernel.  But these symbols do not follow the SYS_* syntax

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: patch to fix unwind info for ia64
  2003-11-15 18:52                                                   ` David Mosberger
  2003-11-19  6:19                                                     ` David Mosberger
@ 2003-11-19 15:25                                                     ` Ulrich Drepper
  1 sibling, 0 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-19 15:25 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:
> Could this patch please be applied?  Roland suggested to add a
> configure test, but I'm not sure it's OK to rely on readelf in
> configure and I also think it's total overkill given that it's
> unlikely that the GAS bug will be fixed any time soon (or ever).

I've applied the patch.  Thanks,

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/uwnk2ijCOnn/RHQRAim3AJ0UsyVK4HBZlYk6qMarzvF+m5RXyACgtf5n
rJSCVTtws8/a9PXrfLlcBXo=
=dJ08
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* unwind failures due to __pthread_initialize_minimal
  2003-11-18  8:22                                                         ` Ulrich Drepper
  2003-11-18 16:45                                                           ` David Mosberger
@ 2003-11-19 23:37                                                           ` David Mosberger
  2003-11-19 23:54                                                             ` Ulrich Drepper
  2003-11-26  9:40                                                           ` new syscall stub support for ia64 libc David Mosberger
  2 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-19 23:37 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: libc-hacker

I'm seeing unwind errors because __pthread_initialize_minimal is being
called through the init-section rather than the init_array mechanism.
Is there a reason why nptl/sysdeps/pthread/pt-initfini.c can't be
re-written to use the constructor/destructor attributes instead?  That
would do the right thing on each platform (i.e., use .init/.fini where
necessary and .init_array/.fini_array where available).  If there is
nothing that prevents this, I'd be willing to create the necessary
patch.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-11-19 23:37                                                           ` unwind failures due to __pthread_initialize_minimal David Mosberger
@ 2003-11-19 23:54                                                             ` Ulrich Drepper
  2003-11-20  0:30                                                               ` Roland McGrath
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-19 23:54 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:
> I'm seeing unwind errors because __pthread_initialize_minimal is being
> called through the init-section rather than the init_array mechanism.

Just write your own pt-initfini file for ia64.  The requirement is that
the code up to the __pthread_initialize_minimal functions depends on
nothing but basic ld.so functionality.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/u/0k2ijCOnn/RHQRAgI1AKC2j9A/7qgByt2/YQmjGzpa+4C54QCePB0a
e5hEmcDhTklydjEOBu3WpSA=
=9kwy
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-11-19 23:54                                                             ` Ulrich Drepper
@ 2003-11-20  0:30                                                               ` Roland McGrath
  2003-11-20  2:35                                                                 ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Roland McGrath @ 2003-11-20  0:30 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, libc-hacker

> Just write your own pt-initfini file for ia64.  The requirement is that
> the code up to the __pthread_initialize_minimal functions depends on
> nothing but basic ld.so functionality.

Remind me what happens wrong if the normal constructor mechanism is used in
libpthread.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-11-20  0:30                                                               ` Roland McGrath
@ 2003-11-20  2:35                                                                 ` David Mosberger
  2003-11-20  4:01                                                                   ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-20  2:35 UTC (permalink / raw)
  To: Ulrich Drepper, Roland McGrath; +Cc: davidm, libc-hacker

>>>>> On Wed, 19 Nov 2003 15:54:49 -0800, Roland McGrath <roland@redhat.com> said:

  >> Just write your own pt-initfini file for ia64.  The requirement
  >> is that the code up to the __pthread_initialize_minimal functions
  >> depends on nothing but basic ld.so functionality.

  Roland> Remind me what happens wrong if the normal constructor
  Roland> mechanism is used in libpthread.

The problem is that the final function is constructed only at
link-time, so the assembler cannot emit the proper
function-length-information needed for the unwind-info.

>>>>> On Wed, 19 Nov 2003 15:30:44 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> Just write your own pt-initfini file for ia64.  The requirement
  Uli> is that the code up to the __pthread_initialize_minimal
  Uli> functions depends on nothing but basic ld.so functionality.

Ah, in fact there is already an ia64-specific version.  I missed it at
first.  Turns out the problem really was worse: not only was it using
.init, but the unwind directives for both _init and _fini were
missing!  With the proper unwind info for _init and _fini, the old
code was lucky enough to work fine, but I still think it's better to
switch to .init_array both for cleanliness and consistency reasons.
LinuxThreads had the same problem.  Patch to fix both is attached.
Please apply.

Thanks,

	--david

linuxthreads/ChangeLog

2003-11-19 David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/pt-initfini.c (INIT_NEW_WAY): New
	macro.
	(INIT_OLD_WAY): Likewise.  Define these macros depending on
	whether or not HAVE_INITFINI_ARRAY is defined.  If it is, use
	.init_array to invoke __pthread_initialize_minimal_internal.
	Also, add proper unwind-directives for _init and _fini.

nptl/ChangeLog

2003-11-19 David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/pt-initfini.c (INIT_NEW_WAY): New
	macro.
	(INIT_OLD_WAY): Likewise.  Define these macros depending on
	whether or not HAVE_INITFINI_ARRAY is defined.  If it is, use
	.init_array to invoke __pthread_initialize_minimal_internal.
	Also, add proper unwind-directives for _init and _fini.

Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
===================================================================
RCS file: /cvs/glibc/libc/linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c,v
retrieving revision 1.5
diff -u -r1.5 pt-initfini.c
--- linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c	14 Nov 2002 10:49:22 -0000	1.5
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c	20 Nov 2003 00:24:14 -0000
@@ -1,5 +1,5 @@
 /* Special .init and .fini section support for ia64. LinuxThreads version.
-   Copyright (C) 2000, 2001, 2002 Free Software Foundation, Inc.
+   Copyright (C) 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it
@@ -36,34 +36,51 @@
    * crtn.s puts the corresponding function epilogues
    in the .init and .fini sections. */
 
+#include <stddef.h>
+
+#ifdef HAVE_INITFINI_ARRAY
+
+# define INIT_NEW_WAY \
+    ".xdata8 \".init_array\", @fptr(__pthread_initialize_minimal_internal)\n"
+# define INIT_OLD_WAY ""
+#else
+# define INIT_NEW_WAY ""
+# define INIT_OLD_WAY \
+	"\n\
+	st8 [r12] = gp, -16\n\
+	br.call.sptk.many b0 = __pthread_initialize_minimal_internal# ;;\n\
+	;;\n\
+	adds r12 = 16, r12\n\
+	;;\n\
+	ld8 gp = [r12]\n\
+	;;\n"
+#endif
+
 __asm__ ("\n\
 \n\
 #include \"defs.h\"\n\
 \n\
 /*@HEADER_ENDS*/\n\
 \n\
-/*@_init_PROLOG_BEGINS*/\n\
-	.section .init\n\
+/*@_init_PROLOG_BEGINS*/\n"
+	INIT_NEW_WAY
+	".section .init\n\
 	.align 16\n\
 	.global _init#\n\
 	.proc _init#\n\
 _init:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
-	;;\n\
-/* we could use r35 to save gp, but we use the stack since that's what\n\
- * all the other init routines will do --davidm 00/04/05 */\n\
-	st8 [r12] = gp, -16\n\
-	br.call.sptk.many b0 = __pthread_initialize_minimal# ;;\n\
-	;;\n\
-	adds r12 = 16, r12\n\
-	;;\n\
-	ld8 gp = [r12]\n\
-	;;\n\
-	.align 16\n\
-	.endp _init#\n\
+	;;\n"
+	INIT_OLD_WAY
+	".endp _init#\n\
 \n\
 /*@_init_PROLOG_ENDS*/\n\
 \n\
@@ -83,12 +100,16 @@
 	.global _fini#\n\
 	.proc _fini#\n\
 _fini:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-	.align 16\n\
 	.endp _fini#\n\
 \n\
 /*@_fini_PROLOG_ENDS*/\n\
Index: nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
===================================================================
RCS file: /cvs/glibc/libc/nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c,v
retrieving revision 1.1
diff -u -r1.1 pt-initfini.c
--- nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c	11 Mar 2003 09:20:41 -0000	1.1
+++ nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c	20 Nov 2003 00:24:15 -0000
@@ -36,34 +36,51 @@
    * crtn.s puts the corresponding function epilogues
    in the .init and .fini sections. */
 
+#include <stddef.h>
+
+#ifdef HAVE_INITFINI_ARRAY
+
+# define INIT_NEW_WAY \
+    ".xdata8 \".init_array\", @fptr(__pthread_initialize_minimal_internal)\n"
+# define INIT_OLD_WAY ""
+#else
+# define INIT_NEW_WAY ""
+# define INIT_OLD_WAY \
+	"\n\
+	st8 [r12] = gp, -16\n\
+	br.call.sptk.many b0 = __pthread_initialize_minimal_internal# ;;\n\
+	;;\n\
+	adds r12 = 16, r12\n\
+	;;\n\
+	ld8 gp = [r12]\n\
+	;;\n"
+#endif
+
 __asm__ ("\n\
 \n\
 #include \"defs.h\"\n\
 \n\
 /*@HEADER_ENDS*/\n\
 \n\
-/*@_init_PROLOG_BEGINS*/\n\
-	.section .init\n\
+/*@_init_PROLOG_BEGINS*/\n"
+	INIT_NEW_WAY
+	".section .init\n\
 	.align 16\n\
 	.global _init#\n\
 	.proc _init#\n\
 _init:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
-	;;\n\
-/* we could use r35 to save gp, but we use the stack since that's what\n\
- * all the other init routines will do --davidm 00/04/05 */\n\
-	st8 [r12] = gp, -16\n\
-	br.call.sptk.many b0 = __pthread_initialize_minimal_internal# ;;\n\
-	;;\n\
-	adds r12 = 16, r12\n\
-	;;\n\
-	ld8 gp = [r12]\n\
-	;;\n\
-	.align 16\n\
-	.endp _init#\n\
+	;;\n"
+	INIT_OLD_WAY
+	".endp _init#\n\
 \n\
 /*@_init_PROLOG_ENDS*/\n\
 \n\
@@ -83,12 +100,16 @@
 	.global _fini#\n\
 	.proc _fini#\n\
 _fini:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-	.align 16\n\
 	.endp _fini#\n\
 \n\
 /*@_fini_PROLOG_ENDS*/\n\

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-11-20  2:35                                                                 ` David Mosberger
@ 2003-11-20  4:01                                                                   ` Ulrich Drepper
  2003-11-20 21:20                                                                     ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-11-20  4:01 UTC (permalink / raw)
  To: davidm; +Cc: Roland McGrath, libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Ah, in fact there is already an ia64-specific version.  I missed it at
> first.  Turns out the problem really was worse: not only was it using
> .init, but the unwind directives for both _init and _fini were
> missing!  With the proper unwind info for _init and _fini, the old
> code was lucky enough to work fine, but I still think it's better to
> switch to .init_array both for cleanliness and consistency reasons.
> LinuxThreads had the same problem.  Patch to fix both is attached.

Mostly OK.  But there still is a .init section and a DT_INIT entry.  The
later must go.  I tried to simply remove the code for the .init section
but this causes some tests to fail.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/vCYh2ijCOnn/RHQRAi6bAJ9WwaTAGxJiqiio3+/W6sPIvv5cIgCeNVy6
lwmB3p/+lDBOSJVsPDCuSk4=
=NY5F
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-11-20  4:01                                                                   ` Ulrich Drepper
@ 2003-11-20 21:20                                                                     ` David Mosberger
  2003-12-07  1:46                                                                       ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-20 21:20 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Roland McGrath, libc-hacker

>>>>> On Wed, 19 Nov 2003 18:25:37 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> David Mosberger wrote:

  >> Ah, in fact there is already an ia64-specific version.  I missed
  >> it at first.  Turns out the problem really was worse: not only
  >> was it using .init, but the unwind directives for both _init and
  >> _fini were missing!  With the proper unwind info for _init and
  >> _fini, the old code was lucky enough to work fine, but I still
  >> think it's better to switch to .init_array both for cleanliness
  >> and consistency reasons.  LinuxThreads had the same problem.
  >> Patch to fix both is attached.

  Uli> Mostly OK.  But there still is a .init section and a DT_INIT
  Uli> entry.  The later must go.  I tried to simply remove the code
  Uli> for the .init section but this causes some tests to fail.

I don't have the complete picture of how pt-initfini.c is being used,
but if the goal is simply to ensure that (a)
__pthread_initialize_minimal_internal gets called first and (b) to
enable other code to specify their own constructors/destructors, then
I think the attached patch might work.  It certainly would be much
cleaner this way and it does get rid of the DT_INIT and DT_FINI
entries.

The patch seems to work fine for me (same "make check" results as
before).  I didn't update the linuxthreads version yet, but if this
patch looks fine, that's trivial to do.

	--david

Index: nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
===================================================================
RCS file: /cvs/glibc/libc/nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c,v
retrieving revision 1.1
diff -u -r1.1 pt-initfini.c
--- nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c	11 Mar 2003 09:20:41 -0000	1.1
+++ nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c	20 Nov 2003 04:01:06 -0000
@@ -36,6 +36,22 @@
    * crtn.s puts the corresponding function epilogues
    in the .init and .fini sections. */
 
+#include <stddef.h>
+
+#ifdef HAVE_INITFINI_ARRAY
+
+__asm__ ("\n\
+#include \"defs.h\"\n\
+\n\
+/*@HEADER_ENDS*/\n\
+\n\
+/*@_init_PROLOG_BEGINS*/\n\
+	.xdata8 \".init_array\",@fptr(__pthread_initialize_minimal_internal)\n\
+/*@_init_PROLOG_ENDS*/\n\
+");
+
+#else
+
 __asm__ ("\n\
 \n\
 #include \"defs.h\"\n\
@@ -48,13 +64,16 @@
 	.global _init#\n\
 	.proc _init#\n\
 _init:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-/* we could use r35 to save gp, but we use the stack since that's what\n\
- * all the other init routines will do --davidm 00/04/05 */\n\
 	st8 [r12] = gp, -16\n\
 	br.call.sptk.many b0 = __pthread_initialize_minimal_internal# ;;\n\
 	;;\n\
@@ -62,7 +81,6 @@
 	;;\n\
 	ld8 gp = [r12]\n\
 	;;\n\
-	.align 16\n\
 	.endp _init#\n\
 \n\
 /*@_init_PROLOG_ENDS*/\n\
@@ -83,12 +101,16 @@
 	.global _fini#\n\
 	.proc _fini#\n\
 _fini:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-	.align 16\n\
 	.endp _fini#\n\
 \n\
 /*@_fini_PROLOG_ENDS*/\n\
@@ -106,3 +128,5 @@
 /*@TRAILER_BEGINS*/\n\
 	.weak	__gmon_start__#\n\
 ");
+
+#endif

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-18  8:22                                                         ` Ulrich Drepper
  2003-11-18 16:45                                                           ` David Mosberger
  2003-11-19 23:37                                                           ` unwind failures due to __pthread_initialize_minimal David Mosberger
@ 2003-11-26  9:40                                                           ` David Mosberger
  2003-12-03  7:25                                                             ` David Mosberger
  2 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-11-26  9:40 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, Jakub Jelinek, libc-hacker

Just a quick status-update: the new-syscall-stub-enabled libc now
passes all checks (apart from a previously-exiting locale-related
failure) and it all seems to work nice and stable.  The NPTL test
suite combined with the new syscall stubs turned out to be a really
tough test-case, because it stressed lots of things in ways that
wasn't possible before.  As a result, bugs got fixed in libc, gcc (one
still pending), libunwind, and the kernel.

The only outstanding issue now is that of libgcc_eh.a.  The problem is
that since libgcc_eh.a references the unwinder, you'll have to link in
-lunwind when linking in libgcc_eh.a.  My understanding is that
libgcc_eh.a should only be linked into statically-linked applications.
When doing so, linking in libunwind.a seems reasonable enough.
However, at the moment, the libc build environment seems to sometime
link libgcc_eh.a even into dynamically linked binaries and in that
case, you really don't want to have to link in libunwind.a.  For now,
I hacked around the problem by putting the libunwind.a members into
libgcc_eh.a, but that's of course not a clean solution.

A related issue: when building a statically linked binary, shouldn't
the NPTL unwinder also use the static libgcc_eh.a facilities, rather
than loading libgcc_s.so at runtime?  Doing the latter can result in
rather difficult to track down bugs when there is a version conflict
between libgcc_eh.a and libgcc_s.so.

Comments?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-11-26  9:40                                                           ` new syscall stub support for ia64 libc David Mosberger
@ 2003-12-03  7:25                                                             ` David Mosberger
  2003-12-08 18:16                                                               ` Jakub Jelinek
  2003-12-10 23:22                                                               ` Ulrich Drepper
  0 siblings, 2 replies; 98+ messages in thread
From: David Mosberger @ 2003-12-03  7:25 UTC (permalink / raw)
  To: Ulrich Drepper, Jakub Jelinek, libc-hacker

Executive Summary:

 Please apply this patch.  It's Good.

Long version:

The patch below adds new syscall support for ia64 Linux.  Compared to
the earlier versions, it has a new autoconf test which ensures that
USE_DL_SYSINFO only gets defined if the compiler uses an unwinder that
is based on libunwind.  As explained earlier, the built-in unwinder
for GCC is hopeless and so there is no point trying to support it.

I tested these 4 configurations:

 (1) NPTL with a libunwind-enabled GCC
	(gcc 3.3.2 with libunwind fixes, kernel v2.6.0-test11, libunwind v0.95)
 (2) linuxthreads with a libunwind-enabled GCC
	(gcc 3.3.2 with libunwind fixes, kernel v2.6.0-test11, libunwind v0.95)
 (3) NPTL without a libunwind-enabled GCC
	(gcc 3.3.2 from Debian/unstable, kernel v2.6.0-test11)
 (4) linuxthreads without a libunwind-enabled GCC
	(gcc 3.3.2 from Debian/unstable, kernel v2.6.0-test11)

The linuxthreads versions (configs 2 and 4) have no failures (apart
from the normal tst-numeric failure, which is due to some locale files
that aren't installed on my machine).

NPTL with libunwind-enabled GCC (config 1) likewise has no failures.

NPTL *without* libunwind-enabled GCC (config 3) has several failures
(cancel{6,17), cancelx{4,5,6,16,17,18,oncex4} but the behavior is
identical compared to stock CVS libc (as of today), provided the
kernel does _not_ pass the gate DSO address via AT_SYSINFO_EHDR.  If
the kernel _does_ pass this address, then there are a couple of
additional failures (e.g., cancel2 and cancel3 would also fail).
These are due to the fact that the GCC built-in unwinder can only
unwind across the signal trampoline if there is _no_ unwind info for
the trampoline.  Since AT_SYSINFO_EHDR registers the unwind-info for
the signal trampoline and the built-in unwinder is not capable of
properly handling this info (e.g., it ignores the UNWABI directive),
this causes the additional failures.  If somebody _really_ cared, this
particular issue could be fixed relatively easily (e.g., if the Linux
sigtramp UNWABI directive is found, MD_FALLBACK_FRAME_STATE_FOR()
could be applied to step over the signal trampoline).  However, given
all the other problems with the built-in unwinder, I suspect it's just
snot worth bothering with it.

	--david

-------------------------------------------------------------------
ChangeLog

2003-12-02  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/ia64/elf/initfini.c: Add missing unwind directives.

	* sysdeps/ia64/dl-machine.h (elf_machine_matches_host): Mark with
	attribute "unused".
	(elf_machine_dynamic): Mark with attributes "unused" and "const".
	(elf_machine_runtime_setup): Likewise.

	* sysdeps/generic/dl-fptr.c (make_fptr_table): Mark with
	attribute "always_inline".
	* sysdeps/ia64/dl-machine.h (__ia64_init_bootstrap_fdesc_table):
	Likewise.

	* configure.in: Check whether compiler has libunwind support.

	* config.make.in (have-cc-with-libunwind): New variable.

	* config.h.in (HAVE_CC_WITH_LIBUNWIND): New macro.

	* Makeconfig (gnulib): If have-cc-withh-libunwind is "yes", also
	mention -lunwind.

2003-11-12  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/vfork.S: Use DO_CALL_VIA_BREAK()
	instead of DO_CALL().

	* sysdeps/unix/sysv/linux/ia64/brk.S (__curbrk): Restructure it
	to take advantage of DO_CALL() macro.
	* sysdeps/unix/sysv/linux/ia64/setcontext.S: Ditto.
	* sysdeps/unix/sysv/linux/ia64/getcontext.S: Ditto.

	* elf/rtld.c (dl_main): Restrict dl_sysinfo_dso check to first
	program header.  On ia64, the check failed previously because
	there are two program headers.

-------------------------------------------------------------------
linuxthreads/ChangeLog

2003-11-19  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/dl-sysdep.h: New file.

	* sysdeps/unix/sysv/linux/ia64/pt-initfini.c (INIT_NEW_WAY): New
	macro.
	(INIT_OLD_WAY): Likewise.  Define these macros depending on
	whether or not HAVE_INITFINI_ARRAY is defined.  If it is, use
	.init_array to invoke __pthread_initialize_minimal.  Also, add
	proper unwind-directives for _init and _fini.

-------------------------------------------------------------------
nptl/ChangeLog

2003-12-02  David Mosberger  <davidm@hpl.hp.com>

	* Makefile (link-libc-static): Remove -lgcc_eh---it's already mentioned
	in $(gnulib).  Also, remove stale comment.

2003-11-19  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/pt-initfini.c (INIT_NEW_WAY): New
	macro.
	(INIT_OLD_WAY): Likewise.  Define these macros depending on
	whether or not HAVE_INITFINI_ARRAY is defined.  If it is, use
	.init_array to invoke __pthread_initialize_minimal_internal.
	Also, add proper unwind-directives for _init and _fini.

2003-11-12  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h (PSEUDO): Take
	advantage of new syscall stub and optimize accordingly.

	* sysdeps/unix/sysv/linux/ia64/lowlevellock.h (__NR_futex): Rename
	from SYS_futex, to match expectations of
	sysdep.h:DO_INLINE_SYSCALL.
	(lll_futex_clobbers): Remove.
	(lll_futex_timed_wait): Rewrite in terms of DO_INLINE_SYSCALL.
	(lll_futex_wake): Ditto.
	(lll_futex_requeue): Ditto.
	(__lll_mutex_trylock): Rewrite to a macro, so we can include this
	file before DO_INLINE_SYSCALL is defined (proposed by Jakub
	Jelinek).
	(__lll_mutex_lock): Ditto.
	(__lll_mutex_cond_lock): Ditto.
	(__lll_mutex_timed_lock): Ditto.
	(__lll_mutex_unlock): Ditto.
	(__lll_mutex_unlock_force): Ditto.

	* sysdeps/pthread/createthread.c (create_thread): Use
	THREAD_SELF_SYSINFO and THREAD_SYSINFO instead of open code.

	* sysdeps/ia64/tls.h: Move declaration of __thread_self up so it
	comes before the include of <sysdep.h>.
	(THREAD_SELF_SYSINFO): New macro.
	(THREAD_SYSINFO): Ditto.
	(INIT_SYSINFO): New macro.
	(TLS_INIT_TP): Call INIT_SYSINFO.

	* sysdeps/ia64/tcb-offsets.sym: Add SYSINFO_OFFSET.

	* allocatestack.c (allocate_stack): Use THREAD_SYSINFO and
	THREAD_SELF_SYSINFO instead of open code.

	* sysdeps/unix/sysv/linux/ia64/dl-sysdep.h: New file.

	* sysdeps/i386/tls.h (THREAD_SELF_SYSINFO): New macro.
	(THREAD_SYSINFO): Ditto.

-------------------------------------------------------------------
Index: Makeconfig
--- Makeconfig
+++ Makeconfig
@@ -511,7 +511,11 @@
 link-extra-libs-bounded = $(foreach lib,$(LDLIBS-$(@F:%-bp=%)),$(common-objpfx)$(lib)_b.a)
 
 ifndef gnulib
-gnulib := -lgcc -lgcc_eh
+ifneq ($(have-cc-with-libunwind),yes)
+ gnulib := -lgcc -lgcc_eh
+else
+ gnulib := -lgcc -lgcc_eh -lunwind
+endif
 endif
 ifeq ($(elf),yes)
 +preinit = $(addprefix $(csu-objpfx),crti.o)
Index: config.h.in
--- config.h.in
+++ config.h.in
@@ -153,6 +153,9 @@
    sections.  */
 #undef	HAVE_INITFINI_ARRAY
 
+/* Define if the compiler's exception support is based on libunwind.  */
+#undef	HAVE_CC_WITH_LIBUNWIND
+
 /* Define if the access to static and hidden variables is position independent
    and does not need relocations.  */
 #undef	PI_STATIC_AND_HIDDEN
Index: config.make.in
--- config.make.in
+++ config.make.in
@@ -55,6 +55,7 @@
 enable-check-abi = @enable_check_abi@
 have-forced-unwind = @libc_cv_forced_unwind@
 have-fpie = @libc_cv_fpie@
+have-cc-with-libunwind = @libc_cv_cc_with_libunwind@
 fno-unit-at-a-time = @fno_unit_at_a_time@
 
 static-libgcc = @libc_cv_gcc_static_libgcc@
Index: configure.in
--- configure.in
+++ configure.in
@@ -1219,6 +1219,19 @@
     AC_DEFINE(HAVE_INITFINI_ARRAY)
   fi
 
+  AC_CACHE_CHECK(for libunwind-support in compiler,
+		 libc_cv_cc_with_libunwind, [dnl
+    AC_TRY_LINK([#include <libunwind.h>], [
+      unw_context_t uc;
+      unw_cursor_t c;
+      unw_getcontext (&uc);
+      unw_init_local (&c, &uc)],
+        libc_cv_cc_with_libunwind=yes, libc_cv_cc_with_libunwind=no)])
+  AC_SUBST(libc_cv_cc_with_libunwind)
+  if test $libc_cv_cc_with_libunwind = yes; then
+    AC_DEFINE(HAVE_CC_WITH_LIBUNWIND)
+  fi
+
   AC_CACHE_CHECK(for -z nodelete option,
 		 libc_cv_z_nodelete, [dnl
   cat > conftest.c <<EOF
Index: elf/rtld.c
--- elf/rtld.c
+++ elf/rtld.c
@@ -1163,6 +1163,9 @@
       if (__builtin_expect (l != NULL, 1))
 	{
 	  static ElfW(Dyn) dyn_temp[DL_RO_DYN_TEMP_CNT];
+#ifndef NDEBUG
+	  uint_fast16_t pt_load_num = 0;
+#endif
 
 	  l->l_phdr = ((const void *) GL(dl_sysinfo_dso)
 		       + GL(dl_sysinfo_dso)->e_phoff);
@@ -1176,8 +1179,14 @@
 		  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
 		  break;
 		}
+#ifndef NDEBUG
 	      if (ph->p_type == PT_LOAD)
-		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+		{
+		  assert (pt_load_num
+			  || (void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+		  pt_load_num++;
+		}
+#endif
 	    }
 	  elf_get_dynamic_info (l, dyn_temp);
 	  _dl_setup_hash (l);
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,45 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+#define NEED_DL_SYSINFO	1
+#undef USE_DL_SYSINFO
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
--- linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
@@ -1,5 +1,5 @@
 /* Special .init and .fini section support for ia64. LinuxThreads version.
-   Copyright (C) 2000, 2001, 2002 Free Software Foundation, Inc.
+   Copyright (C) 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it
@@ -36,34 +36,51 @@
    * crtn.s puts the corresponding function epilogues
    in the .init and .fini sections. */
 
+#include <stddef.h>
+
+#ifdef HAVE_INITFINI_ARRAY
+
+# define INIT_NEW_WAY \
+    ".xdata8 \".init_array\", @fptr(__pthread_initialize_minimal)\n"
+# define INIT_OLD_WAY ""
+#else
+# define INIT_NEW_WAY ""
+# define INIT_OLD_WAY \
+	"\n\
+	st8 [r12] = gp, -16\n\
+	br.call.sptk.many b0 = __pthread_initialize_minimal# ;;\n\
+	;;\n\
+	adds r12 = 16, r12\n\
+	;;\n\
+	ld8 gp = [r12]\n\
+	;;\n"
+#endif
+
 __asm__ ("\n\
 \n\
 #include \"defs.h\"\n\
 \n\
 /*@HEADER_ENDS*/\n\
 \n\
-/*@_init_PROLOG_BEGINS*/\n\
-	.section .init\n\
+/*@_init_PROLOG_BEGINS*/\n"
+	INIT_NEW_WAY
+	".section .init\n\
 	.align 16\n\
 	.global _init#\n\
 	.proc _init#\n\
 _init:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
-	;;\n\
-/* we could use r35 to save gp, but we use the stack since that's what\n\
- * all the other init routines will do --davidm 00/04/05 */\n\
-	st8 [r12] = gp, -16\n\
-	br.call.sptk.many b0 = __pthread_initialize_minimal# ;;\n\
-	;;\n\
-	adds r12 = 16, r12\n\
-	;;\n\
-	ld8 gp = [r12]\n\
-	;;\n\
-	.align 16\n\
-	.endp _init#\n\
+	;;\n"
+	INIT_OLD_WAY
+	".endp _init#\n\
 \n\
 /*@_init_PROLOG_ENDS*/\n\
 \n\
@@ -83,12 +100,16 @@
 	.global _fini#\n\
 	.proc _fini#\n\
 _fini:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-	.align 16\n\
 	.endp _fini#\n\
 \n\
 /*@_fini_PROLOG_ENDS*/\n\
Index: nptl/Makefile
--- nptl/Makefile
+++ nptl/Makefile
@@ -319,8 +319,7 @@
 CFLAGS-ftrylockfile.c = -D_IO_MTSAFE_IO
 CFLAGS-funlockfile.c = -D_IO_MTSAFE_IO
 
-# Ugly, ugly.  We have to link with libgcc_eh but how?
-link-libc-static := $(common-objpfx)libc.a $(gnulib) -lgcc_eh $(common-objpfx)libc.a
+link-libc-static := $(common-objpfx)libc.a $(gnulib) $(common-objpfx)libc.a
 
 ifeq ($(build-static),yes)
 tests-static += tst-locale1 tst-locale2
Index: nptl/allocatestack.c
--- nptl/allocatestack.c
+++ nptl/allocatestack.c
@@ -352,7 +352,7 @@
 
 #ifdef NEED_DL_SYSINFO
       /* Copy the sysinfo value from the parent.  */
-      pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+      THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
       /* The process ID is also the same as that of the caller.  */
@@ -488,7 +488,7 @@
 
 #ifdef NEED_DL_SYSINFO
 	  /* Copy the sysinfo value from the parent.  */
-	  pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+	  THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
 	  /* The process ID is also the same as that of the caller.  */
Index: nptl/sysdeps/i386/tls.h
--- nptl/sysdeps/i386/tls.h
+++ nptl/sysdeps/i386/tls.h
@@ -128,6 +128,8 @@
 # define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	THREAD_GETMEM (THREAD_SELF, header.sysinfo)
+#define THREAD_SYSINFO(pd)	((pd)->header.sysinfo)
 
 /* Macros to load from and store into segment registers.  */
 # ifndef TLS_GET_GS
Index: nptl/sysdeps/ia64/tcb-offsets.sym
--- nptl/sysdeps/ia64/tcb-offsets.sym
+++ nptl/sysdeps/ia64/tcb-offsets.sym
@@ -2,3 +2,4 @@
 #include <tls.h>
 
 MULTIPLE_THREADS_OFFSET offsetof (struct pthread, header.multiple_threads) - sizeof (struct pthread)
+SYSINFO_OFFSET		offsetof (tcbhead_t, private)
Index: nptl/sysdeps/ia64/tls.h
--- nptl/sysdeps/ia64/tls.h
+++ nptl/sysdeps/ia64/tls.h
@@ -42,6 +42,8 @@
   void *private;
 } tcbhead_t;
 
+register struct pthread *__thread_self __asm__("r13");
+
 # define TLS_MULTIPLE_THREADS_IN_TCB 1
 
 #else /* __ASSEMBLER__ */
@@ -64,8 +66,6 @@
 /* Get system call information.  */
 # include <sysdep.h>
 
-register struct pthread *__thread_self __asm__("r13");
-
 /* This is the size of the initial TCB.  */
 # define TLS_INIT_TCB_SIZE sizeof (tcbhead_t)
 
@@ -100,11 +100,20 @@
 #  define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	(((tcbhead_t *) __thread_self)->private)
+#define THREAD_SYSINFO(pd)	(((tcbhead_t *) ((pd) + 1))->private)
+
+#if defined NEED_DL_SYSINFO
+# define INIT_SYSINFO   THREAD_SELF_SYSINFO = (void *) GL(dl_sysinfo)
+#else
+# define INIT_SYSINFO   NULL
+#endif
+
 /* Code to initially initialize the thread pointer.  This might need
    special attention since 'errno' is not yet available and if the
    operation can cause a failure 'errno' must not be touched.  */
 # define TLS_INIT_TP(thrdescr, secondcall) \
-  (__thread_self = (thrdescr), NULL)
+  (__thread_self = (thrdescr), INIT_SYSINFO, NULL)
 
 /* Return the address of the dtv for the current thread.  */
 #  define THREAD_DTV() \
Index: nptl/sysdeps/pthread/createthread.c
--- nptl/sysdeps/pthread/createthread.c
+++ nptl/sysdeps/pthread/createthread.c
@@ -226,7 +226,7 @@
     }
 
 #ifdef NEED_DL_SYSINFO
-  assert (THREAD_GETMEM (THREAD_SELF, header.sysinfo) == pd->header.sysinfo);
+  assert (THREAD_SELF_SYSINFO == THREAD_SYSINFO(pd));
 #endif
 
   /* Actually create the thread.  */
Index: nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,70 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+/* This macro must be defined to either 0 or 1.
+
+   If 1, then an errno global variable hidden in ld.so will work right with
+   all the errno-using libc code compiled for ld.so, and there is never a
+   need to share the errno location with libc.  This is appropriate only if
+   all the libc functions that ld.so uses are called without PLT and always
+   get the versions linked into ld.so rather than the libc ones.  */
+
+#ifdef IS_IN_rtld
+# define RTLD_PRIVATE_ERRNO 1
+#else
+# define RTLD_PRIVATE_ERRNO 0
+#endif
+
+/* Traditionally system calls have been made using break 0x100000.  A
+   second method was introduced which, if possible, will use the EPC
+   instruction.  To signal the presence and where to find the code the
+   kernel passes an AT_SYSINFO_EHDR pointer in the auxiliary vector to
+   the application.  */
+#define NEED_DL_SYSINFO	1
+#ifdef HAVE_CC_WITH_LIBUNWIND
+# define USE_DL_SYSINFO	1
+#else
+  /* GCC's built-in unwinder is too broken for the new syscall stubs
+     to work properly.  */
+# undef USE_DL_SYSINFO
+#endif
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
--- nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
@@ -26,7 +26,7 @@
 #include <ia64intrin.h>
 #include <atomic.h>
 
-#define SYS_futex		1230
+#define __NR_futex		1230
 #define FUTEX_WAIT		0
 #define FUTEX_WAKE		1
 #define FUTEX_REQUEUE		3
@@ -34,112 +34,52 @@
 /* Initializer for compatibility lock.	*/
 #define LLL_MUTEX_LOCK_INITIALIZER (0)
 
-#define lll_futex_clobbers \
-  "out5", "out6", "out7",						      \
-  /* Non-stacked integer registers, minus r8, r10, r15.  */		      \
-  "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	      \
-  "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	      \
-  "r28", "r29", "r30", "r31",						      \
-  /* Predicate registers.  */						      \
-  "p6", "p7", "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15",	      \
-  /* Non-rotating fp registers.  */					      \
-  "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	      \
-  /* Branch registers.  */						      \
-  "b6", "b7",								      \
-  "memory"
-
 #define lll_futex_wait(futex, val) lll_futex_timed_wait (futex, val, 0)
 
-#define lll_futex_timed_wait(futex, val, timespec) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAIT;			      \
-     register int __o2 asm ("out2") = (int) (val);			      \
-     register long int __o3 asm ("out3") = (long int) (timespec);	      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %7;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3)   \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2), "6" (__o3)				      \
-		       : "out4", lll_futex_clobbers);			      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-#define lll_futex_wake(futex, nr) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAKE;			      \
-     register int __o2 asm ("out2") = (int) (nr);			      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %6;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2)		      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2)					      \
-		       : "out3", "out4", lll_futex_clobbers);		      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
+#define lll_futex_timed_wait(ftx, val, timespec)			\
+({									\
+   DO_INLINE_SYSCALL(futex, 4, (long) (ftx), FUTEX_WAIT, (int) (val),	\
+		     (long) (timespec));				\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_wake(ftx, nr)						\
+({									\
+   DO_INLINE_SYSCALL(futex, 3, (long) (ftx), FUTEX_WAKE, (int) (nr));	\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_requeue(ftx, nr_wake, nr_move, mutex)			     \
+({									     \
+   DO_INLINE_SYSCALL(futex, 5, (long) (ftx), FUTEX_REQUEUE, (int) (nr_wake), \
+		     (int) (nr_move), (long) (mutex));			     \
+   _r10 == -1 ? -_retval : _retval;					     \
+})
 
 
-#define lll_futex_requeue(futex, nr_wake, nr_move, mutex) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_REQUEUE;		      \
-     register int __o2 asm ("out2") = (int) (nr_wake);			      \
-     register int __o3 asm ("out3") = (int) (nr_move);			      \
-     register long int __o4 asm ("out4") = (long int) (mutex);		      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %8;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3),  \
-			 "=r" (__o4)					      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-			 "5" (__o2), "6" (__o3), "7" (__o4)		      \
-		       : lll_futex_clobbers);				      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-static inline int
-__attribute__ ((always_inline))
-__lll_mutex_trylock (int *futex)
-{
-  return atomic_compare_and_exchange_val_acq (futex, 1, 0) != 0;
-}
+#define __lll_mutex_trylock(futex) \
+  (atomic_compare_and_exchange_val_acq (futex, 1, 0) != 0)
 #define lll_mutex_trylock(futex) __lll_mutex_trylock (&(futex))
 
 
 extern void __lll_lock_wait (int *futex) attribute_hidden;
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_lock (int *futex)
-{
-  if (atomic_compare_and_exchange_bool_acq (futex, 1, 0) != 0)
-    __lll_lock_wait (futex);
-}
+#define __lll_mutex_lock(futex)						\
+  ((void) ({								\
+    int *__futex = (futex);						\
+    if (atomic_compare_and_exchange_bool_acq (__futex, 1, 0) != 0)	\
+      __lll_lock_wait (__futex);					\
+  }))
 #define lll_mutex_lock(futex) __lll_mutex_lock (&(futex))
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_cond_lock (int *futex)
-{
-  if (atomic_compare_and_exchange_bool_acq (futex, 2, 0) != 0)
-    __lll_lock_wait (futex);
-}
+#define __lll_mutex_cond_lock(futex)					\
+  ((void) ({								\
+    int *__futex = (futex);						\
+    if (atomic_compare_and_exchange_bool_acq (__futex, 2, 0) != 0)	\
+      __lll_lock_wait (__futex);					\
+  }))
 #define lll_mutex_cond_lock(futex) __lll_mutex_cond_lock (&(futex))
 
 
@@ -147,41 +87,37 @@
      attribute_hidden;
 
 
-static inline int
-__attribute__ ((always_inline))
-__lll_mutex_timedlock (int *futex, const struct timespec *abstime)
-{
-  int result = 0;
-
-  if (atomic_compare_and_exchange_bool_acq (futex, 1, 0) != 0)
-    result = __lll_timedlock_wait (futex, abstime);
-
-  return result;
-}
+#define __lll_mutex_timedlock(futex, abstime)				\
+  ({									\
+     int *__futex = (futex);						\
+     int __val = 0;							\
+									\
+     if (atomic_compare_and_exchange_bool_acq (__futex, 1, 0) != 0)	\
+       __val = __lll_timedlock_wait (__futex, abstime);			\
+     __val;								\
+  })
 #define lll_mutex_timedlock(futex, abstime) \
   __lll_mutex_timedlock (&(futex), abstime)
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_unlock (int *futex)
-{
-  int val = atomic_exchange_rel (futex, 0);
-
-  if (__builtin_expect (val > 1, 0))
-    lll_futex_wake (futex, 1);
-}
+#define __lll_mutex_unlock(futex)			\
+  ((void) ({						\
+    int *__futex = (futex);				\
+    int __val = atomic_exchange_rel (__futex, 0);	\
+							\
+    if (__builtin_expect (__val > 1, 0))		\
+      lll_futex_wake (__futex, 1);			\
+  }))
 #define lll_mutex_unlock(futex) \
   __lll_mutex_unlock(&(futex))
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_unlock_force (int *futex)
-{
-  (void) atomic_exchange_rel (futex, 0);
-  lll_futex_wake (futex, 1);
-}
+#define __lll_mutex_unlock_force(futex)		\
+  ((void) ({					\
+    int *__futex = (futex);			\
+    (void) atomic_exchange_rel (__futex, 0);	\
+    lll_futex_wake (__futex, 1);		\
+  }))
 #define lll_mutex_unlock_force(futex) \
   __lll_mutex_unlock_force(&(futex))
 
Index: nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
--- nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
+++ nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
@@ -36,6 +36,22 @@
    * crtn.s puts the corresponding function epilogues
    in the .init and .fini sections. */
 
+#include <stddef.h>
+
+#ifdef HAVE_INITFINI_ARRAY
+
+__asm__ ("\n\
+#include \"defs.h\"\n\
+\n\
+/*@HEADER_ENDS*/\n\
+\n\
+/*@_init_PROLOG_BEGINS*/\n\
+	.xdata8 \".init_array\",@fptr(__pthread_initialize_minimal_internal)\n\
+/*@_init_PROLOG_ENDS*/\n\
+");
+
+#else
+
 __asm__ ("\n\
 \n\
 #include \"defs.h\"\n\
@@ -48,13 +64,16 @@
 	.global _init#\n\
 	.proc _init#\n\
 _init:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-/* we could use r35 to save gp, but we use the stack since that's what\n\
- * all the other init routines will do --davidm 00/04/05 */\n\
 	st8 [r12] = gp, -16\n\
 	br.call.sptk.many b0 = __pthread_initialize_minimal_internal# ;;\n\
 	;;\n\
@@ -62,13 +81,18 @@
 	;;\n\
 	ld8 gp = [r12]\n\
 	;;\n\
-	.align 16\n\
 	.endp _init#\n\
 \n\
 /*@_init_PROLOG_ENDS*/\n\
 \n\
 /*@_init_EPILOG_BEGINS*/\n\
 	.section .init\n\
+	.proc _init#\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
+	.vframe r32\n\
+	.save rp, r33\n\
+	.body\n\
 	.regstk 0,2,0,0\n\
 	mov r12 = r32\n\
 	mov ar.pfs = r34\n\
@@ -83,18 +107,28 @@
 	.global _fini#\n\
 	.proc _fini#\n\
 _fini:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-	.align 16\n\
 	.endp _fini#\n\
 \n\
 /*@_fini_PROLOG_ENDS*/\n\
 \n\
 /*@_fini_EPILOG_BEGINS*/\n\
 	.section .fini\n\
+	.proc _fini#\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
+	.vframe r32\n\
+	.save rp, r33\n\
+	.body\n\
 	mov r12 = r32\n\
 	mov ar.pfs = r34\n\
 	mov b0 = r33\n\
@@ -106,3 +140,5 @@
 /*@TRAILER_BEGINS*/\n\
 	.weak	__gmon_start__#\n\
 ");
+
+#endif
Index: nptl/sysdeps/unix/sysv/linux/ia64/pt-vfork.S
--- nptl/sysdeps/unix/sysv/linux/ia64/pt-vfork.S
+++ nptl/sysdeps/unix/sysv/linux/ia64/pt-vfork.S
@@ -30,6 +30,8 @@
 /* Implemented as __clone_syscall(CLONE_VFORK | CLONE_VM | SIGCHLD, 0)	*/
 
 ENTRY(__vfork)
+	.prologue	// work around a GAS bug which triggers if
+	.body		// first .prologue is not at the beginning of proc.
 	alloc r2=ar.pfs,0,0,2,0
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
Index: nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
--- nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
@@ -26,6 +26,9 @@
 #if !defined NOT_IN_libc || defined IS_IN_libpthread || defined IS_IN_librt
 
 # undef PSEUDO
+
+#ifndef USE_DL_SYSINFO
+
 # define PSEUDO(name, syscall_name, args)				      \
 .text;									      \
 ENTRY (name)								      \
@@ -88,6 +91,83 @@
      mov r8 = -1;							      \
      mov ar.pfs = loc0
 
+#else /* USE_DL_SYSINFO */
+
+# define PSEUDO(name, syscall_name, args)				      \
+.text;									      \
+ENTRY (name)								      \
+     .prologue;								      \
+     adds r2 = SYSINFO_OFFSET, r13;					      \
+     adds r14 = MULTIPLE_THREADS_OFFSET, r13;				      \
+     .save ar.pfs, r11;							      \
+     mov r11 = ar.pfs;;							      \
+     .body;								      \
+     ld4 r14 = [r14];							      \
+     ld8 r2 = [r2];							      \
+     mov r15 = SYS_ify(syscall_name);;					      \
+     cmp4.ne p6, p7 = 0, r14;						      \
+     mov b7 = r2;							      \
+(p6) br.cond.spnt .Lpseudo_cancel;					      \
+     br.call.sptk.many b6 = b7;;					      \
+     mov ar.pfs = r11;							      \
+     cmp.eq p6,p0 = -1, r10;						      \
+(p6) br.cond.spnt.few __syscall_error;					      \
+     ret;;								      \
+     .endp name;							      \
+     .proc __GC_##name;							      \
+     .globl __GC_##name;						      \
+     .hidden __GC_##name;						      \
+__GC_##name:								      \
+.Lpseudo_cancel:							      \
+     .prologue;								      \
+     .regstk args, 5, args, 0;						      \
+     .save ar.pfs, loc0;						      \
+     alloc loc0 = ar.pfs, args, 5, args, 0;				      \
+     adds loc4 = SYSINFO_OFFSET, r13;					      \
+     .save rp, loc1;							      \
+     mov loc1 = rp;;							      \
+     .body;								      \
+     ld8 loc4 = [loc4];							      \
+     CENABLE;;								      \
+     mov loc2 = r8;							      \
+     mov b7 = loc4;							      \
+     COPY_ARGS_##args							      \
+     mov r15 = SYS_ify(syscall_name);					      \
+     br.call.sptk.many b6 = b7;;					      \
+     mov loc3 = r8;							      \
+     mov loc4 = r10;							      \
+     mov out0 = loc2;							      \
+     CDISABLE;;								      \
+     cmp.eq p6,p0=-1,loc4;						      \
+(p6) br.cond.spnt.few __syscall_error_##args;				      \
+     mov r8 = loc3;							      \
+     mov rp = loc1;							      \
+     mov ar.pfs = loc0;							      \
+.Lpseudo_end:								      \
+     ret;								      \
+     .endp __GC_##name;							      \
+.section .gnu.linkonce.t.__syscall_error_##args, "ax";			      \
+     .align 32;								      \
+     .proc __syscall_error_##args;					      \
+     .global __syscall_error_##args;					      \
+     .hidden __syscall_error_##args;					      \
+     .size __syscall_error_##args, 64;					      \
+__syscall_error_##args:							      \
+     .prologue;								      \
+     .regstk args, 5, args, 0;						      \
+     .save ar.pfs, loc0;						      \
+     .save rp, loc1;							      \
+     .body;								      \
+     mov loc4 = r1;;							      \
+     br.call.sptk.many b0 = __errno_location;;				      \
+     st4 [r8] = loc3;							      \
+     mov r1 = loc4;							      \
+     mov rp = loc1;							      \
+     mov r8 = -1;							      \
+     mov ar.pfs = loc0
+
+#endif /* USE_DL_SYSINFO */
+
 #undef PSEUDO_END
 #define PSEUDO_END(name) .endp
 
Index: sysdeps/generic/dl-fptr.c
--- sysdeps/generic/dl-fptr.c
+++ sysdeps/generic/dl-fptr.c
@@ -163,7 +163,7 @@
 }
 
 
-static inline ElfW(Addr) *
+static inline ElfW(Addr) * __attribute__ ((always_inline))
 make_fptr_table (struct link_map *map)
 {
   const ElfW(Sym) *symtab
Index: sysdeps/ia64/dl-machine.h
--- sysdeps/ia64/dl-machine.h
+++ sysdeps/ia64/dl-machine.h
@@ -33,7 +33,7 @@
    in l_info array.  */
 #define DT_IA_64(x) (DT_IA_64_##x - DT_LOPROC + DT_NUM)
 
-static inline void
+static inline void __attribute__ ((always_inline))
 __ia64_init_bootstrap_fdesc_table (struct link_map *map)
 {
   Elf64_Addr *boot_table;
@@ -49,7 +49,7 @@
 	__ia64_init_bootstrap_fdesc_table (&bootstrap_map);
 
 /* Return nonzero iff ELF header is compatible with the running host.  */
-static inline int
+static inline int __attribute__ ((unused))
 elf_machine_matches_host (const Elf64_Ehdr *ehdr)
 {
   return ehdr->e_machine == EM_IA_64;
@@ -57,7 +57,7 @@
 
 
 /* Return the link-time address of _DYNAMIC.  */
-static inline Elf64_Addr
+static inline Elf64_Addr __attribute__ ((unused, const))
 elf_machine_dynamic (void)
 {
   Elf64_Addr *p;
@@ -77,7 +77,7 @@
 
 
 /* Return the run-time load address of the shared object.  */
-static inline Elf64_Addr
+static inline Elf64_Addr __attribute__ ((unused))
 elf_machine_load_address (void)
 {
   Elf64_Addr ip;
@@ -98,7 +98,7 @@
 /* Set up the loaded object described by L so its unrelocated PLT
    entries will jump to the on-demand fixup code in dl-runtime.c.  */
 
-static inline int __attribute__ ((always_inline))
+static inline int __attribute__ ((unused, always_inline))
 elf_machine_runtime_setup (struct link_map *l, int lazy, int profile)
 {
   extern void _dl_runtime_resolve (void);
Index: sysdeps/ia64/elf/initfini.c
--- sysdeps/ia64/elf/initfini.c
+++ sysdeps/ia64/elf/initfini.c
@@ -61,16 +61,20 @@
 #endif
 
 __asm__ (".section .init\n"
-"	.align 16\n"
 "	.global _init#\n"
 "	.proc _init#\n"
 "_init:\n"
+"	.prologue\n"
+"	.save ar.pfs, r34\n"
 "	alloc r34 = ar.pfs, 0, 3, 0, 0\n"
+"	.vframe r32\n"
 "	mov r32 = r12\n"
+"	.save rp, r33\n"
 "	mov r33 = b0\n"
+"	.body\n"
 "	adds r12 = -16, r12\n"
 #ifdef HAVE_INITFINI_ARRAY
- "	;;\n"		/* see gmon_initializer() below */
+"	;;\n"		/* see gmon_initializer() above */
 #else
 "	.weak	__gmon_start__#\n"
 "	addl r14 = @ltoff(@fptr(__gmon_start__#)), gp\n"
@@ -90,12 +94,17 @@
 "	;;\n"
 ".L5:\n"
 #endif
-"	.align 16\n"
 "	.endp _init#\n"
 "\n"
 "/*@_init_PROLOG_ENDS*/\n"
 "\n"
 "/*@_init_EPILOG_BEGINS*/\n"
+"	.proc _init#\n"
+"	.prologue\n"
+"	.save ar.pfs, r34\n"
+"	.vframe r32\n"
+"	.save rp, r33\n"
+"	.body\n"
 "	.section .init\n"
 "	.regstk 0,2,0,0\n"
 "	mov r12 = r32\n"
@@ -107,16 +116,19 @@
 "\n"
 "/*@_fini_PROLOG_BEGINS*/\n"
 "	.section .fini\n"
-"	.align 16\n"
 "	.global _fini#\n"
 "	.proc _fini#\n"
 "_fini:\n"
+"	.prologue\n"
+"	.save ar.pfs, r34\n"
 "	alloc r34 = ar.pfs, 0, 3, 0, 0\n"
+"	.vframe r32\n"
 "	mov r32 = r12\n"
+"	.save rp, r33\n"
 "	mov r33 = b0\n"
+"	.body\n"
 "	adds r12 = -16, r12\n"
 "	;;\n"
-"	.align 16\n"
 "	.endp _fini#\n"
 "\n"
 "/*@_fini_PROLOG_ENDS*/\n"
@@ -125,6 +137,12 @@
 "\n"
 "/*@_fini_EPILOG_BEGINS*/\n"
 "	.section .fini\n"
+"	.proc _fini#\n"
+"	.prologue\n"
+"	.save ar.pfs, r34\n"
+"	.vframe r32\n"
+"	.save rp, r33\n"
+"	.body\n"
 "	mov r12 = r32\n"
 "	mov ar.pfs = r34\n"
 "	mov b0 = r33\n"
Index: sysdeps/unix/sysv/linux/ia64/brk.S
--- sysdeps/unix/sysv/linux/ia64/brk.S
+++ sysdeps/unix/sysv/linux/ia64/brk.S
@@ -35,19 +35,17 @@
 weak_alias (__curbrk, ___brk_addr)
 
 LEAF(__brk)
-	mov	r15=__NR_brk
-	break.i	__BREAK_SYSCALL
+	.regstk 1, 0, 0, 0
+	DO_CALL(__NR_brk)
+	cmp.ltu	p6, p0 = ret0, in0
+	addl r9 = @ltoff(__curbrk), gp
 	;;
-	cmp.ltu	p6,p0=ret0,r32	/* r32 is the input register, even though we
-				   haven't allocated a frame */
-	addl	r9=@ltoff(__curbrk),gp
-	;;
-	ld8	r9=[r9]
-(p6) 	mov	ret0=ENOMEM
+	ld8 r9 = [r9]
+(p6) 	mov ret0 = ENOMEM
 (p6)	br.cond.spnt.few __syscall_error
 	;;
-	st8	[r9]=ret0
-	mov 	ret0=0
+	st8 [r9] = ret0
+	mov ret0 = 0
 	ret
 END(__brk)
 
Index: sysdeps/unix/sysv/linux/ia64/clone2.S
--- sysdeps/unix/sysv/linux/ia64/clone2.S
+++ sysdeps/unix/sysv/linux/ia64/clone2.S
@@ -25,49 +25,56 @@
 /* 	         size_t child_stack_size, int flags, void *arg,		*/
 /*	         pid_t *parent_tid, void *tls, pid_t *child_tid)	*/
 
+#define CHILD	p8
+#define PARENT	p9
+
 ENTRY(__clone2)
-	alloc r2=ar.pfs,8,2,6,0
+	.prologue
+	alloc r2=ar.pfs,8,0,6,0
 	cmp.eq p6,p0=0,in0
 	mov r8=EINVAL
-(p6)	br.cond.spnt.few __syscall_error
-	;;
-	flushrs			/* This is necessary, since the child	*/
-				/* will be running with the same 	*/
-				/* register backing store for a few 	*/
-				/* instructions.  We need to ensure	*/
-				/* that it will not read or write the	*/
-				/* backing store.			*/
-	mov loc0=in0		/* save fn	*/
-	mov loc1=in4		/* save arg	*/
 	mov out0=in3		/* Flags are first syscall argument.	*/
 	mov out1=in1		/* Stack address.			*/
+(p6)	br.cond.spnt.many __syscall_error
+	;;
 	mov out2=in2		/* Stack size.				*/
 	mov out3=in5		/* Parent TID Pointer			*/
 	mov out4=in7		/* Child TID Pointer			*/
  	mov out5=in6		/* TLS pointer				*/
-        DO_CALL (SYS_ify (clone2))
+	/*
+	 * clone2() is special: the child cannot execute br.ret right
+	 * after the system call returns, because it starts out
+	 * executing on an empty stack.  Because of this, we can't use
+	 * the new (lightweight) syscall convention here.  Instead, we
+	 * just fall back on always using "break".
+	 *
+	 * Furthermore, since the child starts with an empty stack, we
+	 * need to avoid unwinding past invalid memory.  To that end,
+	 * we'll pretend now that __clone2() is the end of the
+	 * call-chain.  This is wrong for the parent, but only until
+	 * it returns from clone2() but it's better than the
+	 * alternative.
+	 */
+	mov r15=SYS_ify (clone2)
+	.save rp, r0
+	break __BREAK_SYSCALL
+	.body
         cmp.eq p6,p0=-1,r10
+	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?   */
+(p6)	br.cond.spnt.many __syscall_error
 	;;
-(p6)	br.cond.spnt.few __syscall_error
-
-#	define CHILD p6
-#	define PARENT p7
-	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?	*/
-	;;
-(CHILD)	ld8 out1=[loc0],8	/* Retrieve code pointer.	*/
-(CHILD)	mov out0=loc1		/* Pass proper argument	to fn */
+(CHILD)	ld8 out1=[in0],8	/* Retrieve code pointer.	*/
+(CHILD)	mov out0=in4		/* Pass proper argument	to fn */
 (PARENT) ret
 	;;
-	ld8 gp=[loc0]		/* Load function gp.		*/
+	ld8 gp=[in0]		/* Load function gp.		*/
 	mov b6=out1
-	;;
-	br.call.dptk.few rp=b6	/* Call fn(arg) in the child 	*/
+	br.call.dptk.many rp=b6	/* Call fn(arg) in the child 	*/
 	;;
 	mov out0=r8		/* Argument to _exit		*/
 	.globl _exit
-	br.call.dpnt.few rp=_exit /* call _exit with result from fn.	*/
+	br.call.dpnt.many rp=_exit /* call _exit with result from fn.	*/
 	ret			/* Not reached.		*/
-
 PSEUDO_END(__clone2)
 
 /* For now we leave __clone undefined.  This is unlikely to be a	*/
Index: sysdeps/unix/sysv/linux/ia64/getcontext.S
--- sysdeps/unix/sysv/linux/ia64/getcontext.S
+++ sysdeps/unix/sysv/linux/ia64/getcontext.S
@@ -35,26 +35,27 @@
 
 ENTRY(__getcontext)
 	.prologue
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_BLOCK, NULL, &sc->sc_mask):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_BLOCK
-	mov out1 = 0
-	add out2 = r2, in0
-	mov out3 = 8	// sizeof kernel sigset_t
 
-	break __BREAK_SYSCALL
 	flushrs					// save dirty partition on rbs
+	mov out1 = 0
+	add out2 = r3, in0
+
+	mov out3 = 8	// sizeof kernel sigset_t
+	DO_CALL(__NR_rt_sigprocmask)
 
 	mov.m rFPSR = ar.fpsr
 	mov.m rRSC = ar.rsc
 	add r2 = SC_GR+1*8, r32
 	;;
 	mov.m rBSP = ar.bsp
+	.prologue
 	.save ar.unat, rUNAT
 	mov.m rUNAT = ar.unat
 	.body
@@ -63,7 +64,7 @@
 
 .mem.offset 0,0; st8.spill [r2] = r1, (5*8 - 1*8)
 .mem.offset 8,0; st8.spill [r3] = r4, 16
-	mov.i rPFS = ar.pfs
+	mov rPFS = r11
 	;;
 .mem.offset 0,0; st8.spill [r2] = r5, 16
 .mem.offset 8,0; st8.spill [r3] = r6, 48
Index: sysdeps/unix/sysv/linux/ia64/setcontext.S
--- sysdeps/unix/sysv/linux/ia64/setcontext.S
+++ sysdeps/unix/sysv/linux/ia64/setcontext.S
@@ -32,20 +32,21 @@
   other than the PRESERVED state.  */
 
 ENTRY(__setcontext)
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.prologue
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_SETMASK, &sc->sc_mask, NULL):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_SETMASK
-	add out1 = r2, in0
+	;;
+	add out1 = r3, in0
 	mov out2 = 0
 	mov out3 = 8	// sizeof kernel sigset_t
 
 	invala
-	break __BREAK_SYSCALL
+	DO_CALL(__NR_rt_sigprocmask)
 	add r2 = SC_NAT, r32
 
 	add r3 = SC_RNAT, r32			// r3 <- &sc_ar_rnat
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h
@@ -23,6 +23,8 @@
 
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
+#include <dl-sysdep.h>
+#include <tls.h>
 
 /* As of GAS v2.4.90.0.7, including a ".align" directive inside a
    function will cause bad unwind info to be emitted (GAS doesn't know
@@ -58,6 +60,14 @@
 # define __NR_semtimedop 1247
 #endif
 
+#if defined USE_DL_SYSINFO \
+	&& (!defined NOT_IN_libc \
+	    || defined IS_IN_libpthread || defined IS_IN_librt)
+# define IA64_USE_NEW_STUB
+#else
+# undef IA64_USE_NEW_STUB
+#endif
+
 #ifdef __ASSEMBLER__
 
 #undef CALL_MCOUNT
@@ -102,9 +112,45 @@
 	cmp.eq p6,p0=-1,r10;			\
 (p6)	br.cond.spnt.few __syscall_error;
 
-#define DO_CALL(num)				\
+#define DO_CALL_VIA_BREAK(num)			\
 	mov r15=num;				\
-	break __BREAK_SYSCALL;
+	break __BREAK_SYSCALL
+
+#ifdef IA64_USE_NEW_STUB
+# ifdef SHARED
+#  define DO_CALL(num)				\
+	.prologue;				\
+	adds r2 = SYSINFO_OFFSET, r13;;		\
+	ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+	mov r11 = ar.pfs;;			\
+	.body;					\
+	mov r15 = num;				\
+	mov b7 = r2;				\
+	br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+	mov ar.pfs = r11;			\
+	.prologue;				\
+	.body
+# else /* !SHARED */
+#  define DO_CALL(num)				\
+	.prologue;				\
+	mov r15 = num;				\
+	movl r2 = _dl_sysinfo;;			\
+	ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+	mov r11 = ar.pfs;;			\
+	.body;					\
+	mov b7 = r2;				\
+	br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+	mov ar.pfs = r11;			\
+	.prologue;				\
+	.body
+# endif
+#else
+# define DO_CALL(num)				DO_CALL_VIA_BREAK(num)
+#endif
 
 #undef PSEUDO_END
 #define PSEUDO_END(name)	.endp C_SYMBOL_NAME(name);
@@ -150,45 +196,64 @@
    from a syscall.  r10 is set to -1 on error, whilst r8 contains the
    (non-negative) errno on error or the return value on success.
  */
-#undef INLINE_SYSCALL
-#define INLINE_SYSCALL(name, nr, args...)			\
-  ({								\
+
+#ifdef IA64_USE_NEW_STUB
+
+#define DO_INLINE_SYSCALL(name, nr, args...)					\
+    register long _r8 __asm ("r8");						\
+    register long _r10 __asm ("r10");						\
+    register long _r15 __asm ("r15") = __NR_##name;				\
+    register void *_b7 __asm ("b7") = ((tcbhead_t *) __thread_self)->private;	\
+    long _retval;								\
+    LOAD_ARGS_##nr (args);							\
+    /*										\
+     * Don't specify any unwind info here.  We mark ar.pfs as			\
+     * clobbered.  This will force the compiler to save ar.pfs			\
+     * somewhere and emit appropriate unwind info for that save.		\
+     */										\
+    __asm __volatile ("br.call.sptk.many b6=%0;;\n"				\
+		      : "=b"(_b7), "=r" (_r8), "=r" (_r10), "=r" (_r15)		\
+			ASM_OUTARGS_##nr					\
+		      : "0" (_b7), "3" (_r15) ASM_ARGS_##nr			\
+		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);			\
+    _retval = _r8;
+
+#else /* !IA64_USE_NEW_STUB */
+
+#define DO_INLINE_SYSCALL(name, nr, args...)			\
     register long _r8 asm ("r8");				\
     register long _r10 asm ("r10");				\
     register long _r15 asm ("r15") = __NR_##name;		\
     long _retval;						\
     LOAD_ARGS_##nr (args);					\
     __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
-                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
+		      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
 			ASM_OUTARGS_##nr			\
-                      : "2" (_r15) ASM_ARGS_##nr		\
-                      : "memory" ASM_CLOBBERS_##nr);		\
-    _retval = _r8;						\
-    if (_r10 == -1)						\
-      {								\
-        __set_errno (_retval);					\
-        _retval = -1;						\
-      }								\
+		      : "2" (_r15) ASM_ARGS_##nr		\
+		      : "memory" ASM_CLOBBERS_##nr);		\
+    _retval = _r8;
+
+#endif /* !IA64_USE_NEW_STUB */
+
+#undef INLINE_SYSCALL
+#define INLINE_SYSCALL(name, nr, args...)	\
+  ({						\
+    DO_INLINE_SYSCALL(name, nr, args)		\
+    if (_r10 == -1)				\
+      {						\
+	__set_errno (_retval);			\
+	_retval = -1;				\
+      }						\
     _retval; })
 
 #undef INTERNAL_SYSCALL_DECL
 #define INTERNAL_SYSCALL_DECL(err) long int err
 
 #undef INTERNAL_SYSCALL
-#define INTERNAL_SYSCALL(name, err, nr, args...)		\
-  ({								\
-    register long _r8 asm ("r8");				\
-    register long _r10 asm ("r10");				\
-    register long _r15 asm ("r15") = __NR_##name;		\
-    long _retval;						\
-    LOAD_ARGS_##nr (args);					\
-    __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
-                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
-			ASM_OUTARGS_##nr			\
-                      : "2" (_r15) ASM_ARGS_##nr		\
-                      : "memory" ASM_CLOBBERS_##nr);		\
-    _retval = _r8;						\
-    err = _r10;							\
+#define INTERNAL_SYSCALL(name, err, nr, args...)	\
+  ({							\
+    DO_INLINE_SYSCALL(name, nr, args)			\
+    err = _r10;						\
     _retval; })
 
 #undef INTERNAL_SYSCALL_ERROR_P
@@ -225,6 +290,15 @@
 #define ASM_OUTARGS_5	ASM_OUTARGS_4, "=r" (_out4)
 #define ASM_OUTARGS_6	ASM_OUTARGS_5, "=r" (_out5)
 
+#ifdef IA64_USE_NEW_STUB
+#define ASM_ARGS_0
+#define ASM_ARGS_1	ASM_ARGS_0, "4" (_out0)
+#define ASM_ARGS_2	ASM_ARGS_1, "5" (_out1)
+#define ASM_ARGS_3	ASM_ARGS_2, "6" (_out2)
+#define ASM_ARGS_4	ASM_ARGS_3, "7" (_out3)
+#define ASM_ARGS_5	ASM_ARGS_4, "8" (_out4)
+#define ASM_ARGS_6	ASM_ARGS_5, "9" (_out5)
+#else
 #define ASM_ARGS_0
 #define ASM_ARGS_1	ASM_ARGS_0, "3" (_out0)
 #define ASM_ARGS_2	ASM_ARGS_1, "4" (_out1)
@@ -232,6 +306,7 @@
 #define ASM_ARGS_4	ASM_ARGS_3, "6" (_out3)
 #define ASM_ARGS_5	ASM_ARGS_4, "7" (_out4)
 #define ASM_ARGS_6	ASM_ARGS_5, "8" (_out5)
+#endif
 
 #define ASM_CLOBBERS_0	ASM_CLOBBERS_1, "out0"
 #define ASM_CLOBBERS_1	ASM_CLOBBERS_2, "out1"
@@ -239,7 +314,7 @@
 #define ASM_CLOBBERS_3	ASM_CLOBBERS_4, "out3"
 #define ASM_CLOBBERS_4	ASM_CLOBBERS_5, "out4"
 #define ASM_CLOBBERS_5	ASM_CLOBBERS_6, "out5"
-#define ASM_CLOBBERS_6	, "out6", "out7",				\
+#define ASM_CLOBBERS_6_COMMON	, "out6", "out7",			\
   /* Non-stacked integer registers, minus r8, r10, r15.  */		\
   "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	\
   "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	\
@@ -249,7 +324,13 @@
   /* Non-rotating fp registers.  */					\
   "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	\
   /* Branch registers.  */						\
-  "b6", "b7"
+  "b6"
+
+#ifdef IA64_USE_NEW_STUB
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON
+#else
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON , "b7"
+#endif
 
 #endif /* not __ASSEMBLER__ */
 
Index: sysdeps/unix/sysv/linux/ia64/vfork.S
--- sysdeps/unix/sysv/linux/ia64/vfork.S
+++ sysdeps/unix/sysv/linux/ia64/vfork.S
@@ -34,9 +34,8 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
-	DO_CALL (SYS_ify (clone))
+	DO_CALL_VIA_BREAK (SYS_ify (clone))
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-11-20 21:20                                                                     ` David Mosberger
@ 2003-12-07  1:46                                                                       ` Ulrich Drepper
  2003-12-08 17:40                                                                         ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-12-07  1:46 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> The patch seems to work fine for me (same "make check" results as
> before).  I didn't update the linuxthreads version yet, but if this
> patch looks fine, that's trivial to do.

I've applied this patch, it looks fine.  The DSO now only has
.init_array and .fini_array con/destructors.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/0oM62ijCOnn/RHQRAqoKAKDM/jwrl25e1G/LGLcXE6pb2ay17QCeNRC4
pM8wywLlOdjIilxGIrTe5y8=
=rKU7
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-12-07  1:46                                                                       ` Ulrich Drepper
@ 2003-12-08 17:40                                                                         ` David Mosberger
  2003-12-08 19:27                                                                           ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-12-08 17:40 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, libc-hacker

>>>>> On Sat, 06 Dec 2003 17:32:37 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> -----BEGIN PGP SIGNED MESSAGE-----
  Uli> Hash: SHA1

  Uli> David Mosberger wrote:

  >> The patch seems to work fine for me (same "make check" results as
  >> before).  I didn't update the linuxthreads version yet, but if this
  >> patch looks fine, that's trivial to do.

  Uli> I've applied this patch, it looks fine.  The DSO now only has
  Uli> .init_array and .fini_array con/destructors.

Thanks!

It will mean, however, that the new syscall-stub patch will conflict
with this change (I thought you had ignored this previous patch).  Do
I need to send an updated syscall-stub patch?

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-03  7:25                                                             ` David Mosberger
@ 2003-12-08 18:16                                                               ` Jakub Jelinek
  2003-12-08 19:23                                                                 ` David Mosberger
  2003-12-08 22:17                                                                 ` David Mosberger
  2003-12-10 23:22                                                               ` Ulrich Drepper
  1 sibling, 2 replies; 98+ messages in thread
From: Jakub Jelinek @ 2003-12-08 18:16 UTC (permalink / raw)
  To: davidm; +Cc: Ulrich Drepper, libc-hacker

On Tue, Dec 02, 2003 at 11:25:32PM -0800, David Mosberger wrote:
> Executive Summary:
> 
>  Please apply this patch.  It's Good.
> 
> Long version:
> 
> The patch below adds new syscall support for ia64 Linux.  Compared to
> the earlier versions, it has a new autoconf test which ensures that
> USE_DL_SYSINFO only gets defined if the compiler uses an unwinder that
> is based on libunwind.  As explained earlier, the built-in unwinder
> for GCC is hopeless and so there is no point trying to support it.

The current IA-64 AT_SYSINFO_EHDR virtual DSO seems to be unfortunately
binary incompatible with older GCCs, which is IMHO a bad thing.
When kernel provides AT_SYSINFO_EHDR but userland doesn't grok it yet,
things should work the old way.
I think simply swapping the 2 PT_LOAD segments in virtual DSO would help,
ie. put PF_E segment before PF_R.
AT_SYSINFO_EHDR would point to the Elf64_Ehdr (followed by Elf64_Phdrs)
in the PF_R, ie. 0xa000000000020000.

I've briefly looked at .unwabi and will fix it in unwind-ia64.c as well
as other issues which I'll come over.

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-08 18:16                                                               ` Jakub Jelinek
@ 2003-12-08 19:23                                                                 ` David Mosberger
  2003-12-08 21:17                                                                   ` Jakub Jelinek
  2003-12-08 22:17                                                                 ` David Mosberger
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-12-08 19:23 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Mon, 8 Dec 2003 17:08:44 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> The current IA-64 AT_SYSINFO_EHDR virtual DSO seems to be
  Jakub> unfortunately binary incompatible with older GCCs, which is
  Jakub> IMHO a bad thing.  When kernel provides AT_SYSINFO_EHDR but
  Jakub> userland doesn't grok it yet, things should work the old way.
  Jakub> I think simply swapping the 2 PT_LOAD segments in virtual DSO
  Jakub> would help, ie. put PF_E segment before PF_R.
  Jakub> AT_SYSINFO_EHDR would point to the Elf64_Ehdr (followed by
  Jakub> Elf64_Phdrs) in the PF_R, ie. 0xa000000000020000.

One possitiblity would be for glibc NOT to register the
AT_SYSINFO_EHDR on ia64 for now (i.e., dl_iterate_phdr() wouldn't find
the kernel DSO).  libunwind will then fall back on using the
getunwind() system call.  Not pretty, but it would be backwards compatible.

Of course, longer term, we'd want to get rid of this workaround.  The
question is whether there could be a quick and reliable test to
discover whether the unwinder can handle .unwabi.  Unfortunately this
is complicatd by the fact that we may be dealing with multiple
unwinders (statically linked in and one loaded dynamically via
libgcc_s).

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-12-08 17:40                                                                         ` David Mosberger
@ 2003-12-08 19:27                                                                           ` Ulrich Drepper
  2003-12-08 22:22                                                                             ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-12-08 19:27 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> It will mean, however, that the new syscall-stub patch will conflict
> with this change (I thought you had ignored this previous patch).  Do
> I need to send an updated syscall-stub patch?

Would be good.  But I'm nevertheless waiting until Jakub has analyzed
the situation of the stock gcc.

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/1M1x2ijCOnn/RHQRApV+AJ93B3Meq+AU4LDXMj0gaqTh5x+GFgCgjgO/
D/wOaYS0CgsV21H1dCq3aTM=
=se5X
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-08 19:23                                                                 ` David Mosberger
@ 2003-12-08 21:17                                                                   ` Jakub Jelinek
  2003-12-08 22:10                                                                     ` David Mosberger
  2003-12-09  4:41                                                                     ` David Mosberger
  0 siblings, 2 replies; 98+ messages in thread
From: Jakub Jelinek @ 2003-12-08 21:17 UTC (permalink / raw)
  To: davidm; +Cc: Ulrich Drepper, libc-hacker

On Mon, Dec 08, 2003 at 11:09:59AM -0800, David Mosberger wrote:
> >>>>> On Mon, 8 Dec 2003 17:08:44 +0100, Jakub Jelinek <jakub@redhat.com> said:
> 
>   Jakub> The current IA-64 AT_SYSINFO_EHDR virtual DSO seems to be
>   Jakub> unfortunately binary incompatible with older GCCs, which is
>   Jakub> IMHO a bad thing.  When kernel provides AT_SYSINFO_EHDR but
>   Jakub> userland doesn't grok it yet, things should work the old way.
>   Jakub> I think simply swapping the 2 PT_LOAD segments in virtual DSO
>   Jakub> would help, ie. put PF_E segment before PF_R.
>   Jakub> AT_SYSINFO_EHDR would point to the Elf64_Ehdr (followed by
>   Jakub> Elf64_Phdrs) in the PF_R, ie. 0xa000000000020000.
> 
> One possitiblity would be for glibc NOT to register the
> AT_SYSINFO_EHDR on ia64 for now (i.e., dl_iterate_phdr() wouldn't find
> the kernel DSO).  libunwind will then fall back on using the
> getunwind() system call.  Not pretty, but it would be backwards compatible.

It will not.  The thing is that MD_FALLBACK_FRAME_STATE_FOR special handles
the signal trampoline only if it is between:

#define IA64_GATE_AREA_START 0xa000000000000100LL
#define IA64_GATE_AREA_END   0xa000000000020000LL

#define MD_FALLBACK_FRAME_STATE_FOR(CONTEXT, FS, SUCCESS)               \
  if ((CONTEXT)->rp >= IA64_GATE_AREA_START                             \
      && (CONTEXT)->rp < IA64_GATE_AREA_END)                            \
    {                                                                   \
...

(that's still in current GCC).
If ld.so is not aware about AT_SYSINFO_EHDR, then
MD_FALLBACK_FRAME_STATE_FOR is where the signal trampoline should be
handled.
In 2.4.x kernels the signal trampoline indeed lived between
0xa000000000010000 and 0xa000000000020000, but with the VDSO changes
it now is at 0xa0000000000207e0 though.
That's why I've suggested to swap the 2 segments, then the signal
trampoline would be at 0xa0000000000107e0 and unmodified libgcc_s.so.1
would keep working (as long as ld.so has not included AT_SYSINFO_EHDR).

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-08 21:17                                                                   ` Jakub Jelinek
@ 2003-12-08 22:10                                                                     ` David Mosberger
  2003-12-09  4:41                                                                     ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-12-08 22:10 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Mon, 8 Dec 2003 20:09:58 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> It will not.  The thing is that MD_FALLBACK_FRAME_STATE_FOR
  Jakub> special handles the signal trampoline only if it is between:

  Jakub> #define IA64_GATE_AREA_START 0xa000000000000100LL
  Jakub> #define IA64_GATE_AREA_END   0xa000000000020000LL

Ah, I didn't realize it was testing for such a narrow range.
Switching the read-only and execute-only pages should be OK.  At least
I can't think of anything that would depend on the order of those two
pages.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-08 18:16                                                               ` Jakub Jelinek
  2003-12-08 19:23                                                                 ` David Mosberger
@ 2003-12-08 22:17                                                                 ` David Mosberger
  2003-12-08 22:46                                                                   ` Jakub Jelinek
  1 sibling, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-12-08 22:17 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Mon, 8 Dec 2003 17:08:44 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> I've briefly looked at .unwabi and will fix it in
  Jakub> unwind-ia64.c as well as other issues which I'll come over.

If you really want to keep the old unwinder alive, could you at least
put it in a separate (shared) library called libunwind.so?  I'm
concerned about version-mismatches that can occur when libgcc_eh.a
gets linked into a program and then libgcc_s.so gets loaded as well.
The version conflict arises because of potential differences in the
_Unwind_Context structure.  If libgcc_eh.a doesn't implement any part
of the _Unwind_*() routines, such version conflicts should be much
less likely.  This is the approach I'm now pursuing with libunwind: it
implements all the _Unwind_*() routines needed by GCC so no part of
that interface gets implemented in libgcc_eh.a anymore.  libunwind has
been updated already, but I haven't submitted a GCC patch yet, because
I can't get through to the CVS server (due to the savannah break-in, I
assume).

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: unwind failures due to __pthread_initialize_minimal
  2003-12-08 19:27                                                                           ` Ulrich Drepper
@ 2003-12-08 22:22                                                                             ` David Mosberger
  0 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-12-08 22:22 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, libc-hacker

>>>>> On Mon, 08 Dec 2003 11:13:53 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Uli> -----BEGIN PGP SIGNED MESSAGE-----
  Uli> Hash: SHA1

  Uli> David Mosberger wrote:

  >> It will mean, however, that the new syscall-stub patch will conflict
  >> with this change (I thought you had ignored this previous patch).  Do
  >> I need to send an updated syscall-stub patch?

  Uli> Would be good.  But I'm nevertheless waiting until Jakub has analyzed
  Uli> the situation of the stock gcc.

OK, that's fine.  As long as i know that it's being worked on.
I'll wait to hear from Jakub before making a new patch, then.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-08 22:17                                                                 ` David Mosberger
@ 2003-12-08 22:46                                                                   ` Jakub Jelinek
  2003-12-08 23:03                                                                     ` David Mosberger
  0 siblings, 1 reply; 98+ messages in thread
From: Jakub Jelinek @ 2003-12-08 22:46 UTC (permalink / raw)
  To: davidm; +Cc: Ulrich Drepper, libc-hacker

On Mon, Dec 08, 2003 at 02:17:39PM -0800, David Mosberger wrote:
>   Jakub> I've briefly looked at .unwabi and will fix it in
>   Jakub> unwind-ia64.c as well as other issues which I'll come over.
> 
> If you really want to keep the old unwinder alive, could you at least
> put it in a separate (shared) library called libunwind.so?  I'm
> concerned about version-mismatches that can occur when libgcc_eh.a
> gets linked into a program and then libgcc_s.so gets loaded as well.
> The version conflict arises because of potential differences in the
> _Unwind_Context structure.  If libgcc_eh.a doesn't implement any part
> of the _Unwind_*() routines, such version conflicts should be much
> less likely.  This is the approach I'm now pursuing with libunwind: it
> implements all the _Unwind_*() routines needed by GCC so no part of
> that interface gets implemented in libgcc_eh.a anymore.  libunwind has
> been updated already, but I haven't submitted a GCC patch yet, because
> I can't get through to the CVS server (due to the savannah break-in, I
> assume).

Why a separate shared library and not libgcc_s.so?
The code certainly needs to be in libgcc_eh.a too for statically linked
apps.  Similarly C++ apps and C apps linked with -shared-libgcc.
The only problematic case in this regard is C app using -fexceptions
linked without -shared-libgcc, right?

Now, the thing is, no matter what shared library you want, for that case
solving it is quite hard.  Implying -shared-libgcc for all programs
is an unnecessary bloat for the 99% of C apps which don't need it.
But libgcc_eh.a used for non-static links cannot easily use dlopen either
- that would bloat all apps with -ldl.

	Jakub

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-08 22:46                                                                   ` Jakub Jelinek
@ 2003-12-08 23:03                                                                     ` David Mosberger
  0 siblings, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-12-08 23:03 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Mon, 8 Dec 2003 21:38:45 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> Why a separate shared library and not libgcc_s.so?

libunwind is the standard library in the ia64 world for implementing
the _Unwind_*() interface.  Intel ships libunwind.so with their
compiler, HP-UX does too, etc.

  Jakub> The code certainly needs to be in libgcc_eh.a too for
  Jakub> statically linked apps.

I don't think so.  Just link in libunwind.a when necessary.  GCC
already has code to handle this.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-08 21:17                                                                   ` Jakub Jelinek
  2003-12-08 22:10                                                                     ` David Mosberger
@ 2003-12-09  4:41                                                                     ` David Mosberger
  1 sibling, 0 replies; 98+ messages in thread
From: David Mosberger @ 2003-12-09  4:41 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: davidm, Ulrich Drepper, libc-hacker

>>>>> On Mon, 8 Dec 2003 20:09:58 +0100, Jakub Jelinek <jakub@redhat.com> said:

  Jakub> In 2.4.x kernels the signal trampoline indeed lived between
  Jakub> 0xa000000000010000 and 0xa000000000020000, but with the VDSO
  Jakub> changes it now is at 0xa0000000000207e0 though.

It occurred to me that another (much simpler) approach would be to
simply remove the guard-page and then move the DSO mappings down by
64KB.  The combination of READONLY and EXECUTEONLY DSO mappings should
have the same effect as the single NOPROT guard-page, so it's not
strictly needed anymore.

	--david

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-03  7:25                                                             ` David Mosberger
  2003-12-08 18:16                                                               ` Jakub Jelinek
@ 2003-12-10 23:22                                                               ` Ulrich Drepper
  2003-12-11  0:37                                                                 ` David Mosberger
  1 sibling, 1 reply; 98+ messages in thread
From: Ulrich Drepper @ 2003-12-10 23:22 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've applied this patch now.  With the gcc changes Jakub posted today
I can build and run the code.

I made some changes, though.  Obviously the initfini code.  Please look
through all three files and send a patch with the remaining changes.
And we use the new syscall code now also for gcc so the appropriate
definition in the sysdep.h header has been changed.

Since the new glibc is needed at runtime it doesn't make much sense to
add configure tests for the new functionality.  We'll just have to have
a canned response like "get a new gcc" for people with problems.

Also, the ChangeLog was quite incomplete.

Thanks,

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/16af2ijCOnn/RHQRAk9+AJ9izhCJbAjBo4RmO20tmLPDJf7iSQCgwgt8
nTnsN4nm+uW5P4qADGIjAaw=
=g0LV
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-10 23:22                                                               ` Ulrich Drepper
@ 2003-12-11  0:37                                                                 ` David Mosberger
  2003-12-11 21:00                                                                   ` Ulrich Drepper
  0 siblings, 1 reply; 98+ messages in thread
From: David Mosberger @ 2003-12-11  0:37 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: davidm, libc-hacker

>>>>> On Wed, 10 Dec 2003 15:05:03 -0800, Ulrich Drepper <drepper@redhat.com> said:

  Ulrich> I've applied this patch now.

Yippee!

  Ulrich> I made some changes, though.  Obviously the initfini code.
  Ulrich> Please look through all three files and send a patch with
  Ulrich> the remaining changes.

Attached is a patch for the (slightly updated) left-overs.

Thanks,

	--david

nptl/ChangeLog

2003-12-10  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/pt-initfini.c (_init_EPILOG_BEGINS):
	Add unwind directives.  Drop unused .regstk directive.
	(_fini_EPILOG_BEGINS): Add unwind directives.

linuxthreads/ChangeLog

2003-12-10  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/pt-initfini.c: Update copyright
	message.  Add include of <stddef.h>.
	(INIT_NEW_WAY): New macro.
	(INIT_OLD_WAY): Likewise.
	(_init): Add unwind directives.  Invoke
	__pthread_initialize_minimal() via INIT_NEW_WAY or INIT_OLD_WAY,
	respectively.
	(_init_EPILOG_BEGINS): Add unwind-directives.  Drop unused .regstk
	directive.
	(_fini): Add unwind directives.  Drop unnecessary .align 16
	directive (bundles are always 16-byte aligned).
	(_fini_EPILOG_BEGINS): Add unwind-directives.

Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
--- linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
@@ -1,5 +1,5 @@
 /* Special .init and .fini section support for ia64. LinuxThreads version.
-   Copyright (C) 2000, 2001, 2002 Free Software Foundation, Inc.
+   Copyright (C) 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it
@@ -36,40 +36,62 @@
    * crtn.s puts the corresponding function epilogues
    in the .init and .fini sections. */
 
+#include <stddef.h>
+
+#ifdef HAVE_INITFINI_ARRAY
+
+# define INIT_NEW_WAY \
+    ".xdata8 \".init_array\", @fptr(__pthread_initialize_minimal)\n"
+# define INIT_OLD_WAY ""
+#else
+# define INIT_NEW_WAY ""
+# define INIT_OLD_WAY \
+	"\n\
+	st8 [r12] = gp, -16\n\
+	br.call.sptk.many b0 = __pthread_initialize_minimal# ;;\n\
+	;;\n\
+	adds r12 = 16, r12\n\
+	;;\n\
+	ld8 gp = [r12]\n\
+	;;\n"
+#endif
+
 __asm__ ("\n\
 \n\
 #include \"defs.h\"\n\
 \n\
 /*@HEADER_ENDS*/\n\
 \n\
-/*@_init_PROLOG_BEGINS*/\n\
-	.section .init\n\
+/*@_init_PROLOG_BEGINS*/\n"
+	INIT_NEW_WAY
+	".section .init\n\
 	.align 16\n\
 	.global _init#\n\
 	.proc _init#\n\
 _init:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
-	;;\n\
-/* we could use r35 to save gp, but we use the stack since that's what\n\
- * all the other init routines will do --davidm 00/04/05 */\n\
-	st8 [r12] = gp, -16\n\
-	br.call.sptk.many b0 = __pthread_initialize_minimal# ;;\n\
-	;;\n\
-	adds r12 = 16, r12\n\
-	;;\n\
-	ld8 gp = [r12]\n\
-	;;\n\
-	.align 16\n\
-	.endp _init#\n\
+	;;\n"
+	INIT_OLD_WAY
+	".endp _init#\n\
 \n\
 /*@_init_PROLOG_ENDS*/\n\
 \n\
 /*@_init_EPILOG_BEGINS*/\n\
 	.section .init\n\
-	.regstk 0,2,0,0\n\
+	.proc _init#\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
+	.vframe r32\n\
+	.save rp, r33\n\
+	.body\n\
 	mov r12 = r32\n\
 	mov ar.pfs = r34\n\
 	mov b0 = r33\n\
@@ -83,18 +105,28 @@
 	.global _fini#\n\
 	.proc _fini#\n\
 _fini:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-	.align 16\n\
 	.endp _fini#\n\
 \n\
 /*@_fini_PROLOG_ENDS*/\n\
 \n\
 /*@_fini_EPILOG_BEGINS*/\n\
 	.section .fini\n\
+	.proc _fini#\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
+	.vframe r32\n\
+	.save rp, r33\n\
+	.body\n\
 	mov r12 = r32\n\
 	mov ar.pfs = r34\n\
 	mov b0 = r33\n\
Index: nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
--- nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
+++ nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
@@ -87,7 +87,12 @@
 \n\
 /*@_init_EPILOG_BEGINS*/\n\
 	.section .init\n\
-	.regstk 0,2,0,0\n\
+	.proc _init#\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
+	.vframe r32\n\
+	.save rp, r33\n\
+	.body\n\
 	mov r12 = r32\n\
 	mov ar.pfs = r34\n\
 	mov b0 = r33\n\
@@ -117,6 +122,12 @@
 \n\
 /*@_fini_EPILOG_BEGINS*/\n\
 	.section .fini\n\
+	.proc _fini#\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
+	.vframe r32\n\
+	.save rp, r33\n\
+	.body\n\
 	mov r12 = r32\n\
 	mov ar.pfs = r34\n\
 	mov b0 = r33\n\

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: new syscall stub support for ia64 libc
  2003-12-11  0:37                                                                 ` David Mosberger
@ 2003-12-11 21:00                                                                   ` Ulrich Drepper
  0 siblings, 0 replies; 98+ messages in thread
From: Ulrich Drepper @ 2003-12-11 21:00 UTC (permalink / raw)
  To: davidm; +Cc: libc-hacker

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Mosberger wrote:

> Attached is a patch for the (slightly updated) left-overs.

Applied.  Thanks,

- -- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/2Nen2ijCOnn/RHQRAhqGAKCrFvwKxorHVgGmBLVnxRmGYs0MUwCghuFM
NeEMKmMkKvh/LfU1UUcKYdA=
=xpqJ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2003-12-11 21:00 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-29  4:26 new syscall stub support for ia64 libc David Mosberger
2003-10-29  9:51 ` Jakub Jelinek
2003-10-30  8:04   ` David Mosberger
2003-10-30  9:09     ` Jakub Jelinek
2003-10-30 19:38       ` Roland McGrath
2003-10-30 19:59       ` David Mosberger
2003-10-30 20:23         ` Jakub Jelinek
2003-10-30 22:35           ` David Mosberger
2003-10-31  8:45     ` Richard Henderson
2003-10-31  9:07       ` Jakub Jelinek
2003-10-31 16:45         ` David Mosberger
2003-10-31 16:54           ` Jakub Jelinek
2003-10-31 18:29             ` David Mosberger
2003-11-03 21:46             ` David Mosberger
2003-11-12 22:53             ` David Mosberger
2003-11-12 23:10               ` Ulrich Drepper
2003-11-12 23:47                 ` David Mosberger
2003-11-12 23:57                   ` Jakub Jelinek
2003-11-13  2:38                     ` David Mosberger
2003-11-13  3:46                       ` Ulrich Drepper
2003-11-13  3:53                         ` David Mosberger
2003-11-13  8:23                       ` Jakub Jelinek
2003-11-13  7:32               ` David Mosberger
2003-11-13  9:24                 ` Ulrich Drepper
2003-11-13 17:30                   ` David Mosberger
2003-11-13 17:56                     ` Ulrich Drepper
2003-11-13 18:47                       ` David Mosberger
2003-11-13 20:16                         ` Ulrich Drepper
2003-11-13 21:34                       ` David Mosberger
2003-11-13 21:44                         ` Jakub Jelinek
2003-11-13 21:58                           ` David Mosberger
2003-11-13 23:45                           ` David Mosberger
2003-11-14  1:44                             ` Ulrich Drepper
2003-11-14  1:54                               ` David Mosberger
2003-11-14  2:18                               ` David Mosberger
2003-11-14  2:57                                 ` Ulrich Drepper
2003-11-14  3:22                                   ` David Mosberger
2003-11-14  3:39                                     ` Ulrich Drepper
2003-11-14  5:29                                     ` Ulrich Drepper
2003-11-14  5:49                                       ` David Mosberger
2003-11-14  6:04                                         ` Ulrich Drepper
2003-11-14  6:43                                           ` David Mosberger
2003-11-14 19:53                                             ` Ulrich Drepper
2003-11-14 19:56                                               ` David Mosberger
2003-11-14 20:36                                                 ` Ulrich Drepper
2003-11-15  0:51                                                   ` David Mosberger
2003-11-15  9:38                                                   ` David Mosberger
2003-11-17 18:21                                                     ` Ulrich Drepper
2003-11-17 18:35                                                       ` David Mosberger
2003-11-18  7:54                                                       ` David Mosberger
2003-11-18  8:22                                                         ` Ulrich Drepper
2003-11-18 16:45                                                           ` David Mosberger
2003-11-19 23:37                                                           ` unwind failures due to __pthread_initialize_minimal David Mosberger
2003-11-19 23:54                                                             ` Ulrich Drepper
2003-11-20  0:30                                                               ` Roland McGrath
2003-11-20  2:35                                                                 ` David Mosberger
2003-11-20  4:01                                                                   ` Ulrich Drepper
2003-11-20 21:20                                                                     ` David Mosberger
2003-12-07  1:46                                                                       ` Ulrich Drepper
2003-12-08 17:40                                                                         ` David Mosberger
2003-12-08 19:27                                                                           ` Ulrich Drepper
2003-12-08 22:22                                                                             ` David Mosberger
2003-11-26  9:40                                                           ` new syscall stub support for ia64 libc David Mosberger
2003-12-03  7:25                                                             ` David Mosberger
2003-12-08 18:16                                                               ` Jakub Jelinek
2003-12-08 19:23                                                                 ` David Mosberger
2003-12-08 21:17                                                                   ` Jakub Jelinek
2003-12-08 22:10                                                                     ` David Mosberger
2003-12-09  4:41                                                                     ` David Mosberger
2003-12-08 22:17                                                                 ` David Mosberger
2003-12-08 22:46                                                                   ` Jakub Jelinek
2003-12-08 23:03                                                                     ` David Mosberger
2003-12-10 23:22                                                               ` Ulrich Drepper
2003-12-11  0:37                                                                 ` David Mosberger
2003-12-11 21:00                                                                   ` Ulrich Drepper
2003-11-17 22:15                                                     ` David Mosberger
2003-11-15 19:05                                                   ` David Mosberger
2003-11-17 18:14                                                     ` Ulrich Drepper
2003-11-18  0:47                                                       ` David Mosberger
2003-11-18  1:02                                                         ` Ulrich Drepper
2003-11-18  1:22                                                           ` David Mosberger
2003-11-18  1:37                                                             ` Ulrich Drepper
2003-11-18  1:46                                                               ` David Mosberger
2003-11-18  2:17                                                                 ` Ulrich Drepper
2003-11-18  5:44                                                                   ` David Mosberger
2003-11-18 19:18                                                                   ` David Mosberger
2003-11-18 19:35                                                                     ` Ulrich Drepper
2003-11-18 20:08                                                                       ` David Mosberger
2003-11-14 20:13                                               ` patch to fix unwind info for ia64 David Mosberger
2003-11-14 20:21                                               ` David Mosberger
2003-11-14 20:24                                                 ` Roland McGrath
2003-11-14 21:12                                                   ` David Mosberger
2003-11-15 17:42                                                 ` Andreas Schwab
2003-11-15 18:52                                                   ` David Mosberger
2003-11-19  6:19                                                     ` David Mosberger
2003-11-19 15:25                                                     ` Ulrich Drepper
2003-10-31 16:43       ` new syscall stub support for ia64 libc David Mosberger
2003-10-29 17:54 ` Ulrich Drepper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).