public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] mips/o32: fix internal_syscall5/6/7
@ 2017-08-15 11:53 Aurelien Jarno
  2017-08-15 12:03 ` Andreas Schwab
  2017-08-15 12:17 ` Florian Weimer
  0 siblings, 2 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-15 11:53 UTC (permalink / raw)
  To: libc-alpha; +Cc: Aurelien Jarno

The internal_syscall5/6/7 functions use the stack pointer to store
the 5th and following arguments on the stack. In some cases GCC optimize
out the stack pointer, and thus storing the data to the stack causes a
segmentation fault.

Fix that by declaring the sp register as clobbered. Not sure it is the
best way to do that, but it seems to be enough to force GCC to not
optimize it out.

This fixes the nptl/tst-rwlock15 test on mips/o32.

ChangeLog:
	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
	(internal_syscall5): Add "$29" to the clobber list.
	(internal_syscall6): Likewise.
	(internal_syscall7): Likewise.
---
 ChangeLog                                    | 7 +++++++
 sysdeps/unix/sysv/linux/mips/mips32/sysdep.h | 6 +++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 56540f55a1..5d1a088431 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2017-08-15  Aurelien Jarno <aurelien@aurel32.net>
+
+	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
+	(internal_syscall5): Add "$29" to the clobber list.
+	(internal_syscall6): Likewise.
+	(internal_syscall7): Likewise.
+
 2017-08-14  Joseph Myers  <joseph@codesourcery.com>
 
 	* conform/data/sys/wait.h-data (si_value): Do not expect for
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
index e9e3ee7e82..0df32c186f 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
+++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
@@ -294,7 +294,7 @@
 	: "=r" (__v0), "+r" (__a3)					\
 	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
 	  "r" ((long) (arg5))						\
-	: __SYSCALL_CLOBBERS);						\
+	: __SYSCALL_CLOBBERS, "$29");					\
 	err = __a3;							\
 	_sys_result = __v0;						\
 	}								\
@@ -327,7 +327,7 @@
 	: "=r" (__v0), "+r" (__a3)					\
 	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
 	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
-	: __SYSCALL_CLOBBERS);						\
+	: __SYSCALL_CLOBBERS, "$29");					\
 	err = __a3;							\
 	_sys_result = __v0;						\
 	}								\
@@ -361,7 +361,7 @@
 	: "=r" (__v0), "+r" (__a3)					\
 	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
 	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
-	: __SYSCALL_CLOBBERS);						\
+	: __SYSCALL_CLOBBERS, "$29");					\
 	err = __a3;							\
 	_sys_result = __v0;						\
 	}								\
-- 
2.13.2

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 11:53 [PATCH] mips/o32: fix internal_syscall5/6/7 Aurelien Jarno
@ 2017-08-15 12:03 ` Andreas Schwab
  2017-08-15 13:06   ` Adhemerval Zanella
  2017-08-15 16:16   ` Aurelien Jarno
  2017-08-15 12:17 ` Florian Weimer
  1 sibling, 2 replies; 53+ messages in thread
From: Andreas Schwab @ 2017-08-15 12:03 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: libc-alpha

On Aug 15 2017, Aurelien Jarno <aurelien@aurel32.net> wrote:

> The internal_syscall5/6/7 functions use the stack pointer to store
> the 5th and following arguments on the stack. In some cases GCC optimize
> out the stack pointer, and thus storing the data to the stack causes a
> segmentation fault.

FORCE_FRAME_POINTER does not work any more?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 11:53 [PATCH] mips/o32: fix internal_syscall5/6/7 Aurelien Jarno
  2017-08-15 12:03 ` Andreas Schwab
@ 2017-08-15 12:17 ` Florian Weimer
  1 sibling, 0 replies; 53+ messages in thread
From: Florian Weimer @ 2017-08-15 12:17 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: libc-alpha

On 08/15/2017 01:50 PM, Aurelien Jarno wrote:
> The internal_syscall5/6/7 functions use the stack pointer to store
> the 5th and following arguments on the stack. In some cases GCC optimize
> out the stack pointer, and thus storing the data to the stack causes a
> segmentation fault.
> 
> Fix that by declaring the sp register as clobbered. Not sure it is the
> best way to do that, but it seems to be enough to force GCC to not
> optimize it out.

Doesn't the inline assembly *require* that r29 is the stack pointer?  Is
there a way to express this explicitly?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 12:03 ` Andreas Schwab
@ 2017-08-15 13:06   ` Adhemerval Zanella
  2017-08-15 16:18     ` Aurelien Jarno
  2017-08-15 16:26     ` Joseph Myers
  2017-08-15 16:16   ` Aurelien Jarno
  1 sibling, 2 replies; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-15 13:06 UTC (permalink / raw)
  To: libc-alpha



On 15/08/2017 09:00, Andreas Schwab wrote:
> On Aug 15 2017, Aurelien Jarno <aurelien@aurel32.net> wrote:
> 
>> The internal_syscall5/6/7 functions use the stack pointer to store
>> the 5th and following arguments on the stack. In some cases GCC optimize
>> out the stack pointer, and thus storing the data to the stack causes a
>> segmentation fault.
> 
> FORCE_FRAME_POINTER does not work any more?

Wouldn't a better option and more compiler optimization proof to route
syscall5/6/7 to a out of line symbol call to proper handle the stack
pointer as for ARM and i386 (__libc_do_syscall)?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 12:03 ` Andreas Schwab
  2017-08-15 13:06   ` Adhemerval Zanella
@ 2017-08-15 16:16   ` Aurelien Jarno
  1 sibling, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-15 16:16 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha

On 2017-08-15 14:00, Andreas Schwab wrote:
> On Aug 15 2017, Aurelien Jarno <aurelien@aurel32.net> wrote:
> 
> > The internal_syscall5/6/7 functions use the stack pointer to store
> > the 5th and following arguments on the stack. In some cases GCC optimize
> > out the stack pointer, and thus storing the data to the stack causes a
> > segmentation fault.
> 
> FORCE_FRAME_POINTER does not work any more?

From what I understand of the generated code, it seems to work at the
function level, but not at the asm code level. The pthread_rwlock_rdlock
adds a loop around the syscall, and it seems that's the code path causing
the issue. Adding $sp as clobbered changes the code to reload the $sp
from the saved value at each loop.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 13:06   ` Adhemerval Zanella
@ 2017-08-15 16:18     ` Aurelien Jarno
  2017-08-15 16:26     ` Joseph Myers
  1 sibling, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-15 16:18 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

On 2017-08-15 10:06, Adhemerval Zanella wrote:
> 
> 
> On 15/08/2017 09:00, Andreas Schwab wrote:
> > On Aug 15 2017, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > 
> >> The internal_syscall5/6/7 functions use the stack pointer to store
> >> the 5th and following arguments on the stack. In some cases GCC optimize
> >> out the stack pointer, and thus storing the data to the stack causes a
> >> segmentation fault.
> > 
> > FORCE_FRAME_POINTER does not work any more?
> 
> Wouldn't a better option and more compiler optimization proof to route
> syscall5/6/7 to a out of line symbol call to proper handle the stack
> pointer as for ARM and i386 (__libc_do_syscall)?

Hmm interesting indeed, though that implies an additional call to a
function instead of being fully inline. Not sure it makes a big
difference performance wise given the syscall a few instructions later.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 13:06   ` Adhemerval Zanella
  2017-08-15 16:18     ` Aurelien Jarno
@ 2017-08-15 16:26     ` Joseph Myers
  2017-08-15 19:34       ` Aurelien Jarno
  1 sibling, 1 reply; 53+ messages in thread
From: Joseph Myers @ 2017-08-15 16:26 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

On Tue, 15 Aug 2017, Adhemerval Zanella wrote:

> Wouldn't a better option and more compiler optimization proof to route
> syscall5/6/7 to a out of line symbol call to proper handle the stack
> pointer as for ARM and i386 (__libc_do_syscall)?

Indeed (and with a bug filed in Bugzilla as usual since this issue was 
user-visible in a release).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 16:26     ` Joseph Myers
@ 2017-08-15 19:34       ` Aurelien Jarno
  2017-08-15 19:54         ` Joseph Myers
  0 siblings, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-15 19:34 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Adhemerval Zanella, libc-alpha

On 2017-08-15 16:24, Joseph Myers wrote:
> On Tue, 15 Aug 2017, Adhemerval Zanella wrote:
> 
> > Wouldn't a better option and more compiler optimization proof to route
> > syscall5/6/7 to a out of line symbol call to proper handle the stack
> > pointer as for ARM and i386 (__libc_do_syscall)?
> 
> Indeed (and with a bug filed in Bugzilla as usual since this issue was 
> user-visible in a release).

That's one way to do that, however it does seem correct to me that there
is no way to force the stack pointer to be valid in an asm code. The
stack pointer is used in other asm codes in the glibc, so we need a more
global solution.

For the record, here is the corresponding generated code showing the
issue:

 174:	8e020000 	lw	v0,0(s0)
 178:	30420004 	andi	v0,v0,0x4
 17c:	104000a6 	beqz	v0,418 <__GI___pthread_rwlock_rdlock+0x418>
 180:	02c01825 	move	v1,s6
 184:	92050019 	lbu	a1,25(s0)
 188:	27bdfff0 	addiu	sp,sp,-16

 here the stack pointer is changed

 18c:	8fc60024 	lw	a2,36(s8)
 190:	02002025 	move	a0,s0
 194:	02e5180a 	movz	v1,s7,a1
 198:	27a20010 	addiu	v0,sp,16
 19c:	00003825 	move	a3,zero
 1a0:	afc20020 	sw	v0,32(s8)

 and the original value stored in the fp.

 1a4:	00001025 	move	v0,zero
 1a8:	00602825 	move	a1,v1

 --- begin of asm code

 1ac:	27bdffe0 	addiu	sp,sp,-32
 1b0:	afa20010 	sw	v0,16(sp)
 1b4:	afb20014 	sw	s2,20(sp)
 1b8:	2402108e 	li	v0,4238
 1bc:	0000000c 	syscall
 1c0:	27bd0020 	addiu	sp,sp,32

 --- end of asm code

 1c4:	10e0ffeb 	beqz	a3,174 <__GI___pthread_rwlock_rdlock+0x174>
 1c8:	00021823 	negu	v1,v0


When specifying the stack pointer as clobbered, we end up with the
following code:

 174:	8e020000 	lw	v0,0(s0)
 178:	30420004 	andi	v0,v0,0x4
 17c:	104000a6 	beqz	v0,418 <__GI___pthread_rwlock_rdlock+0x418>
 180:	8fc60024 	lw	a2,36(s8)
 184:	02a0e825 	move	sp,s5

 here the stack pointer is reloaded at each loop (the decrease by 16 is
 done earlier before saving it in s5).

 188:	92050019 	lbu	a1,25(s0)
 18c:	27a20010 	addiu	v0,sp,16
 190:	02002025 	move	a0,s0
 194:	afc20020 	sw	v0,32(s8)
 198:	02c01025 	move	v0,s6
 19c:	02e5100a 	movz	v0,s7,a1
 1a0:	00003825 	move	a3,zero
 1a4:	00402825 	move	a1,v0
 1a8:	00001025 	move	v0,zero

 --- begin of asm code

 1ac:	27bdffe0 	addiu	sp,sp,-32
 1b0:	afa20010 	sw	v0,16(sp)
 1b4:	afb20014 	sw	s2,20(sp)
 1b8:	2402108e 	li	v0,4238
 1bc:	0000000c 	syscall
 1c0:	27bd0020 	addiu	sp,sp,32

 --- end of asm code

 1c4:	10e0ffeb 	beqz	a3,174 <__GI___pthread_rwlock_rdlock+0x174>
 1c8:	00021823 	negu	v1,v0

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 19:34       ` Aurelien Jarno
@ 2017-08-15 19:54         ` Joseph Myers
  2017-08-15 20:09           ` Aurelien Jarno
  0 siblings, 1 reply; 53+ messages in thread
From: Joseph Myers @ 2017-08-15 19:54 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, libc-alpha

On Tue, 15 Aug 2017, Aurelien Jarno wrote:

> That's one way to do that, however it does seem correct to me that there
> is no way to force the stack pointer to be valid in an asm code. The
> stack pointer is used in other asm codes in the glibc, so we need a more
> global solution.

What's the basis for saying the stack pointer is invalid (as opposed to 
unwind information referring to the original stack pointer, so being 
invalid at the point of the syscall, causing unwinding to crash)?  The 
stack pointer should be unconditionally valid for all asms, on all 
architectures; after all, it's certainly OK to make a function call from 
inside an asm, or for a signal handler (without sigaltstack) to interrupt 
an asm.

I don't think modifying the stack pointer inside an asm can ever be safe 
*in glibc's context* because unwind info might refer to it (even if 
there's a frame pointer), and I don't think making an asm clobber the 
stack pointer is safe either.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 19:54         ` Joseph Myers
@ 2017-08-15 20:09           ` Aurelien Jarno
  2017-08-15 20:21             ` Joseph Myers
  0 siblings, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-15 20:09 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Adhemerval Zanella, libc-alpha

On 2017-08-15 19:51, Joseph Myers wrote:
> On Tue, 15 Aug 2017, Aurelien Jarno wrote:
> 
> > That's one way to do that, however it does seem correct to me that there
> > is no way to force the stack pointer to be valid in an asm code. The
> > stack pointer is used in other asm codes in the glibc, so we need a more
> > global solution.
> 
> What's the basis for saying the stack pointer is invalid (as opposed to 
> unwind information referring to the original stack pointer, so being 
> invalid at the point of the syscall, causing unwinding to crash)?  The 

1) Looking at the assembly code, the value of the stack pointer around
   the syscall depends on the number of time the loop is executed.
2) The crash happens when reaching the stack guard, with a very simple
   test case not using recursive functions.

> stack pointer should be unconditionally valid for all asms, on all 
> architectures; after all, it's certainly OK to make a function call from 
> inside an asm, or for a signal handler (without sigaltstack) to interrupt 
> an asm.
> 
> I don't think modifying the stack pointer inside an asm can ever be safe 
> *in glibc's context* because unwind info might refer to it (even if 
> there's a frame pointer), and I don't think making an asm clobber the 
> stack pointer is safe either.

That's indeed correct, so in that specific case it indeed make sense
to use an out of line symbol call. I am still worried about other use of
$sp in other asm code.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 20:09           ` Aurelien Jarno
@ 2017-08-15 20:21             ` Joseph Myers
  2017-08-15 20:41               ` Aurelien Jarno
  2017-08-16 13:26               ` Maciej W. Rozycki
  0 siblings, 2 replies; 53+ messages in thread
From: Joseph Myers @ 2017-08-15 20:21 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, libc-alpha

On Tue, 15 Aug 2017, Aurelien Jarno wrote:

> On 2017-08-15 19:51, Joseph Myers wrote:
> > On Tue, 15 Aug 2017, Aurelien Jarno wrote:
> > 
> > > That's one way to do that, however it does seem correct to me that there
> > > is no way to force the stack pointer to be valid in an asm code. The
> > > stack pointer is used in other asm codes in the glibc, so we need a more
> > > global solution.
> > 
> > What's the basis for saying the stack pointer is invalid (as opposed to 
> > unwind information referring to the original stack pointer, so being 
> > invalid at the point of the syscall, causing unwinding to crash)?  The 
> 
> 1) Looking at the assembly code, the value of the stack pointer around
>    the syscall depends on the number of time the loop is executed.
> 2) The crash happens when reaching the stack guard, with a very simple
>    test case not using recursive functions.

What that says to me is that the alloca (to ensure frame pointer creation) 
is fundamentally problematic if the syscall macro can be used many times 
in a loop within a function, because it will allocate unbounded amounts of 
stack.

In which case having a volatile integer variable with value 4, declaring a 
VLA whose size is that variable, and storing a pointer to that VLA in a 
variable, would be an alternative to alloca to force a frame pointer, but 
with deallocation happening when the scope ends rather than the function 
ending (and the syscall macro has its own scope, so using it inside a loop 
wouldn't be a problem).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 20:21             ` Joseph Myers
@ 2017-08-15 20:41               ` Aurelien Jarno
  2017-08-16 13:26               ` Maciej W. Rozycki
  1 sibling, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-15 20:41 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Adhemerval Zanella, libc-alpha

On 2017-08-15 20:21, Joseph Myers wrote:
> On Tue, 15 Aug 2017, Aurelien Jarno wrote:
> 
> > On 2017-08-15 19:51, Joseph Myers wrote:
> > > On Tue, 15 Aug 2017, Aurelien Jarno wrote:
> > > 
> > > > That's one way to do that, however it does seem correct to me that there
> > > > is no way to force the stack pointer to be valid in an asm code. The
> > > > stack pointer is used in other asm codes in the glibc, so we need a more
> > > > global solution.
> > > 
> > > What's the basis for saying the stack pointer is invalid (as opposed to 
> > > unwind information referring to the original stack pointer, so being 
> > > invalid at the point of the syscall, causing unwinding to crash)?  The 
> > 
> > 1) Looking at the assembly code, the value of the stack pointer around
> >    the syscall depends on the number of time the loop is executed.
> > 2) The crash happens when reaching the stack guard, with a very simple
> >    test case not using recursive functions.
> 
> What that says to me is that the alloca (to ensure frame pointer creation) 
> is fundamentally problematic if the syscall macro can be used many times 
> in a loop within a function, because it will allocate unbounded amounts of 
> stack.

Oh didn't thought about that, thanks for the explanation.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-15 20:21             ` Joseph Myers
  2017-08-15 20:41               ` Aurelien Jarno
@ 2017-08-16 13:26               ` Maciej W. Rozycki
  2017-08-16 13:44                 ` Joseph Myers
  1 sibling, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-16 13:26 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Aurelien Jarno, Adhemerval Zanella, libc-alpha

On Tue, 15 Aug 2017, Joseph Myers wrote:

> In which case having a volatile integer variable with value 4, declaring a 
> VLA whose size is that variable, and storing a pointer to that VLA in a 
> variable, would be an alternative to alloca to force a frame pointer, but 
> with deallocation happening when the scope ends rather than the function 
> ending (and the syscall macro has its own scope, so using it inside a loop 
> wouldn't be a problem).

 I suspect using volatile variables will cause unnecessary memory traffic.  
Passing the size specifier through an empty `asm' might give better code; 
also I think we can use 0 as the size requested, not to decrease the stack 
pointer unnecessarily, e.g.:

  {
    size_t s = 0;

    asm ("" : "+r" (s));
    {
      char vla[s << 3];

      asm ("" : : "p" (vla));
      /* ... */

This seems to produce reasonable code with GCC 8, taking the necessity to 
align stack as per the ABI requirement into account already, and wasting 
two instructions only in addition to the $sp adjustment itself:

	move	$4,$0
	sll	$4,$4,3
# ...
	subu	$sp,$sp,$4
	
 Also I wonder if there's actually a dependable way to have GCC itself 
allocate the argument space we require.  For example if we set `s' to 1 
above instead for `internal_syscall6', then would `0($sp)' and `4($sp)' be 
valid to place arguments #5 and #6 at respectively without the subsequent 
$sp adjustment we currently have in the syscall `asm' or would it be UB?

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 13:26               ` Maciej W. Rozycki
@ 2017-08-16 13:44                 ` Joseph Myers
  2017-08-16 14:13                   ` Adhemerval Zanella
  2017-08-16 14:32                   ` Maciej W. Rozycki
  0 siblings, 2 replies; 53+ messages in thread
From: Joseph Myers @ 2017-08-16 13:44 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Aurelien Jarno, Adhemerval Zanella, libc-alpha

On Wed, 16 Aug 2017, Maciej W. Rozycki wrote:

> On Tue, 15 Aug 2017, Joseph Myers wrote:
> 
> > In which case having a volatile integer variable with value 4, declaring a 
> > VLA whose size is that variable, and storing a pointer to that VLA in a 
> > variable, would be an alternative to alloca to force a frame pointer, but 
> > with deallocation happening when the scope ends rather than the function 
> > ending (and the syscall macro has its own scope, so using it inside a loop 
> > wouldn't be a problem).
> 
>  I suspect using volatile variables will cause unnecessary memory traffic.  
> Passing the size specifier through an empty `asm' might give better code; 
> also I think we can use 0 as the size requested, not to decrease the stack 
> pointer unnecessarily, e.g.:

Sure, as long as (a) the compiler can't know the size is actually constant 
and (b) it can't know the VLA isn't actually used, as if it can tell 
either of those things it can optimize away the variable stack allocation.

>  Also I wonder if there's actually a dependable way to have GCC itself 
> allocate the argument space we require.  For example if we set `s' to 1 
> above instead for `internal_syscall6', then would `0($sp)' and `4($sp)' be 
> valid to place arguments #5 and #6 at respectively without the subsequent 
> $sp adjustment we currently have in the syscall `asm' or would it be UB?

You can't tell whether the compiler might have allocated other variables 
on the stack after the dynamic adjustment - that is, whether any 
particular offset from sp is in fact unused or not.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 13:44                 ` Joseph Myers
@ 2017-08-16 14:13                   ` Adhemerval Zanella
  2017-08-16 14:47                     ` Maciej W. Rozycki
                                       ` (2 more replies)
  2017-08-16 14:32                   ` Maciej W. Rozycki
  1 sibling, 3 replies; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-16 14:13 UTC (permalink / raw)
  To: Joseph Myers, Maciej W. Rozycki; +Cc: Aurelien Jarno, libc-alpha



On 16/08/2017 10:44, Joseph Myers wrote:
> On Wed, 16 Aug 2017, Maciej W. Rozycki wrote:
> 
>> On Tue, 15 Aug 2017, Joseph Myers wrote:
>>
>>> In which case having a volatile integer variable with value 4, declaring a 
>>> VLA whose size is that variable, and storing a pointer to that VLA in a 
>>> variable, would be an alternative to alloca to force a frame pointer, but 
>>> with deallocation happening when the scope ends rather than the function 
>>> ending (and the syscall macro has its own scope, so using it inside a loop 
>>> wouldn't be a problem).
>>
>>  I suspect using volatile variables will cause unnecessary memory traffic.  
>> Passing the size specifier through an empty `asm' might give better code; 
>> also I think we can use 0 as the size requested, not to decrease the stack 
>> pointer unnecessarily, e.g.:
> 
> Sure, as long as (a) the compiler can't know the size is actually constant 
> and (b) it can't know the VLA isn't actually used, as if it can tell 
> either of those things it can optimize away the variable stack allocation.
> 
>>  Also I wonder if there's actually a dependable way to have GCC itself 
>> allocate the argument space we require.  For example if we set `s' to 1 
>> above instead for `internal_syscall6', then would `0($sp)' and `4($sp)' be 
>> valid to place arguments #5 and #6 at respectively without the subsequent 
>> $sp adjustment we currently have in the syscall `asm' or would it be UB?
> 
> You can't tell whether the compiler might have allocated other variables 
> on the stack after the dynamic adjustment - that is, whether any 
> particular offset from sp is in fact unused or not.
> 

What about the below? I can use some help to see if I am handling all the
required ABI requirements for the __libc_do_syscall, but on an qemu emulated
system I see no regression on basic tests (including some cancellation one
from glibc to see the syscall is correctly unwinded) and tst-rwlock15 also
does not fail anymore.


diff --git a/sysdeps/unix/sysv/linux/mips/mips32/Makefile b/sysdeps/unix/sysv/linux/mips/mips32/Makefile
index 33b4615..cbdf032 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/Makefile
+++ b/sysdeps/unix/sysv/linux/mips/mips32/Makefile
@@ -1,8 +1,26 @@
+ifeq ($(subdir),elf)
+sysdep-dl-routines += libc-do-syscall
+endif
+
 ifeq ($(subdir),conform)
 # For bugs 17786 and 21278.
 conformtest-xfail-conds += mips-o32-linux
 endif
 
+ifeq ($(subdir),io)
+sysdep_routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),nptl)
+libpthread-sysdep_routines += libc-do-syscall
+libpthread-shared-only-routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),rt)
+librt-sysdep_routines += libc-do-syscall
+librt-shared-only-routines += libc-do-syscall
+endif
+
 ifeq ($(subdir),stdlib)
 tests += bug-getcontext-mips-gp
 endif
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
new file mode 100644
index 0000000..a7184d9
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
@@ -0,0 +1,54 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sys/asm.h>
+#include <sysdep.h>
+#include <asm/unistd.h>
+#include <sgidefs.h>
+
+
+/* long int __libc_do_syscall (long int, ...)  */
+
+#define FRAMESZ 32
+
+        .text
+        .set    nomips16
+	.hidden __libc_do_syscall
+ENTRY(__libc_do_syscall)
+        move    $2, $4
+        move    $4, $5
+        move    $5, $6
+        move    $6, $7
+        lw      $7, 16(sp)
+        lw      $8, 20(sp)
+        lw      $9, 24(sp)
+        lw      $10,28(sp)
+	.set 	noreorder
+	PTR_SUBU sp, FRAMESZ
+	cfi_adjust_cfa_offset (FRAMESZ)
+        sw      $8, 16(sp)
+        sw      $9, 20(sp)
+        sw      $10,24(sp)
+        syscall
+	PTR_ADDU sp, FRAMESZ
+	cfi_adjust_cfa_offset (-FRAMESZ)
+	.set	reorder
+        beq     $7, $0, 1f
+        subu    $2, $0, $2
+1:      jr      ra
+        nop
+END (__libc_do_syscall)
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
index e9e3ee7..3a8920a 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
+++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
@@ -121,13 +121,13 @@
 # define INTERNAL_SYSCALL_MIPS16(number, err, nr, args...)		\
 	internal_syscall##nr ("lw\t%0, %2\n\t",				\
 			      "R" (number),				\
-			      0, err, args)
+			      SYS_ify(name), err, args)
 
 #else /* !__mips16 */
 # define INTERNAL_SYSCALL(name, err, nr, args...)			\
 	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
 			      "IK" (SYS_ify (name)),			\
-			      0, err, args)
+			      SYS_ify(name), err, args)
 
 # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
 	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
@@ -136,6 +136,7 @@
 
 #endif /* !__mips16 */
 
+
 #define internal_syscall0(v0_init, input, number, err, dummy...)	\
 ({									\
 	long _sys_result;						\
@@ -262,109 +263,41 @@
 	_sys_result;							\
 })
 
-/* We need to use a frame pointer for the functions in which we
-   adjust $sp around the syscall, or debug information and unwind
-   information will be $sp relative and thus wrong during the syscall.  As
-   of GCC 4.7, this is sufficient.  */
-#define FORCE_FRAME_POINTER						\
-  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
+long int __libc_do_syscall (long int, ...) attribute_hidden;
 
 #define internal_syscall5(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5)			\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5))						\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
+	long int _sys_result;						\
+	_sys_result = __libc_do_syscall (number, arg1, arg2, arg3,	\
+					 arg4, arg5);			\
+	err = _sys_result > -4096UL ? 1 : 0;				\
+	if (err)							\
+	  _sys_result = -_sys_result;					\
 	_sys_result;							\
 })
 
 #define internal_syscall6(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6)		\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
+	long int _sys_result;						\
+	_sys_result = __libc_do_syscall (number, arg1, arg2, arg3,	\
+					 arg4, arg5, arg6);		\
+	err = _sys_result > -4096UL ? 1 : 0;				\
+	if (err)							\
+	  _sys_result = -_sys_result;					\
 	_sys_result;							\
 })
 
 #define internal_syscall7(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6, arg7)	\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	"sw\t%8, 24($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
+	long int _sys_result;						\
+	_sys_result = __libc_do_syscall (number, arg1, arg2, arg3,	\
+					 arg4, arg5, arg6, arg7);	\
+	err = _sys_result > -4096UL ? 1 : 0;				\
+	if (err)							\
+	  _sys_result = -_sys_result;					\
 	_sys_result;							\
 })
 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 13:44                 ` Joseph Myers
  2017-08-16 14:13                   ` Adhemerval Zanella
@ 2017-08-16 14:32                   ` Maciej W. Rozycki
  2017-08-16 14:47                     ` Joseph Myers
  1 sibling, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-16 14:32 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Aurelien Jarno, Adhemerval Zanella, libc-alpha

On Wed, 16 Aug 2017, Joseph Myers wrote:

> >  I suspect using volatile variables will cause unnecessary memory traffic.  
> > Passing the size specifier through an empty `asm' might give better code; 
> > also I think we can use 0 as the size requested, not to decrease the stack 
> > pointer unnecessarily, e.g.:
> 
> Sure, as long as (a) the compiler can't know the size is actually constant 
> and (b) it can't know the VLA isn't actually used, as if it can tell 
> either of those things it can optimize away the variable stack allocation.

 Well, an `asm' is a black box, unless it is known -- under the conditions 
set out in GCC documentation -- to be safe to optimise away regardless.  
Neither `asm' I proposed matches the conditions.

> >  Also I wonder if there's actually a dependable way to have GCC itself 
> > allocate the argument space we require.  For example if we set `s' to 1 
> > above instead for `internal_syscall6', then would `0($sp)' and `4($sp)' be 
> > valid to place arguments #5 and #6 at respectively without the subsequent 
> > $sp adjustment we currently have in the syscall `asm' or would it be UB?
> 
> You can't tell whether the compiler might have allocated other variables 
> on the stack after the dynamic adjustment - that is, whether any 
> particular offset from sp is in fact unused or not.

 Hmm, taking the requirement to deallocate the space arranged for a VLA at 
the exit of the containing block into account I think we can eliminate the 
possibility for the compiler to have allocated space for other variables 
as long as no other variable has been declared within the same block (or 
any nested one).

 Or am I missing anything here?  E.g. is the compiler allowed to spill 
random data to the stack at any time even in the absence of a matching 
variable declaration?  Or is it allowed to allocate space for a non-VLA 
variable that has been declared outside the block concerned, but the 
lifespan of which is contained within the block in this stack space rather 
than in the local frame?

 If the answer to any of these questions is "yes", then would factoring 
out the syscall `asm' along with the associated VLA declaration to a 
helper `always_inline' function help or would it not?

 I mean it is a tiny optimisation, but some syscalls are frequently 
called, so if we can avoid a waste of resources, then why not?

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 14:32                   ` Maciej W. Rozycki
@ 2017-08-16 14:47                     ` Joseph Myers
  2017-08-17 16:17                       ` Maciej W. Rozycki
  0 siblings, 1 reply; 53+ messages in thread
From: Joseph Myers @ 2017-08-16 14:47 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Aurelien Jarno, Adhemerval Zanella, libc-alpha

On Wed, 16 Aug 2017, Maciej W. Rozycki wrote:

>  Or am I missing anything here?  E.g. is the compiler allowed to spill 
> random data to the stack at any time even in the absence of a matching 
> variable declaration?  Or is it allowed to allocate space for a non-VLA 
> variable that has been declared outside the block concerned, but the 
> lifespan of which is contained within the block in this stack space rather 
> than in the local frame?

Yes, it's allowed to do both of those.

>  If the answer to any of these questions is "yes", then would factoring 
> out the syscall `asm' along with the associated VLA declaration to a 
> helper `always_inline' function help or would it not?

I don't think that would help.  An asm can never make assumptions about 
which parts of the stack are used for what, only use its operands.

>  I mean it is a tiny optimisation, but some syscalls are frequently 
> called, so if we can avoid a waste of resources, then why not?

Are any 5/6/7-argument syscalls frequently called?

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 14:13                   ` Adhemerval Zanella
@ 2017-08-16 14:47                     ` Maciej W. Rozycki
  2017-08-16 14:54                       ` Adhemerval Zanella
  2017-08-16 15:18                     ` Aurelien Jarno
  2017-08-16 21:15                     ` Aurelien Jarno
  2 siblings, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-16 14:47 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Joseph Myers, Aurelien Jarno, libc-alpha

On Wed, 16 Aug 2017, Adhemerval Zanella wrote:

> +ENTRY(__libc_do_syscall)
> +        move    $2, $4
> +        move    $4, $5
> +        move    $5, $6
> +        move    $6, $7

 I'm not very keen on having a nested syscall function call, but if you do 
that, then please at least arrange the wrapper's arguments such that you 
don't have to shuffle them, i.e. I suggest placing the syscall number 
last.

 For historical reasons you may want to initialise $2 right before the 
SYSCALL instruction, although I take it we don't anymore support Linux 
kernels old enough to require it for the syscall restart convention (so it 
would mainly serve as a reference for those who need to write their own 
code supporting those old kernels, as people often blindly copy & paste 
existing pieces).

 Also the MIPS16 wrappers may require adjustment then in order not to 
execute a doubly nested function call unnecessarily, i.e. call 
`__libc_do_syscall' directly rather than through another wrapper.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 14:47                     ` Maciej W. Rozycki
@ 2017-08-16 14:54                       ` Adhemerval Zanella
  2017-08-16 16:12                         ` Aurelien Jarno
  2017-08-16 21:08                         ` Aurelien Jarno
  0 siblings, 2 replies; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-16 14:54 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Joseph Myers, Aurelien Jarno, libc-alpha



On 16/08/2017 11:46, Maciej W. Rozycki wrote:
> On Wed, 16 Aug 2017, Adhemerval Zanella wrote:
> 
>> +ENTRY(__libc_do_syscall)
>> +        move    $2, $4
>> +        move    $4, $5
>> +        move    $5, $6
>> +        move    $6, $7
> 
>  I'm not very keen on having a nested syscall function call, but if you do 
> that, then please at least arrange the wrapper's arguments such that you 
> don't have to shuffle them, i.e. I suggest placing the syscall number 
> last.

I aimed for simplicity here since to avoid shuffle it would require three
specialized wrapper, one for each syscall convention (5/6/7).  I can do it,
but I still prefer to have only one entry point, since I think the possible
performance gains are not worth the extra maintenance burden.

> 
>  For historical reasons you may want to initialise $2 right before the 
> SYSCALL instruction, although I take it we don't anymore support Linux 
> kernels old enough to require it for the syscall restart convention (so it 
> would mainly serve as a reference for those who need to write their own 
> code supporting those old kernels, as people often blindly copy & paste 
> existing pieces).

Do you know which is the kernel version which this was not really required?
I actually tested on a 3.2 on qemu (as it is the minimum one supported
currently).

> 
>  Also the MIPS16 wrappers may require adjustment then in order not to 
> execute a doubly nested function call unnecessarily, i.e. call 
> `__libc_do_syscall' directly rather than through another wrapper.

I did not actually tested MIPS16, neither build for it.  I would appreciate
any help here, since my mips abi knowledge is limited.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 14:13                   ` Adhemerval Zanella
  2017-08-16 14:47                     ` Maciej W. Rozycki
@ 2017-08-16 15:18                     ` Aurelien Jarno
  2017-08-16 21:15                     ` Aurelien Jarno
  2 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-16 15:18 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Joseph Myers, Maciej W. Rozycki, libc-alpha

On 2017-08-16 11:13, Adhemerval Zanella wrote:
> 
> 
> On 16/08/2017 10:44, Joseph Myers wrote:
> > On Wed, 16 Aug 2017, Maciej W. Rozycki wrote:
> > 
> >> On Tue, 15 Aug 2017, Joseph Myers wrote:
> >>
> >>> In which case having a volatile integer variable with value 4, declaring a 
> >>> VLA whose size is that variable, and storing a pointer to that VLA in a 
> >>> variable, would be an alternative to alloca to force a frame pointer, but 
> >>> with deallocation happening when the scope ends rather than the function 
> >>> ending (and the syscall macro has its own scope, so using it inside a loop 
> >>> wouldn't be a problem).
> >>
> >>  I suspect using volatile variables will cause unnecessary memory traffic.  
> >> Passing the size specifier through an empty `asm' might give better code; 
> >> also I think we can use 0 as the size requested, not to decrease the stack 
> >> pointer unnecessarily, e.g.:
> > 
> > Sure, as long as (a) the compiler can't know the size is actually constant 
> > and (b) it can't know the VLA isn't actually used, as if it can tell 
> > either of those things it can optimize away the variable stack allocation.
> > 
> >>  Also I wonder if there's actually a dependable way to have GCC itself 
> >> allocate the argument space we require.  For example if we set `s' to 1 
> >> above instead for `internal_syscall6', then would `0($sp)' and `4($sp)' be 
> >> valid to place arguments #5 and #6 at respectively without the subsequent 
> >> $sp adjustment we currently have in the syscall `asm' or would it be UB?
> > 
> > You can't tell whether the compiler might have allocated other variables 
> > on the stack after the dynamic adjustment - that is, whether any 
> > particular offset from sp is in fact unused or not.
> > 
> 
> What about the below? I can use some help to see if I am handling all the
> required ABI requirements for the __libc_do_syscall, but on an qemu emulated
> system I see no regression on basic tests (including some cancellation one
> from glibc to see the syscall is correctly unwinded) and tst-rwlock15 also
> does not fail anymore.

Thanks for this patch, I'll give it a try. I have been working on
something similar, however I only routed the syscalls with 5, 6 or 7
arguments to the __libc_do_syscall. That way there is no performance
penalty for them as they are the most used ones.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 14:54                       ` Adhemerval Zanella
@ 2017-08-16 16:12                         ` Aurelien Jarno
  2017-08-16 21:08                         ` Aurelien Jarno
  1 sibling, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-16 16:12 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Maciej W. Rozycki, Joseph Myers, libc-alpha

On 2017-08-16 11:54, Adhemerval Zanella wrote:
> 
> 
> On 16/08/2017 11:46, Maciej W. Rozycki wrote:
> > On Wed, 16 Aug 2017, Adhemerval Zanella wrote:
> > 
> >> +ENTRY(__libc_do_syscall)
> >> +        move    $2, $4
> >> +        move    $4, $5
> >> +        move    $5, $6
> >> +        move    $6, $7
> > 
> >  I'm not very keen on having a nested syscall function call, but if you do 
> > that, then please at least arrange the wrapper's arguments such that you 
> > don't have to shuffle them, i.e. I suggest placing the syscall number 
> > last.
> 
> I aimed for simplicity here since to avoid shuffle it would require three
> specialized wrapper, one for each syscall convention (5/6/7).  I can do it,
> but I still prefer to have only one entry point, since I think the possible
> performance gains are not worth the extra maintenance burden.

Thinking about that, if the __libc_do_syscall routine is only used for
syscall with 5/6/7 arguments, the syscall number can be passed as the
5th argument (the first on the stack), between argument 4 and 5. That
way arguments 1 to 4 are already in the right registers and the other
needs to be copied anyway.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 14:54                       ` Adhemerval Zanella
  2017-08-16 16:12                         ` Aurelien Jarno
@ 2017-08-16 21:08                         ` Aurelien Jarno
  2017-08-16 22:11                           ` Maciej W. Rozycki
  1 sibling, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-16 21:08 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Maciej W. Rozycki, Joseph Myers, libc-alpha

On 2017-08-16 11:54, Adhemerval Zanella wrote:
> Do you know which is the kernel version which this was not really required?
> I actually tested on a 3.2 on qemu (as it is the minimum one supported
> currently).

According to https://www.linux-mips.org/wiki/Syscall it's required up to
kernel 2.6.35.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 14:13                   ` Adhemerval Zanella
  2017-08-16 14:47                     ` Maciej W. Rozycki
  2017-08-16 15:18                     ` Aurelien Jarno
@ 2017-08-16 21:15                     ` Aurelien Jarno
  2017-08-17 13:33                       ` Adhemerval Zanella
  2 siblings, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-16 21:15 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Joseph Myers, Maciej W. Rozycki, libc-alpha

On 2017-08-16 11:13, Adhemerval Zanella wrote:
> 
> 
> On 16/08/2017 10:44, Joseph Myers wrote:
> > On Wed, 16 Aug 2017, Maciej W. Rozycki wrote:
> > 
> >> On Tue, 15 Aug 2017, Joseph Myers wrote:
> >>
> >>> In which case having a volatile integer variable with value 4, declaring a 
> >>> VLA whose size is that variable, and storing a pointer to that VLA in a 
> >>> variable, would be an alternative to alloca to force a frame pointer, but 
> >>> with deallocation happening when the scope ends rather than the function 
> >>> ending (and the syscall macro has its own scope, so using it inside a loop 
> >>> wouldn't be a problem).
> >>
> >>  I suspect using volatile variables will cause unnecessary memory traffic.  
> >> Passing the size specifier through an empty `asm' might give better code; 
> >> also I think we can use 0 as the size requested, not to decrease the stack 
> >> pointer unnecessarily, e.g.:
> > 
> > Sure, as long as (a) the compiler can't know the size is actually constant 
> > and (b) it can't know the VLA isn't actually used, as if it can tell 
> > either of those things it can optimize away the variable stack allocation.
> > 
> >>  Also I wonder if there's actually a dependable way to have GCC itself 
> >> allocate the argument space we require.  For example if we set `s' to 1 
> >> above instead for `internal_syscall6', then would `0($sp)' and `4($sp)' be 
> >> valid to place arguments #5 and #6 at respectively without the subsequent 
> >> $sp adjustment we currently have in the syscall `asm' or would it be UB?
> > 
> > You can't tell whether the compiler might have allocated other variables 
> > on the stack after the dynamic adjustment - that is, whether any 
> > particular offset from sp is in fact unused or not.
> > 
> 
> What about the below? I can use some help to see if I am handling all the
> required ABI requirements for the __libc_do_syscall, but on an qemu emulated

Do we actually have to follow the ABI requirements if we control both
the caller of __libc_do_syscall and the function itself? The i386 and
arm version seem to pass as much as possible in the right registers and
the other values and other way.

For MIPS, it means we can pass v0, a0-a3 in the correct registers and
use __libc_do_syscall to just setup the values on the stack. Something
like that for example:

ENTRY(__libc_do_syscall)
       PTR_SUBU sp, 32
       cfi_adjust_cfa_offset(32)

       .set noreorder
       REG_S s2, 16(sp)
       REG_S s3, 20(sp)
       REG_S s4, 24(sp)
       syscall
       .set reorder

       PTR_SUBU sp, -32
       cfi_adjust_cfa_offset(-32)
       ret
END (__libc_do_syscall)


On the caller side the 5th and following arguments should be passed in
s2, s3, s4. s1 can be used to save ra around the subroutine call.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 21:08                         ` Aurelien Jarno
@ 2017-08-16 22:11                           ` Maciej W. Rozycki
  0 siblings, 0 replies; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-16 22:11 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On Wed, 16 Aug 2017, Aurelien Jarno wrote:

> > Do you know which is the kernel version which this was not really required?
> > I actually tested on a 3.2 on qemu (as it is the minimum one supported
> > currently).
> 
> According to https://www.linux-mips.org/wiki/Syscall it's required up to
> kernel 2.6.35.

 Well, there is this very comment in the source file concerned:

   The convention was relaxed in Linux with a change applied to the kernel
   GIT repository as commit 96187fb0bc30cd7919759d371d810e928048249d, that
   first appeared in the 2.6.36 release.  Since then the kernel has had
   code that reloads $v0 upon syscall restart and resumes right at the
   SYSCALL instruction, so no special arrangement is needed anymore.

(which is "MIPS: Sanitize restart logics", dated Sep 28, 2010).

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 21:15                     ` Aurelien Jarno
@ 2017-08-17 13:33                       ` Adhemerval Zanella
  0 siblings, 0 replies; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-17 13:33 UTC (permalink / raw)
  To: Joseph Myers, Maciej W. Rozycki, libc-alpha



On 16/08/2017 18:15, Aurelien Jarno wrote:
> On 2017-08-16 11:13, Adhemerval Zanella wrote:
>>
>>
>> On 16/08/2017 10:44, Joseph Myers wrote:
>>> On Wed, 16 Aug 2017, Maciej W. Rozycki wrote:
>>>
>>>> On Tue, 15 Aug 2017, Joseph Myers wrote:
>>>>
>>>>> In which case having a volatile integer variable with value 4, declaring a 
>>>>> VLA whose size is that variable, and storing a pointer to that VLA in a 
>>>>> variable, would be an alternative to alloca to force a frame pointer, but 
>>>>> with deallocation happening when the scope ends rather than the function 
>>>>> ending (and the syscall macro has its own scope, so using it inside a loop 
>>>>> wouldn't be a problem).
>>>>
>>>>  I suspect using volatile variables will cause unnecessary memory traffic.  
>>>> Passing the size specifier through an empty `asm' might give better code; 
>>>> also I think we can use 0 as the size requested, not to decrease the stack 
>>>> pointer unnecessarily, e.g.:
>>>
>>> Sure, as long as (a) the compiler can't know the size is actually constant 
>>> and (b) it can't know the VLA isn't actually used, as if it can tell 
>>> either of those things it can optimize away the variable stack allocation.
>>>
>>>>  Also I wonder if there's actually a dependable way to have GCC itself 
>>>> allocate the argument space we require.  For example if we set `s' to 1 
>>>> above instead for `internal_syscall6', then would `0($sp)' and `4($sp)' be 
>>>> valid to place arguments #5 and #6 at respectively without the subsequent 
>>>> $sp adjustment we currently have in the syscall `asm' or would it be UB?
>>>
>>> You can't tell whether the compiler might have allocated other variables 
>>> on the stack after the dynamic adjustment - that is, whether any 
>>> particular offset from sp is in fact unused or not.
>>>
>>
>> What about the below? I can use some help to see if I am handling all the
>> required ABI requirements for the __libc_do_syscall, but on an qemu emulated
> 
> Do we actually have to follow the ABI requirements if we control both
> the caller of __libc_do_syscall and the function itself? The i386 and
> arm version seem to pass as much as possible in the right registers and
> the other values and other way.
> 
> For MIPS, it means we can pass v0, a0-a3 in the correct registers and
> use __libc_do_syscall to just setup the values on the stack. Something
> like that for example:
> 

We do not really to follow ABI requirements and the only requirement is to
unwind correctly backtrace for cancellation work.  However to allow this
optimization we would need to take care different ABI calling convention for
internal symbol on internal symbols  (I noted that for PIC code MIPS adds
a GOT reference plus a R_MIPS_JALR, which linker might relax later).

I think we should aim for simplicity and use as much as C support we can
and optimize this with more asm hackery if we really need to squeeze the
specific cycles out the syscall (which I really think it is overkill for
mostly if not all of them).

Currently with this patch __libc_do_syscall is called on pread, pwrite, 
lseek, llseek, ppoll, posix_fadvice, posix_fallocate, sync_file_range, 
fallocate, preadv, pwritev, preadv2, pwritev2, select, pselect, mmap, 
readahead, epoll_pwait, splice, recvfrom, sendto, recvmmsg, msgsnd, msgrcv, 
msgget, msgctl, semop, semget, semctl, semtimedop, shmat, shmdt, shmget, 
and shmctl.  All with possible exception of posix_fadvice and sysv ctl
are blocking calls which trying to get some cycles really won't make
any difference IMHO.  Also context switch is usually the large factor
of latency.

> ENTRY(__libc_do_syscall)
>        PTR_SUBU sp, 32
>        cfi_adjust_cfa_offset(32)
> 
>        .set noreorder
>        REG_S s2, 16(sp)
>        REG_S s3, 20(sp)
>        REG_S s4, 24(sp)
>        syscall
>        .set reorder
> 
>        PTR_SUBU sp, -32
>        cfi_adjust_cfa_offset(-32)
>        ret
> END (__libc_do_syscall)
> 
> 
> On the caller side the 5th and following arguments should be passed in
> s2, s3, s4. s1 can be used to save ra around the subroutine call.
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-16 14:47                     ` Joseph Myers
@ 2017-08-17 16:17                       ` Maciej W. Rozycki
  2017-08-17 17:25                         ` Adhemerval Zanella
  2017-08-17 18:18                         ` Aurelien Jarno
  0 siblings, 2 replies; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-17 16:17 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Aurelien Jarno, Adhemerval Zanella, libc-alpha

On Wed, 16 Aug 2017, Joseph Myers wrote:

> >  If the answer to any of these questions is "yes", then would factoring 
> > out the syscall `asm' along with the associated VLA declaration to a 
> > helper `always_inline' function help or would it not?
> 
> I don't think that would help.  An asm can never make assumptions about 
> which parts of the stack are used for what, only use its operands.

 There may be ABI restrictions however, which could provide guarantees 
beyond those resulting from the lone `asm' operands.  And it would be 
enough if we could prove that a certain arrangement has to be done in 
order not to break the ABI.  I can't think of anything right now though 
and if neither you nor anyone else can, then we'll have to live with what 
we have right now.

> >  I mean it is a tiny optimisation, but some syscalls are frequently 
> > called, so if we can avoid a waste of resources, then why not?
> 
> Are any 5/6/7-argument syscalls frequently called?

 Good question, however I have no data available.

 Anyway, here's my counter-proposal implementing the approach previously 
outlined.  I have passed it through regular MIPS o32 testing with these 
changes in test outputs resulting:

@@ -2575,7 +2575,7 @@
 PASS: nptl/tst-cond22
 PASS: nptl/tst-cond23
 PASS: nptl/tst-cond24
-FAIL: nptl/tst-cond25
+PASS: nptl/tst-cond25
 PASS: nptl/tst-cond3
 PASS: nptl/tst-cond4
 PASS: nptl/tst-cond5
@@ -2704,7 +2704,7 @@
 PASS: nptl/tst-rwlock12
 PASS: nptl/tst-rwlock13
 PASS: nptl/tst-rwlock14
-FAIL: nptl/tst-rwlock15
+PASS: nptl/tst-rwlock15
 PASS: nptl/tst-rwlock16
 PASS: nptl/tst-rwlock17
 PASS: nptl/tst-rwlock18

 The drawback is it adds a bit to code generated, e.g. `__libc_pwrite' 
(from nptl/pwrite.o and nptl/pwrite.os) grows by 4 and 6 instructions 
respectively for non-PIC and PIC code respectively, and the whole 
libraries:

   text    data     bss     dec     hex filename
1483315   21129   11560 1516004  1721e4 libc.so
 105482     960    8448  114890   1c0ca nptl/libpthread.so

vs:

   text    data     bss     dec     hex filename
1484295   21133   11560 1516988  1725bc libc.so
 105974     960    8448  115382   1c2b6 nptl/libpthread.so

due to the insertion of the VLA size calculation (although GCC is smart 
enough to reuse a value of 0 already available, e.g.:

  38:	7c03e83b 	rdhwr	v1,$29
  3c:	8c638b70 	lw	v1,-29840(v1)
  40:	14600018 	bnez	v1,a4 <__libc_pwrite+0xa4>
  44:	000787c3 	sra	s0,a3,0x1f
  48:	000318c0 	sll	v1,v1,0x3
  4c:	03a08825 	move	s1,sp
  50:	03a3e823 	subu	sp,sp,v1

and save an isntruction) and the use of an extra register to preserve the 
value of $sp across the block containing the VLA (as also seen with $s1 in 
the disassembly above) even though it could use $fp that holds the same 
value instead (e.g. continuing from the above:

  74:	0220e825 	move	sp,s1
  78:	03c0e825 	move	sp,s8

).  It would be good to know how this compares to Adhemerval's proposal.

  Maciej

	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h 
	(FORCE_FRAME_POINTER): Remove macro.
	(internal_syscall5): Use a variable-length array to force the
	use of a frame pointer.
	(internal_syscall6): Likewise.
	(internal_syscall7): Likewise.
---
 sysdeps/unix/sysv/linux/mips/mips32/sysdep.h |   24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

glibc-mips-o32-syscall-stack.diff
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-04-11 21:27:25.000000000 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-16 20:49:15.758029215 +0100
@@ -264,18 +264,20 @@
 
 /* We need to use a frame pointer for the functions in which we
    adjust $sp around the syscall, or debug information and unwind
-   information will be $sp relative and thus wrong during the syscall.  As
-   of GCC 4.7, this is sufficient.  */
-#define FORCE_FRAME_POINTER						\
-  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
+   information will be $sp relative and thus wrong during the syscall.
+   We use a variable-length array to persuade GCC to use $fp.  */
 
 #define internal_syscall5(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5)			\
 ({									\
 	long _sys_result;						\
 									\
-	FORCE_FRAME_POINTER;						\
+	size_t s = 0;							\
+	asm ("" : "+r" (s));						\
 	{								\
+	char vla[s << 3];						\
+	asm ("" : : "p" (vla));						\
+									\
 	register long __s0 asm ("$16") __attribute__ ((unused))		\
 	  = (number);							\
 	register long __v0 asm ("$2");					\
@@ -306,8 +308,12 @@
 ({									\
 	long _sys_result;						\
 									\
-	FORCE_FRAME_POINTER;						\
+	size_t s = 0;							\
+	asm ("" : "+r" (s));						\
 	{								\
+	char vla[s << 3];						\
+	asm ("" : : "p" (vla));						\
+									\
 	register long __s0 asm ("$16") __attribute__ ((unused))		\
 	  = (number);							\
 	register long __v0 asm ("$2");					\
@@ -339,8 +345,12 @@
 ({									\
 	long _sys_result;						\
 									\
-	FORCE_FRAME_POINTER;						\
+	size_t s = 0;							\
+	asm ("" : "+r" (s));						\
 	{								\
+	char vla[s << 3];						\
+	asm ("" : : "p" (vla));						\
+									\
 	register long __s0 asm ("$16") __attribute__ ((unused))		\
 	  = (number);							\
 	register long __v0 asm ("$2");					\

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 16:17                       ` Maciej W. Rozycki
@ 2017-08-17 17:25                         ` Adhemerval Zanella
  2017-08-17 17:32                           ` Joseph Myers
  2017-08-17 20:34                           ` Maciej W. Rozycki
  2017-08-17 18:18                         ` Aurelien Jarno
  1 sibling, 2 replies; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-17 17:25 UTC (permalink / raw)
  To: Maciej W. Rozycki, Joseph Myers; +Cc: Aurelien Jarno, libc-alpha



On 17/08/2017 13:17, Maciej W. Rozycki wrote:
> On Wed, 16 Aug 2017, Joseph Myers wrote:
> 
>>>  If the answer to any of these questions is "yes", then would factoring 
>>> out the syscall `asm' along with the associated VLA declaration to a 
>>> helper `always_inline' function help or would it not?
>>
>> I don't think that would help.  An asm can never make assumptions about 
>> which parts of the stack are used for what, only use its operands.
> 
>  There may be ABI restrictions however, which could provide guarantees 
> beyond those resulting from the lone `asm' operands.  And it would be 
> enough if we could prove that a certain arrangement has to be done in 
> order not to break the ABI.  I can't think of anything right now though 
> and if neither you nor anyone else can, then we'll have to live with what 
> we have right now.
> 
>>>  I mean it is a tiny optimisation, but some syscalls are frequently 
>>> called, so if we can avoid a waste of resources, then why not?
>>
>> Are any 5/6/7-argument syscalls frequently called?
> 
>  Good question, however I have no data available.
> 
>  Anyway, here's my counter-proposal implementing the approach previously 
> outlined.  I have passed it through regular MIPS o32 testing with these 
> changes in test outputs resulting:
> 
> @@ -2575,7 +2575,7 @@
>  PASS: nptl/tst-cond22
>  PASS: nptl/tst-cond23
>  PASS: nptl/tst-cond24
> -FAIL: nptl/tst-cond25
> +PASS: nptl/tst-cond25
>  PASS: nptl/tst-cond3
>  PASS: nptl/tst-cond4
>  PASS: nptl/tst-cond5
> @@ -2704,7 +2704,7 @@
>  PASS: nptl/tst-rwlock12
>  PASS: nptl/tst-rwlock13
>  PASS: nptl/tst-rwlock14
> -FAIL: nptl/tst-rwlock15
> +PASS: nptl/tst-rwlock15
>  PASS: nptl/tst-rwlock16
>  PASS: nptl/tst-rwlock17
>  PASS: nptl/tst-rwlock18
> 
>  The drawback is it adds a bit to code generated, e.g. `__libc_pwrite' 
> (from nptl/pwrite.o and nptl/pwrite.os) grows by 4 and 6 instructions 
> respectively for non-PIC and PIC code respectively, and the whole 
> libraries:
> 
>    text    data     bss     dec     hex filename
> 1483315   21129   11560 1516004  1721e4 libc.so
>  105482     960    8448  114890   1c0ca nptl/libpthread.so
> 
> vs:
> 
>    text    data     bss     dec     hex filename
> 1484295   21133   11560 1516988  1725bc libc.so
>  105974     960    8448  115382   1c2b6 nptl/libpthread.so
> 
> due to the insertion of the VLA size calculation (although GCC is smart 
> enough to reuse a value of 0 already available, e.g.:
> 
>   38:	7c03e83b 	rdhwr	v1,$29
>   3c:	8c638b70 	lw	v1,-29840(v1)
>   40:	14600018 	bnez	v1,a4 <__libc_pwrite+0xa4>
>   44:	000787c3 	sra	s0,a3,0x1f
>   48:	000318c0 	sll	v1,v1,0x3
>   4c:	03a08825 	move	s1,sp
>   50:	03a3e823 	subu	sp,sp,v1
> 
> and save an isntruction) and the use of an extra register to preserve the 
> value of $sp across the block containing the VLA (as also seen with $s1 in 
> the disassembly above) even though it could use $fp that holds the same 
> value instead (e.g. continuing from the above:
> 
>   74:	0220e825 	move	sp,s1
>   78:	03c0e825 	move	sp,s8
> 
> ).  It would be good to know how this compares to Adhemerval's proposal.

My point is I think we should aim for compiler optimization safeness
(to avoid code breakage over compiler defined default flags) and taking
as base current approach to *avoid* VLA on GLIBC I do not think it is
good approach to use it as a bridge to force GCC to generate the expected
code.

I still thinking trying to optimize for 5/6/7 syscall argument is over
engineering in this *specific* case.  As I put in my last message,
5/6/7 argument syscalls are used for 

pread, pwrite, lseek, llseek, ppoll, posix_fadvice, posix_fallocate, 
sync_file_range, fallocate, preadv, pwritev, preadv2, pwritev2, select,
pselect, mmap, readahead, epoll_pwait, splice, recvfrom, sendto, recvmmsg,
msgsnd, msgrcv, msgget, msgctl, semop, semget, semctl, semtimedop, shmat,
shmdt, shmget, and shmctl. 

Which are the one generated from C implementation (some are still auto
generated).  The majority of them are blocking syscalls, so both context
switch plus the required work for syscall completion itself will taking
proportionally all the required time.  So trying to squeeze some cycles
don't really pay off comparing to code maintainability (just all this
discussion of which C construct would be safe enough to generate the 
correct stack spill plus the current issue should indicate we should
aim for correctness first).
 


> 
>   Maciej
> 
> 	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h 
> 	(FORCE_FRAME_POINTER): Remove macro.
> 	(internal_syscall5): Use a variable-length array to force the
> 	use of a frame pointer.
> 	(internal_syscall6): Likewise.
> 	(internal_syscall7): Likewise.
> ---
>  sysdeps/unix/sysv/linux/mips/mips32/sysdep.h |   24 +++++++++++++++++-------
>  1 file changed, 17 insertions(+), 7 deletions(-)
> 
> glibc-mips-o32-syscall-stack.diff
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-04-11 21:27:25.000000000 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-16 20:49:15.758029215 +0100
> @@ -264,18 +264,20 @@
>  
>  /* We need to use a frame pointer for the functions in which we
>     adjust $sp around the syscall, or debug information and unwind
> -   information will be $sp relative and thus wrong during the syscall.  As
> -   of GCC 4.7, this is sufficient.  */
> -#define FORCE_FRAME_POINTER						\
> -  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
> +   information will be $sp relative and thus wrong during the syscall.
> +   We use a variable-length array to persuade GCC to use $fp.  */
>  
>  #define internal_syscall5(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5)			\
>  ({									\
>  	long _sys_result;						\
>  									\
> -	FORCE_FRAME_POINTER;						\
> +	size_t s = 0;							\
> +	asm ("" : "+r" (s));						\
>  	{								\
> +	char vla[s << 3];						\
> +	asm ("" : : "p" (vla));						\
> +									\
>  	register long __s0 asm ("$16") __attribute__ ((unused))		\
>  	  = (number);							\
>  	register long __v0 asm ("$2");					\
> @@ -306,8 +308,12 @@
>  ({									\
>  	long _sys_result;						\
>  									\
> -	FORCE_FRAME_POINTER;						\
> +	size_t s = 0;							\
> +	asm ("" : "+r" (s));						\
>  	{								\
> +	char vla[s << 3];						\
> +	asm ("" : : "p" (vla));						\
> +									\
>  	register long __s0 asm ("$16") __attribute__ ((unused))		\
>  	  = (number);							\
>  	register long __v0 asm ("$2");					\
> @@ -339,8 +345,12 @@
>  ({									\
>  	long _sys_result;						\
>  									\
> -	FORCE_FRAME_POINTER;						\
> +	size_t s = 0;							\
> +	asm ("" : "+r" (s));						\
>  	{								\
> +	char vla[s << 3];						\
> +	asm ("" : : "p" (vla));						\
> +									\
>  	register long __s0 asm ("$16") __attribute__ ((unused))		\
>  	  = (number);							\
>  	register long __v0 asm ("$2");					\
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 17:25                         ` Adhemerval Zanella
@ 2017-08-17 17:32                           ` Joseph Myers
  2017-08-17 20:34                           ` Maciej W. Rozycki
  1 sibling, 0 replies; 53+ messages in thread
From: Joseph Myers @ 2017-08-17 17:32 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Maciej W. Rozycki, Aurelien Jarno, libc-alpha

On Thu, 17 Aug 2017, Adhemerval Zanella wrote:

> My point is I think we should aim for compiler optimization safeness
> (to avoid code breakage over compiler defined default flags) and taking
> as base current approach to *avoid* VLA on GLIBC I do not think it is
> good approach to use it as a bridge to force GCC to generate the expected
> code.

I think the point that -Werror=alloca -Werror=vla would be desirable for 
building glibc (if you don't have any variable-size stack allocations, you 
don't need to worry about problems with unbounded stack allocations, which 
are always bad, even given reliable stack checking, because of the 
inability to report errors from them) is a good one about why to avoid 
using the VLA approach.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 16:17                       ` Maciej W. Rozycki
  2017-08-17 17:25                         ` Adhemerval Zanella
@ 2017-08-17 18:18                         ` Aurelien Jarno
  1 sibling, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-17 18:18 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Joseph Myers, Adhemerval Zanella, libc-alpha

On 2017-08-17 17:17, Maciej W. Rozycki wrote:
>  The drawback is it adds a bit to code generated, e.g. `__libc_pwrite' 
> (from nptl/pwrite.o and nptl/pwrite.os) grows by 4 and 6 instructions 
> respectively for non-PIC and PIC code respectively, and the whole 
> libraries:
> 
>    text    data     bss     dec     hex filename
> 1483315   21129   11560 1516004  1721e4 libc.so
>  105482     960    8448  114890   1c0ca nptl/libpthread.so
> 
> vs:
> 
>    text    data     bss     dec     hex filename
> 1484295   21133   11560 1516988  1725bc libc.so
>  105974     960    8448  115382   1c2b6 nptl/libpthread.so
> 
> due to the insertion of the VLA size calculation (although GCC is smart 
> enough to reuse a value of 0 already available, e.g.:
> 
>   38:	7c03e83b 	rdhwr	v1,$29
>   3c:	8c638b70 	lw	v1,-29840(v1)
>   40:	14600018 	bnez	v1,a4 <__libc_pwrite+0xa4>
>   44:	000787c3 	sra	s0,a3,0x1f
>   48:	000318c0 	sll	v1,v1,0x3
>   4c:	03a08825 	move	s1,sp
>   50:	03a3e823 	subu	sp,sp,v1
> 
> and save an isntruction) and the use of an extra register to preserve the 
> value of $sp across the block containing the VLA (as also seen with $s1 in 
> the disassembly above) even though it could use $fp that holds the same 
> value instead (e.g. continuing from the above:
> 
>   74:	0220e825 	move	sp,s1
>   78:	03c0e825 	move	sp,s8
> 
> ).  It would be good to know how this compares to Adhemerval's proposal.

I have been trying to improve Adhemerval's patches a bit by returning
the error value in v1, in addition to the return code in v0. Here are
the corresponding numbers:

w/o patch:
   text    data     bss     dec     hex filename
1489767   21085   11560 1522412  173aec libc.so
 107908     956    8448  117312   1ca40 nptl/libpthread.so

with patch:
   text    data     bss     dec     hex filename
1488135   21089   11560 1520784  173490 libc.so
 107244     960    8448  116652   1c7ac nptl/libpthread.so


When looking at a given function like `__libc_pwrite' it gets reduced
by 13 instructions in both PIC and non-PIC cases. However we need to
add the 16 instructions of __libc_do_syscall.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 17:25                         ` Adhemerval Zanella
  2017-08-17 17:32                           ` Joseph Myers
@ 2017-08-17 20:34                           ` Maciej W. Rozycki
  2017-08-17 21:09                             ` Adhemerval Zanella
  1 sibling, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-17 20:34 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Joseph Myers, Aurelien Jarno, libc-alpha

On Thu, 17 Aug 2017, Adhemerval Zanella wrote:

> My point is I think we should aim for compiler optimization safeness
> (to avoid code breakage over compiler defined default flags) and taking
> as base current approach to *avoid* VLA on GLIBC I do not think it is
> good approach to use it as a bridge to force GCC to generate the expected
> code.

 You certainly have a point here overall, although I don't think a VLA 
whose size is always 0 really hurts.  And we've used the approach with 
`alloca' since forever with no adverse effects until we added a place 
where the caller invokes the syscall wrapper in a loop.  So I wouldn't 
necessarily call it an issue.  Mind that this is target-specific code, so 
we can rely on a target-specific execution model rather than limiting 
ourselves to what generic ISO C guarantees.

 Aurelien's figures indicating a clear size reduction certainly count as a 
pro though.

> I still thinking trying to optimize for 5/6/7 syscall argument is over
> engineering in this *specific* case.  As I put in my last message,
> 5/6/7 argument syscalls are used for 
> 
> pread, pwrite, lseek, llseek, ppoll, posix_fadvice, posix_fallocate, 
> sync_file_range, fallocate, preadv, pwritev, preadv2, pwritev2, select,
> pselect, mmap, readahead, epoll_pwait, splice, recvfrom, sendto, recvmmsg,
> msgsnd, msgrcv, msgget, msgctl, semop, semget, semctl, semtimedop, shmat,
> shmdt, shmget, and shmctl. 
> 
> Which are the one generated from C implementation (some are still auto
> generated).  The majority of them are blocking syscalls, so both context
> switch plus the required work for syscall completion itself will taking
> proportionally all the required time.  So trying to squeeze some cycles
> don't really pay off comparing to code maintainability (just all this
> discussion of which C construct would be safe enough to generate the 
> correct stack spill plus the current issue should indicate we should
> aim for correctness first).

 TBH, I find it questionable whether it's really the approach I proposed 
that requires more engineering (and long-term maintenance) effort rather 
than using a separate handwritten assembly-language call stub.  Especially 
if a non-standard calling convention is used.

 If everyone but me thinks there's a clear advantage in using such a 
handcoded stub though, then as I previously noted please adjust the 
affected MIPS16 stubs to avoid the extra indirection, i.e. you can call 
`__libc_do_syscall' directly from MIPS16 code as you'd do from regular 
MIPS or microMIPS code, as the lone reason for the existence of the MIPS16 
stubs is the inexistence of a MIPS16 SYSCALL instruction.

 Once you're done with that I can push your proposed change through MIPS16 
regression testing if that helped.  I can see if I can run microMIPS 
testing as well, although I'd have to double-check for an available board 
as I don't use one regularly.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 20:34                           ` Maciej W. Rozycki
@ 2017-08-17 21:09                             ` Adhemerval Zanella
  2017-08-17 21:20                               ` Aurelien Jarno
                                                 ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-17 21:09 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Joseph Myers, Aurelien Jarno, libc-alpha



On 17/08/2017 17:34, Maciej W. Rozycki wrote:
> On Thu, 17 Aug 2017, Adhemerval Zanella wrote:
> 
>> My point is I think we should aim for compiler optimization safeness
>> (to avoid code breakage over compiler defined default flags) and taking
>> as base current approach to *avoid* VLA on GLIBC I do not think it is
>> good approach to use it as a bridge to force GCC to generate the expected
>> code.
> 
>  You certainly have a point here overall, although I don't think a VLA 
> whose size is always 0 really hurts.  And we've used the approach with 
> `alloca' since forever with no adverse effects until we added a place 
> where the caller invokes the syscall wrapper in a loop.  So I wouldn't 
> necessarily call it an issue.  Mind that this is target-specific code, so 
> we can rely on a target-specific execution model rather than limiting 
> ourselves to what generic ISO C guarantees.
> 
>  Aurelien's figures indicating a clear size reduction certainly count as a 
> pro though.

Joseph pointed out another advantage of avoid VLAs (building with 
-Werror=alloca -Werror=vla).  My main problem here is we are betting that
compiler won't mess with our assumptions and generate the desirable code
without trying to adhere what it is suppose to provide.  Target generic
ISO C give us a better guarantee and any deviation indicates a possible
compiler issue, not otherwise (such this case).  My another point is we
can optimize if required later if this is the case and imho this is hardly
the case here (at least for latency).

If I understood correctly Aurelien's suggestion of returning err in v1
is not ABI strictly so it will end up calling __libc_do_syscall with a
non-conformant ABI convention (similar to pipe implementation where requires
assembly specific implementation for a lot of architectures to get this
right).  Again this is something I would really to avoid.

> 
>> I still thinking trying to optimize for 5/6/7 syscall argument is over
>> engineering in this *specific* case.  As I put in my last message,
>> 5/6/7 argument syscalls are used for 
>>
>> pread, pwrite, lseek, llseek, ppoll, posix_fadvice, posix_fallocate, 
>> sync_file_range, fallocate, preadv, pwritev, preadv2, pwritev2, select,
>> pselect, mmap, readahead, epoll_pwait, splice, recvfrom, sendto, recvmmsg,
>> msgsnd, msgrcv, msgget, msgctl, semop, semget, semctl, semtimedop, shmat,
>> shmdt, shmget, and shmctl. 
>>
>> Which are the one generated from C implementation (some are still auto
>> generated).  The majority of them are blocking syscalls, so both context
>> switch plus the required work for syscall completion itself will taking
>> proportionally all the required time.  So trying to squeeze some cycles
>> don't really pay off comparing to code maintainability (just all this
>> discussion of which C construct would be safe enough to generate the 
>> correct stack spill plus the current issue should indicate we should
>> aim for correctness first).
> 
>  TBH, I find it questionable whether it's really the approach I proposed 
> that requires more engineering (and long-term maintenance) effort rather 
> than using a separate handwritten assembly-language call stub.  Especially 
> if a non-standard calling convention is used.

IMHO I find the VLA suggestion more fragile in long term.

> 
>  If everyone but me thinks there's a clear advantage in using such a 
> handcoded stub though, then as I previously noted please adjust the 
> affected MIPS16 stubs to avoid the extra indirection, i.e. you can call 
> `__libc_do_syscall' directly from MIPS16 code as you'd do from regular 
> MIPS or microMIPS code, as the lone reason for the existence of the MIPS16 
> stubs is the inexistence of a MIPS16 SYSCALL instruction.

Ok, I will try to at least check it on qemu. If you have any points on how
correctly build a mips16 glibc it could be helpful. 

> 
>  Once you're done with that I can push your proposed change through MIPS16 
> regression testing if that helped.  I can see if I can run microMIPS 
> testing as well, although I'd have to double-check for an available board 
> as I don't use one regularly.
> 
>   Maciej
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 21:09                             ` Adhemerval Zanella
@ 2017-08-17 21:20                               ` Aurelien Jarno
  2017-08-17 22:05                                 ` Adhemerval Zanella
  2017-08-17 22:34                                 ` Maciej W. Rozycki
  2017-08-17 21:34                               ` Aurelien Jarno
  2017-08-17 21:47                               ` Maciej W. Rozycki
  2 siblings, 2 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-17 21:20 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Maciej W. Rozycki, Joseph Myers, libc-alpha

On 2017-08-17 18:09, Adhemerval Zanella wrote:
> 
> 
> On 17/08/2017 17:34, Maciej W. Rozycki wrote:
> > On Thu, 17 Aug 2017, Adhemerval Zanella wrote:
> > 
> >> My point is I think we should aim for compiler optimization safeness
> >> (to avoid code breakage over compiler defined default flags) and taking
> >> as base current approach to *avoid* VLA on GLIBC I do not think it is
> >> good approach to use it as a bridge to force GCC to generate the expected
> >> code.
> > 
> >  You certainly have a point here overall, although I don't think a VLA 
> > whose size is always 0 really hurts.  And we've used the approach with 
> > `alloca' since forever with no adverse effects until we added a place 
> > where the caller invokes the syscall wrapper in a loop.  So I wouldn't 
> > necessarily call it an issue.  Mind that this is target-specific code, so 
> > we can rely on a target-specific execution model rather than limiting 
> > ourselves to what generic ISO C guarantees.
> > 
> >  Aurelien's figures indicating a clear size reduction certainly count as a 
> > pro though.
> 
> Joseph pointed out another advantage of avoid VLAs (building with 
> -Werror=alloca -Werror=vla).  My main problem here is we are betting that
> compiler won't mess with our assumptions and generate the desirable code
> without trying to adhere what it is suppose to provide.  Target generic
> ISO C give us a better guarantee and any deviation indicates a possible
> compiler issue, not otherwise (such this case).  My another point is we
> can optimize if required later if this is the case and imho this is hardly
> the case here (at least for latency).
> 
> If I understood correctly Aurelien's suggestion of returning err in v1
> is not ABI strictly so it will end up calling __libc_do_syscall with a
> non-conformant ABI convention (similar to pipe implementation where requires
> assembly specific implementation for a lot of architectures to get this
> right).  Again this is something I would really to avoid.
> 
> > 
> >> I still thinking trying to optimize for 5/6/7 syscall argument is over
> >> engineering in this *specific* case.  As I put in my last message,
> >> 5/6/7 argument syscalls are used for 
> >>
> >> pread, pwrite, lseek, llseek, ppoll, posix_fadvice, posix_fallocate, 
> >> sync_file_range, fallocate, preadv, pwritev, preadv2, pwritev2, select,
> >> pselect, mmap, readahead, epoll_pwait, splice, recvfrom, sendto, recvmmsg,
> >> msgsnd, msgrcv, msgget, msgctl, semop, semget, semctl, semtimedop, shmat,
> >> shmdt, shmget, and shmctl. 
> >>
> >> Which are the one generated from C implementation (some are still auto
> >> generated).  The majority of them are blocking syscalls, so both context
> >> switch plus the required work for syscall completion itself will taking
> >> proportionally all the required time.  So trying to squeeze some cycles
> >> don't really pay off comparing to code maintainability (just all this
> >> discussion of which C construct would be safe enough to generate the 
> >> correct stack spill plus the current issue should indicate we should
> >> aim for correctness first).
> > 
> >  TBH, I find it questionable whether it's really the approach I proposed 
> > that requires more engineering (and long-term maintenance) effort rather 
> > than using a separate handwritten assembly-language call stub.  Especially 
> > if a non-standard calling convention is used.
> 
> IMHO I find the VLA suggestion more fragile in long term.
> 
> > 
> >  If everyone but me thinks there's a clear advantage in using such a 
> > handcoded stub though, then as I previously noted please adjust the 
> > affected MIPS16 stubs to avoid the extra indirection, i.e. you can call 
> > `__libc_do_syscall' directly from MIPS16 code as you'd do from regular 
> > MIPS or microMIPS code, as the lone reason for the existence of the MIPS16 
> > stubs is the inexistence of a MIPS16 SYSCALL instruction.
> 
> Ok, I will try to at least check it on qemu. If you have any points on how
> correctly build a mips16 glibc it could be helpful. 

The patch below, based on Adhemerval's version should do it. The changes
I have done:
- return err through v1 instead of using a negative value
- fix build of mips16-syscallX.c
- route mips16 syscalls with 5 to 7th arguments through __libc_do_syscall

I see no regression on mipsel o32. I have only lightly tested mips o32
(the testsuite is still running). I haven't been able to fully compile
mips16 due to the following error compiling dl-tunables.c:

/tmp/ccI2NMgJ.s: Assembler messages:
/tmp/ccI2NMgJ.s:1376: Error: branch to a symbol in another ISA mode

It doesn't seem to be related to my changes.

Aurelien

diff --git a/sysdeps/unix/sysv/linux/mips/mips32/Makefile b/sysdeps/unix/sysv/linux/mips/mips32/Makefile
index 33b461500c..cbdf032c3a 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/Makefile
+++ b/sysdeps/unix/sysv/linux/mips/mips32/Makefile
@@ -1,8 +1,26 @@
+ifeq ($(subdir),elf)
+sysdep-dl-routines += libc-do-syscall
+endif
+
 ifeq ($(subdir),conform)
 # For bugs 17786 and 21278.
 conformtest-xfail-conds += mips-o32-linux
 endif
 
+ifeq ($(subdir),io)
+sysdep_routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),nptl)
+libpthread-sysdep_routines += libc-do-syscall
+libpthread-shared-only-routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),rt)
+librt-sysdep_routines += libc-do-syscall
+librt-shared-only-routines += libc-do-syscall
+endif
+
 ifeq ($(subdir),stdlib)
 tests += bug-getcontext-mips-gp
 endif
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
new file mode 100644
index 0000000000..c02f507008
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
@@ -0,0 +1,52 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sys/asm.h>
+#include <sysdep.h>
+#include <asm/unistd.h>
+#include <sgidefs.h>
+
+
+/* long int __libc_do_syscall (long int, ...)  */
+
+#define FRAMESZ 32
+
+	.text
+	.set    nomips16
+	.hidden __libc_do_syscall
+ENTRY(__libc_do_syscall)
+	move    v0, a0
+	move    a0, a1
+	move    a1, a2
+	move    a2, a3
+	lw      a3, 16(sp)
+	lw      t0, 20(sp)
+	lw      t1, 24(sp)
+	lw      t2, 28(sp)
+	.set 	noreorder
+	PTR_SUBU sp, FRAMESZ
+	cfi_adjust_cfa_offset (FRAMESZ)
+	sw      t0, 16(sp)
+	sw      t1, 20(sp)
+	sw      t2, 24(sp)
+	syscall
+	PTR_ADDU sp, FRAMESZ
+	cfi_adjust_cfa_offset (-FRAMESZ)
+	.set	reorder
+	move    v1, a3
+1:      ret
+END (__libc_do_syscall)
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
index fa9fcb7e6f..6869bf4f7c 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
+++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
@@ -1,13 +1,9 @@
 ifeq ($(subdir),misc)
 sysdep_routines += mips16-syscall0 mips16-syscall1 mips16-syscall2
-sysdep_routines += mips16-syscall3 mips16-syscall4 mips16-syscall5
-sysdep_routines += mips16-syscall6 mips16-syscall7
+sysdep_routines += mips16-syscall3 mips16-syscall4
 CFLAGS-mips16-syscall0.c += -fexceptions
 CFLAGS-mips16-syscall1.c += -fexceptions
 CFLAGS-mips16-syscall2.c += -fexceptions
 CFLAGS-mips16-syscall3.c += -fexceptions
 CFLAGS-mips16-syscall4.c += -fexceptions
-CFLAGS-mips16-syscall5.c += -fexceptions
-CFLAGS-mips16-syscall6.c += -fexceptions
-CFLAGS-mips16-syscall7.c += -fexceptions
 endif
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
index 73bcfb566c..bb21747f44 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
+++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
@@ -1,6 +1,6 @@
 libc {
   GLIBC_PRIVATE {
     __mips16_syscall0; __mips16_syscall1; __mips16_syscall2; __mips16_syscall3;
-    __mips16_syscall4; __mips16_syscall5; __mips16_syscall6; __mips16_syscall7;
+    __mips16_syscall4;
   }
 }
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
index 880e9908e8..60f856d248 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
+++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
@@ -21,17 +21,6 @@
 
 #define __nomips16 __attribute__ ((nomips16))
 
-union __mips16_syscall_return
-  {
-    long long val;
-    struct
-      {
-	long v0;
-	long v1;
-      }
-    reg;
-  };
-
 long long __nomips16 __mips16_syscall0 (long number);
 #define __mips16_syscall0(dummy, number)				\
 	__mips16_syscall0 ((long) (number))
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
index 490245b34e..b9f78e875f 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
+++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
@@ -24,7 +24,7 @@
 long long __nomips16
 __mips16_syscall0 (long number)
 {
-  union __mips16_syscall_return ret;
+  union __libc_do_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 0);
   return ret.val;
 }
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
index 3061e8accb..284ce712cc 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
+++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
@@ -25,7 +25,7 @@ long long __nomips16
 __mips16_syscall1 (long a0,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __libc_do_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 1,
 					a0);
   return ret.val;
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
index 440a4ed285..4e76329239 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
+++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
@@ -25,7 +25,7 @@ long long __nomips16
 __mips16_syscall2 (long a0, long a1,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __libc_do_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 2,
 					a0, a1);
   return ret.val;
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
index c3f83fc1f6..dbb31d2f20 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
+++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
@@ -25,7 +25,7 @@ long long __nomips16
 __mips16_syscall3 (long a0, long a1, long a2,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __libc_do_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 3,
 					a0, a1, a2);
   return ret.val;
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
index 496297d296..a5dade3b3f 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
+++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
@@ -25,7 +25,7 @@ long long __nomips16
 __mips16_syscall4 (long a0, long a1, long a2, long a3,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __libc_do_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 4,
 					a0, a1, a2, a3);
   return ret.val;
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
deleted file mode 100644
index ad265d88e2..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall5
-
-long long __nomips16
-__mips16_syscall5 (long a0, long a1, long a2, long a3,
-		   long a4,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 5,
-					a0, a1, a2, a3, a4);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
deleted file mode 100644
index bfbd395ed3..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall6
-
-long long __nomips16
-__mips16_syscall6 (long a0, long a1, long a2, long a3,
-		   long a4, long a5,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 6,
-					a0, a1, a2, a3, a4, a5);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
deleted file mode 100644
index e1267616dc..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall7
-
-long long __nomips16
-__mips16_syscall7 (long a0, long a1, long a2, long a3,
-		   long a4, long a5, long a6,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 7,
-					a0, a1, a2, a3, a4, a5, a6);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
index e9e3ee7e82..8e55538a5c 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
+++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
@@ -49,9 +49,9 @@
 /* Define a macro which expands into the inline wrapper code for a system
    call.  */
 #undef INLINE_SYSCALL
-#define INLINE_SYSCALL(name, nr, args...)                               \
+#define INLINE_SYSCALL(name, nr, ...)                                   \
   ({ INTERNAL_SYSCALL_DECL (_sc_err);					\
-     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, args);	\
+     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, ## __VA_ARGS__); \
      if ( INTERNAL_SYSCALL_ERROR_P (result_var, _sc_err) )		\
        {								\
 	 __set_errno (INTERNAL_SYSCALL_ERRNO (result_var, _sc_err));	\
@@ -98,6 +98,19 @@
 #undef INTERNAL_SYSCALL
 #undef INTERNAL_SYSCALL_NCS
 
+long long __attribute__ ((nomips16)) __libc_do_syscall (long int, ...) attribute_hidden;
+
+union __libc_do_syscall_return
+  {
+    long long val;
+    struct
+      {
+	long v0;
+	long v1;
+      }
+    reg;
+  };
+
 #ifdef __mips16
 /* There's no MIPS16 syscall instruction, so we go through out-of-line
    standard MIPS wrappers.  These do use inline snippets below though,
@@ -107,13 +120,16 @@
 
 # include <mips16-syscall.h>
 
-# define INTERNAL_SYSCALL(name, err, nr, args...)			\
-	INTERNAL_SYSCALL_NCS (SYS_ify (name), err, nr, args)
+# define INTERNAL_SYSCALL(name, err, nr, ...)				\
+	INTERNAL_SYSCALL_NCS (SYS_ify (name), err, nr, ## __VA_ARGS__)
 
-# define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
+# define INTERNAL_SYSCALL_NCS(number, err, nr, ...)			\
 ({									\
-	union __mips16_syscall_return _sc_ret;				\
-	_sc_ret.val = __mips16_syscall##nr (args, number);		\
+	union __libc_do_syscall_return _sc_ret;				\
+	if (nr <= 4)							\
+	  _sc_ret.val = __mips16_syscall##nr (__VA_ARGS__, number);	\
+	else								\
+	  _sc_ret.val = __libc_do_syscall (number, ## __VA_ARGS__);	\
 	err = _sc_ret.reg.v1;						\
 	_sc_ret.reg.v0;							\
 })
@@ -121,13 +137,13 @@
 # define INTERNAL_SYSCALL_MIPS16(number, err, nr, args...)		\
 	internal_syscall##nr ("lw\t%0, %2\n\t",				\
 			      "R" (number),				\
-			      0, err, args)
+			      number, err, args)
 
 #else /* !__mips16 */
 # define INTERNAL_SYSCALL(name, err, nr, args...)			\
 	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
 			      "IK" (SYS_ify (name)),			\
-			      0, err, args)
+			      SYS_ify(name), err, args)
 
 # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
 	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
@@ -136,6 +152,7 @@
 
 #endif /* !__mips16 */
 
+
 #define internal_syscall0(v0_init, input, number, err, dummy...)	\
 ({									\
 	long _sys_result;						\
@@ -262,110 +279,34 @@
 	_sys_result;							\
 })
 
-/* We need to use a frame pointer for the functions in which we
-   adjust $sp around the syscall, or debug information and unwind
-   information will be $sp relative and thus wrong during the syscall.  As
-   of GCC 4.7, this is sufficient.  */
-#define FORCE_FRAME_POINTER						\
-  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
-
 #define internal_syscall5(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5)			\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5))						\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+        union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall (number, arg1, arg2, arg3,	\
+					     arg4, arg5);		\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
 })
 
 #define internal_syscall6(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6)		\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+        union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall (number, arg1, arg2, arg3,	\
+					     arg4, arg5, arg6);		\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
 })
 
 #define internal_syscall7(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6, arg7)	\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	"sw\t%8, 24($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+        union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall (number, arg1, arg2, arg3,	\
+					     arg4, arg5, arg6, arg7);	\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
 })
 
 #define __SYSCALL_CLOBBERS "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13", \


-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 21:09                             ` Adhemerval Zanella
  2017-08-17 21:20                               ` Aurelien Jarno
@ 2017-08-17 21:34                               ` Aurelien Jarno
  2017-08-17 21:47                               ` Maciej W. Rozycki
  2 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-17 21:34 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Maciej W. Rozycki, Joseph Myers, libc-alpha

On 2017-08-17 18:09, Adhemerval Zanella wrote:
> 
> 
> On 17/08/2017 17:34, Maciej W. Rozycki wrote:
> > On Thu, 17 Aug 2017, Adhemerval Zanella wrote:
> > 
> >> My point is I think we should aim for compiler optimization safeness
> >> (to avoid code breakage over compiler defined default flags) and taking
> >> as base current approach to *avoid* VLA on GLIBC I do not think it is
> >> good approach to use it as a bridge to force GCC to generate the expected
> >> code.
> > 
> >  You certainly have a point here overall, although I don't think a VLA 
> > whose size is always 0 really hurts.  And we've used the approach with 
> > `alloca' since forever with no adverse effects until we added a place 
> > where the caller invokes the syscall wrapper in a loop.  So I wouldn't 
> > necessarily call it an issue.  Mind that this is target-specific code, so 
> > we can rely on a target-specific execution model rather than limiting 
> > ourselves to what generic ISO C guarantees.
> > 
> >  Aurelien's figures indicating a clear size reduction certainly count as a 
> > pro though.
> 
> Joseph pointed out another advantage of avoid VLAs (building with 
> -Werror=alloca -Werror=vla).  My main problem here is we are betting that
> compiler won't mess with our assumptions and generate the desirable code
> without trying to adhere what it is suppose to provide.  Target generic
> ISO C give us a better guarantee and any deviation indicates a possible
> compiler issue, not otherwise (such this case).  My another point is we
> can optimize if required later if this is the case and imho this is hardly
> the case here (at least for latency).
> 
> If I understood correctly Aurelien's suggestion of returning err in v1
> is not ABI strictly so it will end up calling __libc_do_syscall with a
> non-conformant ABI convention (similar to pipe implementation where requires
> assembly specific implementation for a lot of architectures to get this
> right).  Again this is something I would really to avoid.
> 

In the ABI v1 is used in pair with v0 to return 64-bit values. In my
patch the __libc_do_syscall is declared as returning a long long. The
value is then split using a union, in a similar way to what is already
done for the mips16 code.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 21:09                             ` Adhemerval Zanella
  2017-08-17 21:20                               ` Aurelien Jarno
  2017-08-17 21:34                               ` Aurelien Jarno
@ 2017-08-17 21:47                               ` Maciej W. Rozycki
  2 siblings, 0 replies; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-17 21:47 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Joseph Myers, Aurelien Jarno, libc-alpha

On Thu, 17 Aug 2017, Adhemerval Zanella wrote:

> If I understood correctly Aurelien's suggestion of returning err in v1
> is not ABI strictly so it will end up calling __libc_do_syscall with a
> non-conformant ABI convention (similar to pipe implementation where requires
> assembly specific implementation for a lot of architectures to get this
> right).  Again this is something I would really to avoid.

 Using $v1 is fine, in ABI terms it's just a part of a `long long' result, 
and you can access it in plain C in the caller (shifting and masking 
individual 32-bit halves if necessary).  I've done it myself in the past 
in some bare-metal library code.

> >  If everyone but me thinks there's a clear advantage in using such a 
> > handcoded stub though, then as I previously noted please adjust the 
> > affected MIPS16 stubs to avoid the extra indirection, i.e. you can call 
> > `__libc_do_syscall' directly from MIPS16 code as you'd do from regular 
> > MIPS or microMIPS code, as the lone reason for the existence of the MIPS16 
> > stubs is the inexistence of a MIPS16 SYSCALL instruction.
> 
> Ok, I will try to at least check it on qemu. If you have any points on how
> correctly build a mips16 glibc it could be helpful. 

 Just pass `-mips16' along with CFLAGS.  You may have to make sure your 
GCC configuration includes/supports a suitable MIPS16 mulitilib though 
(i.e. MIPS16 libgcc.a and CRT files of your chosen endianness; check with 
`-print-multi-lib' for entries with `@mips16'), to avoid interlinking 
scenarios that may not be supported.  I don't remember offhand what the 
defaults for the individual GCC configurations are, although I'm fairly 
sure at least one of `mips-mti-linux-gnu' and `mips-img-linux-gnu' 
configurations does have MIPS16 multilibs.  Let me know if you have 
troubles with that.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 21:20                               ` Aurelien Jarno
@ 2017-08-17 22:05                                 ` Adhemerval Zanella
  2017-08-17 22:34                                 ` Maciej W. Rozycki
  1 sibling, 0 replies; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-17 22:05 UTC (permalink / raw)
  To: Maciej W. Rozycki, Joseph Myers, libc-alpha



On 17/08/2017 18:20, Aurelien Jarno wrote:
> On 2017-08-17 18:09, Adhemerval Zanella wrote:
>>
>>
>> On 17/08/2017 17:34, Maciej W. Rozycki wrote:
>>> On Thu, 17 Aug 2017, Adhemerval Zanella wrote:
>>>
>>>> My point is I think we should aim for compiler optimization safeness
>>>> (to avoid code breakage over compiler defined default flags) and taking
>>>> as base current approach to *avoid* VLA on GLIBC I do not think it is
>>>> good approach to use it as a bridge to force GCC to generate the expected
>>>> code.
>>>
>>>  You certainly have a point here overall, although I don't think a VLA 
>>> whose size is always 0 really hurts.  And we've used the approach with 
>>> `alloca' since forever with no adverse effects until we added a place 
>>> where the caller invokes the syscall wrapper in a loop.  So I wouldn't 
>>> necessarily call it an issue.  Mind that this is target-specific code, so 
>>> we can rely on a target-specific execution model rather than limiting 
>>> ourselves to what generic ISO C guarantees.
>>>
>>>  Aurelien's figures indicating a clear size reduction certainly count as a 
>>> pro though.
>>
>> Joseph pointed out another advantage of avoid VLAs (building with 
>> -Werror=alloca -Werror=vla).  My main problem here is we are betting that
>> compiler won't mess with our assumptions and generate the desirable code
>> without trying to adhere what it is suppose to provide.  Target generic
>> ISO C give us a better guarantee and any deviation indicates a possible
>> compiler issue, not otherwise (such this case).  My another point is we
>> can optimize if required later if this is the case and imho this is hardly
>> the case here (at least for latency).
>>
>> If I understood correctly Aurelien's suggestion of returning err in v1
>> is not ABI strictly so it will end up calling __libc_do_syscall with a
>> non-conformant ABI convention (similar to pipe implementation where requires
>> assembly specific implementation for a lot of architectures to get this
>> right).  Again this is something I would really to avoid.
>>
>>>
>>>> I still thinking trying to optimize for 5/6/7 syscall argument is over
>>>> engineering in this *specific* case.  As I put in my last message,
>>>> 5/6/7 argument syscalls are used for 
>>>>
>>>> pread, pwrite, lseek, llseek, ppoll, posix_fadvice, posix_fallocate, 
>>>> sync_file_range, fallocate, preadv, pwritev, preadv2, pwritev2, select,
>>>> pselect, mmap, readahead, epoll_pwait, splice, recvfrom, sendto, recvmmsg,
>>>> msgsnd, msgrcv, msgget, msgctl, semop, semget, semctl, semtimedop, shmat,
>>>> shmdt, shmget, and shmctl. 
>>>>
>>>> Which are the one generated from C implementation (some are still auto
>>>> generated).  The majority of them are blocking syscalls, so both context
>>>> switch plus the required work for syscall completion itself will taking
>>>> proportionally all the required time.  So trying to squeeze some cycles
>>>> don't really pay off comparing to code maintainability (just all this
>>>> discussion of which C construct would be safe enough to generate the 
>>>> correct stack spill plus the current issue should indicate we should
>>>> aim for correctness first).
>>>
>>>  TBH, I find it questionable whether it's really the approach I proposed 
>>> that requires more engineering (and long-term maintenance) effort rather 
>>> than using a separate handwritten assembly-language call stub.  Especially 
>>> if a non-standard calling convention is used.
>>
>> IMHO I find the VLA suggestion more fragile in long term.
>>
>>>
>>>  If everyone but me thinks there's a clear advantage in using such a 
>>> handcoded stub though, then as I previously noted please adjust the 
>>> affected MIPS16 stubs to avoid the extra indirection, i.e. you can call 
>>> `__libc_do_syscall' directly from MIPS16 code as you'd do from regular 
>>> MIPS or microMIPS code, as the lone reason for the existence of the MIPS16 
>>> stubs is the inexistence of a MIPS16 SYSCALL instruction.
>>
>> Ok, I will try to at least check it on qemu. If you have any points on how
>> correctly build a mips16 glibc it could be helpful. 
> 
> The patch below, based on Adhemerval's version should do it. The changes
> I have done:
> - return err through v1 instead of using a negative value
> - fix build of mips16-syscallX.c
> - route mips16 syscalls with 5 to 7th arguments through __libc_do_syscall

Thanks for complementing it, I was about to send patched version with same
modification (I figured out 64 bits return ints can be used here reading
mips16 syscall code).  Patch LGTM, I would just add some comments to give why
__libc_do_syscall is required (as for other ports). Comment below.

> 
> I see no regression on mipsel o32. I have only lightly tested mips o32
> (the testsuite is still running). I haven't been able to fully compile
> mips16 due to the following error compiling dl-tunables.c:
> 
> /tmp/ccI2NMgJ.s: Assembler messages:
> /tmp/ccI2NMgJ.s:1376: Error: branch to a symbol in another ISA mode
> 
> It doesn't seem to be related to my changes.

I haven't tests with --enable-tunables, but I could build a mips16 with
default options.  The only issue, although, is the new tst-gmon.c test:

tst-gmon.c: In function ‘f1’:
tst-gmon.c:25:1: sorry, unimplemented: mips16 function profiling
 }
 ^

Which is unrelated to this patch.

> 
> Aurelien
> 
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/Makefile b/sysdeps/unix/sysv/linux/mips/mips32/Makefile
> index 33b461500c..cbdf032c3a 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/Makefile
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/Makefile
> @@ -1,8 +1,26 @@
> +ifeq ($(subdir),elf)
> +sysdep-dl-routines += libc-do-syscall
> +endif
> +
>  ifeq ($(subdir),conform)
>  # For bugs 17786 and 21278.
>  conformtest-xfail-conds += mips-o32-linux
>  endif
>  
> +ifeq ($(subdir),io)
> +sysdep_routines += libc-do-syscall
> +endif
> +
> +ifeq ($(subdir),nptl)
> +libpthread-sysdep_routines += libc-do-syscall
> +libpthread-shared-only-routines += libc-do-syscall
> +endif
> +
> +ifeq ($(subdir),rt)
> +librt-sysdep_routines += libc-do-syscall
> +librt-shared-only-routines += libc-do-syscall
> +endif
> +
>  ifeq ($(subdir),stdlib)
>  tests += bug-getcontext-mips-gp
>  endif
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
> new file mode 100644
> index 0000000000..c02f507008
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
> @@ -0,0 +1,52 @@
> +/* Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sys/asm.h>
> +#include <sysdep.h>
> +#include <asm/unistd.h>
> +#include <sgidefs.h>
> +
> +
> +/* long int __libc_do_syscall (long int, ...)  */

I added some comments in my version:

/* Out-of-line syscall stub used for 5, 6, and 7 argument syscall which
   requires arguments in stack.  It follows the MIPS ABI similar to the C
   prototype:

   long int __libc_do_syscall (long int, ...)

   With syscall number in a0, first argument in a1, second in a2, third
   in a3 and the 4th-7th on stack.   */

> +
> +#define FRAMESZ 32
> +
> +	.text
> +	.set    nomips16
> +	.hidden __libc_do_syscall
> +ENTRY(__libc_do_syscall)
> +	move    v0, a0
> +	move    a0, a1
> +	move    a1, a2
> +	move    a2, a3
> +	lw      a3, 16(sp)
> +	lw      t0, 20(sp)
> +	lw      t1, 24(sp)
> +	lw      t2, 28(sp)
> +	.set 	noreorder
> +	PTR_SUBU sp, FRAMESZ
> +	cfi_adjust_cfa_offset (FRAMESZ)
> +	sw      t0, 16(sp)
> +	sw      t1, 20(sp)
> +	sw      t2, 24(sp)
> +	syscall
> +	PTR_ADDU sp, FRAMESZ
> +	cfi_adjust_cfa_offset (-FRAMESZ)
> +	.set	reorder
> +	move    v1, a3
> +1:      ret
> +END (__libc_do_syscall)
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
> index fa9fcb7e6f..6869bf4f7c 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
> @@ -1,13 +1,9 @@
>  ifeq ($(subdir),misc)
>  sysdep_routines += mips16-syscall0 mips16-syscall1 mips16-syscall2
> -sysdep_routines += mips16-syscall3 mips16-syscall4 mips16-syscall5
> -sysdep_routines += mips16-syscall6 mips16-syscall7
> +sysdep_routines += mips16-syscall3 mips16-syscall4
>  CFLAGS-mips16-syscall0.c += -fexceptions
>  CFLAGS-mips16-syscall1.c += -fexceptions
>  CFLAGS-mips16-syscall2.c += -fexceptions
>  CFLAGS-mips16-syscall3.c += -fexceptions
>  CFLAGS-mips16-syscall4.c += -fexceptions
> -CFLAGS-mips16-syscall5.c += -fexceptions
> -CFLAGS-mips16-syscall6.c += -fexceptions
> -CFLAGS-mips16-syscall7.c += -fexceptions
>  endif
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
> index 73bcfb566c..bb21747f44 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
> @@ -1,6 +1,6 @@
>  libc {
>    GLIBC_PRIVATE {
>      __mips16_syscall0; __mips16_syscall1; __mips16_syscall2; __mips16_syscall3;
> -    __mips16_syscall4; __mips16_syscall5; __mips16_syscall6; __mips16_syscall7;
> +    __mips16_syscall4;
>    }
>  }
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
> index 880e9908e8..60f856d248 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
> @@ -21,17 +21,6 @@
>  
>  #define __nomips16 __attribute__ ((nomips16))
>  
> -union __mips16_syscall_return
> -  {
> -    long long val;
> -    struct
> -      {
> -	long v0;
> -	long v1;
> -      }
> -    reg;
> -  };
> -
>  long long __nomips16 __mips16_syscall0 (long number);
>  #define __mips16_syscall0(dummy, number)				\
>  	__mips16_syscall0 ((long) (number))
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
> index 490245b34e..b9f78e875f 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
> @@ -24,7 +24,7 @@
>  long long __nomips16
>  __mips16_syscall0 (long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __libc_do_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 0);
>    return ret.val;
>  }
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
> index 3061e8accb..284ce712cc 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
> @@ -25,7 +25,7 @@ long long __nomips16
>  __mips16_syscall1 (long a0,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __libc_do_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 1,
>  					a0);
>    return ret.val;
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
> index 440a4ed285..4e76329239 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
> @@ -25,7 +25,7 @@ long long __nomips16
>  __mips16_syscall2 (long a0, long a1,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __libc_do_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 2,
>  					a0, a1);
>    return ret.val;
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
> index c3f83fc1f6..dbb31d2f20 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
> @@ -25,7 +25,7 @@ long long __nomips16
>  __mips16_syscall3 (long a0, long a1, long a2,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __libc_do_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 3,
>  					a0, a1, a2);
>    return ret.val;
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
> index 496297d296..a5dade3b3f 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
> @@ -25,7 +25,7 @@ long long __nomips16
>  __mips16_syscall4 (long a0, long a1, long a2, long a3,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __libc_do_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 4,
>  					a0, a1, a2, a3);
>    return ret.val;
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
> deleted file mode 100644
> index ad265d88e2..0000000000
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
> +++ /dev/null
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall5
> -
> -long long __nomips16
> -__mips16_syscall5 (long a0, long a1, long a2, long a3,
> -		   long a4,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 5,
> -					a0, a1, a2, a3, a4);
> -  return ret.val;
> -}
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
> deleted file mode 100644
> index bfbd395ed3..0000000000
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
> +++ /dev/null
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall6
> -
> -long long __nomips16
> -__mips16_syscall6 (long a0, long a1, long a2, long a3,
> -		   long a4, long a5,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 6,
> -					a0, a1, a2, a3, a4, a5);
> -  return ret.val;
> -}
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
> deleted file mode 100644
> index e1267616dc..0000000000
> --- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
> +++ /dev/null
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall7
> -
> -long long __nomips16
> -__mips16_syscall7 (long a0, long a1, long a2, long a3,
> -		   long a4, long a5, long a6,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 7,
> -					a0, a1, a2, a3, a4, a5, a6);
> -  return ret.val;
> -}
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> index e9e3ee7e82..8e55538a5c 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> @@ -49,9 +49,9 @@
>  /* Define a macro which expands into the inline wrapper code for a system
>     call.  */
>  #undef INLINE_SYSCALL
> -#define INLINE_SYSCALL(name, nr, args...)                               \
> +#define INLINE_SYSCALL(name, nr, ...)                                   \
>    ({ INTERNAL_SYSCALL_DECL (_sc_err);					\
> -     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, args);	\
> +     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, ## __VA_ARGS__); \
>       if ( INTERNAL_SYSCALL_ERROR_P (result_var, _sc_err) )		\
>         {								\
>  	 __set_errno (INTERNAL_SYSCALL_ERRNO (result_var, _sc_err));	\
> @@ -98,6 +98,19 @@
>  #undef INTERNAL_SYSCALL
>  #undef INTERNAL_SYSCALL_NCS
>  
> +long long __attribute__ ((nomips16)) __libc_do_syscall (long int, ...) attribute_hidden;

Same as before, I think it worth adding a comment:

/* MIPS kernel ABI requires for internal_syscall5/6/7 that the 5th and
   following arguments to be on the stack.  To avoid trying force a stack
   allocation (which compiler may optimize) the macros use a auxiliary
   function to actually issue the syscall.  */

> +
> +union __libc_do_syscall_return
> +  {
> +    long long val;
> +    struct
> +      {
> +	long v0;
> +	long v1;
> +      }
> +    reg;
> +  };
> +
>  #ifdef __mips16
>  /* There's no MIPS16 syscall instruction, so we go through out-of-line
>     standard MIPS wrappers.  These do use inline snippets below though,
> @@ -107,13 +120,16 @@
>  
>  # include <mips16-syscall.h>
>  
> -# define INTERNAL_SYSCALL(name, err, nr, args...)			\
> -	INTERNAL_SYSCALL_NCS (SYS_ify (name), err, nr, args)
> +# define INTERNAL_SYSCALL(name, err, nr, ...)				\
> +	INTERNAL_SYSCALL_NCS (SYS_ify (name), err, nr, ## __VA_ARGS__)
>  
> -# define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
> +# define INTERNAL_SYSCALL_NCS(number, err, nr, ...)			\
>  ({									\
> -	union __mips16_syscall_return _sc_ret;				\
> -	_sc_ret.val = __mips16_syscall##nr (args, number);		\
> +	union __libc_do_syscall_return _sc_ret;				\
> +	if (nr <= 4)							\
> +	  _sc_ret.val = __mips16_syscall##nr (__VA_ARGS__, number);	\
> +	else								\
> +	  _sc_ret.val = __libc_do_syscall (number, ## __VA_ARGS__);	\
>  	err = _sc_ret.reg.v1;						\
>  	_sc_ret.reg.v0;							\
>  })
> @@ -121,13 +137,13 @@
>  # define INTERNAL_SYSCALL_MIPS16(number, err, nr, args...)		\
>  	internal_syscall##nr ("lw\t%0, %2\n\t",				\
>  			      "R" (number),				\
> -			      0, err, args)
> +			      number, err, args)
>  
>  #else /* !__mips16 */
>  # define INTERNAL_SYSCALL(name, err, nr, args...)			\
>  	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
>  			      "IK" (SYS_ify (name)),			\
> -			      0, err, args)
> +			      SYS_ify(name), err, args)
>  
>  # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
>  	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
> @@ -136,6 +152,7 @@
>  
>  #endif /* !__mips16 */
>  
> +
>  #define internal_syscall0(v0_init, input, number, err, dummy...)	\
>  ({									\
>  	long _sys_result;						\
> @@ -262,110 +279,34 @@
>  	_sys_result;							\
>  })
>  
> -/* We need to use a frame pointer for the functions in which we
> -   adjust $sp around the syscall, or debug information and unwind
> -   information will be $sp relative and thus wrong during the syscall.  As
> -   of GCC 4.7, this is sufficient.  */
> -#define FORCE_FRAME_POINTER						\
> -  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
> -
>  #define internal_syscall5(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5)			\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5))						\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +        union __libc_do_syscall_return _sys_result;			\
> +	_sys_result.val = __libc_do_syscall (number, arg1, arg2, arg3,	\
> +					     arg4, arg5);		\
> +	err = _sys_result.reg.v1;					\
> +	_sys_result.reg.v0;						\
>  })
>  
>  #define internal_syscall6(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5, arg6)		\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	"sw\t%7, 20($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +        union __libc_do_syscall_return _sys_result;			\
> +	_sys_result.val = __libc_do_syscall (number, arg1, arg2, arg3,	\
> +					     arg4, arg5, arg6);		\
> +	err = _sys_result.reg.v1;					\
> +	_sys_result.reg.v0;						\
>  })
>  
>  #define internal_syscall7(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5, arg6, arg7)	\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	"sw\t%7, 20($29)\n\t"						\
> -	"sw\t%8, 24($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +        union __libc_do_syscall_return _sys_result;			\
> +	_sys_result.val = __libc_do_syscall (number, arg1, arg2, arg3,	\
> +					     arg4, arg5, arg6, arg7);	\
> +	err = _sys_result.reg.v1;					\
> +	_sys_result.reg.v0;						\
>  })
>  
>  #define __SYSCALL_CLOBBERS "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13", \
> 
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 21:20                               ` Aurelien Jarno
  2017-08-17 22:05                                 ` Adhemerval Zanella
@ 2017-08-17 22:34                                 ` Maciej W. Rozycki
  2017-08-18  7:16                                   ` Aurelien Jarno
  1 sibling, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-17 22:34 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On Thu, 17 Aug 2017, Aurelien Jarno wrote:

> I see no regression on mipsel o32. I have only lightly tested mips o32
> (the testsuite is still running). I haven't been able to fully compile
> mips16 due to the following error compiling dl-tunables.c:
> 
> /tmp/ccI2NMgJ.s: Assembler messages:
> /tmp/ccI2NMgJ.s:1376: Error: branch to a symbol in another ISA mode
> 
> It doesn't seem to be related to my changes.

 If it's a regression, then it probably is.  What compiler version?  The 
fix for missing trailing label annotation went in r242424, for GCC 7.  If 
it's in handcoded assembly OTOH, then the offending code has to be fixed.

 Send me the generated assembly and I'll tell you what the case is.  If 
it's due to old buggy compiler and the branch target is really never 
reached, then you can use `-Wa,-mignore-branch-isa' as a workaround.

> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
> new file mode 100644
> index 0000000000..c02f507008
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
> @@ -0,0 +1,52 @@
> +/* Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sys/asm.h>
> +#include <sysdep.h>
> +#include <asm/unistd.h>
> +#include <sgidefs.h>
> +
> +
> +/* long int __libc_do_syscall (long int, ...)  */
> +
> +#define FRAMESZ 32
> +
> +	.text
> +	.set    nomips16
> +	.hidden __libc_do_syscall
> +ENTRY(__libc_do_syscall)
> +	move    v0, a0
> +	move    a0, a1
> +	move    a1, a2
> +	move    a2, a3

 Please rearrange this as we previously discussed, with your idea to have 
the syscall number as the 4th argument.  This will save cycles at no cost.

> +	lw      a3, 16(sp)
> +	lw      t0, 20(sp)
> +	lw      t1, 24(sp)
> +	lw      t2, 28(sp)
> +	.set 	noreorder

 Why `.set noreorder'?

> +	PTR_SUBU sp, FRAMESZ
> +	cfi_adjust_cfa_offset (FRAMESZ)
> +	sw      t0, 16(sp)
> +	sw      t1, 20(sp)
> +	sw      t2, 24(sp)

 With the stub written solely in assembly only I wonder if we actually 
need to mess with $sp in the first place.  I think we can reuse the stack 
argument save area and shuffle the incoming arguments in place.  In C 
language terms this would be equivalent to reassigning their values in the 
callee, which is allowed by the language and IIUC does not require copying 
the arguments out (so e.g. -O0 code would do just that), so the compiler 
cannot assume the argument save area remains unclobbered after a function 
return and use the returning values for anything.

 Perhaps we could have separate `__libc_do_syscall5', `__libc_do_syscall6' 
and `__libc_do_syscall7' stubs even, really minimal, with the only code 
required being to load $v0 from the last argument, i.e.:

ENTRY(__libc_do_syscall5)
	lw	v0, 16(sp)
	syscall
	move	v1, a3
	jr	ra
END(__libc_do_syscall5)

(and then $sp offsets of 20 and 24 for the other two)?  I'd withdraw any 
concerns about code complication I might have had so far then. :)

> +	syscall
> +	PTR_ADDU sp, FRAMESZ
> +	cfi_adjust_cfa_offset (-FRAMESZ)
> +	.set	reorder
> +	move    v1, a3
> +1:      ret

 Are you sure it builds?  There's no MIPS RET instruction; you meant `jr 
ra' presumably.  Also what is the `1' label for?  It'll prevent MOVE from 
being reordered by GAS into the JR's delay slot (in microMIPS assembly 
it'll get relaxed to JRC in that case).

 NB please use a tab rather than spaces to indent this instruction like 
the rest.

> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> index e9e3ee7e82..8e55538a5c 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> @@ -49,9 +49,9 @@
>  /* Define a macro which expands into the inline wrapper code for a system
>     call.  */
>  #undef INLINE_SYSCALL
> -#define INLINE_SYSCALL(name, nr, args...)                               \
> +#define INLINE_SYSCALL(name, nr, ...)                                   \
>    ({ INTERNAL_SYSCALL_DECL (_sc_err);					\
> -     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, args);	\
> +     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, ## __VA_ARGS__); \

 What's this change for (and likewise throughout)?

> @@ -136,6 +152,7 @@
>  
>  #endif /* !__mips16 */
>  
> +

 Extraneous new line?

> @@ -262,110 +279,34 @@
>  	_sys_result;							\
>  })
>  
> -/* We need to use a frame pointer for the functions in which we
> -   adjust $sp around the syscall, or debug information and unwind
> -   information will be $sp relative and thus wrong during the syscall.  As
> -   of GCC 4.7, this is sufficient.  */
> -#define FORCE_FRAME_POINTER						\
> -  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
> -
>  #define internal_syscall5(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5)			\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5))						\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +        union __libc_do_syscall_return _sys_result;			\

 Indentation.  Same with the repeated line throughout.

 Otherwise it looks reasonable to me.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-17 22:34                                 ` Maciej W. Rozycki
@ 2017-08-18  7:16                                   ` Aurelien Jarno
  2017-08-18  9:32                                     ` Maciej W. Rozycki
  0 siblings, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-18  7:16 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On 2017-08-17 23:34, Maciej W. Rozycki wrote:
> On Thu, 17 Aug 2017, Aurelien Jarno wrote:
> 
> > I see no regression on mipsel o32. I have only lightly tested mips o32
> > (the testsuite is still running). I haven't been able to fully compile
> > mips16 due to the following error compiling dl-tunables.c:
> > 
> > /tmp/ccI2NMgJ.s: Assembler messages:
> > /tmp/ccI2NMgJ.s:1376: Error: branch to a symbol in another ISA mode
> > 
> > It doesn't seem to be related to my changes.
> 
>  If it's a regression, then it probably is.  What compiler version?  The 
> fix for missing trailing label annotation went in r242424, for GCC 7.  If 
> it's in handcoded assembly OTOH, then the offending code has to be fixed.

I am using GCC 6, so if the fix went in GCC 7, that's normal the issue
is present.

 
> > diff --git a/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
> > new file mode 100644
> > index 0000000000..c02f507008
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
> > @@ -0,0 +1,52 @@
> > +/* Copyright (C) 2017 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library.  If not, see
> > +   <http://www.gnu.org/licenses/>.  */
> > +
> > +#include <sys/asm.h>
> > +#include <sysdep.h>
> > +#include <asm/unistd.h>
> > +#include <sgidefs.h>
> > +
> > +
> > +/* long int __libc_do_syscall (long int, ...)  */
> > +
> > +#define FRAMESZ 32
> > +
> > +	.text
> > +	.set    nomips16
> > +	.hidden __libc_do_syscall
> > +ENTRY(__libc_do_syscall)
> > +	move    v0, a0
> > +	move    a0, a1
> > +	move    a1, a2
> > +	move    a2, a3
> 
>  Please rearrange this as we previously discussed, with your idea to have 
> the syscall number as the 4th argument.  This will save cycles at no cost.
> 
> > +	lw      a3, 16(sp)
> > +	lw      t0, 20(sp)
> > +	lw      t1, 24(sp)
> > +	lw      t2, 28(sp)
> > +	.set 	noreorder
> 
>  Why `.set noreorder'?

It comes from Adhemerval's patch, and I guess it comes from the asm code
which uses noreorder/reorder around the syscall.

> > +	PTR_SUBU sp, FRAMESZ
> > +	cfi_adjust_cfa_offset (FRAMESZ)
> > +	sw      t0, 16(sp)
> > +	sw      t1, 20(sp)
> > +	sw      t2, 24(sp)
> 
>  With the stub written solely in assembly only I wonder if we actually 
> need to mess with $sp in the first place.  I think we can reuse the stack 
> argument save area and shuffle the incoming arguments in place.  In C 
> language terms this would be equivalent to reassigning their values in the 
> callee, which is allowed by the language and IIUC does not require copying 
> the arguments out (so e.g. -O0 code would do just that), so the compiler 
> cannot assume the argument save area remains unclobbered after a function 
> return and use the returning values for anything.
> 
>  Perhaps we could have separate `__libc_do_syscall5', `__libc_do_syscall6' 
> and `__libc_do_syscall7' stubs even, really minimal, with the only code 
> required being to load $v0 from the last argument, i.e.:
> 
> ENTRY(__libc_do_syscall5)
> 	lw	v0, 16(sp)
> 	syscall
> 	move	v1, a3
> 	jr	ra
> END(__libc_do_syscall5)
> 
> (and then $sp offsets of 20 and 24 for the other two)?  I'd withdraw any 
> concerns about code complication I might have had so far then. :)

That's an interesting idea. If we use a different stub depending on the
number of arguments, we can actually pass the syscall number last, which
is probably more readable. Could also be used for mips16 in all cases?

> > +	syscall
> > +	PTR_ADDU sp, FRAMESZ
> > +	cfi_adjust_cfa_offset (-FRAMESZ)
> > +	.set	reorder
> > +	move    v1, a3
> > +1:      ret
> 
>  Are you sure it builds?  There's no MIPS RET instruction; you meant `jr 
> ra' presumably.  Also what is the `1' label for?  It'll prevent MOVE from 
> being reordered by GAS into the JR's delay slot (in microMIPS assembly 
> it'll get relaxed to JRC in that case).
> 
>  NB please use a tab rather than spaces to indent this instruction like 
> the rest.

I actually noticed the issue of the delay slot not being used, that's
why I tried alternative way to end the function. Using ret compiles and
is used in various glibc places (e.g. mips64/syscall.S). I'll fix that.

> > diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> > index e9e3ee7e82..8e55538a5c 100644
> > --- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> > +++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> > @@ -49,9 +49,9 @@
> >  /* Define a macro which expands into the inline wrapper code for a system
> >     call.  */
> >  #undef INLINE_SYSCALL
> > -#define INLINE_SYSCALL(name, nr, args...)                               \
> > +#define INLINE_SYSCALL(name, nr, ...)                                   \
> >    ({ INTERNAL_SYSCALL_DECL (_sc_err);					\
> > -     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, args);	\
> > +     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, ## __VA_ARGS__); \
> 
>  What's this change for (and likewise throughout)?

This change is need to be able to call __libc_do_syscall in the mips16
code. Without it, syscalls without arguments end up with an empty value
after the comma. Note that the generic sysdep.h has already been changed
to use C99 variadic macros.

Now if we use a different stub depending on the number of arguments, we
can probably get rid of that.

> > @@ -136,6 +152,7 @@
> >  
> >  #endif /* !__mips16 */
> >  
> > +
> 
>  Extraneous new line?

Ack.

> > @@ -262,110 +279,34 @@
> >  	_sys_result;							\
> >  })
> >  
> > -/* We need to use a frame pointer for the functions in which we
> > -   adjust $sp around the syscall, or debug information and unwind
> > -   information will be $sp relative and thus wrong during the syscall.  As
> > -   of GCC 4.7, this is sufficient.  */
> > -#define FORCE_FRAME_POINTER						\
> > -  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
> > -
> >  #define internal_syscall5(v0_init, input, number, err,			\
> >  			  arg1, arg2, arg3, arg4, arg5)			\
> >  ({									\
> > -	long _sys_result;						\
> > -									\
> > -	FORCE_FRAME_POINTER;						\
> > -	{								\
> > -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> > -	  = (number);							\
> > -	register long __v0 asm ("$2");					\
> > -	register long __a0 asm ("$4") = (long) (arg1);			\
> > -	register long __a1 asm ("$5") = (long) (arg2);			\
> > -	register long __a2 asm ("$6") = (long) (arg3);			\
> > -	register long __a3 asm ("$7") = (long) (arg4);			\
> > -	__asm__ volatile (						\
> > -	".set\tnoreorder\n\t"						\
> > -	"subu\t$29, 32\n\t"						\
> > -	"sw\t%6, 16($29)\n\t"						\
> > -	v0_init								\
> > -	"syscall\n\t"							\
> > -	"addiu\t$29, 32\n\t"						\
> > -	".set\treorder"							\
> > -	: "=r" (__v0), "+r" (__a3)					\
> > -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> > -	  "r" ((long) (arg5))						\
> > -	: __SYSCALL_CLOBBERS);						\
> > -	err = __a3;							\
> > -	_sys_result = __v0;						\
> > -	}								\
> > -	_sys_result;							\
> > +        union __libc_do_syscall_return _sys_result;			\
> 
>  Indentation.  Same with the repeated line throughout.

Ack.

Note that in the meantime the testsuite finished on mips o32, and there
is no regression.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-18  7:16                                   ` Aurelien Jarno
@ 2017-08-18  9:32                                     ` Maciej W. Rozycki
  2017-08-18 17:45                                       ` Aurelien Jarno
  0 siblings, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-18  9:32 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On Fri, 18 Aug 2017, Aurelien Jarno wrote:

> >  If it's a regression, then it probably is.  What compiler version?  The 
> > fix for missing trailing label annotation went in r242424, for GCC 7.  If 
> > it's in handcoded assembly OTOH, then the offending code has to be fixed.
> 
> I am using GCC 6, so if the fix went in GCC 7, that's normal the issue
> is present.

 OK then; you can use the workaround I suggested to verify MIPS16 
compilation then.

> > > +	lw      a3, 16(sp)
> > > +	lw      t0, 20(sp)
> > > +	lw      t1, 24(sp)
> > > +	lw      t2, 28(sp)
> > > +	.set 	noreorder
> > 
> >  Why `.set noreorder'?
> 
> It comes from Adhemerval's patch, and I guess it comes from the asm code
> which uses noreorder/reorder around the syscall.

 Indeed, though there doesn't seem to be a good reason to be there in the 
first place.  Overall our MIPS port uses `.set noreorder' in several 
places where there is no justification for that, however I never got to 
cleaning this up.  It should only ever be used for manual delay-slot 
filling in cases like where there is a data anti-dependency between the 
two instructions involved.  It is clearly not the case here.

> >  Perhaps we could have separate `__libc_do_syscall5', `__libc_do_syscall6' 
> > and `__libc_do_syscall7' stubs even, really minimal, with the only code 
> > required being to load $v0 from the last argument, i.e.:
> > 
> > ENTRY(__libc_do_syscall5)
> > 	lw	v0, 16(sp)
> > 	syscall
> > 	move	v1, a3
> > 	jr	ra
> > END(__libc_do_syscall5)
> > 
> > (and then $sp offsets of 20 and 24 for the other two)?  I'd withdraw any 
> > concerns about code complication I might have had so far then. :)
> 
> That's an interesting idea. If we use a different stub depending on the
> number of arguments, we can actually pass the syscall number last, which
> is probably more readable. Could also be used for mips16 in all cases?

 MIPS16 wrappers do that already, which is also why there is an individual 
one for each syscall argument count.

> > > +	syscall
> > > +	PTR_ADDU sp, FRAMESZ
> > > +	cfi_adjust_cfa_offset (-FRAMESZ)
> > > +	.set	reorder
> > > +	move    v1, a3
> > > +1:      ret
> > 
> >  Are you sure it builds?  There's no MIPS RET instruction; you meant `jr 
> > ra' presumably.  Also what is the `1' label for?  It'll prevent MOVE from 
> > being reordered by GAS into the JR's delay slot (in microMIPS assembly 
> > it'll get relaxed to JRC in that case).
> > 
> >  NB please use a tab rather than spaces to indent this instruction like 
> > the rest.
> 
> I actually noticed the issue of the delay slot not being used, that's
> why I tried alternative way to end the function. Using ret compiles and
> is used in various glibc places (e.g. mips64/syscall.S). I'll fix that.

 Ah, that comes from sysdeps/unix/mips/sysdep.h where `ret' is defined as:

#define ret	j ra ; nop

i.e. with useless delay slot filling wasting an instruction, whether in 
`.set noreorder' code or not.  There are ret_NOERRNO and ret_ERRVAL 
alternatives as well, all the same.  This stuff doesn't make sense to me 
and while a clean-up belongs to a separate change I don't think we want 
to continue using these macros in new code.

> > > diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> > > index e9e3ee7e82..8e55538a5c 100644
> > > --- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> > > +++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> > > @@ -49,9 +49,9 @@
> > >  /* Define a macro which expands into the inline wrapper code for a system
> > >     call.  */
> > >  #undef INLINE_SYSCALL
> > > -#define INLINE_SYSCALL(name, nr, args...)                               \
> > > +#define INLINE_SYSCALL(name, nr, ...)                                   \
> > >    ({ INTERNAL_SYSCALL_DECL (_sc_err);					\
> > > -     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, args);	\
> > > +     long result_var = INTERNAL_SYSCALL (name, _sc_err, nr, ## __VA_ARGS__); \
> > 
> >  What's this change for (and likewise throughout)?
> 
> This change is need to be able to call __libc_do_syscall in the mips16
> code. Without it, syscalls without arguments end up with an empty value
> after the comma. Note that the generic sysdep.h has already been changed
> to use C99 variadic macros.

 OK, this would qualify for a separate preparatory change then.

> Now if we use a different stub depending on the number of arguments, we
> can probably get rid of that.

 Ack.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-18  9:32                                     ` Maciej W. Rozycki
@ 2017-08-18 17:45                                       ` Aurelien Jarno
  2017-08-18 22:27                                         ` Maciej W. Rozycki
  0 siblings, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-18 17:45 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On 2017-08-18 10:32, Maciej W. Rozycki wrote:
> On Fri, 18 Aug 2017, Aurelien Jarno wrote:
> 
> > >  If it's a regression, then it probably is.  What compiler version?  The 
> > > fix for missing trailing label annotation went in r242424, for GCC 7.  If 
> > > it's in handcoded assembly OTOH, then the offending code has to be fixed.
> > 
> > I am using GCC 6, so if the fix went in GCC 7, that's normal the issue
> > is present.
> 
>  OK then; you can use the workaround I suggested to verify MIPS16 
> compilation then.

The workaround didn't work. That said building with GCC 7 fixes the
issue.

> > >  Perhaps we could have separate `__libc_do_syscall5', `__libc_do_syscall6' 
> > > and `__libc_do_syscall7' stubs even, really minimal, with the only code 
> > > required being to load $v0 from the last argument, i.e.:
> > > 
> > > ENTRY(__libc_do_syscall5)
> > > 	lw	v0, 16(sp)
> > > 	syscall
> > > 	move	v1, a3
> > > 	jr	ra
> > > END(__libc_do_syscall5)
> > > 
> > > (and then $sp offsets of 20 and 24 for the other two)?  I'd withdraw any 
> > > concerns about code complication I might have had so far then. :)
> > 
> > That's an interesting idea. If we use a different stub depending on the
> > number of arguments, we can actually pass the syscall number last, which
> > is probably more readable. Could also be used for mips16 in all cases?
> 
>  MIPS16 wrappers do that already, which is also why there is an individual 
> one for each syscall argument count.

Please find below a new patch implementing that. It started to be
complicated to get the MIPS16 related defines used to build the 
equivalent code through GCC to work, so I decided to also implement
__libc_do_syscall0 to __libc_do_syscall4 in libc-do-syscall.S. I looked
at the original code generated by GCC, it's very similar to what I used,
sometimes just a bit longer (sometimes GCC saves the syscall number to
the stack to reload it just after).

I have compiled and tested it on mips O32 little and big endian and found
no regression. Of course it fixes nptl/tst-rwlock15. I have also compiled
it on mips16 O32 little endian, but I haven't tested it besides running
ld.so and libc.so under QEMU.

Changelog:

2017-08-18  Adhemerval Zanella  <adhemerval.zanella@linaro.org>
	    Aurelien Jarno <aurelien@aurel32.net>

	[BZ #21956]
	* sysdeps/unix/sysv/linux/mips/mips32/Makefile [subdir = crypt]
	(libcrypt-sysdep_routines): Add libc-do-syscall.
	[subdir = elf] (sysdep-dl-routines): Likewise.
	[subdir = io] (sysdep_routines): Likewise.
	[subdir = nptl] (libpthread-sysdep_routines): Likewise.
	[subdir = nptl] (libpthread-shared-only-routines): Likewise.
	[subdir = nscd] (nscd-modules): Likewise.
	[subdir = nss] (libnss_db-sysdep_routines): Likewise.
	[subdir = nss] (libnss_db-shared-only-routines): Likewise.
	[subdir = resolv] (libanl-sysdep_routines): Likewise.
	[subdir = resolv] (libanl-shared-only-routines): Likewise.
	[subdir = rt] (librt-sysdep_routines): Likewise.
	[subdir = rt] (librt-shared-only-routines): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S: New file.
	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h: [!__mips16]
	(INTERNAL_SYSCALL): Make code unconditional.
	[!__mips16] (INTERNAL_SYSCALL_NCS): Likewise.
	[__mips16] (INTERNAL_SYSCALL): Remove.
	[__mips16] (INTERNAL_SYSCALL_NCS): Likewise.
	(__nomips16): Define.
	(__libc_do_syscall_return): Likewise.
	[__mips16] (__libc_do_syscall0): Declare.
	[__mips16] (internal_syscall0): Define.
	[__mips16] (__libc_do_syscall1): Declare.
	[__mips16] (internal_syscall1): Define.
	[__mips16] (__libc_do_syscall2): Declare.
	[__mips16] (internal_syscall2): Define.
	[__mips16] (__libc_do_syscall3): Declare.
	[__mips16] (internal_syscall3): Define.
	[__mips16] (__libc_do_syscall4): Declare.
	[__mips16] (internal_syscall4): Define.
	(internal_syscall0): Guard with !__mips16.
	(internal_syscall1): Guard with !__mips16.
	(internal_syscall2): Guard with !__mips16.
	(internal_syscall3): Guard with !__mips16.
	(internal_syscall4): Guard with !__mips16.
	(FORCE_FRAME_POINTER): Remove.
	(internal_syscall5): Rewrite to call __libc_do_syscall5.
	(internal_syscall6): Rewrite to call __libc_do_syscall6.
	(internal_syscall7): Rewrite to call __libc_do_syscall7.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile: Remove file.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h:
	  Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c:
	  Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c:
	  Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c:
	  Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c:
	  Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c:
	  Likewise
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c:
	  Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c:
	  Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c:
	  Likewise.


diff --git a/sysdeps/unix/sysv/linux/mips/mips32/Makefile b/sysdeps/unix/sysv/linux/mips/mips32/Makefile
index 33b461500c..d0bb1fc200 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/Makefile
+++ b/sysdeps/unix/sysv/linux/mips/mips32/Makefile
@@ -1,8 +1,44 @@
+ifeq ($(subdir),crypt)
+libcrypt-sysdep_routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += libc-do-syscall
+endif
+
 ifeq ($(subdir),conform)
 # For bugs 17786 and 21278.
 conformtest-xfail-conds += mips-o32-linux
 endif
 
+ifeq ($(subdir),io)
+sysdep_routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),nptl)
+libpthread-sysdep_routines += libc-do-syscall
+libpthread-shared-only-routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),nscd)
+nscd-modules += libc-do-syscall
+endif
+
+ifeq ($(subdir),nss)
+libnss_db-sysdep_routines += libc-do-syscall
+libnss_db-shared-only-routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),resolv)
+libanl-sysdep_routines += libc-do-syscall
+libanl-shared-only-routines += libc-do-syscall
+endif
+
+ifeq ($(subdir),rt)
+librt-sysdep_routines += libc-do-syscall
+librt-shared-only-routines += libc-do-syscall
+endif
+
 ifeq ($(subdir),stdlib)
 tests += bug-getcontext-mips-gp
 endif
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
new file mode 100644
index 0000000000..e6777f6967
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/mips/mips32/libc-do-syscall.S
@@ -0,0 +1,105 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+	.text
+	.set    nomips16
+
+#ifdef __mips16
+
+/* long long __libc_do_syscall0 (long arg1, long number)  */
+	.hidden __libc_do_syscall0
+ENTRY(__libc_do_syscall0)
+        move    v0, a0
+        syscall
+        move    v1, a3
+        jr      ra
+END(__libc_do_syscall0)
+
+
+/* long long __libc_do_syscall1 (long arg1, long number)  */
+	.hidden __libc_do_syscall1
+ENTRY(__libc_do_syscall1)
+        move    v0, a1
+        syscall
+        move    v1, a3
+        jr      ra
+END(__libc_do_syscall1)
+
+/* long long __libc_do_syscall2 (long arg1, long arg2, long number)  */
+	.hidden __libc_do_syscall2
+ENTRY(__libc_do_syscall2)
+        move    v0, a2
+        syscall
+        move    v1, a3
+        jr      ra
+END(__libc_do_syscall2)
+
+/* long long __libc_do_syscall3 (long arg1, long arg2, long arg3,
+				 long number)  */
+	.hidden __libc_do_syscall3
+ENTRY(__libc_do_syscall3)
+        move    v0, a3
+        syscall
+        move    v1, a3
+        jr      ra
+END(__libc_do_syscall3)
+
+/* long long __libc_do_syscall4 (long arg1, long arg2, long arg3, long arg4,
+				 long number)  */
+	.hidden __libc_do_syscall4
+ENTRY(__libc_do_syscall4)
+        lw      v0, 16(sp)
+        syscall
+        move    v1, a3
+        jr      ra
+END(__libc_do_syscall4)
+
+#endif /* !__mips16 */
+
+/* long long __libc_do_syscall5 (long arg1, long arg2, long arg3, long arg4,
+				 long arg5, long number)  */
+	.hidden __libc_do_syscall5
+ENTRY(__libc_do_syscall5)
+        lw      v0, 20(sp)
+        syscall
+        move    v1, a3
+        jr      ra
+END(__libc_do_syscall5)
+
+/* long long __libc_do_syscall6 (long arg1, long arg2, long arg3, long arg4,
+				 long arg5, long arg6, long number)  */
+	.hidden __libc_do_syscall6
+ENTRY(__libc_do_syscall6)
+        lw      v0, 24(sp)
+        syscall
+        move    v1, a3
+        jr      ra
+END(__libc_do_syscall6)
+
+/* long long __libc_do_syscall7 (long arg1, long arg2, long arg3, long arg4,
+				 long arg5, long arg6, long arg7,
+				 long number)  */
+	.hidden __libc_do_syscall7
+ENTRY(__libc_do_syscall7)
+        lw      v0, 28(sp)
+        syscall
+        move    v1, a3
+        jr      ra
+END(__libc_do_syscall7)
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
deleted file mode 100644
index fa9fcb7e6f..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
+++ /dev/null
@@ -1,13 +0,0 @@
-ifeq ($(subdir),misc)
-sysdep_routines += mips16-syscall0 mips16-syscall1 mips16-syscall2
-sysdep_routines += mips16-syscall3 mips16-syscall4 mips16-syscall5
-sysdep_routines += mips16-syscall6 mips16-syscall7
-CFLAGS-mips16-syscall0.c += -fexceptions
-CFLAGS-mips16-syscall1.c += -fexceptions
-CFLAGS-mips16-syscall2.c += -fexceptions
-CFLAGS-mips16-syscall3.c += -fexceptions
-CFLAGS-mips16-syscall4.c += -fexceptions
-CFLAGS-mips16-syscall5.c += -fexceptions
-CFLAGS-mips16-syscall6.c += -fexceptions
-CFLAGS-mips16-syscall7.c += -fexceptions
-endif
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions b/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
deleted file mode 100644
index 73bcfb566c..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
+++ /dev/null
@@ -1,6 +0,0 @@
-libc {
-  GLIBC_PRIVATE {
-    __mips16_syscall0; __mips16_syscall1; __mips16_syscall2; __mips16_syscall3;
-    __mips16_syscall4; __mips16_syscall5; __mips16_syscall6; __mips16_syscall7;
-  }
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
deleted file mode 100644
index 880e9908e8..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
+++ /dev/null
@@ -1,89 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#ifndef MIPS16_SYSCALL_H
-#define MIPS16_SYSCALL_H 1
-
-#define __nomips16 __attribute__ ((nomips16))
-
-union __mips16_syscall_return
-  {
-    long long val;
-    struct
-      {
-	long v0;
-	long v1;
-      }
-    reg;
-  };
-
-long long __nomips16 __mips16_syscall0 (long number);
-#define __mips16_syscall0(dummy, number)				\
-	__mips16_syscall0 ((long) (number))
-
-long long __nomips16 __mips16_syscall1 (long a0,
-					long number);
-#define __mips16_syscall1(a0, number)					\
-	__mips16_syscall1 ((long) (a0),					\
-			   (long) (number))
-
-long long __nomips16 __mips16_syscall2 (long a0, long a1,
-					long number);
-#define __mips16_syscall2(a0, a1, number)				\
-	__mips16_syscall2 ((long) (a0), (long) (a1),			\
-			   (long) (number))
-
-long long __nomips16 __mips16_syscall3 (long a0, long a1, long a2,
-					long number);
-#define __mips16_syscall3(a0, a1, a2, number)				\
-	__mips16_syscall3 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (number))
-
-long long __nomips16 __mips16_syscall4 (long a0, long a1, long a2, long a3,
-					long number);
-#define __mips16_syscall4(a0, a1, a2, a3, number)			\
-	__mips16_syscall4 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3),					\
-			   (long) (number))
-
-long long __nomips16 __mips16_syscall5 (long a0, long a1, long a2, long a3,
-					long a4,
-					long number);
-#define __mips16_syscall5(a0, a1, a2, a3, a4, number)			\
-	__mips16_syscall5 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4),			\
-			   (long) (number))
-
-long long __nomips16 __mips16_syscall6 (long a0, long a1, long a2, long a3,
-					long a4, long a5,
-					long number);
-#define __mips16_syscall6(a0, a1, a2, a3, a4, a5, number)		\
-	__mips16_syscall6 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4), (long) (a5),	\
-			   (long) (number))
-
-long long __nomips16 __mips16_syscall7 (long a0, long a1, long a2, long a3,
-					long a4, long a5, long a6,
-					long number);
-#define __mips16_syscall7(a0, a1, a2, a3, a4, a5, a6, number)		\
-	__mips16_syscall7 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4), (long) (a5),	\
-			   (long) (a6),					\
-			   (long) (number))
-
-#endif
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
deleted file mode 100644
index 490245b34e..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
+++ /dev/null
@@ -1,30 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall0
-
-long long __nomips16
-__mips16_syscall0 (long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 0);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
deleted file mode 100644
index 3061e8accb..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
+++ /dev/null
@@ -1,32 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall1
-
-long long __nomips16
-__mips16_syscall1 (long a0,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 1,
-					a0);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
deleted file mode 100644
index 440a4ed285..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
+++ /dev/null
@@ -1,32 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall2
-
-long long __nomips16
-__mips16_syscall2 (long a0, long a1,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 2,
-					a0, a1);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
deleted file mode 100644
index c3f83fc1f6..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
+++ /dev/null
@@ -1,32 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall3
-
-long long __nomips16
-__mips16_syscall3 (long a0, long a1, long a2,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 3,
-					a0, a1, a2);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
deleted file mode 100644
index 496297d296..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
+++ /dev/null
@@ -1,32 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall4
-
-long long __nomips16
-__mips16_syscall4 (long a0, long a1, long a2, long a3,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 4,
-					a0, a1, a2, a3);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
deleted file mode 100644
index ad265d88e2..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall5
-
-long long __nomips16
-__mips16_syscall5 (long a0, long a1, long a2, long a3,
-		   long a4,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 5,
-					a0, a1, a2, a3, a4);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
deleted file mode 100644
index bfbd395ed3..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall6
-
-long long __nomips16
-__mips16_syscall6 (long a0, long a1, long a2, long a3,
-		   long a4, long a5,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 6,
-					a0, a1, a2, a3, a4, a5);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c b/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
deleted file mode 100644
index e1267616dc..0000000000
--- a/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall7
-
-long long __nomips16
-__mips16_syscall7 (long a0, long a1, long a2, long a3,
-		   long a4, long a5, long a6,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 7,
-					a0, a1, a2, a3, a4, a5, a6);
-  return ret.val;
-}
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
index e9e3ee7e82..31d70c0189 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
+++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
@@ -98,45 +98,100 @@
 #undef INTERNAL_SYSCALL
 #undef INTERNAL_SYSCALL_NCS
 
+#define INTERNAL_SYSCALL(name, err, nr, args...)			\
+	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
+			      "IK" (SYS_ify (name)),			\
+			      SYS_ify(name), err, args)
+
+#define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
+	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
+			      "r" (__s0),				\
+			      number, err, args)
+
+#define __nomips16 __attribute__ ((nomips16))
+
+union __libc_do_syscall_return
+  {
+    long long val;
+    struct
+      {
+	long v0;
+	long v1;
+      }
+    reg;
+  };
+
 #ifdef __mips16
-/* There's no MIPS16 syscall instruction, so we go through out-of-line
-   standard MIPS wrappers.  These do use inline snippets below though,
-   through INTERNAL_SYSCALL_MIPS16.  Spilling the syscall number to
-   memory gives the best code in that case, avoiding the need to save
-   and restore a static register.  */
+/* There's no MIPS16 syscall instruction, so we always need to go through
+   out-of-line standard MIPS wrappers.  */
+
+long long __nomips16 __libc_do_syscall0 (long number);
 
-# include <mips16-syscall.h>
+# define internal_syscall0(v0_init, input, number, err, dummy)		\
+({									\
+	union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall0 (number);			\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
+})
 
-# define INTERNAL_SYSCALL(name, err, nr, args...)			\
-	INTERNAL_SYSCALL_NCS (SYS_ify (name), err, nr, args)
+long long __nomips16 __libc_do_syscall1 (long arg1, long number);
 
-# define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
+# define internal_syscall1(v0_init, input, number, err, arg1)		\
 ({									\
-	union __mips16_syscall_return _sc_ret;				\
-	_sc_ret.val = __mips16_syscall##nr (args, number);		\
-	err = _sc_ret.reg.v1;						\
-	_sc_ret.reg.v0;							\
+	union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall1 ((long) (arg1),		\
+					      number);			\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
 })
 
-# define INTERNAL_SYSCALL_MIPS16(number, err, nr, args...)		\
-	internal_syscall##nr ("lw\t%0, %2\n\t",				\
-			      "R" (number),				\
-			      0, err, args)
+long long __nomips16 __libc_do_syscall2 (long arg1, long arg2, long number);
 
-#else /* !__mips16 */
-# define INTERNAL_SYSCALL(name, err, nr, args...)			\
-	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
-			      "IK" (SYS_ify (name)),			\
-			      0, err, args)
+# define internal_syscall2(v0_init, input, number, err, arg1, arg2)	\
+({									\
+	union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall2 ((long) (arg1),		\
+					      (long) (arg2),		\
+					      number);			\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
+})
 
-# define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
-	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
-			      "r" (__s0),				\
-			      number, err, args)
+long long __nomips16 __libc_do_syscall3 (long arg1, long arg2, long arg3,
+					 long number);
 
-#endif /* !__mips16 */
+# define internal_syscall3(v0_init, input, number, err,			\
+			   arg1, arg2, arg3)				\
+({									\
+	union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall3 ((long) (arg1),		\
+					      (long) (arg2),		\
+					      (long) (arg3),		\
+					      number);			\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
+})
+
+long long __nomips16 __libc_do_syscall4 (long arg1, long arg2, long arg3,
+					 long arg4, long number);
 
-#define internal_syscall0(v0_init, input, number, err, dummy...)	\
+# define internal_syscall4(v0_init, input, number, err,			\
+			   arg1, arg2, arg3, arg4)			\
+({									\
+	union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall4 ((long) (arg1),		\
+					      (long) (arg2),		\
+					      (long) (arg3),		\
+					      (long) (arg4),		\
+					      number);			\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
+})
+
+#else /* !__mips16 */
+
+# define internal_syscall0(v0_init, input, number, err, dummy...)	\
 ({									\
 	long _sys_result;						\
 									\
@@ -159,7 +214,7 @@
 	_sys_result;							\
 })
 
-#define internal_syscall1(v0_init, input, number, err, arg1)		\
+# define internal_syscall1(v0_init, input, number, err, arg1)		\
 ({									\
 	long _sys_result;						\
 									\
@@ -183,7 +238,7 @@
 	_sys_result;							\
 })
 
-#define internal_syscall2(v0_init, input, number, err, arg1, arg2)	\
+# define internal_syscall2(v0_init, input, number, err, arg1, arg2)	\
 ({									\
 	long _sys_result;						\
 									\
@@ -208,8 +263,8 @@
 	_sys_result;							\
 })
 
-#define internal_syscall3(v0_init, input, number, err,			\
-			  arg1, arg2, arg3)				\
+# define internal_syscall3(v0_init, input, number, err,			\
+			   arg1, arg2, arg3)				\
 ({									\
 	long _sys_result;						\
 									\
@@ -235,8 +290,8 @@
 	_sys_result;							\
 })
 
-#define internal_syscall4(v0_init, input, number, err,			\
-			  arg1, arg2, arg3, arg4)			\
+# define internal_syscall4(v0_init, input, number, err,			\
+			   arg1, arg2, arg3, arg4)			\
 ({									\
 	long _sys_result;						\
 									\
@@ -262,110 +317,65 @@
 	_sys_result;							\
 })
 
-/* We need to use a frame pointer for the functions in which we
-   adjust $sp around the syscall, or debug information and unwind
-   information will be $sp relative and thus wrong during the syscall.  As
-   of GCC 4.7, this is sufficient.  */
-#define FORCE_FRAME_POINTER						\
-  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
+#endif /* !__mips16 */
+
+/* Out-of-line standard MIPS wrappers used for 5, 6, and 7 argument syscall
+   which requires arguments in stack.  */
+
+long long __nomips16 __libc_do_syscall5 (long arg1, long arg2, long arg3,
+					 long arg4, long arg5, long number);
 
 #define internal_syscall5(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5)			\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5))						\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall5 ((long) (arg1),		\
+					      (long) (arg2),		\
+					      (long) (arg3),		\
+					      (long) (arg4),		\
+					      (long) (arg5),		\
+					      number);			\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
 })
 
+long long __nomips16 __libc_do_syscall6 (long arg1, long arg2, long arg3,
+					 long arg4, long arg5, long arg6,
+					 long number);
+
 #define internal_syscall6(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6)		\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall6 ((long) (arg1),		\
+					      (long) (arg2),		\
+					      (long) (arg3),		\
+					      (long) (arg4),		\
+					      (long) (arg5),		\
+					      (long) (arg6),		\
+					      number);			\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
 })
 
+long long __nomips16 __libc_do_syscall7 (long arg1, long arg2, long arg3,
+					 long arg4, long arg5, long arg6,
+					 long arg7, long number);
+
 #define internal_syscall7(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6, arg7)	\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	"sw\t%8, 24($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __libc_do_syscall_return _sys_result;			\
+	_sys_result.val = __libc_do_syscall7 ((long) (arg1),		\
+					      (long) (arg2),		\
+					      (long) (arg3),		\
+					      (long) (arg4),		\
+					      (long) (arg5),		\
+					      (long) (arg6),		\
+					      (long) (arg7),		\
+					      number);			\
+	err = _sys_result.reg.v1;					\
+	_sys_result.reg.v0;						\
 })
 
 #define __SYSCALL_CLOBBERS "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13", \

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-18 17:45                                       ` Aurelien Jarno
@ 2017-08-18 22:27                                         ` Maciej W. Rozycki
  2017-08-19 12:45                                           ` Aurelien Jarno
  0 siblings, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-18 22:27 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On Fri, 18 Aug 2017, Aurelien Jarno wrote:

> > > I am using GCC 6, so if the fix went in GCC 7, that's normal the issue
> > > is present.
> > 
> >  OK then; you can use the workaround I suggested to verify MIPS16 
> > compilation then.
> 
> The workaround didn't work.

 Hmm, that means there's something wrong with binutils which needs fixing.  
Can you please send me the failing .s file and the command line used to 
assemble it (from `gcc -v')?

> > > That's an interesting idea. If we use a different stub depending on the
> > > number of arguments, we can actually pass the syscall number last, which
> > > is probably more readable. Could also be used for mips16 in all cases?
> > 
> >  MIPS16 wrappers do that already, which is also why there is an individual 
> > one for each syscall argument count.
> 
> Please find below a new patch implementing that. It started to be
> complicated to get the MIPS16 related defines used to build the 
> equivalent code through GCC to work, so I decided to also implement
> __libc_do_syscall0 to __libc_do_syscall4 in libc-do-syscall.S. I looked
> at the original code generated by GCC, it's very similar to what I used,
> sometimes just a bit longer (sometimes GCC saves the syscall number to
> the stack to reload it just after).

 The MIPS16 wrappers were split into individual files so that only ones 
that are actually used by `ld.so' are pulled.  I think it would be good if 
we preserved that.  I'll see if I can experiment with keeping the original 
MIPS16 0-3 wrappers.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-18 22:27                                         ` Maciej W. Rozycki
@ 2017-08-19 12:45                                           ` Aurelien Jarno
  2017-08-21 10:49                                             ` Maciej W. Rozycki
  0 siblings, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-19 12:45 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 1963 bytes --]

On 2017-08-18 23:27, Maciej W. Rozycki wrote:
> On Fri, 18 Aug 2017, Aurelien Jarno wrote:
> 
> > > > I am using GCC 6, so if the fix went in GCC 7, that's normal the issue
> > > > is present.
> > > 
> > >  OK then; you can use the workaround I suggested to verify MIPS16 
> > > compilation then.
> > 
> > The workaround didn't work.
> 
>  Hmm, that means there's something wrong with binutils which needs fixing.  
> Can you please send me the failing .s file and the command line used to 
> assemble it (from `gcc -v')?

Please find that attached.

> > > > That's an interesting idea. If we use a different stub depending on the
> > > > number of arguments, we can actually pass the syscall number last, which
> > > > is probably more readable. Could also be used for mips16 in all cases?
> > > 
> > >  MIPS16 wrappers do that already, which is also why there is an individual 
> > > one for each syscall argument count.
> > 
> > Please find below a new patch implementing that. It started to be
> > complicated to get the MIPS16 related defines used to build the 
> > equivalent code through GCC to work, so I decided to also implement
> > __libc_do_syscall0 to __libc_do_syscall4 in libc-do-syscall.S. I looked
> > at the original code generated by GCC, it's very similar to what I used,
> > sometimes just a bit longer (sometimes GCC saves the syscall number to
> > the stack to reload it just after).
> 
>  The MIPS16 wrappers were split into individual files so that only ones 
> that are actually used by `ld.so' are pulled.  I think it would be good if 
> we preserved that.  I'll see if I can experiment with keeping the original 
> MIPS16 0-3 wrappers.

For what I have seen, ld.so already uses syscalls with 1 to 4 arguments.
It doesn't use any syscall without argument though. So it's only 4
instructions overhead.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

[-- Attachment #2: dl-tunables.cmd.gz --]
[-- Type: application/gzip, Size: 2187 bytes --]

[-- Attachment #3: dl-tunables.s.gz --]
[-- Type: application/gzip, Size: 27046 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-19 12:45                                           ` Aurelien Jarno
@ 2017-08-21 10:49                                             ` Maciej W. Rozycki
  2017-08-21 14:30                                               ` Adhemerval Zanella
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-21 10:49 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On Sat, 19 Aug 2017, Aurelien Jarno wrote:

> > > The workaround didn't work.
> > 
> >  Hmm, that means there's something wrong with binutils which needs fixing.  
> > Can you please send me the failing .s file and the command line used to 
> > assemble it (from `gcc -v')?
> 
> Please find that attached.

 Thanks.  There's indeed a bug in GAS, a MIPS16 path of execution has been 
missed in the handling of this option.  I have a preliminary fix, however 
I yet have to prepare test suite cases (originally the option was 
mistakenly only covered by regular MIPS and microMIPS testing, which is 
clearly why the MIPS16 case has been missed).  I expect this fix to be 
included in the upcoming 2.29.1 release, and also backported to 2.28 
(although no new 2.28 release is scheduled).

> >  The MIPS16 wrappers were split into individual files so that only ones 
> > that are actually used by `ld.so' are pulled.  I think it would be good if 
> > we preserved that.  I'll see if I can experiment with keeping the original 
> > MIPS16 0-3 wrappers.
> 
> For what I have seen, ld.so already uses syscalls with 1 to 4 arguments.
> It doesn't use any syscall without argument though. So it's only 4
> instructions overhead.

 Why make things worse where they don't have to be and there's no benefit
elsewhere that would balance the regression?

 Here's what I had in mind.  The key was updating the macros appropriately 
in sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h.  Beyond 
that I made a number of clean-ups:

1. I renamed `__libc_do_syscall?' to `__mips_syscall?', for consistency 
   with `__mips16_syscall?', and consequently `__libc_do_syscall_return' 
   to `__mips_syscall_return'.

2. I exported `__mips_syscall?' wrappers from `libc.so' rather than making
   them hidden.  This is also consistent with `__mips16_syscall?' wrappers
   and reduces code duplication of doubtful benefit -- it could be that 
   some calls, if internal, could be subject to the JALR->BAL
   optimisation, however only those that are in range and only in regular 
   MIPS code, for a minimal execution time saving on some processors only.
   Exporting these entries makes the maintenance effort much easier 
   though, as we don't have to track and record their use in the
   individual subdirectories in Makefile.

3. I renamed `_sys_result' to `_sc_ret' where it is declared as `union 
   __mips_syscall_return', again for clarity and consistency with MIPS16
   INTERNAL_SYSCALL_NCS.

4. I wrapped `number' in parentheses in `internal_syscall5', 
   `internal_syscall6' and `internal_syscall7', and cast it to `long'.

5. I have adjusted some comments.

Please let me know if you have any questions or concerns about this 
update.

 This passes o32 regular MIPS regression testing, however my GCC 8 binary 
seems to miscompile support_test_main.c in the MIPS16 mode, causing all 
the relevant test cases to crash with SIGILL (in the `support_set_test_dir 
(test_dir)' invocation it's `test_dir' that is jumped to rather than 
`support_set_test_dir'; I'll file a bug separately).  I'll find a way to 
run MIPS16 testing eventually, however meanwhile I will appreciate if you 
do it for me.

  Maciej

2017-08-21  Adhemerval Zanella  <adhemerval.zanella@linaro.org>
	    Aurelien Jarno <aurelien@aurel32.net>
	    Maciej W. Rozycki  <macro@imgtec.com>

	[BZ #21956]
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
	[subdir = misc] (sysdep_routines): Remove `mips16-syscall5',
	`mips16-syscall6' and `mips16-syscall7'.
	(CFLAGS-mips16-syscall5.c, CFLAGS-mips16-syscall6.c)
	(CFLAGS-mips16-syscall7.c): Remove.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions (libc):
	Remove `__mips16_syscall5', `__mips16_syscall6' and 
	`__mips16_syscall7'.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
	(__mips16_syscall0): Rename `__mips16_syscall_return' to 
	`__mips_syscall_return'.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
	(__mips16_syscall1): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
	(__mips16_syscall2): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
	(__mips16_syscall3): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
	(__mips16_syscall4): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c:
	Remove.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c:
	Remove.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c:
	Remove.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
	(__mips16_syscall5): Expand to `__mips_syscall5' rather than 
	`__mips16_syscall5'.  Remove prototype.
	(__mips16_syscall6): Expand to `__mips_syscall6' rather than
	`__mips16_syscall6'.  Remove prototype.
	(__mips16_syscall7): Expand to `__mips_syscall7' rather than
	`__mips16_syscall7'.  Remove prototype.
	(__nomips16, __mips16_syscall_return): Move to...
	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h 
	(__nomips16, __mips_syscall_return): ... here.
	[__mips16] (INTERNAL_SYSCALL_NCS): Rename 
	`__mips16_syscall_return' to `__mips_syscall_return'.
	[__mips16] (INTERNAL_SYSCALL_MIPS16): Pass `number' to
	`internal_syscall##nr'.
	[!__mips16] (INTERNAL_SYSCALL): Pass `SYS_ify (name)' to
	`internal_syscall##nr'.
	(FORCE_FRAME_POINTER): Remove.
	(__mips_syscall5): New prototype.
	(internal_syscall5): Rewrite to call `__mips_syscall5'.
	(__mips_syscall6): New prototype.
	(internal_syscall6): Rewrite to call `__mips_syscall6'.
	(__mips_syscall7): New prototype.
	(internal_syscall7): Rewrite to call `__mips_syscall7'.
	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S: New file.
	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S: New file.
	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S: New file.
	* sysdeps/unix/sysv/linux/mips/mips32/Makefile [subdir = misc]
	(sysdep_routines): Add libc-do-syscall.
	* sysdeps/unix/sysv/linux/mips/mips32/Versions (libc): Add
	`__mips_syscall5', `__mips_syscall6' and `__mips_syscall7'.

---
 sysdeps/unix/sysv/linux/mips/mips32/Makefile                 |    4 
 sysdeps/unix/sysv/linux/mips/mips32/Versions                 |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S          |   33 ++
 sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S          |   33 ++
 sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S          |   33 ++
 sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile          |    6 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions          |    2 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h  |   44 ---
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c |   33 --
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c |   33 --
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c |   33 --
 sysdeps/unix/sysv/linux/mips/mips32/sysdep.h                 |  154 ++++-------
 17 files changed, 186 insertions(+), 240 deletions(-)

glibc-aurelien-mips-o32-syscall.diff
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/Makefile
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/Makefile	2017-08-20 01:30:23.485088512 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/Makefile	2017-08-20 01:34:46.716443161 +0100
@@ -3,6 +3,10 @@ ifeq ($(subdir),conform)
 conformtest-xfail-conds += mips-o32-linux
 endif
 
+ifeq ($(subdir),misc)
+sysdep_routines += mips-syscall5 mips-syscall6 mips-syscall7
+endif
+
 ifeq ($(subdir),stdlib)
 tests += bug-getcontext-mips-gp
 endif
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/Versions
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/Versions	2017-08-20 01:30:23.617089669 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/Versions	2017-08-20 01:34:46.740608292 +0100
@@ -3,4 +3,7 @@ libc {
     getrlimit64;
     setrlimit64;
   }
+  GLIBC_PRIVATE {
+    __mips_syscall5; __mips_syscall6; __mips_syscall7;
+  }
 }
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S	2017-08-20 03:02:13.583854495 +0100
@@ -0,0 +1,33 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+	.text
+	.set	nomips16
+
+/* long long __mips_syscall5 (long arg1, long arg2, long arg3, long arg4,
+			      long arg5,
+			      long number)  */
+
+ENTRY(__mips_syscall5)
+	lw	v0, 20(sp)
+	syscall
+	move	v1, a3
+	jr	ra
+END(__mips_syscall5)
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S	2017-08-20 03:02:17.294755273 +0100
@@ -0,0 +1,33 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+	.text
+	.set	nomips16
+
+/* long long __mips_syscall6 (long arg1, long arg2, long arg3, long arg4,
+			      long arg5, long arg6,
+			      long number)  */
+
+ENTRY(__mips_syscall6)
+	lw	v0, 24(sp)
+	syscall
+	move	v1, a3
+	jr	ra
+END(__mips_syscall6)
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S	2017-08-20 03:02:20.617120331 +0100
@@ -0,0 +1,33 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+	.text
+	.set	nomips16
+
+/* long long __mips_syscall7 (long arg1, long arg2, long arg3, long arg4,
+			      long arg5, long arg6, long arg7,
+			      long number)  */
+
+ENTRY(__mips_syscall7)
+	lw	v0, 28(sp)
+	syscall
+	move	v1, a3
+	jr	ra
+END(__mips_syscall7)
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile	2017-08-20 01:30:23.490163141 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile	2017-08-20 01:34:46.819433661 +0100
@@ -1,13 +1,9 @@
 ifeq ($(subdir),misc)
 sysdep_routines += mips16-syscall0 mips16-syscall1 mips16-syscall2
-sysdep_routines += mips16-syscall3 mips16-syscall4 mips16-syscall5
-sysdep_routines += mips16-syscall6 mips16-syscall7
+sysdep_routines += mips16-syscall3 mips16-syscall4
 CFLAGS-mips16-syscall0.c += -fexceptions
 CFLAGS-mips16-syscall1.c += -fexceptions
 CFLAGS-mips16-syscall2.c += -fexceptions
 CFLAGS-mips16-syscall3.c += -fexceptions
 CFLAGS-mips16-syscall4.c += -fexceptions
-CFLAGS-mips16-syscall5.c += -fexceptions
-CFLAGS-mips16-syscall6.c += -fexceptions
-CFLAGS-mips16-syscall7.c += -fexceptions
 endif
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions	2017-08-20 01:30:23.494195070 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions	2017-08-20 01:34:46.830600751 +0100
@@ -1,6 +1,6 @@
 libc {
   GLIBC_PRIVATE {
     __mips16_syscall0; __mips16_syscall1; __mips16_syscall2; __mips16_syscall3;
-    __mips16_syscall4; __mips16_syscall5; __mips16_syscall6; __mips16_syscall7;
+    __mips16_syscall4;
   }
 }
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h	2017-08-20 01:30:23.505269052 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h	2017-08-20 01:34:46.845845282 +0100
@@ -19,19 +19,6 @@
 #ifndef MIPS16_SYSCALL_H
 #define MIPS16_SYSCALL_H 1
 
-#define __nomips16 __attribute__ ((nomips16))
-
-union __mips16_syscall_return
-  {
-    long long val;
-    struct
-      {
-	long v0;
-	long v1;
-      }
-    reg;
-  };
-
 long long __nomips16 __mips16_syscall0 (long number);
 #define __mips16_syscall0(dummy, number)				\
 	__mips16_syscall0 ((long) (number))
@@ -61,29 +48,22 @@ long long __nomips16 __mips16_syscall4 (
 			   (long) (a3),					\
 			   (long) (number))
 
-long long __nomips16 __mips16_syscall5 (long a0, long a1, long a2, long a3,
-					long a4,
-					long number);
+/* The remaining ones use regular MIPS wrappers.  */
+
 #define __mips16_syscall5(a0, a1, a2, a3, a4, number)			\
-	__mips16_syscall5 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4),			\
-			   (long) (number))
+	__mips_syscall5 ((long) (a0), (long) (a1), (long) (a2),		\
+			 (long) (a3), (long) (a4),			\
+			 (long) (number))
 
-long long __nomips16 __mips16_syscall6 (long a0, long a1, long a2, long a3,
-					long a4, long a5,
-					long number);
 #define __mips16_syscall6(a0, a1, a2, a3, a4, a5, number)		\
-	__mips16_syscall6 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4), (long) (a5),	\
-			   (long) (number))
+	__mips_syscall6 ((long) (a0), (long) (a1), (long) (a2),		\
+			 (long) (a3), (long) (a4), (long) (a5),		\
+			 (long) (number))
 
-long long __nomips16 __mips16_syscall7 (long a0, long a1, long a2, long a3,
-					long a4, long a5, long a6,
-					long number);
 #define __mips16_syscall7(a0, a1, a2, a3, a4, a5, a6, number)		\
-	__mips16_syscall7 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4), (long) (a5),	\
-			   (long) (a6),					\
-			   (long) (number))
+	__mips_syscall7 ((long) (a0), (long) (a1), (long) (a2),		\
+			 (long) (a3), (long) (a4), (long) (a5),		\
+			 (long) (a6),					\
+			 (long) (number))
 
 #endif
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c	2017-08-20 01:30:23.564450375 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c	2017-08-20 01:34:46.849881768 +0100
@@ -17,14 +17,13 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall0
 
 long long __nomips16
 __mips16_syscall0 (long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 0);
   return ret.val;
 }
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c	2017-08-20 01:30:23.568506526 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c	2017-08-20 01:34:46.860061453 +0100
@@ -17,7 +17,6 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall1
 
@@ -25,7 +24,7 @@ long long __nomips16
 __mips16_syscall1 (long a0,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 1,
 					a0);
   return ret.val;
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c	2017-08-20 01:30:23.578642154 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c	2017-08-20 01:34:46.865155941 +0100
@@ -17,7 +17,6 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall2
 
@@ -25,7 +24,7 @@ long long __nomips16
 __mips16_syscall2 (long a0, long a1,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 2,
 					a0, a1);
   return ret.val;
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c	2017-08-20 01:30:23.582725033 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c	2017-08-20 01:34:46.873273352 +0100
@@ -17,7 +17,6 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall3
 
@@ -25,7 +24,7 @@ long long __nomips16
 __mips16_syscall3 (long a0, long a1, long a2,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 3,
 					a0, a1, a2);
   return ret.val;
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c	2017-08-20 01:30:23.591846747 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c	2017-08-20 01:34:46.894612901 +0100
@@ -17,7 +17,6 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall4
 
@@ -25,7 +24,7 @@ long long __nomips16
 __mips16_syscall4 (long a0, long a1, long a2, long a3,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 4,
 					a0, a1, a2, a3);
   return ret.val;
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c	2017-08-20 01:30:23.522504662 +0100
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall5
-
-long long __nomips16
-__mips16_syscall5 (long a0, long a1, long a2, long a3,
-		   long a4,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 5,
-					a0, a1, a2, a3, a4);
-  return ret.val;
-}
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c	2017-08-20 01:30:23.536931914 +0100
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall6
-
-long long __nomips16
-__mips16_syscall6 (long a0, long a1, long a2, long a3,
-		   long a4, long a5,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 6,
-					a0, a1, a2, a3, a4, a5);
-  return ret.val;
-}
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c	2017-08-20 01:30:23.546046859 +0100
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall7
-
-long long __nomips16
-__mips16_syscall7 (long a0, long a1, long a2, long a3,
-		   long a4, long a5, long a6,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 7,
-					a0, a1, a2, a3, a4, a5, a6);
-  return ret.val;
-}
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-20 01:30:23.602967356 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-20 14:15:03.369838111 +0100
@@ -98,6 +98,19 @@
 #undef INTERNAL_SYSCALL
 #undef INTERNAL_SYSCALL_NCS
 
+#define __nomips16 __attribute__ ((nomips16))
+
+union __mips_syscall_return
+  {
+    long long val;
+    struct
+      {
+	long v0;
+	long v1;
+      }
+    reg;
+  };
+
 #ifdef __mips16
 /* There's no MIPS16 syscall instruction, so we go through out-of-line
    standard MIPS wrappers.  These do use inline snippets below though,
@@ -112,7 +125,7 @@
 
 # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
 ({									\
-	union __mips16_syscall_return _sc_ret;				\
+	union __mips_syscall_return _sc_ret;				\
 	_sc_ret.val = __mips16_syscall##nr (args, number);		\
 	err = _sc_ret.reg.v1;						\
 	_sc_ret.reg.v0;							\
@@ -121,13 +134,13 @@
 # define INTERNAL_SYSCALL_MIPS16(number, err, nr, args...)		\
 	internal_syscall##nr ("lw\t%0, %2\n\t",				\
 			      "R" (number),				\
-			      0, err, args)
+			      number, err, args)
 
 #else /* !__mips16 */
 # define INTERNAL_SYSCALL(name, err, nr, args...)			\
 	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
 			      "IK" (SYS_ify (name)),			\
-			      0, err, args)
+			      SYS_ify (name), err, args)
 
 # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
 	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
@@ -262,110 +275,65 @@
 	_sys_result;							\
 })
 
-/* We need to use a frame pointer for the functions in which we
-   adjust $sp around the syscall, or debug information and unwind
-   information will be $sp relative and thus wrong during the syscall.  As
-   of GCC 4.7, this is sufficient.  */
-#define FORCE_FRAME_POINTER						\
-  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
+/* Out-of-line standard MIPS wrappers used for 5, 6, and 7 argument
+   syscalls, which require stack arguments.  */
+
+long long __nomips16 __mips_syscall5 (long arg1, long arg2, long arg3,
+				      long arg4, long arg5,
+				      long number);
 
 #define internal_syscall5(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5)			\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5))						\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __mips_syscall_return _sc_ret;				\
+	_sc_ret.val = __mips_syscall5 ((long) (arg1),			\
+				       (long) (arg2),			\
+				       (long) (arg3),			\
+				       (long) (arg4),			\
+				       (long) (arg5),			\
+				       (long) (number));		\
+	err = _sc_ret.reg.v1;						\
+	_sc_ret.reg.v0;							\
 })
 
+long long __nomips16 __mips_syscall6 (long arg1, long arg2, long arg3,
+				      long arg4, long arg5, long arg6,
+				      long number);
+
 #define internal_syscall6(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6)		\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __mips_syscall_return _sc_ret;				\
+	_sc_ret.val = __mips_syscall6 ((long) (arg1),			\
+				       (long) (arg2),			\
+				       (long) (arg3),			\
+				       (long) (arg4),			\
+				       (long) (arg5),			\
+				       (long) (arg6),			\
+				       (long) (number));		\
+	err = _sc_ret.reg.v1;						\
+	_sc_ret.reg.v0;							\
 })
 
+long long __nomips16 __mips_syscall7 (long arg1, long arg2, long arg3,
+				      long arg4, long arg5, long arg6,
+				      long arg7,
+				      long number);
+
 #define internal_syscall7(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6, arg7)	\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	"sw\t%8, 24($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __mips_syscall_return _sc_ret;				\
+	_sc_ret.val = __mips_syscall7 ((long) (arg1),			\
+				       (long) (arg2),			\
+				       (long) (arg3),			\
+				       (long) (arg4),			\
+				       (long) (arg5),			\
+				       (long) (arg6),			\
+				       (long) (arg7),			\
+				       (long) (number));		\
+	err = _sc_ret.reg.v1;						\
+	_sc_ret.reg.v0;							\
 })
 
 #define __SYSCALL_CLOBBERS "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13", \

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-21 10:49                                             ` Maciej W. Rozycki
@ 2017-08-21 14:30                                               ` Adhemerval Zanella
  2017-08-24 13:27                                                 ` [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7 Maciej W. Rozycki
  2017-08-22  8:25                                               ` [PATCH] mips/o32: fix internal_syscall5/6/7 Aurelien Jarno
  2017-08-30 15:35                                               ` Maciej W. Rozycki
  2 siblings, 1 reply; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-21 14:30 UTC (permalink / raw)
  To: Maciej W. Rozycki, Aurelien Jarno; +Cc: Joseph Myers, libc-alpha



On 21/08/2017 07:48, Maciej W. Rozycki wrote:
> On Sat, 19 Aug 2017, Aurelien Jarno wrote:
> 
>>>> The workaround didn't work.
>>>
>>>  Hmm, that means there's something wrong with binutils which needs fixing.  
>>> Can you please send me the failing .s file and the command line used to 
>>> assemble it (from `gcc -v')?
>>
>> Please find that attached.
> 
>  Thanks.  There's indeed a bug in GAS, a MIPS16 path of execution has been 
> missed in the handling of this option.  I have a preliminary fix, however 
> I yet have to prepare test suite cases (originally the option was 
> mistakenly only covered by regular MIPS and microMIPS testing, which is 
> clearly why the MIPS16 case has been missed).  I expect this fix to be 
> included in the upcoming 2.29.1 release, and also backported to 2.28 
> (although no new 2.28 release is scheduled).
> 
>>>  The MIPS16 wrappers were split into individual files so that only ones 
>>> that are actually used by `ld.so' are pulled.  I think it would be good if 
>>> we preserved that.  I'll see if I can experiment with keeping the original 
>>> MIPS16 0-3 wrappers.
>>
>> For what I have seen, ld.so already uses syscalls with 1 to 4 arguments.
>> It doesn't use any syscall without argument though. So it's only 4
>> instructions overhead.
> 
>  Why make things worse where they don't have to be and there's no benefit
> elsewhere that would balance the regression?
> 
>  Here's what I had in mind.  The key was updating the macros appropriately 
> in sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h.  Beyond 
> that I made a number of clean-ups:
> 
> 1. I renamed `__libc_do_syscall?' to `__mips_syscall?', for consistency 
>    with `__mips16_syscall?', and consequently `__libc_do_syscall_return' 
>    to `__mips_syscall_return'.

Ok.

> 
> 2. I exported `__mips_syscall?' wrappers from `libc.so' rather than making
>    them hidden.  This is also consistent with `__mips16_syscall?' wrappers
>    and reduces code duplication of doubtful benefit -- it could be that 
>    some calls, if internal, could be subject to the JALR->BAL
>    optimisation, however only those that are in range and only in regular 
>    MIPS code, for a minimal execution time saving on some processors only.
>    Exporting these entries makes the maintenance effort much easier 
>    though, as we don't have to track and record their use in the
>    individual subdirectories in Makefile.

In this case we can still have internal hidden calls for libc with the cost
of code duplication by using hiden alias with:

libc_hidden_proto (__mips_syscall{5,6,7}, nomips16)

And with the pairing

libc_hidden_def (__mips_syscall{5,6,7});

On implementation.

> 
> 3. I renamed `_sys_result' to `_sc_ret' where it is declared as `union 
>    __mips_syscall_return', again for clarity and consistency with MIPS16
>    INTERNAL_SYSCALL_NCS.

Ok.

> 
> 4. I wrapped `number' in parentheses in `internal_syscall5', 
>    `internal_syscall6' and `internal_syscall7', and cast it to `long'.

Ok.

> 
> 5. I have adjusted some comments.

I think it still worth to add some more explanation (see below).

> 
> Please let me know if you have any questions or concerns about this 
> update.
> 
>  This passes o32 regular MIPS regression testing, however my GCC 8 binary 
> seems to miscompile support_test_main.c in the MIPS16 mode, causing all 
> the relevant test cases to crash with SIGILL (in the `support_set_test_dir 
> (test_dir)' invocation it's `test_dir' that is jumped to rather than 
> `support_set_test_dir'; I'll file a bug separately).  I'll find a way to 
> run MIPS16 testing eventually, however meanwhile I will appreciate if you 
> do it for me.
> 
>   Maciej
> 
> 2017-08-21  Adhemerval Zanella  <adhemerval.zanella@linaro.org>
> 	    Aurelien Jarno <aurelien@aurel32.net>
> 	    Maciej W. Rozycki  <macro@imgtec.com>
> 
> 	[BZ #21956]
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
> 	[subdir = misc] (sysdep_routines): Remove `mips16-syscall5',
> 	`mips16-syscall6' and `mips16-syscall7'.
> 	(CFLAGS-mips16-syscall5.c, CFLAGS-mips16-syscall6.c)
> 	(CFLAGS-mips16-syscall7.c): Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions (libc):
> 	Remove `__mips16_syscall5', `__mips16_syscall6' and 
> 	`__mips16_syscall7'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
> 	(__mips16_syscall0): Rename `__mips16_syscall_return' to 
> 	`__mips_syscall_return'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
> 	(__mips16_syscall1): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
> 	(__mips16_syscall2): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
> 	(__mips16_syscall3): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
> 	(__mips16_syscall4): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
> 	(__mips16_syscall5): Expand to `__mips_syscall5' rather than 
> 	`__mips16_syscall5'.  Remove prototype.
> 	(__mips16_syscall6): Expand to `__mips_syscall6' rather than
> 	`__mips16_syscall6'.  Remove prototype.
> 	(__mips16_syscall7): Expand to `__mips_syscall7' rather than
> 	`__mips16_syscall7'.  Remove prototype.
> 	(__nomips16, __mips16_syscall_return): Move to...
> 	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h 
> 	(__nomips16, __mips_syscall_return): ... here.
> 	[__mips16] (INTERNAL_SYSCALL_NCS): Rename 
> 	`__mips16_syscall_return' to `__mips_syscall_return'.
> 	[__mips16] (INTERNAL_SYSCALL_MIPS16): Pass `number' to
> 	`internal_syscall##nr'.
> 	[!__mips16] (INTERNAL_SYSCALL): Pass `SYS_ify (name)' to
> 	`internal_syscall##nr'.
> 	(FORCE_FRAME_POINTER): Remove.
> 	(__mips_syscall5): New prototype.
> 	(internal_syscall5): Rewrite to call `__mips_syscall5'.
> 	(__mips_syscall6): New prototype.
> 	(internal_syscall6): Rewrite to call `__mips_syscall6'.
> 	(__mips_syscall7): New prototype.
> 	(internal_syscall7): Rewrite to call `__mips_syscall7'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/Makefile [subdir = misc]
> 	(sysdep_routines): Add libc-do-syscall.
> 	* sysdeps/unix/sysv/linux/mips/mips32/Versions (libc): Add
> 	`__mips_syscall5', `__mips_syscall6' and `__mips_syscall7'.
> 
> ---
>  sysdeps/unix/sysv/linux/mips/mips32/Makefile                 |    4 
>  sysdeps/unix/sysv/linux/mips/mips32/Versions                 |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S          |   33 ++
>  sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S          |   33 ++
>  sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S          |   33 ++
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile          |    6 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions          |    2 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h  |   44 ---
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c |   33 --
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c |   33 --
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c |   33 --
>  sysdeps/unix/sysv/linux/mips/mips32/sysdep.h                 |  154 ++++-------
>  17 files changed, 186 insertions(+), 240 deletions(-)
> 
> glibc-aurelien-mips-o32-syscall.diff
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/Makefile
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/Makefile	2017-08-20 01:30:23.485088512 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/Makefile	2017-08-20 01:34:46.716443161 +0100
> @@ -3,6 +3,10 @@ ifeq ($(subdir),conform)
>  conformtest-xfail-conds += mips-o32-linux
>  endif
>  
> +ifeq ($(subdir),misc)
> +sysdep_routines += mips-syscall5 mips-syscall6 mips-syscall7
> +endif
> +
>  ifeq ($(subdir),stdlib)
>  tests += bug-getcontext-mips-gp
>  endif
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/Versions
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/Versions	2017-08-20 01:30:23.617089669 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/Versions	2017-08-20 01:34:46.740608292 +0100
> @@ -3,4 +3,7 @@ libc {
>      getrlimit64;
>      setrlimit64;
>    }
> +  GLIBC_PRIVATE {
> +    __mips_syscall5; __mips_syscall6; __mips_syscall7;
> +  }
>  }
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S	2017-08-20 03:02:13.583854495 +0100
> @@ -0,0 +1,33 @@

One line comment to describe this file.

> +/* Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +	.text
> +	.set	nomips16
> +
> +/* long long __mips_syscall5 (long arg1, long arg2, long arg3, long arg4,
> +			      long arg5,
> +			      long number)  */
> +
> +ENTRY(__mips_syscall5)
> +	lw	v0, 20(sp)
> +	syscall
> +	move	v1, a3
> +	jr	ra
> +END(__mips_syscall5)
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S	2017-08-20 03:02:17.294755273 +0100
> @@ -0,0 +1,33 @@

Same as before.

> +/* Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +	.text
> +	.set	nomips16
> +
> +/* long long __mips_syscall6 (long arg1, long arg2, long arg3, long arg4,
> +			      long arg5, long arg6,
> +			      long number)  */
> +
> +ENTRY(__mips_syscall6)
> +	lw	v0, 24(sp)
> +	syscall
> +	move	v1, a3
> +	jr	ra
> +END(__mips_syscall6)
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S	2017-08-20 03:02:20.617120331 +0100
> @@ -0,0 +1,33 @@

Same as before.

> +/* Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +	.text
> +	.set	nomips16
> +
> +/* long long __mips_syscall7 (long arg1, long arg2, long arg3, long arg4,
> +			      long arg5, long arg6, long arg7,
> +			      long number)  */
> +
> +ENTRY(__mips_syscall7)
> +	lw	v0, 28(sp)
> +	syscall
> +	move	v1, a3
> +	jr	ra
> +END(__mips_syscall7)
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile	2017-08-20 01:30:23.490163141 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile	2017-08-20 01:34:46.819433661 +0100
> @@ -1,13 +1,9 @@
>  ifeq ($(subdir),misc)
>  sysdep_routines += mips16-syscall0 mips16-syscall1 mips16-syscall2
> -sysdep_routines += mips16-syscall3 mips16-syscall4 mips16-syscall5
> -sysdep_routines += mips16-syscall6 mips16-syscall7
> +sysdep_routines += mips16-syscall3 mips16-syscall4
>  CFLAGS-mips16-syscall0.c += -fexceptions
>  CFLAGS-mips16-syscall1.c += -fexceptions
>  CFLAGS-mips16-syscall2.c += -fexceptions
>  CFLAGS-mips16-syscall3.c += -fexceptions
>  CFLAGS-mips16-syscall4.c += -fexceptions
> -CFLAGS-mips16-syscall5.c += -fexceptions
> -CFLAGS-mips16-syscall6.c += -fexceptions
> -CFLAGS-mips16-syscall7.c += -fexceptions
>  endif
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions	2017-08-20 01:30:23.494195070 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions	2017-08-20 01:34:46.830600751 +0100
> @@ -1,6 +1,6 @@
>  libc {
>    GLIBC_PRIVATE {
>      __mips16_syscall0; __mips16_syscall1; __mips16_syscall2; __mips16_syscall3;
> -    __mips16_syscall4; __mips16_syscall5; __mips16_syscall6; __mips16_syscall7;
> +    __mips16_syscall4;
>    }
>  }
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h	2017-08-20 01:30:23.505269052 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h	2017-08-20 01:34:46.845845282 +0100
> @@ -19,19 +19,6 @@
>  #ifndef MIPS16_SYSCALL_H
>  #define MIPS16_SYSCALL_H 1
>  
> -#define __nomips16 __attribute__ ((nomips16))
> -
> -union __mips16_syscall_return
> -  {
> -    long long val;
> -    struct
> -      {
> -	long v0;
> -	long v1;
> -      }
> -    reg;
> -  };
> -
>  long long __nomips16 __mips16_syscall0 (long number);
>  #define __mips16_syscall0(dummy, number)				\
>  	__mips16_syscall0 ((long) (number))
> @@ -61,29 +48,22 @@ long long __nomips16 __mips16_syscall4 (
>  			   (long) (a3),					\
>  			   (long) (number))
>  
> -long long __nomips16 __mips16_syscall5 (long a0, long a1, long a2, long a3,
> -					long a4,
> -					long number);
> +/* The remaining ones use regular MIPS wrappers.  */
> +
>  #define __mips16_syscall5(a0, a1, a2, a3, a4, number)			\
> -	__mips16_syscall5 ((long) (a0), (long) (a1), (long) (a2),	\
> -			   (long) (a3), (long) (a4),			\
> -			   (long) (number))
> +	__mips_syscall5 ((long) (a0), (long) (a1), (long) (a2),		\
> +			 (long) (a3), (long) (a4),			\
> +			 (long) (number))
>  
> -long long __nomips16 __mips16_syscall6 (long a0, long a1, long a2, long a3,
> -					long a4, long a5,
> -					long number);
>  #define __mips16_syscall6(a0, a1, a2, a3, a4, a5, number)		\
> -	__mips16_syscall6 ((long) (a0), (long) (a1), (long) (a2),	\
> -			   (long) (a3), (long) (a4), (long) (a5),	\
> -			   (long) (number))
> +	__mips_syscall6 ((long) (a0), (long) (a1), (long) (a2),		\
> +			 (long) (a3), (long) (a4), (long) (a5),		\
> +			 (long) (number))
>  
> -long long __nomips16 __mips16_syscall7 (long a0, long a1, long a2, long a3,
> -					long a4, long a5, long a6,
> -					long number);
>  #define __mips16_syscall7(a0, a1, a2, a3, a4, a5, a6, number)		\
> -	__mips16_syscall7 ((long) (a0), (long) (a1), (long) (a2),	\
> -			   (long) (a3), (long) (a4), (long) (a5),	\
> -			   (long) (a6),					\
> -			   (long) (number))
> +	__mips_syscall7 ((long) (a0), (long) (a1), (long) (a2),		\
> +			 (long) (a3), (long) (a4), (long) (a5),		\
> +			 (long) (a6),					\
> +			 (long) (number))
>  
>  #endif
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c	2017-08-20 01:30:23.564450375 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c	2017-08-20 01:34:46.849881768 +0100
> @@ -17,14 +17,13 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall0
>  
>  long long __nomips16
>  __mips16_syscall0 (long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 0);
>    return ret.val;
>  }
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c	2017-08-20 01:30:23.568506526 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c	2017-08-20 01:34:46.860061453 +0100
> @@ -17,7 +17,6 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall1
>  
> @@ -25,7 +24,7 @@ long long __nomips16
>  __mips16_syscall1 (long a0,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 1,
>  					a0);
>    return ret.val;
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c	2017-08-20 01:30:23.578642154 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c	2017-08-20 01:34:46.865155941 +0100
> @@ -17,7 +17,6 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall2
>  
> @@ -25,7 +24,7 @@ long long __nomips16
>  __mips16_syscall2 (long a0, long a1,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 2,
>  					a0, a1);
>    return ret.val;
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c	2017-08-20 01:30:23.582725033 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c	2017-08-20 01:34:46.873273352 +0100
> @@ -17,7 +17,6 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall3
>  
> @@ -25,7 +24,7 @@ long long __nomips16
>  __mips16_syscall3 (long a0, long a1, long a2,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 3,
>  					a0, a1, a2);
>    return ret.val;
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c	2017-08-20 01:30:23.591846747 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c	2017-08-20 01:34:46.894612901 +0100
> @@ -17,7 +17,6 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall4
>  
> @@ -25,7 +24,7 @@ long long __nomips16
>  __mips16_syscall4 (long a0, long a1, long a2, long a3,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 4,
>  					a0, a1, a2, a3);
>    return ret.val;
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c	2017-08-20 01:30:23.522504662 +0100
> +++ /dev/null	1970-01-01 00:00:00.000000000 +0000
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall5
> -
> -long long __nomips16
> -__mips16_syscall5 (long a0, long a1, long a2, long a3,
> -		   long a4,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 5,
> -					a0, a1, a2, a3, a4);
> -  return ret.val;
> -}
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c	2017-08-20 01:30:23.536931914 +0100
> +++ /dev/null	1970-01-01 00:00:00.000000000 +0000
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall6
> -
> -long long __nomips16
> -__mips16_syscall6 (long a0, long a1, long a2, long a3,
> -		   long a4, long a5,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 6,
> -					a0, a1, a2, a3, a4, a5);
> -  return ret.val;
> -}
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c	2017-08-20 01:30:23.546046859 +0100
> +++ /dev/null	1970-01-01 00:00:00.000000000 +0000
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall7
> -
> -long long __nomips16
> -__mips16_syscall7 (long a0, long a1, long a2, long a3,
> -		   long a4, long a5, long a6,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 7,
> -					a0, a1, a2, a3, a4, a5, a6);
> -  return ret.val;
> -}
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-20 01:30:23.602967356 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-20 14:15:03.369838111 +0100
> @@ -98,6 +98,19 @@
>  #undef INTERNAL_SYSCALL
>  #undef INTERNAL_SYSCALL_NCS
>  
> +#define __nomips16 __attribute__ ((nomips16))
> +
> +union __mips_syscall_return
> +  {
> +    long long val;
> +    struct
> +      {
> +	long v0;
> +	long v1;
> +      }
> +    reg;
> +  };
> +
>  #ifdef __mips16
>  /* There's no MIPS16 syscall instruction, so we go through out-of-line
>     standard MIPS wrappers.  These do use inline snippets below though,
> @@ -112,7 +125,7 @@
>  
>  # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
>  ({									\
> -	union __mips16_syscall_return _sc_ret;				\
> +	union __mips_syscall_return _sc_ret;				\
>  	_sc_ret.val = __mips16_syscall##nr (args, number);		\
>  	err = _sc_ret.reg.v1;						\
>  	_sc_ret.reg.v0;							\
> @@ -121,13 +134,13 @@
>  # define INTERNAL_SYSCALL_MIPS16(number, err, nr, args...)		\
>  	internal_syscall##nr ("lw\t%0, %2\n\t",				\
>  			      "R" (number),				\
> -			      0, err, args)
> +			      number, err, args)
>  
>  #else /* !__mips16 */
>  # define INTERNAL_SYSCALL(name, err, nr, args...)			\
>  	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
>  			      "IK" (SYS_ify (name)),			\
> -			      0, err, args)
> +			      SYS_ify (name), err, args)
>  
>  # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
>  	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
> @@ -262,110 +275,65 @@
>  	_sys_result;							\
>  })
>  
> -/* We need to use a frame pointer for the functions in which we
> -   adjust $sp around the syscall, or debug information and unwind
> -   information will be $sp relative and thus wrong during the syscall.  As
> -   of GCC 4.7, this is sufficient.  */
> -#define FORCE_FRAME_POINTER						\
> -  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
> +/* Out-of-line standard MIPS wrappers used for 5, 6, and 7 argument
> +   syscalls, which require stack arguments.  */

I think it is worth to add a comment why we are using out-of-line wrappers
for syscalls with 5, 6, and 7 arguments.

> +
> +long long __nomips16 __mips_syscall5 (long arg1, long arg2, long arg3,
> +				      long arg4, long arg5,
> +				      long number)
>  
>  #define internal_syscall5(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5)			\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5))						\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +	union __mips_syscall_return _sc_ret;				\
> +	_sc_ret.val = __mips_syscall5 ((long) (arg1),			\
> +				       (long) (arg2),			\
> +				       (long) (arg3),			\
> +				       (long) (arg4),			\
> +				       (long) (arg5),			\
> +				       (long) (number));		\
> +	err = _sc_ret.reg.v1;						\
> +	_sc_ret.reg.v0;							\
>  })
>  
> +long long __nomips16 __mips_syscall6 (long arg1, long arg2, long arg3,
> +				      long arg4, long arg5, long arg6,
> +				      long number);
> +
>  #define internal_syscall6(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5, arg6)		\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	"sw\t%7, 20($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +	union __mips_syscall_return _sc_ret;				\
> +	_sc_ret.val = __mips_syscall6 ((long) (arg1),			\
> +				       (long) (arg2),			\
> +				       (long) (arg3),			\
> +				       (long) (arg4),			\
> +				       (long) (arg5),			\
> +				       (long) (arg6),			\
> +				       (long) (number));		\
> +	err = _sc_ret.reg.v1;						\
> +	_sc_ret.reg.v0;							\
>  })
>  
> +long long __nomips16 __mips_syscall7 (long arg1, long arg2, long arg3,
> +				      long arg4, long arg5, long arg6,
> +				      long arg7,
> +				      long number);
> +
>  #define internal_syscall7(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5, arg6, arg7)	\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	"sw\t%7, 20($29)\n\t"						\
> -	"sw\t%8, 24($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +	union __mips_syscall_return _sc_ret;				\
> +	_sc_ret.val = __mips_syscall7 ((long) (arg1),			\
> +				       (long) (arg2),			\
> +				       (long) (arg3),			\
> +				       (long) (arg4),			\
> +				       (long) (arg5),			\
> +				       (long) (arg6),			\
> +				       (long) (arg7),			\
> +				       (long) (number));		\
> +	err = _sc_ret.reg.v1;						\
> +	_sc_ret.reg.v0;							\
>  })
>  
>  #define __SYSCALL_CLOBBERS "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13", \
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-21 10:49                                             ` Maciej W. Rozycki
  2017-08-21 14:30                                               ` Adhemerval Zanella
@ 2017-08-22  8:25                                               ` Aurelien Jarno
  2017-08-22 10:07                                                 ` Maciej W. Rozycki
  2017-08-30 15:35                                               ` Maciej W. Rozycki
  2 siblings, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-22  8:25 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On 2017-08-21 11:48, Maciej W. Rozycki wrote:
> On Sat, 19 Aug 2017, Aurelien Jarno wrote:
> > >  The MIPS16 wrappers were split into individual files so that only ones 
> > > that are actually used by `ld.so' are pulled.  I think it would be good if 
> > > we preserved that.  I'll see if I can experiment with keeping the original 
> > > MIPS16 0-3 wrappers.
> > 
> > For what I have seen, ld.so already uses syscalls with 1 to 4 arguments.
> > It doesn't use any syscall without argument though. So it's only 4
> > instructions overhead.
> 
>  Why make things worse where they don't have to be and there's no benefit
> elsewhere that would balance the regression?
> 
>  Here's what I had in mind.  The key was updating the macros appropriately 
> in sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h.  Beyond 
> that I made a number of clean-ups:
> 
> 1. I renamed `__libc_do_syscall?' to `__mips_syscall?', for consistency 
>    with `__mips16_syscall?', and consequently `__libc_do_syscall_return' 
>    to `__mips_syscall_return'.
> 
> 2. I exported `__mips_syscall?' wrappers from `libc.so' rather than making
>    them hidden.  This is also consistent with `__mips16_syscall?' wrappers
>    and reduces code duplication of doubtful benefit -- it could be that 
>    some calls, if internal, could be subject to the JALR->BAL
>    optimisation, however only those that are in range and only in regular 
>    MIPS code, for a minimal execution time saving on some processors only.
>    Exporting these entries makes the maintenance effort much easier 
>    though, as we don't have to track and record their use in the
>    individual subdirectories in Makefile.
> 
> 3. I renamed `_sys_result' to `_sc_ret' where it is declared as `union 
>    __mips_syscall_return', again for clarity and consistency with MIPS16
>    INTERNAL_SYSCALL_NCS.
> 
> 4. I wrapped `number' in parentheses in `internal_syscall5', 
>    `internal_syscall6' and `internal_syscall7', and cast it to `long'.
> 
> 5. I have adjusted some comments.
> 
> Please let me know if you have any questions or concerns about this 
> update.
> 
>  This passes o32 regular MIPS regression testing, however my GCC 8 binary 
> seems to miscompile support_test_main.c in the MIPS16 mode, causing all 
> the relevant test cases to crash with SIGILL (in the `support_set_test_dir 
> (test_dir)' invocation it's `test_dir' that is jumped to rather than 
> `support_set_test_dir'; I'll file a bug separately).  I'll find a way to 
> run MIPS16 testing eventually, however meanwhile I will appreciate if you 
> do it for me.

Thanks, this looks fine for me. I don't have more comments than the ones
already done by Adhemerval. I have tested it on MIPS O32 BE and LE, I
don't see any regression.

I can do a test build for MIPS16 and run basic testing under QEMU, but I
don't have the hardware to do more.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-22  8:25                                               ` [PATCH] mips/o32: fix internal_syscall5/6/7 Aurelien Jarno
@ 2017-08-22 10:07                                                 ` Maciej W. Rozycki
  0 siblings, 0 replies; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-22 10:07 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On Tue, 22 Aug 2017, Aurelien Jarno wrote:

> >  This passes o32 regular MIPS regression testing, however my GCC 8 binary 
> > seems to miscompile support_test_main.c in the MIPS16 mode, causing all 
> > the relevant test cases to crash with SIGILL (in the `support_set_test_dir 
> > (test_dir)' invocation it's `test_dir' that is jumped to rather than 
> > `support_set_test_dir'; I'll file a bug separately).  I'll find a way to 
> > run MIPS16 testing eventually, however meanwhile I will appreciate if you 
> > do it for me.
> 
> Thanks, this looks fine for me. I don't have more comments than the ones
> already done by Adhemerval. I have tested it on MIPS O32 BE and LE, I
> don't see any regression.

 Great, thanks!  I'll yet make the adjustments Adhemerval suggested.

> I can do a test build for MIPS16 and run basic testing under QEMU, but I
> don't have the hardware to do more.

 Thanks.  I have now identified the cause of my code generation problem 
(it was actually a register encoding bug in LD relaxation I have 
implemented in preparation to handle PR ld/21375, and which I thought I 
had removed from the binutils build used for this verification), so I will 
be able to run MIPS16 testing myself once I have the updated patch ready.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7
  2017-08-21 14:30                                               ` Adhemerval Zanella
@ 2017-08-24 13:27                                                 ` Maciej W. Rozycki
  2017-08-24 20:08                                                   ` Adhemerval Zanella
  2017-08-26  8:00                                                   ` [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7 Aurelien Jarno
  0 siblings, 2 replies; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-24 13:27 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Aurelien Jarno, Joseph Myers, libc-alpha

From: Adhemerval Zanella <adhemerval.zanella@linaro.org>

Fix a commit cc25c8b4c119 ("New pthread rwlock that is more scalable.") 
regression and prevent uncontrolled stack space usage from happening 
when a 5-, 6- or 7-argument syscall wrapper is placed in a loop.

The cause of the problem is the use of `alloca' in regular MIPS/Linux 
wrappers to force the use of the frame pointer register in any function 
using one or more of these wrappers.  Using the frame pointer register 
is required so as not to break frame unwinding as the the stack pointer 
is lowered within the inline asm used by these wrappers to make room for 
the stack arguments, which 5-, 6- and 7-argument syscalls use with the 
o32 ABI.

The regular MIPS/Linux wrappers are macros however, expanded inline, and 
stack allocations made with `alloca' are not discarded until the return 
of the function they are made in.  Consequently if called in a loop, 
then virtual memory is wasted, and if the loop goes through enough 
iterations, then ultimately available memory can get exhausted causing 
the program to crash.

Address the issue by replacing the inline code with standalone assembly 
functions, which rely on the compiler arranging syscall arguments 
according to the o32 function calling convention, which MIPS/Linux 
syscalls also use, except for the syscall number passed and the error 
flag returned.  This way there is no need to fiddle with the stack 
pointer anymore and all that has to be handled in the new standalone 
functions is the special handling of the syscall number and the error 
flag.

Redirect 5-, 6- or 7-argument MIPS16/Linux syscall wrappers to these new 
functions as well, so as to avoid an unnecessary double call the 
existing wrappers would cause with the new arrangement.

2017-08-24  Adhemerval Zanella  <adhemerval.zanella@linaro.org>
	    Aurelien Jarno  <aurelien@aurel32.net>
	    Maciej W. Rozycki  <macro@imgtec.com>

	[BZ #21956]
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
	[subdir = misc] (sysdep_routines): Remove `mips16-syscall5',
	`mips16-syscall6' and `mips16-syscall7'.
	(CFLAGS-mips16-syscall5.c, CFLAGS-mips16-syscall6.c)
	(CFLAGS-mips16-syscall7.c): Remove.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions (libc):
	Remove `__mips16_syscall5', `__mips16_syscall6' and 
	`__mips16_syscall7'.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
	(__mips16_syscall0): Rename `__mips16_syscall_return' to 
	`__mips_syscall_return'.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
	(__mips16_syscall1): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
	(__mips16_syscall2): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
	(__mips16_syscall3): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
	(__mips16_syscall4): Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c:
	Remove.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c:
	Remove.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c:
	Remove.
	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
	(__mips16_syscall5): Expand to `__mips_syscall5' rather than 
	`__mips16_syscall5'.  Remove prototype.
	(__mips16_syscall6): Expand to `__mips_syscall6' rather than
	`__mips16_syscall6'.  Remove prototype.
	(__mips16_syscall7): Expand to `__mips_syscall7' rather than
	`__mips16_syscall7'.  Remove prototype.
	(__nomips16, __mips16_syscall_return): Move to...
	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h 
	(__nomips16, __mips_syscall_return): ... here.
	[__mips16] (INTERNAL_SYSCALL_NCS): Rename 
	`__mips16_syscall_return' to `__mips_syscall_return'.
	[__mips16] (INTERNAL_SYSCALL_MIPS16): Pass `number' to
	`internal_syscall##nr'.
	[!__mips16] (INTERNAL_SYSCALL): Pass `SYS_ify (name)' to
	`internal_syscall##nr'.
	(FORCE_FRAME_POINTER): Remove.
	(__mips_syscall5): New prototype.
	(internal_syscall5): Rewrite to call `__mips_syscall5'.
	(__mips_syscall6): New prototype.
	(internal_syscall6): Rewrite to call `__mips_syscall6'.
	(__mips_syscall7): New prototype.
	(internal_syscall7): Rewrite to call `__mips_syscall7'.
	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S: New file.
	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S: New file.
	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S: New file.
	* sysdeps/unix/sysv/linux/mips/mips32/Makefile [subdir = misc]
	(sysdep_routines): Add libc-do-syscall.
	* sysdeps/unix/sysv/linux/mips/mips32/Versions (libc): Add
	`__mips_syscall5', `__mips_syscall6' and `__mips_syscall7'.

---
On Mon, 21 Aug 2017, Adhemerval Zanella wrote:

> > 2. I exported `__mips_syscall?' wrappers from `libc.so' rather than making
> >    them hidden.  This is also consistent with `__mips16_syscall?' wrappers
> >    and reduces code duplication of doubtful benefit -- it could be that 
> >    some calls, if internal, could be subject to the JALR->BAL
> >    optimisation, however only those that are in range and only in regular 
> >    MIPS code, for a minimal execution time saving on some processors only.
> >    Exporting these entries makes the maintenance effort much easier 
> >    though, as we don't have to track and record their use in the
> >    individual subdirectories in Makefile.
> 
> In this case we can still have internal hidden calls for libc with the cost
> of code duplication by using hiden alias with:
> 
> libc_hidden_proto (__mips_syscall{5,6,7}, nomips16)
> 
> And with the pairing
> 
> libc_hidden_def (__mips_syscall{5,6,7});
> 
> On implementation.

 Thanks for the hint.  Actually that does not cause code duplication.

 I actually considered making hidden aliases available for use within 
libc.so as a possible future improvement, but didn't realise it would be 
as simple to arrange with the symbols defined in standalone assembly 
code rather than C.  This results in 61 BAL instructions replacing JALR 
ones in my regular MIPS libc.so build.

> > Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S
> > ===================================================================
> > --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> > +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S	2017-08-20 03:02:13.583854495 +0100
> > @@ -0,0 +1,33 @@
> 
> One line comment to describe this file.

 I pinched the terse comment used across the existing MIPS16 wrappers 
(with s/MIPS16/MIPS/ applied); I hope this is good enough as otherwise the 
code should be self-explanatory.

> > @@ -262,110 +275,65 @@
> >  	_sys_result;							\
> >  })
> >  
> > -/* We need to use a frame pointer for the functions in which we
> > -   adjust $sp around the syscall, or debug information and unwind
> > -   information will be $sp relative and thus wrong during the syscall.  As
> > -   of GCC 4.7, this is sufficient.  */
> > -#define FORCE_FRAME_POINTER						\
> > -  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
> > +/* Out-of-line standard MIPS wrappers used for 5, 6, and 7 argument
> > +   syscalls, which require stack arguments.  */
> 
> I think it is worth to add a comment why we are using out-of-line wrappers
> for syscalls with 5, 6, and 7 arguments.

 Good point, let me know if what I came up with is comprehensive enough.

 This update has passed regular MIPS and MIPS16 o32 regression testing, 
with no regressions.  OK to apply?

 NB while looking at it I've noticed we do not pass any `-O' optimisation 
flag to the GCC driver while building .S files.  That in turn means no GAS 
branch optimisation is enabled and consequently branch delay slots are not 
scheduled in `reorder' code by the assembler (the MIPS/GCC `asm' spec has 
this `%{noasmopt:-O0; O0|fno-delayed-branch:-O1; O*:-O2; :-O1}'), causing 
delay slots wasted with a NOP where a preceding instruction could be moved 
instead and save some code space.  It can be easily observed by comparing 
code in the compiler-generated MIPS16 wrappers vs the new standalone 
assembly regular MIPS (and microMIPS) wrappers.

 Was this a deliberate choice made sometime to have greater control over 
code produced or just an accidental oversight?

  Maciej

---
 sysdeps/unix/sysv/linux/mips/mips32/Makefile                 |    4 
 sysdeps/unix/sysv/linux/mips/mips32/Versions                 |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S          |   35 ++
 sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S          |   35 ++
 sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S          |   35 ++
 sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile          |    6 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions          |    2 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h  |   44 --
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c |    3 
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c |   33 --
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c |   33 --
 sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c |   33 --
 sysdeps/unix/sysv/linux/mips/mips32/sysdep.h                 |  163 ++++-------
 17 files changed, 201 insertions(+), 240 deletions(-)

glibc-aurelien-mips-o32-syscall.diff
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/Makefile
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/Makefile	2017-08-20 21:30:35.093821957 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/Makefile	2017-08-22 20:33:16.504589387 +0100
@@ -3,6 +3,10 @@ ifeq ($(subdir),conform)
 conformtest-xfail-conds += mips-o32-linux
 endif
 
+ifeq ($(subdir),misc)
+sysdep_routines += mips-syscall5 mips-syscall6 mips-syscall7
+endif
+
 ifeq ($(subdir),stdlib)
 tests += bug-getcontext-mips-gp
 endif
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/Versions
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/Versions	2017-08-20 21:30:35.142707136 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/Versions	2017-08-22 20:33:16.571894966 +0100
@@ -3,4 +3,7 @@ libc {
     getrlimit64;
     setrlimit64;
   }
+  GLIBC_PRIVATE {
+    __mips_syscall5; __mips_syscall6; __mips_syscall7;
+  }
 }
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S	2017-08-22 20:47:54.745857965 +0100
@@ -0,0 +1,35 @@
+/* MIPS syscall wrappers.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+	.text
+	.set	nomips16
+
+/* long long __mips_syscall5 (long arg1, long arg2, long arg3, long arg4,
+			      long arg5,
+			      long number)  */
+
+ENTRY(__mips_syscall5)
+	lw	v0, 20(sp)
+	syscall
+	move	v1, a3
+	jr	ra
+END(__mips_syscall5)
+libc_hidden_def (__mips_syscall5)
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S	2017-08-22 20:47:47.596264940 +0100
@@ -0,0 +1,35 @@
+/* MIPS syscall wrappers.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+	.text
+	.set	nomips16
+
+/* long long __mips_syscall6 (long arg1, long arg2, long arg3, long arg4,
+			      long arg5, long arg6,
+			      long number)  */
+
+ENTRY(__mips_syscall6)
+	lw	v0, 24(sp)
+	syscall
+	move	v1, a3
+	jr	ra
+END(__mips_syscall6)
+libc_hidden_def (__mips_syscall6)
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S	2017-08-22 20:47:25.781928113 +0100
@@ -0,0 +1,35 @@
+/* MIPS syscall wrappers.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+	.text
+	.set	nomips16
+
+/* long long __mips_syscall7 (long arg1, long arg2, long arg3, long arg4,
+			      long arg5, long arg6, long arg7,
+			      long number)  */
+
+ENTRY(__mips_syscall7)
+	lw	v0, 28(sp)
+	syscall
+	move	v1, a3
+	jr	ra
+END(__mips_syscall7)
+libc_hidden_def (__mips_syscall7)
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile	2017-08-20 21:30:35.448096086 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile	2017-08-22 20:33:16.687468562 +0100
@@ -1,13 +1,9 @@
 ifeq ($(subdir),misc)
 sysdep_routines += mips16-syscall0 mips16-syscall1 mips16-syscall2
-sysdep_routines += mips16-syscall3 mips16-syscall4 mips16-syscall5
-sysdep_routines += mips16-syscall6 mips16-syscall7
+sysdep_routines += mips16-syscall3 mips16-syscall4
 CFLAGS-mips16-syscall0.c += -fexceptions
 CFLAGS-mips16-syscall1.c += -fexceptions
 CFLAGS-mips16-syscall2.c += -fexceptions
 CFLAGS-mips16-syscall3.c += -fexceptions
 CFLAGS-mips16-syscall4.c += -fexceptions
-CFLAGS-mips16-syscall5.c += -fexceptions
-CFLAGS-mips16-syscall6.c += -fexceptions
-CFLAGS-mips16-syscall7.c += -fexceptions
 endif
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions	2017-08-20 21:30:35.487709423 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions	2017-08-22 20:33:16.719930829 +0100
@@ -1,6 +1,6 @@
 libc {
   GLIBC_PRIVATE {
     __mips16_syscall0; __mips16_syscall1; __mips16_syscall2; __mips16_syscall3;
-    __mips16_syscall4; __mips16_syscall5; __mips16_syscall6; __mips16_syscall7;
+    __mips16_syscall4;
   }
 }
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h	2017-08-20 21:30:35.540283348 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h	2017-08-22 20:33:16.732035716 +0100
@@ -19,19 +19,6 @@
 #ifndef MIPS16_SYSCALL_H
 #define MIPS16_SYSCALL_H 1
 
-#define __nomips16 __attribute__ ((nomips16))
-
-union __mips16_syscall_return
-  {
-    long long val;
-    struct
-      {
-	long v0;
-	long v1;
-      }
-    reg;
-  };
-
 long long __nomips16 __mips16_syscall0 (long number);
 #define __mips16_syscall0(dummy, number)				\
 	__mips16_syscall0 ((long) (number))
@@ -61,29 +48,22 @@ long long __nomips16 __mips16_syscall4 (
 			   (long) (a3),					\
 			   (long) (number))
 
-long long __nomips16 __mips16_syscall5 (long a0, long a1, long a2, long a3,
-					long a4,
-					long number);
+/* The remaining ones use regular MIPS wrappers.  */
+
 #define __mips16_syscall5(a0, a1, a2, a3, a4, number)			\
-	__mips16_syscall5 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4),			\
-			   (long) (number))
+	__mips_syscall5 ((long) (a0), (long) (a1), (long) (a2),		\
+			 (long) (a3), (long) (a4),			\
+			 (long) (number))
 
-long long __nomips16 __mips16_syscall6 (long a0, long a1, long a2, long a3,
-					long a4, long a5,
-					long number);
 #define __mips16_syscall6(a0, a1, a2, a3, a4, a5, number)		\
-	__mips16_syscall6 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4), (long) (a5),	\
-			   (long) (number))
+	__mips_syscall6 ((long) (a0), (long) (a1), (long) (a2),		\
+			 (long) (a3), (long) (a4), (long) (a5),		\
+			 (long) (number))
 
-long long __nomips16 __mips16_syscall7 (long a0, long a1, long a2, long a3,
-					long a4, long a5, long a6,
-					long number);
 #define __mips16_syscall7(a0, a1, a2, a3, a4, a5, a6, number)		\
-	__mips16_syscall7 ((long) (a0), (long) (a1), (long) (a2),	\
-			   (long) (a3), (long) (a4), (long) (a5),	\
-			   (long) (a6),					\
-			   (long) (number))
+	__mips_syscall7 ((long) (a0), (long) (a1), (long) (a2),		\
+			 (long) (a3), (long) (a4), (long) (a5),		\
+			 (long) (a6),					\
+			 (long) (number))
 
 #endif
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c	2017-08-20 21:30:35.557630985 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c	2017-08-22 20:33:16.741195496 +0100
@@ -17,14 +17,13 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall0
 
 long long __nomips16
 __mips16_syscall0 (long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 0);
   return ret.val;
 }
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c	2017-08-20 21:30:35.658293922 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c	2017-08-22 20:33:16.758455153 +0100
@@ -17,7 +17,6 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall1
 
@@ -25,7 +24,7 @@ long long __nomips16
 __mips16_syscall1 (long a0,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 1,
 					a0);
   return ret.val;
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c	2017-08-20 21:30:35.769000365 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c	2017-08-22 20:33:16.768703866 +0100
@@ -17,7 +17,6 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall2
 
@@ -25,7 +24,7 @@ long long __nomips16
 __mips16_syscall2 (long a0, long a1,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 2,
 					a0, a1);
   return ret.val;
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c	2017-08-20 21:30:35.796726756 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c	2017-08-22 20:33:16.779819073 +0100
@@ -17,7 +17,6 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall3
 
@@ -25,7 +24,7 @@ long long __nomips16
 __mips16_syscall3 (long a0, long a1, long a2,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 3,
 					a0, a1, a2);
   return ret.val;
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c	2017-08-20 21:30:35.819263979 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c	2017-08-22 20:33:16.794914009 +0100
@@ -17,7 +17,6 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <sysdep.h>
-#include <mips16-syscall.h>
 
 #undef __mips16_syscall4
 
@@ -25,7 +24,7 @@ long long __nomips16
 __mips16_syscall4 (long a0, long a1, long a2, long a3,
 		   long number)
 {
-  union __mips16_syscall_return ret;
+  union __mips_syscall_return ret;
   ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 4,
 					a0, a1, a2, a3);
   return ret.val;
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c	2017-08-20 21:30:35.822296212 +0100
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall5
-
-long long __nomips16
-__mips16_syscall5 (long a0, long a1, long a2, long a3,
-		   long a4,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 5,
-					a0, a1, a2, a3, a4);
-  return ret.val;
-}
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c	2017-08-20 21:30:35.838512882 +0100
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall6
-
-long long __nomips16
-__mips16_syscall6 (long a0, long a1, long a2, long a3,
-		   long a4, long a5,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 6,
-					a0, a1, a2, a3, a4, a5);
-  return ret.val;
-}
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c	2017-08-20 21:30:35.846816836 +0100
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,33 +0,0 @@
-/* MIPS16 syscall wrappers.
-   Copyright (C) 2013-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <mips16-syscall.h>
-
-#undef __mips16_syscall7
-
-long long __nomips16
-__mips16_syscall7 (long a0, long a1, long a2, long a3,
-		   long a4, long a5, long a6,
-		   long number)
-{
-  union __mips16_syscall_return ret;
-  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 7,
-					a0, a1, a2, a3, a4, a5, a6);
-  return ret.val;
-}
Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
===================================================================
--- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-20 21:30:35.944796826 +0100
+++ glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-22 23:13:25.010701782 +0100
@@ -98,6 +98,19 @@
 #undef INTERNAL_SYSCALL
 #undef INTERNAL_SYSCALL_NCS
 
+#define __nomips16 __attribute__ ((nomips16))
+
+union __mips_syscall_return
+  {
+    long long val;
+    struct
+      {
+	long v0;
+	long v1;
+      }
+    reg;
+  };
+
 #ifdef __mips16
 /* There's no MIPS16 syscall instruction, so we go through out-of-line
    standard MIPS wrappers.  These do use inline snippets below though,
@@ -112,7 +125,7 @@
 
 # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
 ({									\
-	union __mips16_syscall_return _sc_ret;				\
+	union __mips_syscall_return _sc_ret;				\
 	_sc_ret.val = __mips16_syscall##nr (args, number);		\
 	err = _sc_ret.reg.v1;						\
 	_sc_ret.reg.v0;							\
@@ -121,13 +134,13 @@
 # define INTERNAL_SYSCALL_MIPS16(number, err, nr, args...)		\
 	internal_syscall##nr ("lw\t%0, %2\n\t",				\
 			      "R" (number),				\
-			      0, err, args)
+			      number, err, args)
 
 #else /* !__mips16 */
 # define INTERNAL_SYSCALL(name, err, nr, args...)			\
 	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
 			      "IK" (SYS_ify (name)),			\
-			      0, err, args)
+			      SYS_ify (name), err, args)
 
 # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
 	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
@@ -262,110 +275,74 @@
 	_sys_result;							\
 })
 
-/* We need to use a frame pointer for the functions in which we
-   adjust $sp around the syscall, or debug information and unwind
-   information will be $sp relative and thus wrong during the syscall.  As
-   of GCC 4.7, this is sufficient.  */
-#define FORCE_FRAME_POINTER						\
-  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
+/* Standalone MIPS wrappers used for 5, 6, and 7 argument syscalls,
+   which require stack arguments.  We rely on the compiler arranging
+   wrapper's arguments according to the MIPS o32 function calling
+   convention, which is reused by syscalls, except for the syscall
+   number passed and the error flag returned (taken care of in the
+   wrapper called).  This relieves us from relying on non-guaranteed
+   compiler specifics required for the stack arguments to be pushed,
+   which would be the case if these syscalls were inlined.  */
+
+long long __nomips16 __mips_syscall5 (long arg1, long arg2, long arg3,
+				      long arg4, long arg5,
+				      long number);
+libc_hidden_proto (__mips_syscall5, nomips16)
 
 #define internal_syscall5(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5)			\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5))						\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __mips_syscall_return _sc_ret;				\
+	_sc_ret.val = __mips_syscall5 ((long) (arg1),			\
+				       (long) (arg2),			\
+				       (long) (arg3),			\
+				       (long) (arg4),			\
+				       (long) (arg5),			\
+				       (long) (number));		\
+	err = _sc_ret.reg.v1;						\
+	_sc_ret.reg.v0;							\
 })
 
+long long __nomips16 __mips_syscall6 (long arg1, long arg2, long arg3,
+				      long arg4, long arg5, long arg6,
+				      long number);
+libc_hidden_proto (__mips_syscall6, nomips16)
+
 #define internal_syscall6(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6)		\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __mips_syscall_return _sc_ret;				\
+	_sc_ret.val = __mips_syscall6 ((long) (arg1),			\
+				       (long) (arg2),			\
+				       (long) (arg3),			\
+				       (long) (arg4),			\
+				       (long) (arg5),			\
+				       (long) (arg6),			\
+				       (long) (number));		\
+	err = _sc_ret.reg.v1;						\
+	_sc_ret.reg.v0;							\
 })
 
+long long __nomips16 __mips_syscall7 (long arg1, long arg2, long arg3,
+				      long arg4, long arg5, long arg6,
+				      long arg7,
+				      long number);
+libc_hidden_proto (__mips_syscall7, nomips16)
+
 #define internal_syscall7(v0_init, input, number, err,			\
 			  arg1, arg2, arg3, arg4, arg5, arg6, arg7)	\
 ({									\
-	long _sys_result;						\
-									\
-	FORCE_FRAME_POINTER;						\
-	{								\
-	register long __s0 asm ("$16") __attribute__ ((unused))		\
-	  = (number);							\
-	register long __v0 asm ("$2");					\
-	register long __a0 asm ("$4") = (long) (arg1);			\
-	register long __a1 asm ("$5") = (long) (arg2);			\
-	register long __a2 asm ("$6") = (long) (arg3);			\
-	register long __a3 asm ("$7") = (long) (arg4);			\
-	__asm__ volatile (						\
-	".set\tnoreorder\n\t"						\
-	"subu\t$29, 32\n\t"						\
-	"sw\t%6, 16($29)\n\t"						\
-	"sw\t%7, 20($29)\n\t"						\
-	"sw\t%8, 24($29)\n\t"						\
-	v0_init								\
-	"syscall\n\t"							\
-	"addiu\t$29, 32\n\t"						\
-	".set\treorder"							\
-	: "=r" (__v0), "+r" (__a3)					\
-	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
-	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
-	: __SYSCALL_CLOBBERS);						\
-	err = __a3;							\
-	_sys_result = __v0;						\
-	}								\
-	_sys_result;							\
+	union __mips_syscall_return _sc_ret;				\
+	_sc_ret.val = __mips_syscall7 ((long) (arg1),			\
+				       (long) (arg2),			\
+				       (long) (arg3),			\
+				       (long) (arg4),			\
+				       (long) (arg5),			\
+				       (long) (arg6),			\
+				       (long) (arg7),			\
+				       (long) (number));		\
+	err = _sc_ret.reg.v1;						\
+	_sc_ret.reg.v0;							\
 })
 
 #define __SYSCALL_CLOBBERS "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13", \

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7
  2017-08-24 13:27                                                 ` [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7 Maciej W. Rozycki
@ 2017-08-24 20:08                                                   ` Adhemerval Zanella
  2017-08-29 18:00                                                     ` Maciej W. Rozycki
  2017-09-08 11:15                                                     ` MIPS: Standalone/inline assembly issues (was: MIPS/o32: Fix internal_syscall5/6/7) Maciej W. Rozycki
  2017-08-26  8:00                                                   ` [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7 Aurelien Jarno
  1 sibling, 2 replies; 53+ messages in thread
From: Adhemerval Zanella @ 2017-08-24 20:08 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Aurelien Jarno, Joseph Myers, libc-alpha



On 24/08/2017 10:26, Maciej W. Rozycki wrote:
> From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
> 
> Fix a commit cc25c8b4c119 ("New pthread rwlock that is more scalable.") 
> regression and prevent uncontrolled stack space usage from happening 
> when a 5-, 6- or 7-argument syscall wrapper is placed in a loop.
> 
> The cause of the problem is the use of `alloca' in regular MIPS/Linux 
> wrappers to force the use of the frame pointer register in any function 
> using one or more of these wrappers.  Using the frame pointer register 
> is required so as not to break frame unwinding as the the stack pointer 
> is lowered within the inline asm used by these wrappers to make room for 
> the stack arguments, which 5-, 6- and 7-argument syscalls use with the 
> o32 ABI.
> 
> The regular MIPS/Linux wrappers are macros however, expanded inline, and 
> stack allocations made with `alloca' are not discarded until the return 
> of the function they are made in.  Consequently if called in a loop, 
> then virtual memory is wasted, and if the loop goes through enough 
> iterations, then ultimately available memory can get exhausted causing 
> the program to crash.
> 
> Address the issue by replacing the inline code with standalone assembly 
> functions, which rely on the compiler arranging syscall arguments 
> according to the o32 function calling convention, which MIPS/Linux 
> syscalls also use, except for the syscall number passed and the error 
> flag returned.  This way there is no need to fiddle with the stack 
> pointer anymore and all that has to be handled in the new standalone 
> functions is the special handling of the syscall number and the error 
> flag.
> 
> Redirect 5-, 6- or 7-argument MIPS16/Linux syscall wrappers to these new 
> functions as well, so as to avoid an unnecessary double call the 
> existing wrappers would cause with the new arrangement.
> 
> 2017-08-24  Adhemerval Zanella  <adhemerval.zanella@linaro.org>
> 	    Aurelien Jarno  <aurelien@aurel32.net>
> 	    Maciej W. Rozycki  <macro@imgtec.com>
> 
> 	[BZ #21956]
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
> 	[subdir = misc] (sysdep_routines): Remove `mips16-syscall5',
> 	`mips16-syscall6' and `mips16-syscall7'.
> 	(CFLAGS-mips16-syscall5.c, CFLAGS-mips16-syscall6.c)
> 	(CFLAGS-mips16-syscall7.c): Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions (libc):
> 	Remove `__mips16_syscall5', `__mips16_syscall6' and 
> 	`__mips16_syscall7'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
> 	(__mips16_syscall0): Rename `__mips16_syscall_return' to 
> 	`__mips_syscall_return'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
> 	(__mips16_syscall1): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
> 	(__mips16_syscall2): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
> 	(__mips16_syscall3): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
> 	(__mips16_syscall4): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
> 	(__mips16_syscall5): Expand to `__mips_syscall5' rather than 
> 	`__mips16_syscall5'.  Remove prototype.
> 	(__mips16_syscall6): Expand to `__mips_syscall6' rather than
> 	`__mips16_syscall6'.  Remove prototype.
> 	(__mips16_syscall7): Expand to `__mips_syscall7' rather than
> 	`__mips16_syscall7'.  Remove prototype.
> 	(__nomips16, __mips16_syscall_return): Move to...
> 	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h 
> 	(__nomips16, __mips_syscall_return): ... here.
> 	[__mips16] (INTERNAL_SYSCALL_NCS): Rename 
> 	`__mips16_syscall_return' to `__mips_syscall_return'.
> 	[__mips16] (INTERNAL_SYSCALL_MIPS16): Pass `number' to
> 	`internal_syscall##nr'.
> 	[!__mips16] (INTERNAL_SYSCALL): Pass `SYS_ify (name)' to
> 	`internal_syscall##nr'.
> 	(FORCE_FRAME_POINTER): Remove.
> 	(__mips_syscall5): New prototype.
> 	(internal_syscall5): Rewrite to call `__mips_syscall5'.
> 	(__mips_syscall6): New prototype.
> 	(internal_syscall6): Rewrite to call `__mips_syscall6'.
> 	(__mips_syscall7): New prototype.
> 	(internal_syscall7): Rewrite to call `__mips_syscall7'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/Makefile [subdir = misc]
> 	(sysdep_routines): Add libc-do-syscall.
> 	* sysdeps/unix/sysv/linux/mips/mips32/Versions (libc): Add
> 	`__mips_syscall5', `__mips_syscall6' and `__mips_syscall7'.

Patch LGTM, thanks for following this up.

> 
> ---
> On Mon, 21 Aug 2017, Adhemerval Zanella wrote:
> 
>>> 2. I exported `__mips_syscall?' wrappers from `libc.so' rather than making
>>>    them hidden.  This is also consistent with `__mips16_syscall?' wrappers
>>>    and reduces code duplication of doubtful benefit -- it could be that 
>>>    some calls, if internal, could be subject to the JALR->BAL
>>>    optimisation, however only those that are in range and only in regular 
>>>    MIPS code, for a minimal execution time saving on some processors only.
>>>    Exporting these entries makes the maintenance effort much easier 
>>>    though, as we don't have to track and record their use in the
>>>    individual subdirectories in Makefile.
>>
>> In this case we can still have internal hidden calls for libc with the cost
>> of code duplication by using hiden alias with:
>>
>> libc_hidden_proto (__mips_syscall{5,6,7}, nomips16)
>>
>> And with the pairing
>>
>> libc_hidden_def (__mips_syscall{5,6,7});
>>
>> On implementation.
> 
>  Thanks for the hint.  Actually that does not cause code duplication.
> 
>  I actually considered making hidden aliases available for use within 
> libc.so as a possible future improvement, but didn't realise it would be 
> as simple to arrange with the symbols defined in standalone assembly 
> code rather than C.  This results in 61 BAL instructions replacing JALR 
> ones in my regular MIPS libc.so build.

Nice improvement.

> 
>>> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S
>>> ===================================================================
>>> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
>>> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S	2017-08-20 03:02:13.583854495 +0100
>>> @@ -0,0 +1,33 @@
>>
>> One line comment to describe this file.
> 
>  I pinched the terse comment used across the existing MIPS16 wrappers 
> (with s/MIPS16/MIPS/ applied); I hope this is good enough as otherwise the 
> code should be self-explanatory.

I think it is good enough.

> 
>>> @@ -262,110 +275,65 @@
>>>  	_sys_result;							\
>>>  })
>>>  
>>> -/* We need to use a frame pointer for the functions in which we
>>> -   adjust $sp around the syscall, or debug information and unwind
>>> -   information will be $sp relative and thus wrong during the syscall.  As
>>> -   of GCC 4.7, this is sufficient.  */
>>> -#define FORCE_FRAME_POINTER						\
>>> -  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
>>> +/* Out-of-line standard MIPS wrappers used for 5, 6, and 7 argument
>>> +   syscalls, which require stack arguments.  */
>>
>> I think it is worth to add a comment why we are using out-of-line wrappers
>> for syscalls with 5, 6, and 7 arguments.
> 
>  Good point, let me know if what I came up with is comprehensive enough.
> 
>  This update has passed regular MIPS and MIPS16 o32 regression testing, 
> with no regressions.  OK to apply?

Ok from my side.

> 
>  NB while looking at it I've noticed we do not pass any `-O' optimisation 
> flag to the GCC driver while building .S files.  That in turn means no GAS 
> branch optimisation is enabled and consequently branch delay slots are not 
> scheduled in `reorder' code by the assembler (the MIPS/GCC `asm' spec has 
> this `%{noasmopt:-O0; O0|fno-delayed-branch:-O1; O*:-O2; :-O1}'), causing 
> delay slots wasted with a NOP where a preceding instruction could be moved 
> instead and save some code space.  It can be easily observed by comparing 
> code in the compiler-generated MIPS16 wrappers vs the new standalone 
> assembly regular MIPS (and microMIPS) wrappers.
> 
>  Was this a deliberate choice made sometime to have greater control over 
> code produced or just an accidental oversight?

I am not sure if it was deliberate, but my guess it is not really an issue
for most architectures since afaik any '-O' optimization flag along with
.S files usually does not turn any extra flags (SUBTARGET_ASM_SPEC seems
to be define only in a handful architectures on gcc).

For MIPS I think we can set ASFLAGS to O1 since it should enable the
required optimization, unless there is an specific gas option to enable
it (which I couldn't find). Another option would be to filter out CFLAGS
and extract the optimization level used for ASFLAGS, but I think for this
specific issue it should extra non required complexity.

Another thing I noticed is another possible optimization for size would
to use ".set push", ".set noreorder", and ".set pop" instead of current
code as below.

Using GCC 6.2.1 and binutils 2.27 I noted building for mips16 (using
a strip libc.so for both):

- base:

section			size
.text			966480
.eh_frame_hdr		8764
.eh_frame		36628
.pdr			97568

- patched:

section			size
.text			966160
.eh_frame_hdr		8732
.eh_frame		36492
.pdr			97440

Not that much, but still a gain. Using O1 (as below) shows some more
slight gains:

section			size
.text			966096
.eh_frame_hdr		8732
.eh_frame		36492
.pdr			97440

---

diff --git a/sysdeps/mips/Makefile b/sysdeps/mips/Makefile
index 7c1d779..1130015 100644
--- a/sysdeps/mips/Makefile
+++ b/sysdeps/mips/Makefile
@@ -83,3 +83,9 @@ $(objpfx)tst-mode-switch-2: $(shared-thread-library)
 endif
 endif
 endif
+
+# Enable delay branch optimization
+ASFLAGS-.o += -O1
+ASFLAGS-.os += -O1
+ASFLAGS-.op += -O1
+ASFLAGS-.oS += -O2

---

diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
index dadfa18..96867de 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
+++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
@@ -159,10 +159,11 @@ union __mips_syscall_return
        register long __v0 asm ("$2");                                  \
        register long __a3 asm ("$7");                                  \
        __asm__ volatile (                                              \
-       ".set\tnoreorder\n\t"                                           \
+       ".set push\n\t"                                                 \
+       ".set noreorder\n\t"                                            \
        v0_init                                                         \
        "syscall\n\t"                                                   \
-       ".set reorder"                                                  \
+       ".set pop"                                                      \
        : "=r" (__v0), "=r" (__a3)                                      \
        : input                                                         \
        : __SYSCALL_CLOBBERS);                                          \
@@ -183,10 +184,11 @@ union __mips_syscall_return
        register long __a0 asm ("$4") = (long) (arg1);                  \
        register long __a3 asm ("$7");                                  \
        __asm__ volatile (                                              \
-       ".set\tnoreorder\n\t"                                           \
+       ".set push\n\t"                                                 \
+       ".set noreorder\n\t"                                            \
        v0_init                                                         \
        "syscall\n\t"                                                   \
-       ".set reorder"                                                  \
+       ".set pop"                                                      \
        : "=r" (__v0), "=r" (__a3)                                      \
        : input, "r" (__a0)                                             \
        : __SYSCALL_CLOBBERS);                                          \
@@ -208,10 +210,11 @@ union __mips_syscall_return
        register long __a1 asm ("$5") = (long) (arg2);                  \
        register long __a3 asm ("$7");                                  \
        __asm__ volatile (                                              \
-       ".set\tnoreorder\n\t"                                           \
+       ".set push\n\t"                                                 \
+       ".set noreorder\n\t"                                            \
        v0_init                                                         \
        "syscall\n\t"                                                   \
-       ".set\treorder"                                                 \
+       ".set pop"                                                      \
        : "=r" (__v0), "=r" (__a3)                                      \
        : input, "r" (__a0), "r" (__a1)                                 \
        : __SYSCALL_CLOBBERS);                                          \
@@ -235,10 +238,11 @@ union __mips_syscall_return
        register long __a2 asm ("$6") = (long) (arg3);                  \
        register long __a3 asm ("$7");                                  \
        __asm__ volatile (                                              \
-       ".set\tnoreorder\n\t"                                           \
+       ".set push\n\t"                                                 \
+       ".set noreorder\n\t"                                            \
        v0_init                                                         \
        "syscall\n\t"                                                   \
-       ".set\treorder"                                                 \
+       ".set pop"                                                      \
        : "=r" (__v0), "=r" (__a3)                                      \
        : input, "r" (__a0), "r" (__a1), "r" (__a2)                     \
        : __SYSCALL_CLOBBERS);                                          \
@@ -262,10 +266,11 @@ union __mips_syscall_return
        register long __a2 asm ("$6") = (long) (arg3);                  \
        register long __a3 asm ("$7") = (long) (arg4);                  \
        __asm__ volatile (                                              \
-       ".set\tnoreorder\n\t"                                           \
+       ".set push\n\t"                                                 \
+       ".set noreorder\n\t"                                            \
        v0_init                                                         \
        "syscall\n\t"                                                   \
-       ".set\treorder"                                                 \
+       ".set pop"                                                      \
        : "=r" (__v0), "+r" (__a3)                                      \
        : input, "r" (__a0), "r" (__a1), "r" (__a2)                     \
        : __SYSCALL_CLOBBERS);                                          \


> 
>   Maciej
> 
> ---
>  sysdeps/unix/sysv/linux/mips/mips32/Makefile                 |    4 
>  sysdeps/unix/sysv/linux/mips/mips32/Versions                 |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S          |   35 ++
>  sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S          |   35 ++
>  sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S          |   35 ++
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile          |    6 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions          |    2 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h  |   44 --
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c |    3 
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c |   33 --
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c |   33 --
>  sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c |   33 --
>  sysdeps/unix/sysv/linux/mips/mips32/sysdep.h                 |  163 ++++-------
>  17 files changed, 201 insertions(+), 240 deletions(-)
> 
> glibc-aurelien-mips-o32-syscall.diff
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/Makefile
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/Makefile	2017-08-20 21:30:35.093821957 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/Makefile	2017-08-22 20:33:16.504589387 +0100
> @@ -3,6 +3,10 @@ ifeq ($(subdir),conform)
>  conformtest-xfail-conds += mips-o32-linux
>  endif
>  
> +ifeq ($(subdir),misc)
> +sysdep_routines += mips-syscall5 mips-syscall6 mips-syscall7
> +endif
> +
>  ifeq ($(subdir),stdlib)
>  tests += bug-getcontext-mips-gp
>  endif
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/Versions
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/Versions	2017-08-20 21:30:35.142707136 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/Versions	2017-08-22 20:33:16.571894966 +0100
> @@ -3,4 +3,7 @@ libc {
>      getrlimit64;
>      setrlimit64;
>    }
> +  GLIBC_PRIVATE {
> +    __mips_syscall5; __mips_syscall6; __mips_syscall7;
> +  }
>  }
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S	2017-08-22 20:47:54.745857965 +0100
> @@ -0,0 +1,35 @@
> +/* MIPS syscall wrappers.
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +	.text
> +	.set	nomips16
> +
> +/* long long __mips_syscall5 (long arg1, long arg2, long arg3, long arg4,
> +			      long arg5,
> +			      long number)  */
> +
> +ENTRY(__mips_syscall5)
> +	lw	v0, 20(sp)
> +	syscall
> +	move	v1, a3
> +	jr	ra
> +END(__mips_syscall5)
> +libc_hidden_def (__mips_syscall5)
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S	2017-08-22 20:47:47.596264940 +0100
> @@ -0,0 +1,35 @@
> +/* MIPS syscall wrappers.
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +	.text
> +	.set	nomips16
> +
> +/* long long __mips_syscall6 (long arg1, long arg2, long arg3, long arg4,
> +			      long arg5, long arg6,
> +			      long number)  */
> +
> +ENTRY(__mips_syscall6)
> +	lw	v0, 24(sp)
> +	syscall
> +	move	v1, a3
> +	jr	ra
> +END(__mips_syscall6)
> +libc_hidden_def (__mips_syscall6)
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S	2017-08-22 20:47:25.781928113 +0100
> @@ -0,0 +1,35 @@
> +/* MIPS syscall wrappers.
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +	.text
> +	.set	nomips16
> +
> +/* long long __mips_syscall7 (long arg1, long arg2, long arg3, long arg4,
> +			      long arg5, long arg6, long arg7,
> +			      long number)  */
> +
> +ENTRY(__mips_syscall7)
> +	lw	v0, 28(sp)
> +	syscall
> +	move	v1, a3
> +	jr	ra
> +END(__mips_syscall7)
> +libc_hidden_def (__mips_syscall7)
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile	2017-08-20 21:30:35.448096086 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile	2017-08-22 20:33:16.687468562 +0100
> @@ -1,13 +1,9 @@
>  ifeq ($(subdir),misc)
>  sysdep_routines += mips16-syscall0 mips16-syscall1 mips16-syscall2
> -sysdep_routines += mips16-syscall3 mips16-syscall4 mips16-syscall5
> -sysdep_routines += mips16-syscall6 mips16-syscall7
> +sysdep_routines += mips16-syscall3 mips16-syscall4
>  CFLAGS-mips16-syscall0.c += -fexceptions
>  CFLAGS-mips16-syscall1.c += -fexceptions
>  CFLAGS-mips16-syscall2.c += -fexceptions
>  CFLAGS-mips16-syscall3.c += -fexceptions
>  CFLAGS-mips16-syscall4.c += -fexceptions
> -CFLAGS-mips16-syscall5.c += -fexceptions
> -CFLAGS-mips16-syscall6.c += -fexceptions
> -CFLAGS-mips16-syscall7.c += -fexceptions
>  endif
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions	2017-08-20 21:30:35.487709423 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions	2017-08-22 20:33:16.719930829 +0100
> @@ -1,6 +1,6 @@
>  libc {
>    GLIBC_PRIVATE {
>      __mips16_syscall0; __mips16_syscall1; __mips16_syscall2; __mips16_syscall3;
> -    __mips16_syscall4; __mips16_syscall5; __mips16_syscall6; __mips16_syscall7;
> +    __mips16_syscall4;
>    }
>  }
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h	2017-08-20 21:30:35.540283348 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h	2017-08-22 20:33:16.732035716 +0100
> @@ -19,19 +19,6 @@
>  #ifndef MIPS16_SYSCALL_H
>  #define MIPS16_SYSCALL_H 1
>  
> -#define __nomips16 __attribute__ ((nomips16))
> -
> -union __mips16_syscall_return
> -  {
> -    long long val;
> -    struct
> -      {
> -	long v0;
> -	long v1;
> -      }
> -    reg;
> -  };
> -
>  long long __nomips16 __mips16_syscall0 (long number);
>  #define __mips16_syscall0(dummy, number)				\
>  	__mips16_syscall0 ((long) (number))
> @@ -61,29 +48,22 @@ long long __nomips16 __mips16_syscall4 (
>  			   (long) (a3),					\
>  			   (long) (number))
>  
> -long long __nomips16 __mips16_syscall5 (long a0, long a1, long a2, long a3,
> -					long a4,
> -					long number);
> +/* The remaining ones use regular MIPS wrappers.  */
> +
>  #define __mips16_syscall5(a0, a1, a2, a3, a4, number)			\
> -	__mips16_syscall5 ((long) (a0), (long) (a1), (long) (a2),	\
> -			   (long) (a3), (long) (a4),			\
> -			   (long) (number))
> +	__mips_syscall5 ((long) (a0), (long) (a1), (long) (a2),		\
> +			 (long) (a3), (long) (a4),			\
> +			 (long) (number))
>  
> -long long __nomips16 __mips16_syscall6 (long a0, long a1, long a2, long a3,
> -					long a4, long a5,
> -					long number);
>  #define __mips16_syscall6(a0, a1, a2, a3, a4, a5, number)		\
> -	__mips16_syscall6 ((long) (a0), (long) (a1), (long) (a2),	\
> -			   (long) (a3), (long) (a4), (long) (a5),	\
> -			   (long) (number))
> +	__mips_syscall6 ((long) (a0), (long) (a1), (long) (a2),		\
> +			 (long) (a3), (long) (a4), (long) (a5),		\
> +			 (long) (number))
>  
> -long long __nomips16 __mips16_syscall7 (long a0, long a1, long a2, long a3,
> -					long a4, long a5, long a6,
> -					long number);
>  #define __mips16_syscall7(a0, a1, a2, a3, a4, a5, a6, number)		\
> -	__mips16_syscall7 ((long) (a0), (long) (a1), (long) (a2),	\
> -			   (long) (a3), (long) (a4), (long) (a5),	\
> -			   (long) (a6),					\
> -			   (long) (number))
> +	__mips_syscall7 ((long) (a0), (long) (a1), (long) (a2),		\
> +			 (long) (a3), (long) (a4), (long) (a5),		\
> +			 (long) (a6),					\
> +			 (long) (number))
>  
>  #endif
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c	2017-08-20 21:30:35.557630985 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c	2017-08-22 20:33:16.741195496 +0100
> @@ -17,14 +17,13 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall0
>  
>  long long __nomips16
>  __mips16_syscall0 (long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 0);
>    return ret.val;
>  }
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c	2017-08-20 21:30:35.658293922 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c	2017-08-22 20:33:16.758455153 +0100
> @@ -17,7 +17,6 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall1
>  
> @@ -25,7 +24,7 @@ long long __nomips16
>  __mips16_syscall1 (long a0,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 1,
>  					a0);
>    return ret.val;
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c	2017-08-20 21:30:35.769000365 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c	2017-08-22 20:33:16.768703866 +0100
> @@ -17,7 +17,6 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall2
>  
> @@ -25,7 +24,7 @@ long long __nomips16
>  __mips16_syscall2 (long a0, long a1,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 2,
>  					a0, a1);
>    return ret.val;
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c	2017-08-20 21:30:35.796726756 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c	2017-08-22 20:33:16.779819073 +0100
> @@ -17,7 +17,6 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall3
>  
> @@ -25,7 +24,7 @@ long long __nomips16
>  __mips16_syscall3 (long a0, long a1, long a2,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 3,
>  					a0, a1, a2);
>    return ret.val;
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c	2017-08-20 21:30:35.819263979 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c	2017-08-22 20:33:16.794914009 +0100
> @@ -17,7 +17,6 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #include <sysdep.h>
> -#include <mips16-syscall.h>
>  
>  #undef __mips16_syscall4
>  
> @@ -25,7 +24,7 @@ long long __nomips16
>  __mips16_syscall4 (long a0, long a1, long a2, long a3,
>  		   long number)
>  {
> -  union __mips16_syscall_return ret;
> +  union __mips_syscall_return ret;
>    ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 4,
>  					a0, a1, a2, a3);
>    return ret.val;
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c	2017-08-20 21:30:35.822296212 +0100
> +++ /dev/null	1970-01-01 00:00:00.000000000 +0000
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall5
> -
> -long long __nomips16
> -__mips16_syscall5 (long a0, long a1, long a2, long a3,
> -		   long a4,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 5,
> -					a0, a1, a2, a3, a4);
> -  return ret.val;
> -}
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c	2017-08-20 21:30:35.838512882 +0100
> +++ /dev/null	1970-01-01 00:00:00.000000000 +0000
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall6
> -
> -long long __nomips16
> -__mips16_syscall6 (long a0, long a1, long a2, long a3,
> -		   long a4, long a5,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 6,
> -					a0, a1, a2, a3, a4, a5);
> -  return ret.val;
> -}
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c	2017-08-20 21:30:35.846816836 +0100
> +++ /dev/null	1970-01-01 00:00:00.000000000 +0000
> @@ -1,33 +0,0 @@
> -/* MIPS16 syscall wrappers.
> -   Copyright (C) 2013-2017 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <sysdep.h>
> -#include <mips16-syscall.h>
> -
> -#undef __mips16_syscall7
> -
> -long long __nomips16
> -__mips16_syscall7 (long a0, long a1, long a2, long a3,
> -		   long a4, long a5, long a6,
> -		   long number)
> -{
> -  union __mips16_syscall_return ret;
> -  ret.reg.v0 = INTERNAL_SYSCALL_MIPS16 (number, ret.reg.v1, 7,
> -					a0, a1, a2, a3, a4, a5, a6);
> -  return ret.val;
> -}
> Index: glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> ===================================================================
> --- glibc.orig/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-20 21:30:35.944796826 +0100
> +++ glibc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h	2017-08-22 23:13:25.010701782 +0100
> @@ -98,6 +98,19 @@
>  #undef INTERNAL_SYSCALL
>  #undef INTERNAL_SYSCALL_NCS
>  
> +#define __nomips16 __attribute__ ((nomips16))
> +
> +union __mips_syscall_return
> +  {
> +    long long val;
> +    struct
> +      {
> +	long v0;
> +	long v1;
> +      }
> +    reg;
> +  };
> +
>  #ifdef __mips16
>  /* There's no MIPS16 syscall instruction, so we go through out-of-line
>     standard MIPS wrappers.  These do use inline snippets below though,
> @@ -112,7 +125,7 @@
>  
>  # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
>  ({									\
> -	union __mips16_syscall_return _sc_ret;				\
> +	union __mips_syscall_return _sc_ret;				\
>  	_sc_ret.val = __mips16_syscall##nr (args, number);		\
>  	err = _sc_ret.reg.v1;						\
>  	_sc_ret.reg.v0;							\
> @@ -121,13 +134,13 @@
>  # define INTERNAL_SYSCALL_MIPS16(number, err, nr, args...)		\
>  	internal_syscall##nr ("lw\t%0, %2\n\t",				\
>  			      "R" (number),				\
> -			      0, err, args)
> +			      number, err, args)
>  
>  #else /* !__mips16 */
>  # define INTERNAL_SYSCALL(name, err, nr, args...)			\
>  	internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",	\
>  			      "IK" (SYS_ify (name)),			\
> -			      0, err, args)
> +			      SYS_ify (name), err, args)
>  
>  # define INTERNAL_SYSCALL_NCS(number, err, nr, args...)			\
>  	internal_syscall##nr (MOVE32 "\t%0, %2\n\t",			\
> @@ -262,110 +275,74 @@
>  	_sys_result;							\
>  })
>  
> -/* We need to use a frame pointer for the functions in which we
> -   adjust $sp around the syscall, or debug information and unwind
> -   information will be $sp relative and thus wrong during the syscall.  As
> -   of GCC 4.7, this is sufficient.  */
> -#define FORCE_FRAME_POINTER						\
> -  void *volatile __fp_force __attribute__ ((unused)) = alloca (4)
> +/* Standalone MIPS wrappers used for 5, 6, and 7 argument syscalls,
> +   which require stack arguments.  We rely on the compiler arranging
> +   wrapper's arguments according to the MIPS o32 function calling
> +   convention, which is reused by syscalls, except for the syscall
> +   number passed and the error flag returned (taken care of in the
> +   wrapper called).  This relieves us from relying on non-guaranteed
> +   compiler specifics required for the stack arguments to be pushed,
> +   which would be the case if these syscalls were inlined.  */
> +
> +long long __nomips16 __mips_syscall5 (long arg1, long arg2, long arg3,
> +				      long arg4, long arg5,
> +				      long number);
> +libc_hidden_proto (__mips_syscall5, nomips16)
>  
>  #define internal_syscall5(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5)			\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5))						\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +	union __mips_syscall_return _sc_ret;				\
> +	_sc_ret.val = __mips_syscall5 ((long) (arg1),			\
> +				       (long) (arg2),			\
> +				       (long) (arg3),			\
> +				       (long) (arg4),			\
> +				       (long) (arg5),			\
> +				       (long) (number));		\
> +	err = _sc_ret.reg.v1;						\
> +	_sc_ret.reg.v0;							\
>  })
>  
> +long long __nomips16 __mips_syscall6 (long arg1, long arg2, long arg3,
> +				      long arg4, long arg5, long arg6,
> +				      long number);
> +libc_hidden_proto (__mips_syscall6, nomips16)
> +
>  #define internal_syscall6(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5, arg6)		\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	"sw\t%7, 20($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5)), "r" ((long) (arg6))			\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +	union __mips_syscall_return _sc_ret;				\
> +	_sc_ret.val = __mips_syscall6 ((long) (arg1),			\
> +				       (long) (arg2),			\
> +				       (long) (arg3),			\
> +				       (long) (arg4),			\
> +				       (long) (arg5),			\
> +				       (long) (arg6),			\
> +				       (long) (number));		\
> +	err = _sc_ret.reg.v1;						\
> +	_sc_ret.reg.v0;							\
>  })
>  
> +long long __nomips16 __mips_syscall7 (long arg1, long arg2, long arg3,
> +				      long arg4, long arg5, long arg6,
> +				      long arg7,
> +				      long number);
> +libc_hidden_proto (__mips_syscall7, nomips16)
> +
>  #define internal_syscall7(v0_init, input, number, err,			\
>  			  arg1, arg2, arg3, arg4, arg5, arg6, arg7)	\
>  ({									\
> -	long _sys_result;						\
> -									\
> -	FORCE_FRAME_POINTER;						\
> -	{								\
> -	register long __s0 asm ("$16") __attribute__ ((unused))		\
> -	  = (number);							\
> -	register long __v0 asm ("$2");					\
> -	register long __a0 asm ("$4") = (long) (arg1);			\
> -	register long __a1 asm ("$5") = (long) (arg2);			\
> -	register long __a2 asm ("$6") = (long) (arg3);			\
> -	register long __a3 asm ("$7") = (long) (arg4);			\
> -	__asm__ volatile (						\
> -	".set\tnoreorder\n\t"						\
> -	"subu\t$29, 32\n\t"						\
> -	"sw\t%6, 16($29)\n\t"						\
> -	"sw\t%7, 20($29)\n\t"						\
> -	"sw\t%8, 24($29)\n\t"						\
> -	v0_init								\
> -	"syscall\n\t"							\
> -	"addiu\t$29, 32\n\t"						\
> -	".set\treorder"							\
> -	: "=r" (__v0), "+r" (__a3)					\
> -	: input, "r" (__a0), "r" (__a1), "r" (__a2),			\
> -	  "r" ((long) (arg5)), "r" ((long) (arg6)), "r" ((long) (arg7))	\
> -	: __SYSCALL_CLOBBERS);						\
> -	err = __a3;							\
> -	_sys_result = __v0;						\
> -	}								\
> -	_sys_result;							\
> +	union __mips_syscall_return _sc_ret;				\
> +	_sc_ret.val = __mips_syscall7 ((long) (arg1),			\
> +				       (long) (arg2),			\
> +				       (long) (arg3),			\
> +				       (long) (arg4),			\
> +				       (long) (arg5),			\
> +				       (long) (arg6),			\
> +				       (long) (arg7),			\
> +				       (long) (number));		\
> +	err = _sc_ret.reg.v1;						\
> +	_sc_ret.reg.v0;							\
>  })
>  
>  #define __SYSCALL_CLOBBERS "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13", \
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7
  2017-08-24 13:27                                                 ` [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7 Maciej W. Rozycki
  2017-08-24 20:08                                                   ` Adhemerval Zanella
@ 2017-08-26  8:00                                                   ` Aurelien Jarno
  1 sibling, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-26  8:00 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On 2017-08-24 14:26, Maciej W. Rozycki wrote:
> From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
> 
> Fix a commit cc25c8b4c119 ("New pthread rwlock that is more scalable.") 
> regression and prevent uncontrolled stack space usage from happening 
> when a 5-, 6- or 7-argument syscall wrapper is placed in a loop.
> 
> The cause of the problem is the use of `alloca' in regular MIPS/Linux 
> wrappers to force the use of the frame pointer register in any function 
> using one or more of these wrappers.  Using the frame pointer register 
> is required so as not to break frame unwinding as the the stack pointer 
> is lowered within the inline asm used by these wrappers to make room for 
> the stack arguments, which 5-, 6- and 7-argument syscalls use with the 
> o32 ABI.
> 
> The regular MIPS/Linux wrappers are macros however, expanded inline, and 
> stack allocations made with `alloca' are not discarded until the return 
> of the function they are made in.  Consequently if called in a loop, 
> then virtual memory is wasted, and if the loop goes through enough 
> iterations, then ultimately available memory can get exhausted causing 
> the program to crash.
> 
> Address the issue by replacing the inline code with standalone assembly 
> functions, which rely on the compiler arranging syscall arguments 
> according to the o32 function calling convention, which MIPS/Linux 
> syscalls also use, except for the syscall number passed and the error 
> flag returned.  This way there is no need to fiddle with the stack 
> pointer anymore and all that has to be handled in the new standalone 
> functions is the special handling of the syscall number and the error 
> flag.
> 
> Redirect 5-, 6- or 7-argument MIPS16/Linux syscall wrappers to these new 
> functions as well, so as to avoid an unnecessary double call the 
> existing wrappers would cause with the new arrangement.
> 
> 2017-08-24  Adhemerval Zanella  <adhemerval.zanella@linaro.org>
> 	    Aurelien Jarno  <aurelien@aurel32.net>
> 	    Maciej W. Rozycki  <macro@imgtec.com>
> 
> 	[BZ #21956]
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
> 	[subdir = misc] (sysdep_routines): Remove `mips16-syscall5',
> 	`mips16-syscall6' and `mips16-syscall7'.
> 	(CFLAGS-mips16-syscall5.c, CFLAGS-mips16-syscall6.c)
> 	(CFLAGS-mips16-syscall7.c): Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions (libc):
> 	Remove `__mips16_syscall5', `__mips16_syscall6' and 
> 	`__mips16_syscall7'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
> 	(__mips16_syscall0): Rename `__mips16_syscall_return' to 
> 	`__mips_syscall_return'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
> 	(__mips16_syscall1): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
> 	(__mips16_syscall2): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
> 	(__mips16_syscall3): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
> 	(__mips16_syscall4): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c:
> 	Remove.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
> 	(__mips16_syscall5): Expand to `__mips_syscall5' rather than 
> 	`__mips16_syscall5'.  Remove prototype.
> 	(__mips16_syscall6): Expand to `__mips_syscall6' rather than
> 	`__mips16_syscall6'.  Remove prototype.
> 	(__mips16_syscall7): Expand to `__mips_syscall7' rather than
> 	`__mips16_syscall7'.  Remove prototype.
> 	(__nomips16, __mips16_syscall_return): Move to...
> 	* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h 
> 	(__nomips16, __mips_syscall_return): ... here.
> 	[__mips16] (INTERNAL_SYSCALL_NCS): Rename 
> 	`__mips16_syscall_return' to `__mips_syscall_return'.
> 	[__mips16] (INTERNAL_SYSCALL_MIPS16): Pass `number' to
> 	`internal_syscall##nr'.
> 	[!__mips16] (INTERNAL_SYSCALL): Pass `SYS_ify (name)' to
> 	`internal_syscall##nr'.
> 	(FORCE_FRAME_POINTER): Remove.
> 	(__mips_syscall5): New prototype.
> 	(internal_syscall5): Rewrite to call `__mips_syscall5'.
> 	(__mips_syscall6): New prototype.
> 	(internal_syscall6): Rewrite to call `__mips_syscall6'.
> 	(__mips_syscall7): New prototype.
> 	(internal_syscall7): Rewrite to call `__mips_syscall7'.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S: New file.
> 	* sysdeps/unix/sysv/linux/mips/mips32/Makefile [subdir = misc]
> 	(sysdep_routines): Add libc-do-syscall.
> 	* sysdeps/unix/sysv/linux/mips/mips32/Versions (libc): Add
> 	`__mips_syscall5', `__mips_syscall6' and `__mips_syscall7'.
> 
> ---

Thanks for this new version, it looks good to me.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7
  2017-08-24 20:08                                                   ` Adhemerval Zanella
@ 2017-08-29 18:00                                                     ` Maciej W. Rozycki
  2017-08-30 20:33                                                       ` Aurelien Jarno
  2017-09-08 11:15                                                     ` MIPS: Standalone/inline assembly issues (was: MIPS/o32: Fix internal_syscall5/6/7) Maciej W. Rozycki
  1 sibling, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-29 18:00 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Aurelien Jarno, Joseph Myers, libc-alpha

On Thu, 24 Aug 2017, Adhemerval Zanella wrote:

> >  This update has passed regular MIPS and MIPS16 o32 regression testing, 
> > with no regressions.  OK to apply?
> 
> Ok from my side.

 I have applied it to master now; thanks to everyone involved.  As a fix 
for a functional regression (as demonstrated by `tst-rwlock15') do we want 
to have it in 2.25 and 2.26 as well?

 I'll continue the discussion about assembly optimisation separately.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-21 10:49                                             ` Maciej W. Rozycki
  2017-08-21 14:30                                               ` Adhemerval Zanella
  2017-08-22  8:25                                               ` [PATCH] mips/o32: fix internal_syscall5/6/7 Aurelien Jarno
@ 2017-08-30 15:35                                               ` Maciej W. Rozycki
  2017-08-30 20:33                                                 ` Aurelien Jarno
  2 siblings, 1 reply; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-08-30 15:35 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On Mon, 21 Aug 2017, Maciej W. Rozycki wrote:

>  Thanks.  There's indeed a bug in GAS, a MIPS16 path of execution has been 
> missed in the handling of this option.  I have a preliminary fix, however 
> I yet have to prepare test suite cases (originally the option was 
> mistakenly only covered by regular MIPS and microMIPS testing, which is 
> clearly why the MIPS16 case has been missed).  I expect this fix to be 
> included in the upcoming 2.29.1 release, and also backported to 2.28 
> (although no new 2.28 release is scheduled).

 The fix has passed QA and has now been applied as commit 278fcf38584d 
("MIPS/GAS: Also respect `-mignore-branch-isa' with MIPS16 code") to 
master, and backported to 2.29 and 2.28.  A binutils 2.29.1 release is 
scheduled mid September.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] mips/o32: fix internal_syscall5/6/7
  2017-08-30 15:35                                               ` Maciej W. Rozycki
@ 2017-08-30 20:33                                                 ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-30 20:33 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On 2017-08-30 16:35, Maciej W. Rozycki wrote:
> On Mon, 21 Aug 2017, Maciej W. Rozycki wrote:
> 
> >  Thanks.  There's indeed a bug in GAS, a MIPS16 path of execution has been 
> > missed in the handling of this option.  I have a preliminary fix, however 
> > I yet have to prepare test suite cases (originally the option was 
> > mistakenly only covered by regular MIPS and microMIPS testing, which is 
> > clearly why the MIPS16 case has been missed).  I expect this fix to be 
> > included in the upcoming 2.29.1 release, and also backported to 2.28 
> > (although no new 2.28 release is scheduled).
> 
>  The fix has passed QA and has now been applied as commit 278fcf38584d 
> ("MIPS/GAS: Also respect `-mignore-branch-isa' with MIPS16 code") to 
> master, and backported to 2.29 and 2.28.  A binutils 2.29.1 release is 
> scheduled mid September.

Thanks!

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7
  2017-08-29 18:00                                                     ` Maciej W. Rozycki
@ 2017-08-30 20:33                                                       ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2017-08-30 20:33 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Adhemerval Zanella, Joseph Myers, libc-alpha

On 2017-08-29 19:00, Maciej W. Rozycki wrote:
> On Thu, 24 Aug 2017, Adhemerval Zanella wrote:
> 
> > >  This update has passed regular MIPS and MIPS16 o32 regression testing, 
> > > with no regressions.  OK to apply?
> > 
> > Ok from my side.
> 
>  I have applied it to master now; thanks to everyone involved.  As a fix 
> for a functional regression (as demonstrated by `tst-rwlock15') do we want 
> to have it in 2.25 and 2.26 as well?

That might be a good idea. At least Debian is now using this patch, and
we regularly pull from the upstream stable branch.

I can take care of that in the next days.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* MIPS: Standalone/inline assembly issues (was: MIPS/o32: Fix internal_syscall5/6/7)
  2017-08-24 20:08                                                   ` Adhemerval Zanella
  2017-08-29 18:00                                                     ` Maciej W. Rozycki
@ 2017-09-08 11:15                                                     ` Maciej W. Rozycki
  1 sibling, 0 replies; 53+ messages in thread
From: Maciej W. Rozycki @ 2017-09-08 11:15 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Aurelien Jarno, Joseph Myers, libc-alpha

Hi Adhemerval,

 Following up on the issues with MIPS assembly, both standalone sources 
and (as you have observed) inline pieces.

On Thu, 24 Aug 2017, Adhemerval Zanella wrote:

> >  NB while looking at it I've noticed we do not pass any `-O' optimisation 
> > flag to the GCC driver while building .S files.  That in turn means no GAS 
> > branch optimisation is enabled and consequently branch delay slots are not 
> > scheduled in `reorder' code by the assembler (the MIPS/GCC `asm' spec has 
> > this `%{noasmopt:-O0; O0|fno-delayed-branch:-O1; O*:-O2; :-O1}'), causing 
> > delay slots wasted with a NOP where a preceding instruction could be moved 
> > instead and save some code space.  It can be easily observed by comparing 
> > code in the compiler-generated MIPS16 wrappers vs the new standalone 
> > assembly regular MIPS (and microMIPS) wrappers.
> > 
> >  Was this a deliberate choice made sometime to have greater control over 
> > code produced or just an accidental oversight?
> 
> I am not sure if it was deliberate, but my guess it is not really an issue
> for most architectures since afaik any '-O' optimization flag along with
> .S files usually does not turn any extra flags (SUBTARGET_ASM_SPEC seems
> to be define only in a handful architectures on gcc).
> 
> For MIPS I think we can set ASFLAGS to O1 since it should enable the
> required optimization, unless there is an specific gas option to enable
> it (which I couldn't find). Another option would be to filter out CFLAGS
> and extract the optimization level used for ASFLAGS, but I think for this
> specific issue it should extra non required complexity.

 I wonder if just passing base CFLAGS unchanged would do?

 Anyone please correct me if I am wrong, but I believe that when building 
an assembly source the GCC driver is supposed to take all the usual 
options it accepts for C, and just silently discard those which have no 
effect at the assembly stage.  This is in principle so that you can do 
say:

$ gcc $CFLAGS -S foo.c
$ gcc $CFLAGS -c foo.s

and get exactly the same result (sans having the foo.s byproduct) as with:

$ gcc $CFLAGS -c foo.c

We already have to pass flags to GAS which affect multilib selection, such 
as `-EL' or `-mabi=', so why bother filtering?

> Another thing I noticed is another possible optimization for size would
> to use ".set push", ".set noreorder", and ".set pop" instead of current
> code as below.
> 
[...]
> 
> diff --git a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> index dadfa18..96867de 100644
> --- a/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> +++ b/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
> @@ -159,10 +159,11 @@ union __mips_syscall_return
>         register long __v0 asm ("$2");                                  \
>         register long __a3 asm ("$7");                                  \
>         __asm__ volatile (                                              \
> -       ".set\tnoreorder\n\t"                                           \
> +       ".set push\n\t"                                                 \
> +       ".set noreorder\n\t"                                            \
>         v0_init                                                         \
>         "syscall\n\t"                                                   \
> -       ".set reorder"                                                  \
> +       ".set pop"                                                      \
>         : "=r" (__v0), "=r" (__a3)                                      \
>         : input                                                         \
>         : __SYSCALL_CLOBBERS);                                          \

 [Etc...] If this makes any difference (and your size change observations 
do not merely come from updated $ASFLAGS), then this is very weird and I 
think it needs investigating.

 The thing is for backwards compatibility at the place where an inline asm 
is expanded GCC sets the `reorder' mode (unless it's been set already), as 
this is what was the GCC's mode of assembly generation before support for 
explicit relocations was added back in early-mid 2000s.  If switching back 
to the `noreorder' is required by GCC at the conclusion of the asm, it has 
to do the switch itself.

 So there should actually be no difference in machine code generated 
between current code and your patched version of sysdep.h.  If there is, 
then something hairy is going on.  I'll try experimenting with your patch 
and will check what's happening here.

  Maciej

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2017-09-08 11:15 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-15 11:53 [PATCH] mips/o32: fix internal_syscall5/6/7 Aurelien Jarno
2017-08-15 12:03 ` Andreas Schwab
2017-08-15 13:06   ` Adhemerval Zanella
2017-08-15 16:18     ` Aurelien Jarno
2017-08-15 16:26     ` Joseph Myers
2017-08-15 19:34       ` Aurelien Jarno
2017-08-15 19:54         ` Joseph Myers
2017-08-15 20:09           ` Aurelien Jarno
2017-08-15 20:21             ` Joseph Myers
2017-08-15 20:41               ` Aurelien Jarno
2017-08-16 13:26               ` Maciej W. Rozycki
2017-08-16 13:44                 ` Joseph Myers
2017-08-16 14:13                   ` Adhemerval Zanella
2017-08-16 14:47                     ` Maciej W. Rozycki
2017-08-16 14:54                       ` Adhemerval Zanella
2017-08-16 16:12                         ` Aurelien Jarno
2017-08-16 21:08                         ` Aurelien Jarno
2017-08-16 22:11                           ` Maciej W. Rozycki
2017-08-16 15:18                     ` Aurelien Jarno
2017-08-16 21:15                     ` Aurelien Jarno
2017-08-17 13:33                       ` Adhemerval Zanella
2017-08-16 14:32                   ` Maciej W. Rozycki
2017-08-16 14:47                     ` Joseph Myers
2017-08-17 16:17                       ` Maciej W. Rozycki
2017-08-17 17:25                         ` Adhemerval Zanella
2017-08-17 17:32                           ` Joseph Myers
2017-08-17 20:34                           ` Maciej W. Rozycki
2017-08-17 21:09                             ` Adhemerval Zanella
2017-08-17 21:20                               ` Aurelien Jarno
2017-08-17 22:05                                 ` Adhemerval Zanella
2017-08-17 22:34                                 ` Maciej W. Rozycki
2017-08-18  7:16                                   ` Aurelien Jarno
2017-08-18  9:32                                     ` Maciej W. Rozycki
2017-08-18 17:45                                       ` Aurelien Jarno
2017-08-18 22:27                                         ` Maciej W. Rozycki
2017-08-19 12:45                                           ` Aurelien Jarno
2017-08-21 10:49                                             ` Maciej W. Rozycki
2017-08-21 14:30                                               ` Adhemerval Zanella
2017-08-24 13:27                                                 ` [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7 Maciej W. Rozycki
2017-08-24 20:08                                                   ` Adhemerval Zanella
2017-08-29 18:00                                                     ` Maciej W. Rozycki
2017-08-30 20:33                                                       ` Aurelien Jarno
2017-09-08 11:15                                                     ` MIPS: Standalone/inline assembly issues (was: MIPS/o32: Fix internal_syscall5/6/7) Maciej W. Rozycki
2017-08-26  8:00                                                   ` [PATCH v5] [BZ #21956] MIPS/o32: Fix internal_syscall5/6/7 Aurelien Jarno
2017-08-22  8:25                                               ` [PATCH] mips/o32: fix internal_syscall5/6/7 Aurelien Jarno
2017-08-22 10:07                                                 ` Maciej W. Rozycki
2017-08-30 15:35                                               ` Maciej W. Rozycki
2017-08-30 20:33                                                 ` Aurelien Jarno
2017-08-17 21:34                               ` Aurelien Jarno
2017-08-17 21:47                               ` Maciej W. Rozycki
2017-08-17 18:18                         ` Aurelien Jarno
2017-08-15 16:16   ` Aurelien Jarno
2017-08-15 12:17 ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).