public inbox for libffi-discuss@sourceware.org
 help / color / mirror / Atom feed
* ABI incompatibility between libffi and LLVM-generated callees
@ 2013-01-30  7:30 Mark H Weaver
  2013-01-30 11:25 ` Andrew Haley
  0 siblings, 1 reply; 13+ messages in thread
From: Mark H Weaver @ 2013-01-30  7:30 UTC (permalink / raw)
  To: libffi-discuss

Hello all,

I'm a developer of Guile 2.0, which uses libffi, and we've received
multiple bug reports of test failures on OS X related to libffi.  We
recently discovered the root cause of these failures.

Functions compiled using LLVM for the x86_64 architecture assume that
callers will sign-extend integer arguments less than 64-bits that are
passed in registers.  Indeed, both GCC and LLVM generate code that does
this sign-extension in callers, so it seems that their assumption is
reasonable.

The only problem is that libffi does *not* sign-extend integer arguments
passed in registers on x86_64.  Instead it zero-extends them, even if
they are signed.  This works when calling GCC-compiled code, because GCC
performs the sign-extension in the callee as well as the caller, but
LLVM-compiled code is more strict.

Having read section 3.2.3 ("Parameter Passing") of the SysV x86_64 ABI,
which is admittedly somewhat vague on this issue, it is far from clear
to me that the LLVM behavior is a bug.  It seems to me that callers
should sign-extend to be on the safe side.

I believe this is a case of GCC following Postel's Law of being
permissive in what one accepts, whereas LLVM is more strict in its
requirements.

However you might choose to interpret the ABI requirements, the fact is
that LLVM-compiled libraries are widely deployed in Mac OS X, and
libffi is currently unable to properly call such libraries properly.

For more details on this, see:

  http://debbugs.gnu.org/cgi/bugreport.cgi?bug=13342#26
  http://debbugs.gnu.org/cgi/bugreport.cgi?bug=13342#32

Comment #26 shows the difference in assembly code generated by GCC and
LLVM for functions that accept int8_t arguments.

Also see the recent report of libffi test failures on OS X:

  http://sourceware.org/ml/libffi-discuss/2012/msg00162.html

I suggest that libffi should be changed to sign-extend.
What do you think?

    Regards,
      Mark

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-01-30  7:30 ABI incompatibility between libffi and LLVM-generated callees Mark H Weaver
@ 2013-01-30 11:25 ` Andrew Haley
  2013-01-30 14:23   ` Andrew Haley
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Haley @ 2013-01-30 11:25 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: libffi-discuss

On 01/30/2013 07:30 AM, Mark H Weaver wrote:
> Hello all,
> 
> I'm a developer of Guile 2.0, which uses libffi, and we've received
> multiple bug reports of test failures on OS X related to libffi.  We
> recently discovered the root cause of these failures.
> 
> Functions compiled using LLVM for the x86_64 architecture assume that
> callers will sign-extend integer arguments less than 64-bits that are
> passed in registers.  Indeed, both GCC and LLVM generate code that does
> this sign-extension in callers, so it seems that their assumption is
> reasonable.
> 
> The only problem is that libffi does *not* sign-extend integer arguments
> passed in registers on x86_64.  Instead it zero-extends them, even if
> they are signed.  This works when calling GCC-compiled code, because GCC
> performs the sign-extension in the callee as well as the caller, but
> LLVM-compiled code is more strict.
> 
> Having read section 3.2.3 ("Parameter Passing") of the SysV x86_64 ABI,
> which is admittedly somewhat vague on this issue, it is far from clear
> to me that the LLVM behavior is a bug.

Isn't it?  LLVM is supposed to be adding an int8_t and an int64_t, but
it adds two int64_ts.

I suspect that it is indeed an LLVM bug, but I've asked the psABI
authors to comment.

> It seems to me that callers
> should sign-extend to be on the safe side.
> 
> I believe this is a case of GCC following Postel's Law of being
> permissive in what one accepts, whereas LLVM is more strict in its
> requirements.
> 
> However you might choose to interpret the ABI requirements, the fact is
> that LLVM-compiled libraries are widely deployed in Mac OS X, and
> libffi is currently unable to properly call such libraries properly.
> 
> For more details on this, see:
> 
>   http://debbugs.gnu.org/cgi/bugreport.cgi?bug=13342#26
>   http://debbugs.gnu.org/cgi/bugreport.cgi?bug=13342#32
> 
> Comment #26 shows the difference in assembly code generated by GCC and
> LLVM for functions that accept int8_t arguments.
> 
> Also see the recent report of libffi test failures on OS X:
> 
>   http://sourceware.org/ml/libffi-discuss/2012/msg00162.html
> 
> I suggest that libffi should be changed to sign-extend.
> What do you think?

I don't suppose it would hurt.

Andrew.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-01-30 11:25 ` Andrew Haley
@ 2013-01-30 14:23   ` Andrew Haley
  2013-01-31  0:05     ` Mark H Weaver
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Haley @ 2013-01-30 14:23 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: libffi-discuss

On 01/30/2013 11:25 AM, Andrew Haley wrote:
> On 01/30/2013 07:30 AM, Mark H Weaver wrote:
>> Hello all,
>>
>> I'm a developer of Guile 2.0, which uses libffi, and we've received
>> multiple bug reports of test failures on OS X related to libffi.  We
>> recently discovered the root cause of these failures.
>>
>> Functions compiled using LLVM for the x86_64 architecture assume that
>> callers will sign-extend integer arguments less than 64-bits that are
>> passed in registers.  Indeed, both GCC and LLVM generate code that does
>> this sign-extension in callers, so it seems that their assumption is
>> reasonable.
>>
>> The only problem is that libffi does *not* sign-extend integer arguments
>> passed in registers on x86_64.  Instead it zero-extends them, even if
>> they are signed.  This works when calling GCC-compiled code, because GCC
>> performs the sign-extension in the callee as well as the caller, but
>> LLVM-compiled code is more strict.
>>
>> Having read section 3.2.3 ("Parameter Passing") of the SysV x86_64 ABI,
>> which is admittedly somewhat vague on this issue, it is far from clear
>> to me that the LLVM behavior is a bug.
> 
> Isn't it?  LLVM is supposed to be adding an int8_t and an int64_t, but
> it adds two int64_ts.
> 
> I suspect that it is indeed an LLVM bug, but I've asked the psABI
> authors to comment.

It is an LLVM bug.  See http://gcc.gnu.org/ml/gcc/2013-01/msg00448.html

Andrew.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-01-30 14:23   ` Andrew Haley
@ 2013-01-31  0:05     ` Mark H Weaver
  2013-01-31  5:52       ` Anthony Green
  2013-01-31  9:12       ` Andrew Haley
  0 siblings, 2 replies; 13+ messages in thread
From: Mark H Weaver @ 2013-01-31  0:05 UTC (permalink / raw)
  To: Andrew Haley; +Cc: libffi-discuss

Andrew Haley <aph@redhat.com> writes:
> It is an LLVM bug.  See http://gcc.gnu.org/ml/gcc/2013-01/msg00448.html

Even if this is true, and even if the LLVM developers act quickly to fix
this bug, the fact remains that LLVM-compiled code will be deployed on a
massive scale for many years to come, and it is the users of libffi who
will be hurt by this.  It will be hard to convince our users that it's
anyone else's fault when even the psABI doesn't spell it out clearly,
and when virtually everything else interfaces properly with LLVM code.

Ultimately, this just makes Guile (and other users of libffi) look bad.

Guile requires a foreign-function interface that works properly on Macs,
even if that means working around LLVM bugs for a several years until
the improperly-compiled code is sufficiently rare.  I'm sure that many
other users of libffi would feel similarly.

So, are you willing to work around this issue for the sake of your
users?

   Regards,
     Mark

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-01-31  0:05     ` Mark H Weaver
@ 2013-01-31  5:52       ` Anthony Green
  2013-02-03  3:29         ` Mark H Weaver
  2013-01-31  9:12       ` Andrew Haley
  1 sibling, 1 reply; 13+ messages in thread
From: Anthony Green @ 2013-01-31  5:52 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andrew Haley, libffi-discuss

On Wed, Jan 30, 2013 at 7:04 PM, Mark H Weaver <mhw@netris.org> wrote:
> So, are you willing to work around this issue for the sake of your
> users?

I'm ok with a work-around.  I can see where the change has to happen
(around line 485 of ffi64.c), but I'm travelling right now and won't
have a chance to try out a work-around for a few days.
I'd be happy to merge a patch from somebody else in the meantime.

Thanks for looking into this, Mark and Andrew.  I have a happy history
with Guile, so I'm glad to help if I can.

AG

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-01-31  0:05     ` Mark H Weaver
  2013-01-31  5:52       ` Anthony Green
@ 2013-01-31  9:12       ` Andrew Haley
  2013-01-31  9:40         ` Mark H Weaver
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Haley @ 2013-01-31  9:12 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: libffi-discuss

On 01/31/2013 12:04 AM, Mark H Weaver wrote:
> Andrew Haley <aph@redhat.com> writes:
>> It is an LLVM bug.  See http://gcc.gnu.org/ml/gcc/2013-01/msg00448.html
> 
> Even if this is true, and even if the LLVM developers act quickly to fix
> this bug, the fact remains that LLVM-compiled code will be deployed on a
> massive scale for many years to come, and it is the users of libffi who
> will be hurt by this.  It will be hard to convince our users that it's
> anyone else's fault when even the psABI doesn't spell it out clearly,
> and when virtually everything else interfaces properly with LLVM code.
> 
> Ultimately, this just makes Guile (and other users of libffi) look bad.
> 
> Guile requires a foreign-function interface that works properly on Macs,
> even if that means working around LLVM bugs for a several years until
> the improperly-compiled code is sufficiently rare.  I'm sure that many
> other users of libffi would feel similarly.

Sure, but why don't you just submit a patch?  You have the test
systems, and you've carefully analysed the problem.  It's much
harder if someone has to write then throw it over the wall for
someone else to try.

Andrew.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-01-31  9:12       ` Andrew Haley
@ 2013-01-31  9:40         ` Mark H Weaver
  2013-01-31  9:44           ` Andrew Haley
  0 siblings, 1 reply; 13+ messages in thread
From: Mark H Weaver @ 2013-01-31  9:40 UTC (permalink / raw)
  To: Andrew Haley; +Cc: libffi-discuss

Hi Andrew,

Andrew Haley <aph@redhat.com> writes:
> On 01/31/2013 12:04 AM, Mark H Weaver wrote:
>> Guile requires a foreign-function interface that works properly on Macs,
>> even if that means working around LLVM bugs for a several years until
>> the improperly-compiled code is sufficiently rare.  I'm sure that many
>> other users of libffi would feel similarly.
>
> Sure, but why don't you just submit a patch?  You have the test
> systems, and you've carefully analysed the problem.  It's much
> harder if someone has to write then throw it over the wall for
> someone else to try.

I'm willing to work on a patch if no one else wants to do it, though I'm
not very familiar with libffi internals, or even how to use its API.
I've only used it via Guile.  My analysis of the problem did not go much
beyond noticing that libffi zero-extends register arguments whereas
LLVM-compiled code expects them to be sign-extended.

Also, for the record, I use GNU/Linux and GCC exclusively, and I don't
have any Mac systems to test on.  However, I did recently compile LLVM
on Debian for purposes of reproducing this bug, so I suppose that makes
it easier for me to test.

I apologize if I sounded as if I were demanding that someone else do
this work.  I'm pleased to hear that a patch would be accepted (assuming
of course that it's deemed to be of sufficient quality), and that's all
that I can reasonably ask for.

    Thanks,
      Mark

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-01-31  9:40         ` Mark H Weaver
@ 2013-01-31  9:44           ` Andrew Haley
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Haley @ 2013-01-31  9:44 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: libffi-discuss

On 01/31/2013 09:40 AM, Mark H Weaver wrote:
> Also, for the record, I use GNU/Linux and GCC exclusively, and I don't
> have any Mac systems to test on.  However, I did recently compile LLVM
> on Debian for purposes of reproducing this bug, so I suppose that makes
> it easier for me to test.

Ah, OK.

> I apologize if I sounded as if I were demanding that someone else do
> this work.

Well, it did, rather!

> I'm pleased to hear that a patch would be accepted (assuming of
> course that it's deemed to be of sufficient quality), and that's all
> that I can reasonably ask for.

This is a pretty helpful community, and it doesn't sound like a hugely
difficult thing to get done.  It was worth finding out where the fault
lay, though: IMO knowing is always better than not knowing.

Andrew.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-01-31  5:52       ` Anthony Green
@ 2013-02-03  3:29         ` Mark H Weaver
  2013-02-03  9:10           ` Andrew Haley
  0 siblings, 1 reply; 13+ messages in thread
From: Mark H Weaver @ 2013-02-03  3:29 UTC (permalink / raw)
  To: Anthony Green; +Cc: Andrew Haley, libffi-discuss

[-- Attachment #1: Type: text/plain, Size: 503 bytes --]

Anthony Green <green@moxielogic.com> writes:
> I'm ok with a work-around.  I can see where the change has to happen
> (around line 485 of ffi64.c), but I'm travelling right now and won't
> have a chance to try out a work-around for a few days.
> I'd be happy to merge a patch from somebody else in the meantime.

Okay, I've attached a patch to work around the LLVM bug.  It has been
tested on Mac OS, and is confirmed to fix the problem.  Comments and
suggestions welcome.

   Many thanks,
      Mark



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: [PATCH] Sign-extend integer arguments passed via general purpose registers --]
[-- Type: text/x-diff, Size: 1909 bytes --]

From 967c7931ae6d8facd0ef33799015a403b7059128 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw@netris.org>
Date: Sat, 2 Feb 2013 06:42:43 -0500
Subject: [PATCH] Sign-extend integer arguments passed via general purpose
 registers.

---
 ChangeLog       |    5 +++++
 src/x86/ffi64.c |   21 +++++++++++++++++++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 815156c..b9da7b4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2013-02-02  Mark H Weaver <mhw@netris.org>
+
+	* src/x86/ffi64.c (ffi_call): Sign-extend integer arguments passed
+	via general purpose registers.
+
 2013-01-21  Nathan Rossi <nathan.rossi@xilinx.com>
 
 	* README: Add MicroBlaze details.
diff --git a/src/x86/ffi64.c b/src/x86/ffi64.c
index b8a823d..2014af2 100644
--- a/src/x86/ffi64.c
+++ b/src/x86/ffi64.c
@@ -484,8 +484,25 @@ ffi_call (ffi_cif *cif, void (*fn)(void), void *rvalue, void **avalue)
 		{
 		case X86_64_INTEGER_CLASS:
 		case X86_64_INTEGERSI_CLASS:
-		  reg_args->gpr[gprcount] = 0;
-		  memcpy (&reg_args->gpr[gprcount], a, size < 8 ? size : 8);
+		  /* Sign-extend integer arguments passed in general
+		     purpose registers, to cope with the fact that
+		     LLVM incorrectly assumes that this will be done
+		     (the x86-64 PS ABI does not specify this). */
+		  switch (arg_types[i]->type)
+		    {
+		    case FFI_TYPE_SINT8:
+		      *(SINT64 *)&reg_args->gpr[gprcount] = (SINT64) *((SINT8 *) a);
+		      break;
+		    case FFI_TYPE_SINT16:
+		      *(SINT64 *)&reg_args->gpr[gprcount] = (SINT64) *((SINT16 *) a);
+		      break;
+		    case FFI_TYPE_SINT32:
+		      *(SINT64 *)&reg_args->gpr[gprcount] = (SINT64) *((SINT32 *) a);
+		      break;
+		    default:
+		      reg_args->gpr[gprcount] = 0;
+		      memcpy (&reg_args->gpr[gprcount], a, size < 8 ? size : 8);
+		    }
 		  gprcount++;
 		  break;
 		case X86_64_SSE_CLASS:
-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-02-03  3:29         ` Mark H Weaver
@ 2013-02-03  9:10           ` Andrew Haley
  2013-02-03 17:01             ` Mark H Weaver
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Haley @ 2013-02-03  9:10 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Anthony Green, libffi-discuss

On 02/03/2013 03:28 AM, Mark H Weaver wrote:
> I've attached a patch to work around the LLVM bug.  It has been
> tested on Mac OS, and is confirmed to fix the problem.  Comments and
> suggestions welcome.

That looks right.  However, I wonder if the same bug may bite with
unsigned types.  I suppose it's unlikely, but it is theoretically
possible.

Andrew.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-02-03  9:10           ` Andrew Haley
@ 2013-02-03 17:01             ` Mark H Weaver
  2013-02-03 17:59               ` Andrew Haley
  0 siblings, 1 reply; 13+ messages in thread
From: Mark H Weaver @ 2013-02-03 17:01 UTC (permalink / raw)
  To: Andrew Haley; +Cc: Anthony Green, libffi-discuss

Andrew Haley <aph@redhat.com> writes:

> On 02/03/2013 03:28 AM, Mark H Weaver wrote:
>> I've attached a patch to work around the LLVM bug.  It has been
>> tested on Mac OS, and is confirmed to fix the problem.  Comments and
>> suggestions welcome.
>
> That looks right.  However, I wonder if the same bug may bite with
> unsigned types.  I suppose it's unlikely, but it is theoretically
> possible.

LLVM assumes that unsigned integers will be zero-extended by the caller,
which is what this patch does.  For example:

--8<---------------cut here---------------start------------->8---
#include <stdint.h>

int32_t external_callee (int8_t a, uint8_t b);

int32_t callee (int8_t a, uint8_t b)
{
  return a + b;
}

int32_t caller (void)
{
  return external_callee (-1, -1);
}
--8<---------------cut here---------------end--------------->8---

Clang 3.2 compiles this as follows (heavily excerpted):

--8<---------------cut here---------------start------------->8---
callee:                                 # @callee
	addl	%esi, %edi
	movl	%edi, %eax
	ret

caller:                                 # @caller
	movl	$-1, %edi
	movl	$255, %esi
	jmp	external_callee         # TAILCALL
--8<---------------cut here---------------end--------------->8---

When I wrote simply that LLVM assumes that integer arguments will be
sign-extended, that was based on the notion that unsigned integers have
an implicit zero sign bit, so sign-extending them is the same as
zero-extending them.  Sorry if that was unclear.

    Regards,
      Mark

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-02-03 17:01             ` Mark H Weaver
@ 2013-02-03 17:59               ` Andrew Haley
  2013-02-03 18:54                 ` Mark H Weaver
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Haley @ 2013-02-03 17:59 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Anthony Green, libffi-discuss

On 02/03/2013 05:01 PM, Mark H Weaver wrote:
> LLVM assumes that unsigned integers will be zero-extended by the caller,
> which is what this patch does.

Does it?  I didn't see that.  I just saw a copy of the word,
regardless of whether the source of the data is a full word.
Maybe I'm mistaken.

Andrew.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ABI incompatibility between libffi and LLVM-generated callees
  2013-02-03 17:59               ` Andrew Haley
@ 2013-02-03 18:54                 ` Mark H Weaver
  0 siblings, 0 replies; 13+ messages in thread
From: Mark H Weaver @ 2013-02-03 18:54 UTC (permalink / raw)
  To: Andrew Haley; +Cc: Anthony Green, libffi-discuss

Andrew Haley <aph@redhat.com> writes:

> On 02/03/2013 05:01 PM, Mark H Weaver wrote:
>> LLVM assumes that unsigned integers will be zero-extended by the caller,
>> which is what this patch does.
>
> Does it?  I didn't see that.  I just saw a copy of the word,
> regardless of whether the source of the data is a full word.
> Maybe I'm mistaken.

Consider the case where the caller passes -1 for the int8_t and 255 for
the uint8_t.  Since the LLVM callee simply accesses both arguments as if
they were both int64_t, it is important to sign-extend the -1 and
zero-extend the 255.  If the caller sign-extended both arguments, then
the callee would return -2, which is incorrect.  The sum should be 254.

      Mark

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-02-03 18:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-30  7:30 ABI incompatibility between libffi and LLVM-generated callees Mark H Weaver
2013-01-30 11:25 ` Andrew Haley
2013-01-30 14:23   ` Andrew Haley
2013-01-31  0:05     ` Mark H Weaver
2013-01-31  5:52       ` Anthony Green
2013-02-03  3:29         ` Mark H Weaver
2013-02-03  9:10           ` Andrew Haley
2013-02-03 17:01             ` Mark H Weaver
2013-02-03 17:59               ` Andrew Haley
2013-02-03 18:54                 ` Mark H Weaver
2013-01-31  9:12       ` Andrew Haley
2013-01-31  9:40         ` Mark H Weaver
2013-01-31  9:44           ` Andrew Haley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).