Option to make unsigned->signed conversion always well-defined?

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Option to make unsigned->signed conversion always well-defined?
@ 2011-10-05 22:56 Ulf Magnusson
  2011-10-06  8:25 ` Ulf Magnusson
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Ulf Magnusson @ 2011-10-05 22:56 UTC (permalink / raw)
  To: gcc

Hi,

I've been experimenting with different methods for emulating the
signed overflow of an 8-bit CPU. The method I've found that seems to
generate the most efficient code on both ARM and x86 is

bool overflow(unsigned int a, unsigned int b) {
    const unsigned int sum = (int8_t)a + (int8_t)b;
    return (int8_t)sum != sum;
}

(The real function would probably be 'inline', of course. Regs are
stored in overlong variables, hence 'unsigned int'.)

Looking at the spec, it unfortunately seems the behavior of this
function is undefined, as it relies on signed int addition wrapping,
and that (int8_t)sum truncates bits. Is there some way to make this
guaranteed safe with GCC without resorting to inline asm? Locally
enabling -fwrap takes care of the addition, but that still leaves the
conversion.

/Ulf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-05 22:56 Option to make unsigned->signed conversion always well-defined? Ulf Magnusson
@ 2011-10-06  8:25 ` Ulf Magnusson
  2011-10-06  9:04 ` Pedro Pedruzzi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Ulf Magnusson @ 2011-10-06  8:25 UTC (permalink / raw)
  To: gcc

On Wed, Oct 5, 2011 at 10:11 PM, Ulf Magnusson <ulfalizer@gmail.com> wrote:
> Hi,
>
> I've been experimenting with different methods for emulating the
> signed overflow of an 8-bit CPU. The method I've found that seems to
> generate the most efficient code on both ARM and x86 is
>
> bool overflow(unsigned int a, unsigned int b) {
>    const unsigned int sum = (int8_t)a + (int8_t)b;
>    return (int8_t)sum != sum;
> }
>
> (The real function would probably be 'inline', of course. Regs are
> stored in overlong variables, hence 'unsigned int'.)
>
> Looking at the spec, it unfortunately seems the behavior of this
> function is undefined, as it relies on signed int addition wrapping,
> and that (int8_t)sum truncates bits. Is there some way to make this
> guaranteed safe with GCC without resorting to inline asm? Locally
> enabling -fwrap takes care of the addition, but that still leaves the
> conversion.
>
> /Ulf
>

Is *((int8_t*)&sum) safe (assuming little endian)? Unfortunately that
seems to generate worse code. On X86 it generates the following (GCC
4.5.2):

00000050 <_Z9overflow4jj>:
  50:	83 ec 10             	sub    $0x10,%esp
  53:	0f be 54 24 18       	movsbl 0x18(%esp),%edx
  58:	0f be 44 24 14       	movsbl 0x14(%esp),%eax
  5d:	8d 04 02             	lea    (%edx,%eax,1),%eax
  60:	0f be d0             	movsbl %al,%edx
  63:	39 d0                	cmp    %edx,%eax
  65:	0f 95 c0             	setne  %al
  68:	83 c4 10             	add    $0x10,%esp
  6b:	c3                   	ret

With the straight (int8_t) cast you get

  50:	0f be 54 24 08       	movsbl 0x8(%esp),%edx
  55:	0f be 44 24 04       	movsbl 0x4(%esp),%eax
  5a:	8d 04 02             	lea    (%edx,%eax,1),%eax
  5d:	0f be d0             	movsbl %al,%edx
  60:	39 c2                	cmp    %eax,%edx
  62:	0f 95 c0             	setne  %al
  65:	c3                   	ret

What's with the extra add/sub of ESP?

/Ulf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-05 22:56 Option to make unsigned->signed conversion always well-defined? Ulf Magnusson
  2011-10-06  8:25 ` Ulf Magnusson
@ 2011-10-06  9:04 ` Pedro Pedruzzi
  2011-10-06 10:52   ` Ulf Magnusson
       [not found] ` <4E8DBF3C.1020700@redhat.com>
  2011-10-06 22:46 ` Florian Weimer
  3 siblings, 1 reply; 18+ messages in thread
From: Pedro Pedruzzi @ 2011-10-06  9:04 UTC (permalink / raw)
  To: gcc

Em 05-10-2011 17:11, Ulf Magnusson escreveu:
> Hi,
> 
> I've been experimenting with different methods for emulating the
> signed overflow of an 8-bit CPU.

You would like to check whether a 8-bit signed addition will overflow or
not, given the two operands. Is that correct?

As you used the word `emulating', I am assuming that your function will
not run by the mentioned CPU.

Does this 8-bit CPU use two's complement representation?

> The method I've found that seems to
> generate the most efficient code on both ARM and x86 is
> 
> bool overflow(unsigned int a, unsigned int b) {
>     const unsigned int sum = (int8_t)a + (int8_t)b;
>     return (int8_t)sum != sum;
> }
> 
> (The real function would probably be 'inline', of course. Regs are
> stored in overlong variables, hence 'unsigned int'.)
> 
> Looking at the spec, it unfortunately seems the behavior of this
> function is undefined, as it relies on signed int addition wrapping,
> and that (int8_t)sum truncates bits. Is there some way to make this
> guaranteed safe with GCC without resorting to inline asm? Locally
> enabling -fwrap takes care of the addition, but that still leaves the
> conversion.

I believe the cast from unsigned int to int8_t is implementation-defined
for values that can't be represented in int8_t (e.g. 0xff). A kind of
`undefined behavior' as well.

I tried:

bool overflow(unsigned int a, unsigned int b) {
    const unsigned int sum = a + b;
    return ((a & 0x80) == (b & 0x80)) && ((a & 0x80) != (sum & 0x80));
}

But it is not as efficient as yours.

-- 
Pedro Pedruzzi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-06  9:04 ` Pedro Pedruzzi
@ 2011-10-06 10:52   ` Ulf Magnusson
  2011-10-06 14:42     ` Ulf Magnusson
  0 siblings, 1 reply; 18+ messages in thread
From: Ulf Magnusson @ 2011-10-06 10:52 UTC (permalink / raw)
  To: Pedro Pedruzzi; +Cc: gcc

On Thu, Oct 6, 2011 at 12:55 AM, Pedro Pedruzzi
<pedro.pedruzzi@gmail.com> wrote:
> Em 05-10-2011 17:11, Ulf Magnusson escreveu:
>> Hi,
>>
>> I've been experimenting with different methods for emulating the
>> signed overflow of an 8-bit CPU.
>
> You would like to check whether a 8-bit signed addition will overflow or
> not, given the two operands. Is that correct?
>
> As you used the word `emulating', I am assuming that your function will
> not run by the mentioned CPU.
>

No, it'll most likely only run on systems with a wider bitness.

> Does this 8-bit CPU use two's complement representation?

Yes, and the criterion for signed overflow is "both numbers have the
same sign, but the sign of the sum is different". Should have made
that more clear.

>
>> The method I've found that seems to
>> generate the most efficient code on both ARM and x86 is
>>
>> bool overflow(unsigned int a, unsigned int b) {
>>     const unsigned int sum = (int8_t)a + (int8_t)b;
>>     return (int8_t)sum != sum;
>> }
>>
>> (The real function would probably be 'inline', of course. Regs are
>> stored in overlong variables, hence 'unsigned int'.)
>>
>> Looking at the spec, it unfortunately seems the behavior of this
>> function is undefined, as it relies on signed int addition wrapping,
>> and that (int8_t)sum truncates bits. Is there some way to make this
>> guaranteed safe with GCC without resorting to inline asm? Locally
>> enabling -fwrap takes care of the addition, but that still leaves the
>> conversion.
>
> I believe the cast from unsigned int to int8_t is implementation-defined
> for values that can't be represented in int8_t (e.g. 0xff). A kind of
> `undefined behavior' as well.
>
> I tried:
>
> bool overflow(unsigned int a, unsigned int b) {
>    const unsigned int sum = a + b;
>    return ((a & 0x80) == (b & 0x80)) && ((a & 0x80) != (sum & 0x80));
> }
>
> But it is not as efficient as yours.
>
> --
> Pedro Pedruzzi
>

Yeah, I tried similar bit-trickery along the lines of

bool overflow(unsigned int a, unsigned int b) {
    const uint8_t ab = (uint8_t)a;
    const uint8_t bb = (uint8_t)b;
    const uint8_t sum = ab + bb;
    return (ab ^ bb) & ~(ab ^ sum) & 0x80;
}

, but it doesn't seem to generate very efficient code.

/Ulf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-06 10:52   ` Ulf Magnusson
@ 2011-10-06 14:42     ` Ulf Magnusson
  2011-10-06 14:45       ` Miles Bader
  0 siblings, 1 reply; 18+ messages in thread
From: Ulf Magnusson @ 2011-10-06 14:42 UTC (permalink / raw)
  To: Pedro Pedruzzi; +Cc: gcc

On Thu, Oct 6, 2011 at 10:25 AM, Ulf Magnusson <ulfalizer@gmail.com> wrote:
> On Thu, Oct 6, 2011 at 12:55 AM, Pedro Pedruzzi
> <pedro.pedruzzi@gmail.com> wrote:
>> Em 05-10-2011 17:11, Ulf Magnusson escreveu:
>>> Hi,
>>>
>>> I've been experimenting with different methods for emulating the
>>> signed overflow of an 8-bit CPU.
>>
>> You would like to check whether a 8-bit signed addition will overflow or
>> not, given the two operands. Is that correct?
>>
>> As you used the word `emulating', I am assuming that your function will
>> not run by the mentioned CPU.
>>
>
> No, it'll most likely only run on systems with a wider bitness.
>
>> Does this 8-bit CPU use two's complement representation?
>
> Yes, and the criterion for signed overflow is "both numbers have the
> same sign, but the sign of the sum is different". Should have made
> that more clear.
>
>>
>>> The method I've found that seems to
>>> generate the most efficient code on both ARM and x86 is
>>>
>>> bool overflow(unsigned int a, unsigned int b) {
>>>     const unsigned int sum = (int8_t)a + (int8_t)b;
>>>     return (int8_t)sum != sum;
>>> }
>>>
>>> (The real function would probably be 'inline', of course. Regs are
>>> stored in overlong variables, hence 'unsigned int'.)
>>>
>>> Looking at the spec, it unfortunately seems the behavior of this
>>> function is undefined, as it relies on signed int addition wrapping,
>>> and that (int8_t)sum truncates bits. Is there some way to make this
>>> guaranteed safe with GCC without resorting to inline asm? Locally
>>> enabling -fwrap takes care of the addition, but that still leaves the
>>> conversion.
>>
>> I believe the cast from unsigned int to int8_t is implementation-defined
>> for values that can't be represented in int8_t (e.g. 0xff). A kind of
>> `undefined behavior' as well.
>>
>> I tried:
>>
>> bool overflow(unsigned int a, unsigned int b) {
>>    const unsigned int sum = a + b;
>>    return ((a & 0x80) == (b & 0x80)) && ((a & 0x80) != (sum & 0x80));
>> }
>>
>> But it is not as efficient as yours.
>>
>> --
>> Pedro Pedruzzi
>>
>
> Yeah, I tried similar bit-trickery along the lines of
>
> bool overflow(unsigned int a, unsigned int b) {
>    const uint8_t ab = (uint8_t)a;
>    const uint8_t bb = (uint8_t)b;
>    const uint8_t sum = ab + bb;
>    return (ab ^ bb) & ~(ab ^ sum) & 0x80;
> }
>
> , but it doesn't seem to generate very efficient code.
>
> /Ulf
>

Might as well do

bool overflowbit(unsigned int a, unsigned int b) {
    const unsigned int sum = a + b;
    return (a ^ b) & ~(a ^ sum) & 0x80;
}

But still not very good output compared to other approaches as expected.

/Ulf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-06 14:42     ` Ulf Magnusson
@ 2011-10-06 14:45       ` Miles Bader
  2011-10-06 14:48         ` Ulf Magnusson
  0 siblings, 1 reply; 18+ messages in thread
From: Miles Bader @ 2011-10-06 14:45 UTC (permalink / raw)
  To: Ulf Magnusson; +Cc: Pedro Pedruzzi, gcc

Ulf Magnusson <ulfalizer@gmail.com> writes:
> Might as well do
>
> bool overflowbit(unsigned int a, unsigned int b) {
>     const unsigned int sum = a + b;
>     return (a ^ b) & ~(a ^ sum) & 0x80;
> }
>
> But still not very good output compared to other approaches as expected.

How about:

   bool overflowbit2(unsigned int a, unsigned int b)
   {
       const unsigned int sum = a + b;
       return ~(a ^ b) & sum & 0x80;
   }

?

I thinnnnk it has the same results as your function...
[I just made a table of all 8 possibilities, and checked!]

-miles

-- 
Circus, n. A place where horses, ponies and elephants are permitted to see
men, women and children acting the fool.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-06 14:45       ` Miles Bader
@ 2011-10-06 14:48         ` Ulf Magnusson
  2011-10-06 21:31           ` Pedro Pedruzzi
  0 siblings, 1 reply; 18+ messages in thread
From: Ulf Magnusson @ 2011-10-06 14:48 UTC (permalink / raw)
  To: Miles Bader; +Cc: Pedro Pedruzzi, gcc

On Thu, Oct 6, 2011 at 11:04 AM, Miles Bader <miles@gnu.org> wrote:
> Ulf Magnusson <ulfalizer@gmail.com> writes:
>> Might as well do
>>
>> bool overflowbit(unsigned int a, unsigned int b) {
>>     const unsigned int sum = a + b;
>>     return (a ^ b) & ~(a ^ sum) & 0x80;
>> }
>>
>> But still not very good output compared to other approaches as expected.
>
> How about:
>
>   bool overflowbit2(unsigned int a, unsigned int b)
>   {
>       const unsigned int sum = a + b;
>       return ~(a ^ b) & sum & 0x80;
>   }
>
> ?
>
> I thinnnnk it has the same results as your function...
> [I just made a table of all 8 possibilities, and checked!]
>
> -miles
>
> --
> Circus, n. A place where horses, ponies and elephants are permitted to see
> men, women and children acting the fool.
>

Ops, should have been

return ~(a ^ b) & (a ^ sum) & 0x80

~(a ^ b) gives 1 in the sign bit position if the signs are the same,
and (a ^ sum) gives 1 if it's different in the sum.

A clearer way of writing it (that also generates suboptimal code) is

bool overflow(unsigned int a, unsigned int b) {
    const unsigned asign   = a       & 0x80;
    const unsigned bsign   = b       & 0x80;
    const unsigned sumsign = (a + b) & 0x80;
    return (asign == bsign) && (asign != sumsign);
}

Seems bit-fiddling isn't the way to go.

Maybe I should take this to gnu-help as it isn't really development-related.

/Ulf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-06 14:48         ` Ulf Magnusson
@ 2011-10-06 21:31           ` Pedro Pedruzzi
  2011-10-07 11:52             ` Miles Bader
  0 siblings, 1 reply; 18+ messages in thread
From: Pedro Pedruzzi @ 2011-10-06 21:31 UTC (permalink / raw)
  To: Ulf Magnusson; +Cc: Miles Bader, gcc

On Thu, Oct 6, 2011 at 11:04 AM, Miles Bader <miles@gnu.org> wrote:
> How about:
>
>   bool overflowbit2(unsigned int a, unsigned int b)
>   {
>       const unsigned int sum = a + b;
>       return ~(a ^ b) & sum & 0x80;
>   }
>
> ?
>
> I thinnnnk it has the same results as your function...
> [I just made a table of all 8 possibilities, and checked!]

Miles, it is not the same. Take for example (0xff, 0xff). In 8-bit 2's
complement, this is (-1, -1) and does not overflow. Your function says
it does.


Em 06-10-2011 12:23, Jeremy Hall escreveu:
> bool overflow(int16_t a, int16_t b)
> {
>    const int16_t sum = a + b;
>    return sum > INT8_MAX || sum < INT8_MIN;
> }

Jeremy, here you are ignoring the problem of converting from the
unsigned int (in the range 0 to 0xff) to the signed integer that it
represents in 8-bit two's complement. Example: 0xff -> -1.

In practice, casting the unsigned int to int8_t works in most cases, but
it is compiler-defined. We are trying to find a always well-defined
approach that is efficient as well.


> Ops, should have been
> 
> return ~(a ^ b) & (a ^ sum) & 0x80
> 
> ~(a ^ b) gives 1 in the sign bit position if the signs are the same,
> and (a ^ sum) gives 1 if it's different in the sum.

This is good. Do you think this is suboptimal? How are you evaluating
efficiency? In x86 this generates pretty small code.

<overflow2>:
  400524:   8d 04 3e                lea    (%rsi,%rdi,1),%eax
  400527:   31 f8                   xor    %edi,%eax
  400529:   31 f7                   xor    %esi,%edi
  40052b:   f7 d7                   not    %edi
  40052d:   21 f8                   and    %edi,%eax
  40052f:   25 80 00 00 00          and    $0x80,%eax
  400534:   c3                      retq

-- 
Pedro Pedruzzi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-06 21:31           ` Pedro Pedruzzi
@ 2011-10-07 11:52             ` Miles Bader
  2011-10-07 17:20               ` Pedro Pedruzzi
  0 siblings, 1 reply; 18+ messages in thread
From: Miles Bader @ 2011-10-07 11:52 UTC (permalink / raw)
  To: Pedro Pedruzzi; +Cc: Ulf Magnusson, gcc

Pedro Pedruzzi <pedro.pedruzzi@gmail.com> writes:
> On Thu, Oct 6, 2011 at 11:04 AM, Miles Bader <miles@gnu.org> wrote:
>> How about:
>>
>>   bool overflowbit2(unsigned int a, unsigned int b)
>>   {
>>       const unsigned int sum = a + b;
>>       return ~(a ^ b) & sum & 0x80;
>>   }
>
> Miles, it is not the same. Take for example (0xff, 0xff). In 8-bit
> 2's complement, this is (-1, -1) and does not overflow. Your
> function says it does.

Negative overflow isn't considered overflow...?  wacky...

-miles

-- 
=====
(^o^;
(()))
*This is the cute octopus virus, please copy it into your sig so it can spread.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-07 11:52             ` Miles Bader
@ 2011-10-07 17:20               ` Pedro Pedruzzi
  2011-10-07 17:35                 ` Miles Bader
  0 siblings, 1 reply; 18+ messages in thread
From: Pedro Pedruzzi @ 2011-10-07 17:20 UTC (permalink / raw)
  To: Miles Bader; +Cc: Ulf Magnusson, gcc

Em 07-10-2011 02:35, Miles Bader escreveu:
> Pedro Pedruzzi <pedro.pedruzzi@gmail.com> writes:
>> On Thu, Oct 6, 2011 at 11:04 AM, Miles Bader <miles@gnu.org> wrote:
>>> How about:
>>>
>>>   bool overflowbit2(unsigned int a, unsigned int b)
>>>   {
>>>       const unsigned int sum = a + b;
>>>       return ~(a ^ b) & sum & 0x80;
>>>   }
>>
>> Miles, it is not the same. Take for example (0xff, 0xff). In 8-bit
>> 2's complement, this is (-1, -1) and does not overflow. Your
>> function says it does.
> 
> Negative overflow isn't considered overflow...?  wacky...

It is. For example -100 + -100 = -200 (less than INT8_MIN; does not
fit). But -1 + -1 = -2, is ok.

-- 
Pedro

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-07 17:20               ` Pedro Pedruzzi
@ 2011-10-07 17:35                 ` Miles Bader
  0 siblings, 0 replies; 18+ messages in thread
From: Miles Bader @ 2011-10-07 17:35 UTC (permalink / raw)
  To: Pedro Pedruzzi; +Cc: Ulf Magnusson, gcc

2011/10/7 Pedro Pedruzzi <pedro.pedruzzi@gmail.com>:
> It is. For example -100 + -100 = -200 (less than INT8_MIN; does not
> fit). But -1 + -1 = -2, is ok.

Ah, now I see...

-miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <4E8DBF3C.1020700@redhat.com>]

* Re: Option to make unsigned->signed conversion always well-defined?
       [not found] ` <4E8DBF3C.1020700@redhat.com>
@ 2011-10-06 22:08   ` Ulf Magnusson
  0 siblings, 0 replies; 18+ messages in thread
From: Ulf Magnusson @ 2011-10-06 22:08 UTC (permalink / raw)
  To: Andrew Haley; +Cc: gcc-help, gcc

(I'll cross-post this to gcc and keep it on gcc-help after that.)

On Thu, Oct 6, 2011 at 4:46 PM, Andrew Haley <aph@redhat.com> wrote:
>
> inline int8_t as_signed_8 (unsigned int a) {
>  a &= 0xff;
>  return a & 0x80 ? (int)a - 0x100 : a;
> }
>
> int overflow(unsigned int a, unsigned int b) {
>  int sum = as_signed_8(a) + as_signed_8(b);
>  return as_signed_8(sum) != sum;
> }
>
> Andrew.
>

That's a really neat trick, and seems to generate identical code. Thanks!

I'd be interesting to know if this version produces equally efficient
code with MSVC.

To summarize what we have so far, here's four different methods along
with the code generated for X86 and ARM (GCC 4.5.2):

#include <inttypes.h>

inline int8_t as_signed_8(unsigned int a) {
    a &= 0xff;
    return a & 0x80 ? (int)a - 0x100 : a;
}

bool overflow_range(unsigned int a, unsigned int b) {
    const int sum = as_signed_8(a) + as_signed_8(b);
    return sum < -128 || sum > 127;
}

bool overflow_bit(unsigned int a, unsigned int b) {
    const unsigned int sum = a + b;
    return ~(a ^ b) & (a ^ sum) & 0x80;
}

bool overflow_unsafe(unsigned int a, unsigned int b) {
    const unsigned int sum = (int8_t)a + (int8_t)b;
    return (int8_t)sum != sum;
}

bool overflow_safe(unsigned int a, unsigned int b) {
    const int sum = as_signed_8(a) + as_signed_8(b);
    return as_signed_8(sum) != sum;
}



Output for X86 with -O3 -fomit-frame-pointer:

00000000 <_Z14overflow_rangejj>:
   0:	0f be 54 24 04       	movsbl 0x4(%esp),%edx
   5:	0f be 44 24 08       	movsbl 0x8(%esp),%eax
   a:	8d 84 02 80 00 00 00 	lea    0x80(%edx,%eax,1),%eax
  11:	3d ff 00 00 00       	cmp    $0xff,%eax
  16:	0f 97 c0             	seta   %al
  19:	c3                   	ret
  1a:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi

00000020 <_Z12overflow_bitjj>:
  20:	8b 54 24 08          	mov    0x8(%esp),%edx
  24:	8b 4c 24 04          	mov    0x4(%esp),%ecx
  28:	89 d0                	mov    %edx,%eax
  2a:	31 c8                	xor    %ecx,%eax
  2c:	01 ca                	add    %ecx,%edx
  2e:	31 ca                	xor    %ecx,%edx
  30:	f7 d0                	not    %eax
  32:	21 d0                	and    %edx,%eax
  34:	a8 80                	test   $0x80,%al
  36:	0f 95 c0             	setne  %al
  39:	c3                   	ret
  3a:	8d b6 00 00 00 00    	lea    0x0(%esi),%esi

00000040 <_Z15overflow_unsafejj>:
  40:	0f be 54 24 08       	movsbl 0x8(%esp),%edx
  45:	0f be 44 24 04       	movsbl 0x4(%esp),%eax
  4a:	8d 04 02             	lea    (%edx,%eax,1),%eax
  4d:	0f be d0             	movsbl %al,%edx
  50:	39 c2                	cmp    %eax,%edx
  52:	0f 95 c0             	setne  %al
  55:	c3                   	ret
  56:	8d 76 00             	lea    0x0(%esi),%esi
  59:	8d bc 27 00 00 00 00 	lea    0x0(%edi,%eiz,1),%edi

00000060 <_Z13overflow_safejj>:
  60:	0f be 54 24 08       	movsbl 0x8(%esp),%edx
  65:	0f be 44 24 04       	movsbl 0x4(%esp),%eax
  6a:	8d 04 02             	lea    (%edx,%eax,1),%eax
  6d:	0f be d0             	movsbl %al,%edx
  70:	39 c2                	cmp    %eax,%edx
  72:	0f 95 c0             	setne  %al
  75:	c3                   	ret



Output for ARM with -O3 -fomit-frame-pointer -mthumb -march=armv7:

00000000 <_Z14overflow_rangejj>:
   0:	b249      	sxtb	r1, r1
   2:	b240      	sxtb	r0, r0
   4:	1808      	adds	r0, r1, r0
   6:	3080      	adds	r0, #128	; 0x80
   8:	28ff      	cmp	r0, #255	; 0xff
   a:	bf94      	ite	ls
   c:	2000      	movls	r0, #0
   e:	2001      	movhi	r0, #1
  10:	4770      	bx	lr
  12:	bf00      	nop
  14:	f3af 8000 	nop.w
  18:	f3af 8000 	nop.w
  1c:	f3af 8000 	nop.w

00000020 <_Z12overflow_bitjj>:
  20:	180b      	adds	r3, r1, r0
  22:	4041      	eors	r1, r0
  24:	ea83 0200 	eor.w	r2, r3, r0
  28:	ea22 0001 	bic.w	r0, r2, r1
  2c:	f3c0 10c0 	ubfx	r0, r0, #7, #1
  30:	4770      	bx	lr
  32:	bf00      	nop
  34:	f3af 8000 	nop.w
  38:	f3af 8000 	nop.w
  3c:	f3af 8000 	nop.w

00000040 <_Z15overflow_unsafejj>:
  40:	b242      	sxtb	r2, r0
  42:	b249      	sxtb	r1, r1
  44:	1888      	adds	r0, r1, r2
  46:	b243      	sxtb	r3, r0
  48:	1a18      	subs	r0, r3, r0
  4a:	bf18      	it	ne
  4c:	2001      	movne	r0, #1
  4e:	4770      	bx	lr

00000050 <_Z13overflow_safejj>:
  50:	b242      	sxtb	r2, r0
  52:	b249      	sxtb	r1, r1
  54:	1888      	adds	r0, r1, r2
  56:	b243      	sxtb	r3, r0
  58:	1a18      	subs	r0, r3, r0
  5a:	bf18      	it	ne
  5c:	2001      	movne	r0, #1
  5e:	4770      	bx	lr


Not sure which version would be fastest on ARM (no device to benchmark
on handy).

By the way, what's a nice way to benchmark snippets like this with
optimization on? If you call each function in a loop from a different
compilation unit the call overhead tends to dominate. If you instead
put it in the same compilation unit and inline, the compiler might do
things you do not expect that renders the benchmark useless.

/Ulf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-05 22:56 Option to make unsigned->signed conversion always well-defined? Ulf Magnusson
                   ` (2 preceding siblings ...)
       [not found] ` <4E8DBF3C.1020700@redhat.com>
@ 2011-10-06 22:46 ` Florian Weimer
  2011-10-07 19:36   ` Ulf Magnusson
  3 siblings, 1 reply; 18+ messages in thread
From: Florian Weimer @ 2011-10-06 22:46 UTC (permalink / raw)
  To: Ulf Magnusson; +Cc: gcc

* Ulf Magnusson:

> I've been experimenting with different methods for emulating the
> signed overflow of an 8-bit CPU. The method I've found that seems to
> generate the most efficient code on both ARM and x86 is
>
> bool overflow(unsigned int a, unsigned int b) {
>     const unsigned int sum = (int8_t)a + (int8_t)b;
>     return (int8_t)sum != sum;
> }

There's a GCC extension which is relevant here:

| For conversion to a type of width N, the value is reduced modulo 2^N
| to be within range of the type; no signal is raised.

<http://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation>

Using that, you can replace the final "& 0x80" with a signed
comparison to zero, which should be give you the best possible code
(for the generic RISC).  You only need to hunt down a copy of Hacker's
Delight or find the right bit twiddling by other means. 8-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-06 22:46 ` Florian Weimer
@ 2011-10-07 19:36   ` Ulf Magnusson
  2011-10-07 22:24     ` Florian Weimer
  0 siblings, 1 reply; 18+ messages in thread
From: Ulf Magnusson @ 2011-10-07 19:36 UTC (permalink / raw)
  To: Florian Weimer; +Cc: gcc

On Thu, Oct 6, 2011 at 11:31 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Ulf Magnusson:
>
>> I've been experimenting with different methods for emulating the
>> signed overflow of an 8-bit CPU. The method I've found that seems to
>> generate the most efficient code on both ARM and x86 is
>>
>> bool overflow(unsigned int a, unsigned int b) {
>>     const unsigned int sum = (int8_t)a + (int8_t)b;
>>     return (int8_t)sum != sum;
>> }
>
> There's a GCC extension which is relevant here:
>
> | For conversion to a type of width N, the value is reduced modulo 2^N
> | to be within range of the type; no signal is raised.
>
> <http://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation>
>
> Using that, you can replace the final "& 0x80" with a signed
> comparison to zero, which should be give you the best possible code
> (for the generic RISC).  You only need to hunt down a copy of Hacker's
> Delight or find the right bit twiddling by other means. 8-)
>

Are you thinking of something like this?

bool overflow_bit2(unsigned int a, unsigned int b) {
    const unsigned int ashift = a << 24;
    const unsigned int bshift = b << 24;
    const unsigned int sum = a + b;
    return (int)(~(a ^ b) & (a ^ sum)) < 0;
}

That version generates

  80:	180b      	adds	r3, r1, r0
  82:	4041      	eors	r1, r0
  84:	ea83 0200 	eor.w	r2, r3, r0
  88:	ea22 0001 	bic.w	r0, r2, r1
  8c:	0fc0      	lsrs	r0, r0, #31
  8e:	4770      	bx	lr

Whereas the unshifted version generates

  40:	180b      	adds	r3, r1, r0
  42:	4041      	eors	r1, r0
  44:	ea83 0200 	eor.w	r2, r3, r0
  48:	ea22 0001 	bic.w	r0, r2, r1
  4c:	f3c0 10c0 	ubfx	r0, r0, #7, #1
  50:	4770      	bx	lr

So maybe a bit better. (I'm no ARM pro, but the compiler does seem to
take advantage of the fact that it's testing the real sign bit at
least.)

Btw, & 0x80000000 generates the same code.

/Ulf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-07 19:36   ` Ulf Magnusson
@ 2011-10-07 22:24     ` Florian Weimer
  2011-10-07 22:48       ` Ulf Magnusson
  0 siblings, 1 reply; 18+ messages in thread
From: Florian Weimer @ 2011-10-07 22:24 UTC (permalink / raw)
  To: Ulf Magnusson; +Cc: gcc

* Ulf Magnusson:

> Are you thinking of something like this?
>
> bool overflow_bit2(unsigned int a, unsigned int b) {
>     const unsigned int ashift = a << 24;
>     const unsigned int bshift = b << 24;
>     const unsigned int sum = a + b;
>     return (int)(~(a ^ b) & (a ^ sum)) < 0;
> }

Yes, but rather like :

  bool overflow_bit2(unsigned char a, unsigned char b) {
    const unsigned char sum = a + b;
    return ((signed char)(~(a ^ b) & (a ^ sum))) < 0;
  }

It still results in abysmal code, given that this should result in two
or three instructions on most architectures.

Are machine code insertions an option?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-07 22:24     ` Florian Weimer
@ 2011-10-07 22:48       ` Ulf Magnusson
  2011-10-07 23:36         ` Florian Weimer
  0 siblings, 1 reply; 18+ messages in thread
From: Ulf Magnusson @ 2011-10-07 22:48 UTC (permalink / raw)
  To: Florian Weimer; +Cc: gcc

On Fri, Oct 7, 2011 at 7:35 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Ulf Magnusson:
>
>> Are you thinking of something like this?
>>
>> bool overflow_bit2(unsigned int a, unsigned int b) {
>>     const unsigned int ashift = a << 24;
>>     const unsigned int bshift = b << 24;
>>     const unsigned int sum = a + b;
>>     return (int)(~(a ^ b) & (a ^ sum)) < 0;
>> }
>
> Yes, but rather like :
>
>  bool overflow_bit2(unsigned char a, unsigned char b) {
>    const unsigned char sum = a + b;
>    return ((signed char)(~(a ^ b) & (a ^ sum))) < 0;
>  }
>
> It still results in abysmal code, given that this should result in two
> or three instructions on most architectures.
>
> Are machine code insertions an option?
>

Tried that version, but it seems to generate worse (or bigger anyway -
haven't benchmarked it) code:

  90:	eb01 0c00 	add.w	ip, r1, r0
  94:	b2c2      	uxtb	r2, r0
  96:	ea82 030c 	eor.w	r3, r2, ip
  9a:	ea82 0101 	eor.w	r1, r2, r1
  9e:	ea23 0001 	bic.w	r0, r3, r1
  a2:	f3c0 10c0 	ubfx	r0, r0, #7, #1
  a6:	4770      	bx	lr
  a8:	f3af 8000 	nop.w
  ac:	f3af 8000 	nop.w

Good machine code would be fun to see, though I might need to brush up
on my ARM.

/Ulf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
  2011-10-07 22:48       ` Ulf Magnusson
@ 2011-10-07 23:36         ` Florian Weimer
  0 siblings, 0 replies; 18+ messages in thread
From: Florian Weimer @ 2011-10-07 23:36 UTC (permalink / raw)
  To: Ulf Magnusson; +Cc: gcc

* Ulf Magnusson:

> Good machine code would be fun to see, though I might need to brush up
> on my ARM.

It turns out that ARM doesn't seem to have 8-bit overflow detection,
so something like this has to be used (with the arguments in R1 and
R2, result in the lower bit of R3):


  MOV    R3, R1, LSL #24
  ADDS   R3, R2, R3, LSL #24
  ADDVS  R3, R3, #1

Not sure if this is correct, I don't really know ARM assembly.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Option to make unsigned->signed conversion always well-defined?
@ 2011-10-06 18:24 Jeremy Hall
  0 siblings, 0 replies; 18+ messages in thread
From: Jeremy Hall @ 2011-10-06 18:24 UTC (permalink / raw)
  To: gcc

Hi.

Instead of all the clever bit twiddling I have used code similar to

   sum > UINT8_MAX

which just generates

      cmp  ax,255
      seta  al

which seems to be far more efficient (even the signed version gets optimized
down to the above single check).
Please could someone tell me if I have missed something here????

Signed check:

bool overflow(int16_t a, int16_t b)
{
   const int16_t sum = a + b;
   return sum > INT8_MAX || sum < INT8_MIN;
}

Unsigned check

bool overflow(uint16_t a, uint16_t b)
{
   const uint16_t sum = a + b;
   return sum > UINT8_MAX;
}

Jeremy

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-10-07 19:36 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-05 22:56 Option to make unsigned->signed conversion always well-defined? Ulf Magnusson
2011-10-06  8:25 ` Ulf Magnusson
2011-10-06  9:04 ` Pedro Pedruzzi
2011-10-06 10:52   ` Ulf Magnusson
2011-10-06 14:42     ` Ulf Magnusson
2011-10-06 14:45       ` Miles Bader
2011-10-06 14:48         ` Ulf Magnusson
2011-10-06 21:31           ` Pedro Pedruzzi
2011-10-07 11:52             ` Miles Bader
2011-10-07 17:20               ` Pedro Pedruzzi
2011-10-07 17:35                 ` Miles Bader
     [not found] ` <4E8DBF3C.1020700@redhat.com>
2011-10-06 22:08   ` Ulf Magnusson
2011-10-06 22:46 ` Florian Weimer
2011-10-07 19:36   ` Ulf Magnusson
2011-10-07 22:24     ` Florian Weimer
2011-10-07 22:48       ` Ulf Magnusson
2011-10-07 23:36         ` Florian Weimer
2011-10-06 18:24 Jeremy Hall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).