public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Fwd: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
@ 2011-08-11 15:15 Florian Merz
  2011-08-11 15:48 ` Richard Guenther
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Merz @ 2011-08-11 15:15 UTC (permalink / raw)
  To: gcc

Dear gcc developers,

this is about an issue that popped up in a verification project [1] based on 
LLVM, but it seems to be already present in the gimple code, before llvm-gcc 
transforms the gimple code to LLVM-IR.

In short:
Calculating the difference of two pointers seems to be treated by gcc as a 
signed integer subtraction. While the result should be of type ptrdiff_t and 
therefore signed, we believe the subtraction itself should not be signed.

Signed subtraction might overflow if a large positive number is subtracted from 
a large negative number. So subtracting for example from the pointer value 
0x80...0 (a large negative signed integer) the pointer value 0x7F...F (a large 
positive signed integer) should in theory be perfectly fine, but trating this 
as a signed subtraction causes an overflow and therefore undefined behaviour.

Can someone explain why this is treated as a signed subtraction?

Thanks a lot and regards,
 Florian

P.S: It seems like clang does not treat this subtraction as signed.

[1] http://baldur.iti.kit.edu/llbmc/

----------  Weitergeleitete Nachricht  ----------

Betreff: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
Datum: Wednesday, 10. August 2011, 19:12:43
Von: Jack Howarth <howarth@bromo.med.uc.edu>
An: Duncan Sands <baldrick@free.fr>
Kopie: llvmdev@cs.uiuc.edu

On Wed, Aug 10, 2011 at 06:13:16PM +0200, Duncan Sands wrote:
> Hi Stephan,
> 
> > We are developing a bounded model checker for C/C++ programs
> > (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's intermediate
> > representation.  While checking a C++ program that uses STL containers
> > we noticed that llvm-gcc and clang handle pointer differences in
> > disagreeing ways.
> >
> > Consider the following C function:
> > int f(int *p, int *q)
> > {
> >       return q - p;
> > }
> >
> > Here's the LLVM code generated by llvm-gcc (2.9):
> > define i32 @f(i32* %p, i32* %q) nounwind readnone {
> > entry:
> >     %0 = ptrtoint i32* %q to i32
> >     %1 = ptrtoint i32* %p to i32
> >     %2 = sub nsw i32 %0, %1
> >     %3 = ashr exact i32 %2, 2
> >     ret i32 %3
> > }
> >
> > And here is what clang (2.9) produces:
> > define i32 @f(i32* %p, i32* %q) nounwind readnone {
> >     %1 = ptrtoint i32* %q to i32
> >     %2 = ptrtoint i32* %p to i32
> >     %3 = sub i32 %1, %2
> >     %4 = ashr exact i32 %3, 2
> >     ret i32 %4
> > }
> >
> > Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't.
> >
> > We think that clang is right and llvm-gcc is wrong:  it could be the
> > case that p and q point into the same array, that q is 0x80000000, and
> > that p is 0x7FFFFFFE.  Then the sub results in a signed overflow, i.e.,
> > sub with nsw is a trap value.
> >
> > Is this a bug in llvm-gcc?
> 
> in llvm-gcc (and dragonegg) this is coming directly from GCC's gimple:
> 
> f (int * p, int * q)
> {
>    long int D.2718;
>    long int D.2717;
>    long int p.1;
>    long int q.0;
>    int D.2714;
> 
> <bb 2>:
>    q.0_2 = (long int) q_1(D);
>    p.1_4 = (long int) p_3(D);
>    D.2717_5 = q.0_2 - p.1_4;
>    D.2718_6 = D.2717_5 /[ex] 4;
>    D.2714_7 = (int) D.2718_6;
>    return D.2714_7;
> 
> }
> 
> Signed overflow in the difference of two long int (ptrdiff_t) values results in
> undefined behaviour according to the GCC type system, which is where the nsw
> flag comes from.
> 
> The C front-end generates this gimple in the pointer_diff routine.  The above 
is
> basically a direct transcription of what pointer_diff does.
> 
> In short, I don't know if this is right or wrong; but if it is wrong it 
seems
> to be a bug in GCC's C frontend.

Shouldn't we cc this over to the gcc mailing list for clarification then?
             Jack

> 
> Ciao, Duncan.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev@cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 15:15 Fwd: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang Florian Merz
@ 2011-08-11 15:48 ` Richard Guenther
  2011-08-11 16:05   ` Florian Merz
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Guenther @ 2011-08-11 15:48 UTC (permalink / raw)
  To: florian.merz; +Cc: gcc

On Thu, Aug 11, 2011 at 5:15 PM, Florian Merz <florian.merz@kit.edu> wrote:
> Dear gcc developers,
>
> this is about an issue that popped up in a verification project [1] based on
> LLVM, but it seems to be already present in the gimple code, before llvm-gcc
> transforms the gimple code to LLVM-IR.
>
> In short:
> Calculating the difference of two pointers seems to be treated by gcc as a
> signed integer subtraction. While the result should be of type ptrdiff_t and
> therefore signed, we believe the subtraction itself should not be signed.
>
> Signed subtraction might overflow if a large positive number is subtracted from
> a large negative number. So subtracting for example from the pointer value
> 0x80...0 (a large negative signed integer) the pointer value 0x7F...F (a large
> positive signed integer) should in theory be perfectly fine, but trating this
> as a signed subtraction causes an overflow and therefore undefined behaviour.
>
> Can someone explain why this is treated as a signed subtraction?

GCC restricts objects to the size of half of the address-space thus
a valid pointer subtraction in C cannot overflow.

Richard.

> Thanks a lot and regards,
>  Florian
>
> P.S: It seems like clang does not treat this subtraction as signed.
>
> [1] http://baldur.iti.kit.edu/llbmc/
>
> ----------  Weitergeleitete Nachricht  ----------
>
> Betreff: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
> Datum: Wednesday, 10. August 2011, 19:12:43
> Von: Jack Howarth <howarth@bromo.med.uc.edu>
> An: Duncan Sands <baldrick@free.fr>
> Kopie: llvmdev@cs.uiuc.edu
>
> On Wed, Aug 10, 2011 at 06:13:16PM +0200, Duncan Sands wrote:
>> Hi Stephan,
>>
>> > We are developing a bounded model checker for C/C++ programs
>> > (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's intermediate
>> > representation.  While checking a C++ program that uses STL containers
>> > we noticed that llvm-gcc and clang handle pointer differences in
>> > disagreeing ways.
>> >
>> > Consider the following C function:
>> > int f(int *p, int *q)
>> > {
>> >       return q - p;
>> > }
>> >
>> > Here's the LLVM code generated by llvm-gcc (2.9):
>> > define i32 @f(i32* %p, i32* %q) nounwind readnone {
>> > entry:
>> >     %0 = ptrtoint i32* %q to i32
>> >     %1 = ptrtoint i32* %p to i32
>> >     %2 = sub nsw i32 %0, %1
>> >     %3 = ashr exact i32 %2, 2
>> >     ret i32 %3
>> > }
>> >
>> > And here is what clang (2.9) produces:
>> > define i32 @f(i32* %p, i32* %q) nounwind readnone {
>> >     %1 = ptrtoint i32* %q to i32
>> >     %2 = ptrtoint i32* %p to i32
>> >     %3 = sub i32 %1, %2
>> >     %4 = ashr exact i32 %3, 2
>> >     ret i32 %4
>> > }
>> >
>> > Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't.
>> >
>> > We think that clang is right and llvm-gcc is wrong:  it could be the
>> > case that p and q point into the same array, that q is 0x80000000, and
>> > that p is 0x7FFFFFFE.  Then the sub results in a signed overflow, i.e.,
>> > sub with nsw is a trap value.
>> >
>> > Is this a bug in llvm-gcc?
>>
>> in llvm-gcc (and dragonegg) this is coming directly from GCC's gimple:
>>
>> f (int * p, int * q)
>> {
>>    long int D.2718;
>>    long int D.2717;
>>    long int p.1;
>>    long int q.0;
>>    int D.2714;
>>
>> <bb 2>:
>>    q.0_2 = (long int) q_1(D);
>>    p.1_4 = (long int) p_3(D);
>>    D.2717_5 = q.0_2 - p.1_4;
>>    D.2718_6 = D.2717_5 /[ex] 4;
>>    D.2714_7 = (int) D.2718_6;
>>    return D.2714_7;
>>
>> }
>>
>> Signed overflow in the difference of two long int (ptrdiff_t) values results in
>> undefined behaviour according to the GCC type system, which is where the nsw
>> flag comes from.
>>
>> The C front-end generates this gimple in the pointer_diff routine.  The above
> is
>> basically a direct transcription of what pointer_diff does.
>>
>> In short, I don't know if this is right or wrong; but if it is wrong it
> seems
>> to be a bug in GCC's C frontend.
>
> Shouldn't we cc this over to the gcc mailing list for clarification then?
>             Jack
>
>>
>> Ciao, Duncan.
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev@cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev@cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> -------------------------------------------------------------
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 15:48 ` Richard Guenther
@ 2011-08-11 16:05   ` Florian Merz
  2011-08-11 17:11     ` Richard Guenther
  2011-08-11 17:13     ` Joe Buck
  0 siblings, 2 replies; 13+ messages in thread
From: Florian Merz @ 2011-08-11 16:05 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Thanks for your reply Richard, but I'm not satisfied with your answer, yet. :-)
If I'm right, then the problem I'm refering to doesn't require large objects.

See below for more.

Am Thursday, 11. August 2011, 17:48:26 schrieb Richard Guenther:
> On Thu, Aug 11, 2011 at 5:15 PM, Florian Merz <florian.merz@kit.edu> wrote:
> > Dear gcc developers,
> > 
> > this is about an issue that popped up in a verification project [1] based
> > on LLVM, but it seems to be already present in the gimple code, before
> > llvm-gcc transforms the gimple code to LLVM-IR.
> > 
> > In short:
> > Calculating the difference of two pointers seems to be treated by gcc as
> > a signed integer subtraction. While the result should be of type
> > ptrdiff_t and therefore signed, we believe the subtraction itself should
> > not be signed.
> > 
> > Signed subtraction might overflow if a large positive number is
> > subtracted from a large negative number. So subtracting for example from
> > the pointer value 0x80...0 (a large negative signed integer) the pointer
> > value 0x7F...F (a large positive signed integer) should in theory be
> > perfectly fine, but trating this as a signed subtraction causes an
> > overflow and therefore undefined behaviour.
> > 
> > Can someone explain why this is treated as a signed subtraction?
> 
> GCC restricts objects to the size of half of the address-space thus
> a valid pointer subtraction in C cannot overflow.

Consider an array containing 8 bytes starting at 0x7FFFFFFC. This array would 
go up to one less than 0x80000004.

If I remember the standard correctly, pointer subtraction is valid if both 
pointers point to elements of the same array or to one past the last element 
of the array. According to this 0x80000000 - 0x7FFFFFFF should be a valid 
pointer subtraction with the result 0x00000001.

But if the subtraction is treated as a signed, this would be an signed integer 
overflow, as we subtract INT_MAX from INT_MIN, which surely must overflow, and 
the result therefore would be undefined.

> Richard.
> 
> > Thanks a lot and regards,
> >  Florian
> > 
> > P.S: It seems like clang does not treat this subtraction as signed.
> > 
> > [1] http://baldur.iti.kit.edu/llbmc/
> > 
> > ----------  Weitergeleitete Nachricht  ----------
> > 
> > Betreff: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and
> > clang Datum: Wednesday, 10. August 2011, 19:12:43
> > Von: Jack Howarth <howarth@bromo.med.uc.edu>
> > An: Duncan Sands <baldrick@free.fr>
> > Kopie: llvmdev@cs.uiuc.edu
> > 
> > On Wed, Aug 10, 2011 at 06:13:16PM +0200, Duncan Sands wrote:
> >> Hi Stephan,
> >> 
> >> > We are developing a bounded model checker for C/C++ programs
> >> > (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's
> >> > intermediate representation.  While checking a C++ program that uses
> >> > STL containers we noticed that llvm-gcc and clang handle pointer
> >> > differences in disagreeing ways.
> >> > 
> >> > Consider the following C function:
> >> > int f(int *p, int *q)
> >> > {
> >> >       return q - p;
> >> > }
> >> > 
> >> > Here's the LLVM code generated by llvm-gcc (2.9):
> >> > define i32 @f(i32* %p, i32* %q) nounwind readnone {
> >> > entry:
> >> >     %0 = ptrtoint i32* %q to i32
> >> >     %1 = ptrtoint i32* %p to i32
> >> >     %2 = sub nsw i32 %0, %1
> >> >     %3 = ashr exact i32 %2, 2
> >> >     ret i32 %3
> >> > }
> >> > 
> >> > And here is what clang (2.9) produces:
> >> > define i32 @f(i32* %p, i32* %q) nounwind readnone {
> >> >     %1 = ptrtoint i32* %q to i32
> >> >     %2 = ptrtoint i32* %p to i32
> >> >     %3 = sub i32 %1, %2
> >> >     %4 = ashr exact i32 %3, 2
> >> >     ret i32 %4
> >> > }
> >> > 
> >> > Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't.
> >> > 
> >> > We think that clang is right and llvm-gcc is wrong:  it could be the
> >> > case that p and q point into the same array, that q is 0x80000000, and
> >> > that p is 0x7FFFFFFE.  Then the sub results in a signed overflow,
> >> > i.e., sub with nsw is a trap value.
> >> > 
> >> > Is this a bug in llvm-gcc?
> >> 
> >> in llvm-gcc (and dragonegg) this is coming directly from GCC's gimple:
> >> 
> >> f (int * p, int * q)
> >> {
> >>    long int D.2718;
> >>    long int D.2717;
> >>    long int p.1;
> >>    long int q.0;
> >>    int D.2714;
> >> 
> >> <bb 2>:
> >>    q.0_2 = (long int) q_1(D);
> >>    p.1_4 = (long int) p_3(D);
> >>    D.2717_5 = q.0_2 - p.1_4;
> >>    D.2718_6 = D.2717_5 /[ex] 4;
> >>    D.2714_7 = (int) D.2718_6;
> >>    return D.2714_7;
> >> 
> >> }
> >> 
> >> Signed overflow in the difference of two long int (ptrdiff_t) values
> >> results in undefined behaviour according to the GCC type system, which
> >> is where the nsw flag comes from.
> >> 
> >> The C front-end generates this gimple in the pointer_diff routine.  The
> >> above
> > 
> > is
> > 
> >> basically a direct transcription of what pointer_diff does.
> >> 
> >> In short, I don't know if this is right or wrong; but if it is wrong it
> > 
> > seems
> > 
> >> to be a bug in GCC's C frontend.
> > 
> > Shouldn't we cc this over to the gcc mailing list for clarification then?
> >             Jack
> > 
> >> Ciao, Duncan.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 16:05   ` Florian Merz
@ 2011-08-11 17:11     ` Richard Guenther
  2011-08-11 18:58       ` Joseph S. Myers
  2011-08-11 17:13     ` Joe Buck
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Guenther @ 2011-08-11 17:11 UTC (permalink / raw)
  To: florian.merz; +Cc: gcc, Joseph S. Myers

On Thu, Aug 11, 2011 at 6:05 PM, Florian Merz <florian.merz@kit.edu> wrote:
> Thanks for your reply Richard, but I'm not satisfied with your answer, yet. :-)
> If I'm right, then the problem I'm refering to doesn't require large objects.
>
> See below for more.
>
> Am Thursday, 11. August 2011, 17:48:26 schrieb Richard Guenther:
>> On Thu, Aug 11, 2011 at 5:15 PM, Florian Merz <florian.merz@kit.edu> wrote:
>> > Dear gcc developers,
>> >
>> > this is about an issue that popped up in a verification project [1] based
>> > on LLVM, but it seems to be already present in the gimple code, before
>> > llvm-gcc transforms the gimple code to LLVM-IR.
>> >
>> > In short:
>> > Calculating the difference of two pointers seems to be treated by gcc as
>> > a signed integer subtraction. While the result should be of type
>> > ptrdiff_t and therefore signed, we believe the subtraction itself should
>> > not be signed.
>> >
>> > Signed subtraction might overflow if a large positive number is
>> > subtracted from a large negative number. So subtracting for example from
>> > the pointer value 0x80...0 (a large negative signed integer) the pointer
>> > value 0x7F...F (a large positive signed integer) should in theory be
>> > perfectly fine, but trating this as a signed subtraction causes an
>> > overflow and therefore undefined behaviour.
>> >
>> > Can someone explain why this is treated as a signed subtraction?
>>
>> GCC restricts objects to the size of half of the address-space thus
>> a valid pointer subtraction in C cannot overflow.
>
> Consider an array containing 8 bytes starting at 0x7FFFFFFC. This array would
> go up to one less than 0x80000004.
>
> If I remember the standard correctly, pointer subtraction is valid if both
> pointers point to elements of the same array or to one past the last element
> of the array. According to this 0x80000000 - 0x7FFFFFFF should be a valid
> pointer subtraction with the result 0x00000001.
>
> But if the subtraction is treated as a signed, this would be an signed integer
> overflow, as we subtract INT_MAX from INT_MIN, which surely must overflow, and
> the result therefore would be undefined.

int x,y;
int main ()
{
  char *a, *b;
  __INTPTR_TYPE__ w;
  if (x)
    a = 0x7ffffffe;
  else
    a = 0x7fffffff;
  if (y)
    b = 0x80000001;
  else
    b = 0x80000000;
  w = b - a;
  return w;
}

indeed traps with -ftrapv for me which suggests you are right.

Joseph?

Richard.

>> Richard.
>>
>> > Thanks a lot and regards,
>> >  Florian
>> >
>> > P.S: It seems like clang does not treat this subtraction as signed.
>> >
>> > [1] http://baldur.iti.kit.edu/llbmc/
>> >
>> > ----------  Weitergeleitete Nachricht  ----------
>> >
>> > Betreff: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and
>> > clang Datum: Wednesday, 10. August 2011, 19:12:43
>> > Von: Jack Howarth <howarth@bromo.med.uc.edu>
>> > An: Duncan Sands <baldrick@free.fr>
>> > Kopie: llvmdev@cs.uiuc.edu
>> >
>> > On Wed, Aug 10, 2011 at 06:13:16PM +0200, Duncan Sands wrote:
>> >> Hi Stephan,
>> >>
>> >> > We are developing a bounded model checker for C/C++ programs
>> >> > (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's
>> >> > intermediate representation.  While checking a C++ program that uses
>> >> > STL containers we noticed that llvm-gcc and clang handle pointer
>> >> > differences in disagreeing ways.
>> >> >
>> >> > Consider the following C function:
>> >> > int f(int *p, int *q)
>> >> > {
>> >> >       return q - p;
>> >> > }
>> >> >
>> >> > Here's the LLVM code generated by llvm-gcc (2.9):
>> >> > define i32 @f(i32* %p, i32* %q) nounwind readnone {
>> >> > entry:
>> >> >     %0 = ptrtoint i32* %q to i32
>> >> >     %1 = ptrtoint i32* %p to i32
>> >> >     %2 = sub nsw i32 %0, %1
>> >> >     %3 = ashr exact i32 %2, 2
>> >> >     ret i32 %3
>> >> > }
>> >> >
>> >> > And here is what clang (2.9) produces:
>> >> > define i32 @f(i32* %p, i32* %q) nounwind readnone {
>> >> >     %1 = ptrtoint i32* %q to i32
>> >> >     %2 = ptrtoint i32* %p to i32
>> >> >     %3 = sub i32 %1, %2
>> >> >     %4 = ashr exact i32 %3, 2
>> >> >     ret i32 %4
>> >> > }
>> >> >
>> >> > Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't.
>> >> >
>> >> > We think that clang is right and llvm-gcc is wrong:  it could be the
>> >> > case that p and q point into the same array, that q is 0x80000000, and
>> >> > that p is 0x7FFFFFFE.  Then the sub results in a signed overflow,
>> >> > i.e., sub with nsw is a trap value.
>> >> >
>> >> > Is this a bug in llvm-gcc?
>> >>
>> >> in llvm-gcc (and dragonegg) this is coming directly from GCC's gimple:
>> >>
>> >> f (int * p, int * q)
>> >> {
>> >>    long int D.2718;
>> >>    long int D.2717;
>> >>    long int p.1;
>> >>    long int q.0;
>> >>    int D.2714;
>> >>
>> >> <bb 2>:
>> >>    q.0_2 = (long int) q_1(D);
>> >>    p.1_4 = (long int) p_3(D);
>> >>    D.2717_5 = q.0_2 - p.1_4;
>> >>    D.2718_6 = D.2717_5 /[ex] 4;
>> >>    D.2714_7 = (int) D.2718_6;
>> >>    return D.2714_7;
>> >>
>> >> }
>> >>
>> >> Signed overflow in the difference of two long int (ptrdiff_t) values
>> >> results in undefined behaviour according to the GCC type system, which
>> >> is where the nsw flag comes from.
>> >>
>> >> The C front-end generates this gimple in the pointer_diff routine.  The
>> >> above
>> >
>> > is
>> >
>> >> basically a direct transcription of what pointer_diff does.
>> >>
>> >> In short, I don't know if this is right or wrong; but if it is wrong it
>> >
>> > seems
>> >
>> >> to be a bug in GCC's C frontend.
>> >
>> > Shouldn't we cc this over to the gcc mailing list for clarification then?
>> >             Jack
>> >
>> >> Ciao, Duncan.
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 16:05   ` Florian Merz
  2011-08-11 17:11     ` Richard Guenther
@ 2011-08-11 17:13     ` Joe Buck
  2011-08-11 17:15       ` Richard Guenther
  1 sibling, 1 reply; 13+ messages in thread
From: Joe Buck @ 2011-08-11 17:13 UTC (permalink / raw)
  To: Florian Merz; +Cc: Richard Guenther, gcc

On Thu, Aug 11, 2011 at 09:05:19AM -0700, Florian Merz wrote:
> If I remember the standard correctly, pointer subtraction is valid if both 
> pointers point to elements of the same array or to one past the last element 
> of the array. According to this 0x80000000 - 0x7FFFFFFF should be a valid 
> pointer subtraction with the result 0x00000001.
> 
> But if the subtraction is treated as a signed, this would be an signed integer 
> overflow, as we subtract INT_MAX from INT_MIN, which surely must overflow, and 
> the result therefore would be undefined.

It is true that the C and C++ languages make signed integer overflow
undefined, but that's for actual integer types as declared by the user.
For pointers, though the subtraction has to be signed (because, for two
pointers, either can can come later in the address space), this signed
subtraction has to be defined to work in a two's complement fashion (so
the wraparound in your example case works reliably).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 17:13     ` Joe Buck
@ 2011-08-11 17:15       ` Richard Guenther
  2011-08-11 17:21         ` Florian Merz
  2011-08-11 20:14         ` Gabriel Dos Reis
  0 siblings, 2 replies; 13+ messages in thread
From: Richard Guenther @ 2011-08-11 17:15 UTC (permalink / raw)
  To: Joe Buck; +Cc: Florian Merz, gcc

On Thu, Aug 11, 2011 at 7:13 PM, Joe Buck <Joe.Buck@synopsys.com> wrote:
> On Thu, Aug 11, 2011 at 09:05:19AM -0700, Florian Merz wrote:
>> If I remember the standard correctly, pointer subtraction is valid if both
>> pointers point to elements of the same array or to one past the last element
>> of the array. According to this 0x80000000 - 0x7FFFFFFF should be a valid
>> pointer subtraction with the result 0x00000001.
>>
>> But if the subtraction is treated as a signed, this would be an signed integer
>> overflow, as we subtract INT_MAX from INT_MIN, which surely must overflow, and
>> the result therefore would be undefined.
>
> It is true that the C and C++ languages make signed integer overflow
> undefined, but that's for actual integer types as declared by the user.
> For pointers, though the subtraction has to be signed (because, for two
> pointers, either can can come later in the address space), this signed
> subtraction has to be defined to work in a two's complement fashion (so
> the wraparound in your example case works reliably).

Of course GCC can't (yet) do both at the same time.  Thus we have to
use unsigned arithmetic when we want two's complement arithmetic.

Richard.

>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 17:15       ` Richard Guenther
@ 2011-08-11 17:21         ` Florian Merz
  2011-08-11 20:14         ` Gabriel Dos Reis
  1 sibling, 0 replies; 13+ messages in thread
From: Florian Merz @ 2011-08-11 17:21 UTC (permalink / raw)
  To: gcc

Am Thursday, 11. August 2011, 19:15:41 schrieb Richard Guenther:
> On Thu, Aug 11, 2011 at 7:13 PM, Joe Buck <Joe.Buck@synopsys.com> wrote:
> > On Thu, Aug 11, 2011 at 09:05:19AM -0700, Florian Merz wrote:
> >> If I remember the standard correctly, pointer subtraction is valid if
> >> both pointers point to elements of the same array or to one past the
> >> last element of the array. According to this 0x80000000 - 0x7FFFFFFF
> >> should be a valid pointer subtraction with the result 0x00000001.
> >> 
> >> But if the subtraction is treated as a signed, this would be an signed
> >> integer overflow, as we subtract INT_MAX from INT_MIN, which surely
> >> must overflow, and the result therefore would be undefined.
> > 
> > It is true that the C and C++ languages make signed integer overflow
> > undefined, but that's for actual integer types as declared by the user.
> > For pointers, though the subtraction has to be signed (because, for two
> > pointers, either can can come later in the address space), this signed
> > subtraction has to be defined to work in a two's complement fashion (so
> > the wraparound in your example case works reliably).
> 
> Of course GCC can't (yet) do both at the same time.  Thus we have to
> use unsigned arithmetic when we want two's complement arithmetic.

I agree on that. Unsigned subtraction isn't entirely correct either, after all 
the result might be negative, but with unsigned subtraction at least we get 
two's complement arithmetic without trapping.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 17:11     ` Richard Guenther
@ 2011-08-11 18:58       ` Joseph S. Myers
  2011-08-11 20:16         ` Gabriel Dos Reis
  0 siblings, 1 reply; 13+ messages in thread
From: Joseph S. Myers @ 2011-08-11 18:58 UTC (permalink / raw)
  To: Richard Guenther; +Cc: florian.merz, gcc

On Thu, 11 Aug 2011, Richard Guenther wrote:

> int x,y;
> int main ()
> {
>   char *a, *b;
>   __INTPTR_TYPE__ w;
>   if (x)
>     a = 0x7ffffffe;
>   else
>     a = 0x7fffffff;
>   if (y)
>     b = 0x80000001;
>   else
>     b = 0x80000000;
>   w = b - a;
>   return w;
> }
> 
> indeed traps with -ftrapv for me which suggests you are right.
> 
> Joseph?

Subtracting pointers via conversion to integers is wrong in a similar way 
to the pre-POINTER_PLUS_EXPR representation of pointer addition 
(converting the integer operand to a pointer type).  Unlike that 
representation it isn't actually nonsensical, but logically the operation 
of subtracting two pointers yielding an integer should be represented 
without needing to convert either pointer to an integer type.  In the 
absence of such a representation, then converting to an unsigned type is 
indeed safer.  -ftrapv and -fwrapv should have no effect on pointer 
subtraction.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 17:15       ` Richard Guenther
  2011-08-11 17:21         ` Florian Merz
@ 2011-08-11 20:14         ` Gabriel Dos Reis
  1 sibling, 0 replies; 13+ messages in thread
From: Gabriel Dos Reis @ 2011-08-11 20:14 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Joe Buck, Florian Merz, gcc

On Thu, Aug 11, 2011 at 12:15 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Aug 11, 2011 at 7:13 PM, Joe Buck <Joe.Buck@synopsys.com> wrote:
>> On Thu, Aug 11, 2011 at 09:05:19AM -0700, Florian Merz wrote:
>>> If I remember the standard correctly, pointer subtraction is valid if both
>>> pointers point to elements of the same array or to one past the last element
>>> of the array. According to this 0x80000000 - 0x7FFFFFFF should be a valid
>>> pointer subtraction with the result 0x00000001.
>>>
>>> But if the subtraction is treated as a signed, this would be an signed integer
>>> overflow, as we subtract INT_MAX from INT_MIN, which surely must overflow, and
>>> the result therefore would be undefined.
>>
>> It is true that the C and C++ languages make signed integer overflow
>> undefined, but that's for actual integer types as declared by the user.
>> For pointers, though the subtraction has to be signed (because, for two
>> pointers, either can can come later in the address space), this signed
>> subtraction has to be defined to work in a two's complement fashion (so
>> the wraparound in your example case works reliably).
>
> Of course GCC can't (yet) do both at the same time.

yes, but GCC should mark its internal artifacts so that it surely distinguishes
user-provided abstractions (which may be subjected to harsh treatments)
from its own blessed babies.

>  Thus we have to
> use unsigned arithmetic when we want two's complement arithmetic.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 18:58       ` Joseph S. Myers
@ 2011-08-11 20:16         ` Gabriel Dos Reis
  2011-08-11 20:39           ` Joe Buck
  0 siblings, 1 reply; 13+ messages in thread
From: Gabriel Dos Reis @ 2011-08-11 20:16 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Richard Guenther, florian.merz, gcc

On Thu, Aug 11, 2011 at 1:58 PM, Joseph S. Myers
<joseph@codesourcery.com> wrote:
>  -ftrapv and -fwrapv should have no effect on pointer subtraction.

Yes!

-- Gaby

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 20:16         ` Gabriel Dos Reis
@ 2011-08-11 20:39           ` Joe Buck
  2011-08-12  7:32             ` Richard Guenther
  0 siblings, 1 reply; 13+ messages in thread
From: Joe Buck @ 2011-08-11 20:39 UTC (permalink / raw)
  To: Gabriel Dos Reis, Joseph S. Myers; +Cc: Richard Guenther, florian.merz, gcc

On Thu, Aug 11, 2011 at 1:58 PM, Joseph S. Myers
<joseph@codesourcery.com> wrote:
>  -ftrapv and -fwrapv should have no effect on pointer subtraction.

Gaby writes:

> Yes!

Wouldn't it suffice to convert the pointers to unsigned, do an unsigned subtraction, and then convert the result to signed? This would then guarantee that gcc uses two's complement semantics, independent of -ftrapv.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-11 20:39           ` Joe Buck
@ 2011-08-12  7:32             ` Richard Guenther
  2011-08-12  7:59               ` Florian Merz
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Guenther @ 2011-08-12  7:32 UTC (permalink / raw)
  To: Joe Buck; +Cc: Gabriel Dos Reis, Joseph S. Myers, florian.merz, gcc

On Thu, Aug 11, 2011 at 10:36 PM, Joe Buck <Joe.Buck@synopsys.com> wrote:
> On Thu, Aug 11, 2011 at 1:58 PM, Joseph S. Myers
> <joseph@codesourcery.com> wrote:
>>  -ftrapv and -fwrapv should have no effect on pointer subtraction.
>
> Gaby writes:
>
>> Yes!
>
> Wouldn't it suffice to convert the pointers to unsigned, do an unsigned subtraction, and then convert the result to signed? This would then guarantee that gcc uses two's complement semantics, independent of -ftrapv.

Of course, I think that is what is being proposed.

Richard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang
  2011-08-12  7:32             ` Richard Guenther
@ 2011-08-12  7:59               ` Florian Merz
  0 siblings, 0 replies; 13+ messages in thread
From: Florian Merz @ 2011-08-12  7:59 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Joe Buck, Gabriel Dos Reis, Joseph S. Myers, gcc

So it seems like we agreed that this is a problem that should be fixed.

Shall I create a bug report suggesting for it?

Am Friday, 12. August 2011, 09:32:11 schrieb Richard Guenther:
> On Thu, Aug 11, 2011 at 10:36 PM, Joe Buck <Joe.Buck@synopsys.com> wrote:
> > On Thu, Aug 11, 2011 at 1:58 PM, Joseph S. Myers
> > 
> > <joseph@codesourcery.com> wrote:
> >>  -ftrapv and -fwrapv should have no effect on pointer subtraction.
> > 
> > Gaby writes:
> >> Yes!
> > 
> > Wouldn't it suffice to convert the pointers to unsigned, do an unsigned
> > subtraction, and then convert the result to signed? This would then
> > guarantee that gcc uses two's complement semantics, independent of
> > -ftrapv.
> 
> Of course, I think that is what is being proposed.
> 
> Richard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-08-12  7:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-11 15:15 Fwd: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang Florian Merz
2011-08-11 15:48 ` Richard Guenther
2011-08-11 16:05   ` Florian Merz
2011-08-11 17:11     ` Richard Guenther
2011-08-11 18:58       ` Joseph S. Myers
2011-08-11 20:16         ` Gabriel Dos Reis
2011-08-11 20:39           ` Joe Buck
2011-08-12  7:32             ` Richard Guenther
2011-08-12  7:59               ` Florian Merz
2011-08-11 17:13     ` Joe Buck
2011-08-11 17:15       ` Richard Guenther
2011-08-11 17:21         ` Florian Merz
2011-08-11 20:14         ` Gabriel Dos Reis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).