From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8138 invoked by alias); 11 Aug 2011 16:05:48 -0000 Received: (qmail 8127 invoked by uid 22791); 11 Aug 2011 16:05:45 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,TW_LV X-Spam-Check-By: sourceware.org Received: from iramx2.ira.uni-karlsruhe.de (HELO iramx2.ira.uni-karlsruhe.de) (141.3.10.81) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 11 Aug 2011 16:05:27 +0000 Received: from irams1.ira.uni-karlsruhe.de ([141.3.10.5]) by iramx2.ira.uni-karlsruhe.de with esmtps port 25 id 1QrXl2-0003uZ-HX; Thu, 11 Aug 2011 18:05:26 +0200 Received: from i12pcsinz7.iti.kit.edu ([141.3.27.157] helo=i12pcsinz7.localnet) by irams1.ira.uni-karlsruhe.de with esmtpsa port 25 id 1QrXl2-0007oB-Bb; Thu, 11 Aug 2011 18:05:20 +0200 From: Florian Merz Reply-To: florian.merz@kit.edu To: Richard Guenther Subject: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and clang Date: Thu, 11 Aug 2011 16:05:00 -0000 User-Agent: KMail/1.13.6 (Linux/2.6.38-11-generic; KDE/4.7.0; x86_64; ; ) Cc: gcc@gcc.gnu.org References: <201108111715.14240.florian.merz@kit.edu> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201108111805.19582.florian.merz@kit.edu> X-ATIS-AV: ClamAV (irams1.ira.uni-karlsruhe.de) X-ATIS-AV: ClamAV (iramx2.ira.uni-karlsruhe.de) X-ATIS-AV: Kaspersky (iramx2.ira.uni-karlsruhe.de) X-ATIS-Timestamp: iramx2.ira.uni-karlsruhe.de 1313078726.634450000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2011-08/txt/msg00223.txt.bz2 Thanks for your reply Richard, but I'm not satisfied with your answer, yet. :-) If I'm right, then the problem I'm refering to doesn't require large objects. See below for more. Am Thursday, 11. August 2011, 17:48:26 schrieb Richard Guenther: > On Thu, Aug 11, 2011 at 5:15 PM, Florian Merz wrote: > > Dear gcc developers, > > > > this is about an issue that popped up in a verification project [1] based > > on LLVM, but it seems to be already present in the gimple code, before > > llvm-gcc transforms the gimple code to LLVM-IR. > > > > In short: > > Calculating the difference of two pointers seems to be treated by gcc as > > a signed integer subtraction. While the result should be of type > > ptrdiff_t and therefore signed, we believe the subtraction itself should > > not be signed. > > > > Signed subtraction might overflow if a large positive number is > > subtracted from a large negative number. So subtracting for example from > > the pointer value 0x80...0 (a large negative signed integer) the pointer > > value 0x7F...F (a large positive signed integer) should in theory be > > perfectly fine, but trating this as a signed subtraction causes an > > overflow and therefore undefined behaviour. > > > > Can someone explain why this is treated as a signed subtraction? > > GCC restricts objects to the size of half of the address-space thus > a valid pointer subtraction in C cannot overflow. Consider an array containing 8 bytes starting at 0x7FFFFFFC. This array would go up to one less than 0x80000004. If I remember the standard correctly, pointer subtraction is valid if both pointers point to elements of the same array or to one past the last element of the array. According to this 0x80000000 - 0x7FFFFFFF should be a valid pointer subtraction with the result 0x00000001. But if the subtraction is treated as a signed, this would be an signed integer overflow, as we subtract INT_MAX from INT_MIN, which surely must overflow, and the result therefore would be undefined. > Richard. > > > Thanks a lot and regards, > > Florian > > > > P.S: It seems like clang does not treat this subtraction as signed. > > > > [1] http://baldur.iti.kit.edu/llbmc/ > > > > ---------- Weitergeleitete Nachricht ---------- > > > > Betreff: Re: [LLVMdev] Handling of pointer difference in llvm-gcc and > > clang Datum: Wednesday, 10. August 2011, 19:12:43 > > Von: Jack Howarth > > An: Duncan Sands > > Kopie: llvmdev@cs.uiuc.edu > > > > On Wed, Aug 10, 2011 at 06:13:16PM +0200, Duncan Sands wrote: > >> Hi Stephan, > >> > >> > We are developing a bounded model checker for C/C++ programs > >> > (http://baldur.iti.kit.edu/llbmc/) that operates on LLVM's > >> > intermediate representation. While checking a C++ program that uses > >> > STL containers we noticed that llvm-gcc and clang handle pointer > >> > differences in disagreeing ways. > >> > > >> > Consider the following C function: > >> > int f(int *p, int *q) > >> > { > >> > return q - p; > >> > } > >> > > >> > Here's the LLVM code generated by llvm-gcc (2.9): > >> > define i32 @f(i32* %p, i32* %q) nounwind readnone { > >> > entry: > >> > %0 = ptrtoint i32* %q to i32 > >> > %1 = ptrtoint i32* %p to i32 > >> > %2 = sub nsw i32 %0, %1 > >> > %3 = ashr exact i32 %2, 2 > >> > ret i32 %3 > >> > } > >> > > >> > And here is what clang (2.9) produces: > >> > define i32 @f(i32* %p, i32* %q) nounwind readnone { > >> > %1 = ptrtoint i32* %q to i32 > >> > %2 = ptrtoint i32* %p to i32 > >> > %3 = sub i32 %1, %2 > >> > %4 = ashr exact i32 %3, 2 > >> > ret i32 %4 > >> > } > >> > > >> > Thus, llvm-gcc added the nsw flag to the sub, whereas clang didn't. > >> > > >> > We think that clang is right and llvm-gcc is wrong: it could be the > >> > case that p and q point into the same array, that q is 0x80000000, and > >> > that p is 0x7FFFFFFE. Then the sub results in a signed overflow, > >> > i.e., sub with nsw is a trap value. > >> > > >> > Is this a bug in llvm-gcc? > >> > >> in llvm-gcc (and dragonegg) this is coming directly from GCC's gimple: > >> > >> f (int * p, int * q) > >> { > >> long int D.2718; > >> long int D.2717; > >> long int p.1; > >> long int q.0; > >> int D.2714; > >> > >> : > >> q.0_2 = (long int) q_1(D); > >> p.1_4 = (long int) p_3(D); > >> D.2717_5 = q.0_2 - p.1_4; > >> D.2718_6 = D.2717_5 /[ex] 4; > >> D.2714_7 = (int) D.2718_6; > >> return D.2714_7; > >> > >> } > >> > >> Signed overflow in the difference of two long int (ptrdiff_t) values > >> results in undefined behaviour according to the GCC type system, which > >> is where the nsw flag comes from. > >> > >> The C front-end generates this gimple in the pointer_diff routine. The > >> above > > > > is > > > >> basically a direct transcription of what pointer_diff does. > >> > >> In short, I don't know if this is right or wrong; but if it is wrong it > > > > seems > > > >> to be a bug in GCC's C frontend. > > > > Shouldn't we cc this over to the gcc mailing list for clarification then? > > Jack > > > >> Ciao, Duncan.