From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5096 invoked by alias); 17 Jun 2011 10:44:39 -0000 Received: (qmail 5088 invoked by uid 22791); 17 Jun 2011 10:44:39 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST X-Spam-Check-By: sourceware.org Received: from mail-ww0-f51.google.com (HELO mail-ww0-f51.google.com) (74.125.82.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 17 Jun 2011 10:44:24 +0000 Received: by wwf26 with SMTP id 26so2232007wwf.8 for ; Fri, 17 Jun 2011 03:44:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.227.55.67 with SMTP id t3mr1931758wbg.90.1308307463429; Fri, 17 Jun 2011 03:44:23 -0700 (PDT) Received: by 10.227.28.69 with HTTP; Fri, 17 Jun 2011 03:44:23 -0700 (PDT) In-Reply-To: <4DFB2F3A.3040706@codesourcery.com> References: <4DF9A526.9060906@codesourcery.com> <4DFA7D1C.9040105@redhat.com> <4DFB2F3A.3040706@codesourcery.com> Date: Fri, 17 Jun 2011 10:56:00 -0000 Message-ID: Subject: Re: [PATCH PR45098] Disallow NULL pointer in pointer arithmetic From: Richard Guenther To: Tom de Vries Cc: Jeff Law , Zdenek Dvorak , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-06/txt/msg01328.txt.bz2 On Fri, Jun 17, 2011 at 12:40 PM, Tom de Vries wro= te: > On 06/17/2011 12:01 AM, Jeff Law wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 06/16/11 00:39, Tom de Vries wrote: >>> Hi, >>> >>> Consider the following example. >>> >>> extern unsigned int foo (int*) __attribute__((pure)); >>> unsigned int >>> tr (int array[], int n) >>> { >>> =A0 unsigned int i; >>> =A0 unsigned int sum =3D 0; >>> =A0 for (i =3D 0; i < n; i++) >>> =A0 =A0 sum +=3D foo (&array[i]); >>> =A0 return sum; >>> } >>> >>> For 32-bit pointers, the analysis in infer_loop_bounds_from_pointer_ari= th >>> currently concludes that the range of valid &array[i] is &array[0x0] to >>> &array[0x3fffffff], meaning 0x40000000 distinct values. >>> This implies that i < n is executed at most 0x40000001 times, and i < n >>> cannot be eliminated by an 32-bit iterator with step 4, since that one = has >>> only 0x40000000 distinct values. >>> >>> The patch reasons that NULL cannot be used or produced by pointer >>> arithmetic, and that we can exclude the possibility of the NULL pointer= in the >>> range. So the range of valid &array[i] is &array[0] to &array[0x3ffffff= e], >>> meaning 0x3fffffff distinct values. >>> This implies that i < n is executed at most 0x40000000 times and i < n = can be >>> eliminated. >>> >>> The patch implements this new limitation by changing the (low, high, st= ep) >>> triplet in infer_loop_bounds_from_pointer_arith from (0x0, 0xffffffff, = 0x4) >>> to (0x4, 0xffffffff, 0x4). >>> >>> I'm not too happy about the test for C-like language: ptrdiff_type_node= !=3D >>> NULL_TREE, but I'm not sure how else to test for this. >>> >>> Bootstrapped and reg-tested on x86_64. >>> >>> I will sent the adapted test cases in a separate email. > >> Interesting. =A0I'd never thought about the generation/use angle to prove >> a pointer was non-null. =A0ISTM we could use that same logic to infer th= at >> more pointers are non-null in extract_range_from_binary_expr. >> >> Interested in tackling that improvement, obviously as an independent pat= ch? >> > > I'm not familiar with vrp code, but.. something like this? > > Index: tree-vrp.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- tree-vrp.c =A0(revision 173703) > +++ tree-vrp.c =A0(working copy) > @@ -2273,7 +2273,12 @@ extract_range_from_binary_expr (value_ra > =A0 =A0 =A0 =A0{ > =A0 =A0 =A0 =A0 =A0/* For pointer types, we are really only interested in= asserting > =A0 =A0 =A0 =A0 =A0 =A0 whether the expression evaluates to non-NULL. =A0= */ > - =A0 =A0 =A0 =A0 if (range_is_nonnull (&vr0) || range_is_nonnull (&vr1)) > + =A0 =A0 =A0 =A0 if (flag_delete_null_pointer_checks && nowrap_type_p (e= xpr_type)) the latter would always return true Btw, I guess you'll "miscompile" a load of code that is strictly undefined. So I'm not sure we want to do this against our users ... Oh, and of course it's even wrong. I thing it needs && !range_includes_zero (&vr1) (which we probably don't have). The offset may be 0 and NULL + 0 is still NULL. Richard. > + =A0 =A0 =A0 =A0 =A0 { > + =A0 =A0 =A0 =A0 =A0 =A0 set_value_range_to_nonnull (vr, expr_type); > + =A0 =A0 =A0 =A0 =A0 =A0 set_value_range_to_nonnull (&vr0, expr_type); > + =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 else if (range_is_nonnull (&vr0) || range_is_nonnull (&= vr1)) > =A0 =A0 =A0 =A0 =A0 =A0set_value_range_to_nonnull (vr, expr_type); > =A0 =A0 =A0 =A0 =A0else if (range_is_null (&vr0) && range_is_null (&vr1)) > =A0 =A0 =A0 =A0 =A0 =A0set_value_range_to_null (vr, expr_type); > > Thanks, > - Tom >