From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 64371 invoked by alias); 20 Jun 2017 08:18:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 64355 invoked by uid 89); 20 Jun 2017 08:18:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=excessive X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Jun 2017 08:18:23 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D97F6ABC8; Tue, 20 Jun 2017 08:18:20 +0000 (UTC) Date: Tue, 20 Jun 2017 08:18:00 -0000 From: Richard Biener To: Jakub Jelinek cc: =?ISO-8859-15?Q?Martin_Li=A8ka?= , gcc-patches@gcc.gnu.org Subject: Re: [RFC PATCH] -fsanitize=pointer-overflow support (PR sanitizer/80998) In-Reply-To: <20170620081348.GE2123@tucnak> Message-ID: References: <20170619182515.GA2123@tucnak> <20170620081348.GE2123@tucnak> User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-SW-Source: 2017-06/txt/msg01402.txt.bz2 On Tue, 20 Jun 2017, Jakub Jelinek wrote: > On Tue, Jun 20, 2017 at 09:41:43AM +0200, Richard Biener wrote: > > On Mon, 19 Jun 2017, Jakub Jelinek wrote: > > > > > Hi! > > > > > > The following patch adds -fsanitize=pointer-overflow support, > > > which adds instrumentation (included in -fsanitize=undefined) that checks > > > that pointer arithmetics doesn't wrap. If the offset on ptr p+ off when treating > > > it as signed value is non-negative, we check whether the result is bigger > > > (uintptr_t comparison) than ptr, if it is negative in ssizetype, we check > > > whether the result is smaller than ptr, otherwise we check at runtime > > > whether (ssizetype) off < 0 and do the check based on that. > > > The patch checks both POINTER_PLUS_EXPR, as well as e.g. ADDR_EXPR of > > > handled components, and even handled components themselves (exception > > > is for constant offset when the base is an automatic non-VLA decl or > > > decl that binds to current function where we can at compile time for > > > sure guarantee it will fit). > > > > Does this "properly" interact with any array-bound sanitizing we do? > > Say, for > > > > &a->b[i].c.d > > > > ? > > It doesn't interact with it right now at all, and I think it shouldn't > for the case you wrote, &a->b[i].c.d even if i is within bounds of the > declared array the pointer still could point to something that wraps around. > struct S { struct T { struct U { int d; } c; } b[0x1000000]; } *a; > could still in a buggy program point to something that is actually much > shorter like a = (struct S *) malloc (0x10000 * sizeof (int)); > or similar and there could be a wrap around. > > What I have considered, but haven't implemented yet, is checking if there > is UBSAN_BOUNDS sanitization and it is actually array or structure with > array etc. (i.e. there is no pointer dereference) - for > &q.b[i].c.d > if the bounds check is present and we are sure about the object size > (i.e. automatic variable or locally defined file scope var). > > > > Martin has said he'll write the sanopt part of optimization > > > (if UBSAN_PTR for some pointer is dominated by UBSAN_PTR for the same > > > pointer and the offset is constant in both cases and equal or absolute value > > > bigger and same sign in the dominating UBSAN_PTR, we can avoid the dominated > > > check). > > > > > > For the cases where there is a dereference (i.e. not ADDR_EXPR of the > > > handled component or POINTER_PLUS_EXPR), I wonder if we couldn't ignore > > > say constant offsets in range <-4096, 4096> or something similar, hoping > > > people don't have anything mapped at the page 0 and -pagesize in hosted > > > env. Thoughts on that? > > > > Not sure what the problem is here? > > It would be an attempt to avoid sanitizing int foo (int *p) { return p[10] + p[-5]; } > (when the offset is constant and small and we dereference it). > If there is no page mapped at NULL or at the highest page in the virtual > address space, then the above will crash in case p + 10 or p - 5 wraps > around. Ah, so merely an optimization to avoid excessive instrumentation then, yes, this might work (maybe make 4096 a --param configurable to be able to disable it). > > > I've bootstrapped/regtested the patch on x86_64-linux and i686-linux > > > and additionally bootstrapped/regtested with bootstrap-ubsan on both too. > > > The latter revealed a couple of issues I'd like to discuss: > > > > > > 1) libcpp/symtab.c contains a couple of spots reduced into: > > > #define DELETED ((char *) -1) > > > void bar (char *); > > > void > > > foo (char *p) > > > { > > > if (p && p != DELETED) > > > bar (p); > > > } > > > where we fold it early into if ((p p+ -1) <= (char *) -3) > > > and as the instrumentation is done during ubsan pass, if p is NULL, > > > we diagnose this as invalid pointer overflow from NULL to 0xffff*f. > > > Shall we change the folder so that during GENERIC folding it > > > actually does the addition and comparison in pointer_sized_int > > > instead (my preference), or shall I move the UBSAN_PTR instrumentation > > > earlier into the FEs (but then I still risk stuff is folded earlier)? > > > > Aww, so we turn the pointer test into a range test ;) That it uses > > a pointer type rather than an unsigned integer type is a bug, probably > > caused by pointers being TYPE_UNSIGNED. > > > > Not sure if the folding itself is worthwhile to keep though, thus an > > option would be to not generate range tests from pointers? > > I'll have a look. Maybe only do it during reassoc and not earlier. It certainly looks somewhat premature in fold-const.c. > > > 3) not really related to this patch, but something I also saw during the > > > bootstrap-ubsan on i686-linux: > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147475412 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147426384 - 2147478324 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147451580 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147450216 - 2147465664 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147469348 - 2147451544 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147482364 - 2147475376 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483624 - 2147475376 cannot be represented in type 'int' > > > ../../gcc/bitmap.c:141:12: runtime error: signed integer overflow: -2147483628 - 2147451544 cannot be represented in type 'int' > > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147426384 - 2147475376 cannot be represented in type 'int' > > > ../../gcc/memory-block.cc:59:4: runtime error: signed integer overflow: -2147450216 - 2147451544 cannot be represented in type 'int' > > > The problem here is that we lower pointer subtraction, e.g. > > > long foo (char *p, char *q) { return q - p; } > > > as return (ptrdiff_t) ((ssizetype) q - (ssizetype) p); > > > and even for a valid testcase where we have an array across > > > the middle of the virtual address space, say the first one above > > > is (char *) 0x8000dfb0 - (char *) 0x7fffdfd4 subtraction, even if > > > there is 128KB array starting at 0x7fffd000, it will yield > > > UB (not in the source, but in whatever the compiler lowered it into). > > > So, shall we instead do the subtraction in sizetype and only then > > > cast? For sizeof (*ptr) > 1 I think we have some outstanding PR, > > > and it is more difficult to find out in what types to compute it. > > > Or do we want to introduce POINTER_DIFF_EXPR? > > > > Just use uintptr_t for the difference computation (well, an unsigned > > integer type of desired precision -- mind address-spaces), then cast > > the result to signed. > > Ok (of course, will handle this separately from the rest). Yes. Note I didn't look at the actual patch (yet). Richard.