From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27070 invoked by alias); 26 Jan 2015 19:54:11 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 27050 invoked by uid 89); 26 Jan 2015 19:54:10 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pd0-f181.google.com Received: from mail-pd0-f181.google.com (HELO mail-pd0-f181.google.com) (209.85.192.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 26 Jan 2015 19:54:07 +0000 Received: by mail-pd0-f181.google.com with SMTP id g10so13742253pdj.12 for ; Mon, 26 Jan 2015 11:54:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-type:content-transfer-encoding; bh=DOK6oajK7b3BmTgtHjDnpmkMqZj8EZX0ggK/EO18iNg=; b=TOssvjypSap10TCul8XhCKOX3RToi95XPuDPUcARtJcFESTQJyLN260b3+jlbrcOfM 9nSU4a4xXm9whR37DhyO11Hi7P3MDMCnQyoymLv7c74JbQhouhe4YIvbGcCRoAZ1g7gi QHxyY68Bl9e6iqRwlcI2In/EPu17YmAuBC9/M7vT9WpqOXBxnPNonzUYlS/ydc80+hAk bXy+ypM9UkkpfKAx4t0BYqitApqdmKBJ/N9EEJW3uG8YumF2/Bgg0/jyYplN1UuDzgNg Dc0k5DusvtkPtIsTeJwiGyYuUy7AQ8oRWcV5llSqmNwFiDZ+Q8fflZG0B+qxx8PZ6jQf 119g== X-Gm-Message-State: ALoCoQmekNnQunBXNOJUF3gFHwObo1KsuqJUesspW7aI34oaQWyyXrf5HYePmNyELRtlQ2qXHUFI X-Received: by 10.66.192.194 with SMTP id hi2mr36243529pac.57.1422302046065; Mon, 26 Jan 2015 11:54:06 -0800 (PST) Received: from lemur (dhcp-171-158.EECS.Berkeley.EDU. [128.32.171.158]) by mx.google.com with ESMTPSA id qm7sm10462977pbc.46.2015.01.26.11.54.02 (version=SSLv3 cipher=RC4-SHA bits=128/128); Mon, 26 Jan 2015 11:54:03 -0800 (PST) Date: Mon, 26 Jan 2015 19:54:00 -0000 From: Martin Uecker To: gcc Mailing List Cc: Jeff Law , Joseph Myers , Jakub Jelinek , Marek Polacek , Florian Weimer , "Balaji V. Iyer" Subject: array bounds, sanitizer, safe programming, and cilk array notation Message-ID: <20150126115359.295659da@lemur> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-SW-Source: 2015-01/txt/msg00237.txt.bz2 Hi all, I am writing numerical code, so I am trying to make the use of arrays in C (with gcc) suck a bit less. In general, the long term goal would be to have either a compile-time warning or the possibility to get a run-time error if one writes beyond the end of an array as specified by its type. So one example (see below) I looked at is where I pass an array of too small size to a function, to see how see can be diagnosed. In some cases, we can get a runtime error with the address sanitizer, but this is fairly limited (e.g. it does not work when the array is embedded in a struct) and I also got mixed results when the function is inlined. For pointers to arrays with static size one can get an "incompatible pointer" warning - which is nice. With clang, I also get warning for pointers which are declared as array parameters and use the 'static' keyword to specify a minimum size. This a diagnostic we are currently missing. The next step would be to have diagnostics also for the VLA case if the size is known at compilation time (as in the example below) and a run-time error when it is not (maybe with the undefined behaviour sanitizer?). If we have the later, I think this might also help with safer programming in C, because one would get either a compile time or runtime error when I passing a buffer which is too small to a function. For example, snprintf could have a prototype like this: int snprintf(size_t size; char str[static size], size_t size, const char *format, ...); That VLAs essentially provide the bounded pointer type C is missing has been pointed out before, e.g. there was a proposal by John Nagle, although he proposed rather intrusive language changes (e.g. adding references to C) which are not necessary in my opinion: https://gcc.gnu.org/ml/gcc/2012-08/msg00360.html Finally, what is missing is a way to diagnose problems inside the called functions. -Warray-bounds=2 (with my recently accepted patch) helps with this, but - so far - only for static arrays: void foo(int (*x)[4]) { (*x)[4] = 5; // warning } It would be nice to also have these warning and runtime errors (with the undefined behaviour sanitizer) for VLAs. Finally, I think we should have corresponding warning also for pointers which are declared as array parameters: void foo2(int x[4]) { x[4] = 5; } The later does not currently produce a warning, because x is converted to a pointer and the length is ignored. If it is not possible to have warning here for compatibility reasons, one possibility is to have an extension similar to 'static' which makes 'x' a proper array in the callee, e.g. something like: void foo2(int x[array 4]) { // x is now of type int[4] and not int* x[4] = 5; // error } The semantics would be that the array is still passed as a pointer but the type of x would be int[4]. Because it immediately decays into a pointer when used, no code generation changes would be required (except maybe when looking at the type with sizeof and _Generic). Another reason I like this is because Cilk array notation currently requires the length to be specified for 'x' because it is a pointer and not an array. If x would be an array, something like this would work: void foo2(int x[array 4]) { x[:] = 1; } In fact, the documentation for Cilk as such examples (without the array keyword), and I guess this works on the intel compiler but not on gcc. I am willing to spend some (limited) time on all of this, but I thought I ask for comments first. I appreciate any feedback, suggestions, and help! Cheers, Martin // file 1 extern void bar(int x[static 5]) { for (int i = 0; i < 5; i++) x[i] = 1; } extern void bar2(int (*x)[5]) { for (int i = 0; i < 5; i++) (*x)[i] = 1; } // file 2 #include extern void bar(int x[static 5]); extern void bar2(int (*x)[5]); int main() { int x[4] = { 0 }; bar(x); // warning only with clang (found by asan) bar2(&x); // warning (found by asan) int c = 4; int y[c]; for (int i = 0; i < c; i++) y[i] = 0; bar(y); // not diagnosed (found by asan) bar2(&y); // not diagnosed (found by asan) struct foo { int z[4]; int bar; } zz = { { 0 }, 0 }; bar(zz.z); // warning only with clang bar2(&zz.z); // warning printf("%d %d %d %d\n", x[0], x[1], x[2], x[3]); printf("%d %d %d %d\n", y[0], y[1], y[2], y[3]); printf("%d %d %d %d\n", zz.z[0], zz.z[1], zz.z[2], zz.z[3]); }