public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jason Merrill <jason@redhat.com>
To: Martin Sebor <msebor@gmail.com>,
	Gcc Patch List <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements
Date: Mon, 04 Apr 2016 23:35:00 -0000	[thread overview]
Message-ID: <5702FA46.9020807@redhat.com> (raw)
In-Reply-To: <56FEFE08.8010207@gmail.com>

On 04/01/2016 07:02 PM, Martin Sebor wrote:
>> Fair enough.  I don't think we can impose an arbitrary 64K limit,
>> however, as that is a lot smaller than the 8MB default stack size, and
>> programs can use setrlimit to increase the stack farther.  For GCC 6 let
>> not impose any limit beyond non-negative/overflowing, and as you say we
>> can do something better in GCC 7.
>
> Would you be open to imposing a somewhat more permissive limit,
> say on the order of one or a few megabytes (but less than the
> default 8 MB on Linux)?
>
> I ask because I expect the majority of programmer errors with
> VLAs to be due to out-of-range bounds that couldn't be
> accommodated even if stack space was extended well beyond
> the Linux default (say hundreds of MB), or that would result
> in complete stack space exhaustion.  I.e., not caused by
> deliberately trying to create very large VLAs but rather by
> accidentally creating VLAs with unconstrained bounds (due to
> a failure to validate input, uninitialized unsigned variables,
> etc.)
>
> I expect fewer cases to be due to negative or zero bounds or
> excessive initializers.
>
> I also expect programmers to want to find out about such bugs
> sooner (i.e., in unit testing with plentiful stack space) rather
> than after software has been deployed (and under low stack space
> conditions not exercised during unit testing).
>
> To that end, I think a lower limit is going to be more helpful
> than a permissive one (or none at all).

> But if even a few MB seems too strict, I would find having even
> an exceedingly liberal limit (say 1GB) much preferable to none
> at all as it makes it possible to exercise boundary conditions
> such as the size overflow problem you noted below.

That sounds reasonable, as long as users with unusual needs can adjust 
it with a flag, but even so I'm nervous about doing this in stage 4.  It 
certainly isn't a regression.

> During additional testing it dawned on me that there is no good
> way to validate (or even to initialize) the initializer list of
> a multi-dimensional VLA that isn't unambiguously braced.
>
> For example, the following VLA with N = 2 and N = 3:
>
>      int A [M][N] = { 1, 2, 3, 4 };
>
> Unpatched, GCC initializes it to { { 1, 2, 3 }, { 0, 0, 0 } }.
> With my first patch, GCC throws.  Neither is correct, but doing
> the right thing would involve emitting parameterizing the
> initialization code for the value of each bound.  While that
> might be doable it feels like a bigger change than I would be
> comfortable attempting at this stage.  To avoid the problem I've
> made it an error to specify a partially braced VLA initializer.

Sounds good.

> If you think it's worthwhile, I can see about implementing the
> runtime reshaping in stage 1.

No, thanks.  I think requiring explicit bracing is fine.

>> It seems to me that we want use the existing check for excess
>> initializers in build_vec_init, in the if (length_check) section, though
>> as you mention in 70019 that needs to be improved to handle STRING_CST.
>
> I don't think modifying build_vec_init() alone would be sufficient.
> For example, the function isn't called for a VLA with a constant
> bound like this one:
>
>   int A [2][N] = { 1, 2, 3, 4 };

That seems like a bug, due to array_of_runtime_bound_p returning false 
for that array.

>> Also, I think we should check for invalid bounds in
>> compute_array_index_type, next to the UBsan code.  Checking bounds only
>> from cp_finish_decl means that we don't check uses of VLA types other
>> than variable declarations.
>
> You mean VLA typedefs?  That's true, though I have consciously
> avoided dealing with those.  They're outlawed in N3639 and so
> I've been focusing just on variables.  But since GCC accepts
> VLA typedefs too I was thinking I would bring them up at some
> point in the future to decide what to do about them.

And cast to pointer to VLAs.  But for non-variable cases we don't care 
about available stack, so we wouldn't want your allocation limit to apply.

> As for where to add the bounds checking code, I also at first
> thought of checking the bounds parallel to the UBSan code in
> compute_array_index_type() and even tried that approach. The
> problem with it is that it only considers one array dimension
> at a time, without the knowledge of the others.  As a result,
> as noted in sanitizer/70051, it doesn't correctly detect
> overflows in the bounds of multidimensional VLAs.

It doesn't, but I don't see why it couldn't.  It should be fine to check 
each dimension for overflow separately; if an inner dimension doesn't 
overflow, we can go on and consider the outer dimension.

Incidentally, I was wondering if it would make sense to use the 
overflowing calculation for both TYPE_SIZE and the sanity check when 
we're doing both.

>>> +          /* Avoid instrumenting constexpr functions.  Those must
>>> +         be checked statically, and the (non-constexpr) dynamic
>>> +         instrumentation would cause them to be rejected.  */

>> Hmm, this sounds wrong; constexpr functions can also be called with
>> non-constant arguments, and the instrumentation should be folded away
>> when evaluating a call with constant arguments.

> You're right that constexpr functions should be checked as
> well.  Unfortunately, at present, due to c++/70507 the check
> (or rather the call to __builtin_mul_overflow) isn't folded
> away and we end up with error: call to internal function.

Ah, sure.  It should be pretty simple to teach the constexpr code how to 
handle that built-in.

Jason

  reply	other threads:[~2016-04-04 23:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-07  1:38 Martin Sebor
2016-03-14 21:26 ` Martin Sebor
2016-03-21 22:10   ` PING #2 " Martin Sebor
2016-03-23 18:34 ` Jason Merrill
2016-03-23 22:50   ` Martin Sebor
2016-03-24 16:24     ` Jason Merrill
2016-04-01 23:02       ` Martin Sebor
2016-04-04 23:35         ` Jason Merrill [this message]
2016-04-08  0:05           ` Martin Sebor
2016-04-10 23:14             ` Martin Sebor
2016-04-12 15:43               ` Jason Merrill
2016-04-12 17:37                 ` Martin Sebor
2016-04-12 18:17               ` Jason Merrill
2016-04-13 18:37                 ` Martin Sebor
2016-04-13 19:26                   ` Jason Merrill
2016-04-14  2:35                   ` H.J. Lu
2016-04-14 15:32                     ` Martin Sebor
2016-04-14 10:40                   ` Andreas Schwab
2016-04-14 15:26                     ` Martin Sebor
2016-04-15  7:11                       ` Christophe Lyon
2016-04-15 12:31                       ` Jakub Jelinek
2016-04-15 14:33                         ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5702FA46.9020807@redhat.com \
    --to=jason@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=msebor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).