Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Martin Sebor <msebor@gmail.com>
To: Jason Merrill <jason@redhat.com>,
	Gcc Patch List <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements
Date: Fri, 08 Apr 2016 00:05:00 -0000	[thread overview]
Message-ID: <5706F5D5.9030204@gmail.com> (raw)
In-Reply-To: <5702FA46.9020807@redhat.com>

I've spent a ton of time trying to implement the suggested
changes (far too much in large part because of my own mistakes)
but I don't think they will work.  I'll try to clean up what
I have and post it for review.  I wanted to respond to this
how in case you have some suggestions or concerns with the
direction I'm taking in the meantime.

>> But if even a few MB seems too strict, I would find having even
>> an exceedingly liberal limit (say 1GB) much preferable to none
>> at all as it makes it possible to exercise boundary conditions
>> such as the size overflow problem you noted below.
>
> That sounds reasonable, as long as users with unusual needs can adjust
> it with a flag, but even so I'm nervous about doing this in stage 4.  It
> certainly isn't a regression.

I'm not comfortable adding a new option at this stage.  I'm also
not sure that an option to impose a static limit is the best
solution.  It seems that if we go to the trouble of making the limit
customizable it should be possible to change it without recompiling
everything (e.g., on ELF, we could check for a weak function and
call it to get the most up-to-date limit).

Let me restore the 4.9.3 behavior by setting the VLA size limit to
SIZE_MAX / 2 (that fixes the other regression that I just raised
in c++/70588 for the record).

>> I don't think modifying build_vec_init() alone would be sufficient.
>> For example, the function isn't called for a VLA with a constant
>> bound like this one:
>>
>>   int A [2][N] = { 1, 2, 3, 4 };
>
> That seems like a bug, due to array_of_runtime_bound_p returning false
> for that array.

It seems that a complete fix would involve (among other things)
replacing calls to array_of_runtime_bound_p with
variably_modified_type_p or similar since the N3639 arrays are
just a subset of those accepted by G++.  Unfortunately, that has
other repercussions (e.g., c++70555).

I replaced the call to array_of_runtime_bound_p in build_vec_init
with one to variably_modified_type_p to get around the above.
That  works, but it's only good for checking for excess
initializers in build_vec_init.  It's too late to check for
overflow in the VLA bounds because by that time the code to
allocate the stack has already been emitted.

>>> Also, I think we should check for invalid bounds in
>>> compute_array_index_type, next to the UBsan code.  Checking bounds only
>>> from cp_finish_decl means that we don't check uses of VLA types other
>>> than variable declarations.

I don't see how to make this work.  compute_array_index_type
doesn't have access to the CONSTRUCTOR for the initializer of
the VLA the initializer hasn't been parsed yet).  Without it
it's not possible to detect VLA size overflow in cases such
as in:

     T a [][N] = { { ... }, { ... } };

where the number of top-level elements determines whether or
not the size of the whole VLA would overflow or exceed the
maximum.

Given this, I believe the check does need to be implemented
somewhere in cp_finish_decl or one of the functions it calls
(such as check_initializer) and emitted before build_vec_init
is called or the initializer code it creates is emitted.

>>
>> You mean VLA typedefs?  That's true, though I have consciously
>> avoided dealing with those.  They're outlawed in N3639 and so
>> I've been focusing just on variables.  But since GCC accepts
>> VLA typedefs too I was thinking I would bring them up at some
>> point in the future to decide what to do about them.
>
> And cast to pointer to VLAs.  But for non-variable cases we don't care
> about available stack, so we wouldn't want your allocation limit to apply.

I don't want to implement it now, but I think the same limit
should apply in all cases, otherwise code remains susceptible
to unsigned integer wrapping.  For example:

   extern size_t N;
   typedef int A [N];
   int *a = (int*)malloc (sizeof (A));   // possible wraparound
   a [N - 1] = 0;                        // out-of-bounds write

It seems that the typedef will need to be accepted (in case it's
unused) but the runtime sizeof would need to do the checking and
potentially throw.  I haven't thought through the ramifications
yet.

>
>> As for where to add the bounds checking code, I also at first
>> thought of checking the bounds parallel to the UBSan code in
>> compute_array_index_type() and even tried that approach. The
>> problem with it is that it only considers one array dimension
>> at a time, without the knowledge of the others.  As a result,
>> as noted in sanitizer/70051, it doesn't correctly detect
>> overflows in the bounds of multidimensional VLAs.
>
> It doesn't, but I don't see why it couldn't.  It should be fine to check
> each dimension for overflow separately; if an inner dimension doesn't
> overflow, we can go on and consider the outer dimension.

As I explained above, I don't see how to make this work.

>
> Incidentally, I was wondering if it would make sense to use the
> overflowing calculation for both TYPE_SIZE and the sanity check when
> we're doing both.

I'm not sure what you mean here.  Can you elaborate?

>
>>>> +          /* Avoid instrumenting constexpr functions.  Those must
>>>> +         be checked statically, and the (non-constexpr) dynamic
>>>> +         instrumentation would cause them to be rejected.  */
>
>>> Hmm, this sounds wrong; constexpr functions can also be called with
>>> non-constant arguments, and the instrumentation should be folded away
>>> when evaluating a call with constant arguments.
>
>> You're right that constexpr functions should be checked as
>> well.  Unfortunately, at present, due to c++/70507 the check
>> (or rather the call to __builtin_mul_overflow) isn't folded
>> away and we end up with error: call to internal function.
>
> Ah, sure.  It should be pretty simple to teach the constexpr code how to
> handle that built-in.

I'd be glad to do this work but I don't believe I can get it done
in time for 6.0.

Martin

next prev parent reply	other threads:[~2016-04-08  0:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-07  1:38 Martin Sebor
2016-03-14 21:26 ` Martin Sebor
2016-03-21 22:10   ` PING #2 " Martin Sebor
2016-03-23 18:34 ` Jason Merrill
2016-03-23 22:50   ` Martin Sebor
2016-03-24 16:24     ` Jason Merrill
2016-04-01 23:02       ` Martin Sebor
2016-04-04 23:35         ` Jason Merrill
2016-04-08  0:05           ` Martin Sebor [this message]
2016-04-10 23:14             ` Martin Sebor
2016-04-12 15:43               ` Jason Merrill
2016-04-12 17:37                 ` Martin Sebor
2016-04-12 18:17               ` Jason Merrill
2016-04-13 18:37                 ` Martin Sebor
2016-04-13 19:26                   ` Jason Merrill
2016-04-14  2:35                   ` H.J. Lu
2016-04-14 15:32                     ` Martin Sebor
2016-04-14 10:40                   ` Andreas Schwab
2016-04-14 15:26                     ` Martin Sebor
2016-04-15  7:11                       ` Christophe Lyon
2016-04-15 12:31                       ` Jakub Jelinek
2016-04-15 14:33                         ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5706F5D5.9030204@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).