public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Jeff Law <law@redhat.com>, gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] add -Wmismatched-new-delete to middle end (PR 90629)
Date: Tue, 17 Nov 2020 11:02:13 -0700	[thread overview]
Message-ID: <585abfb6-d9a8-339d-2cca-7e95a96c82f3@gmail.com> (raw)
In-Reply-To: <2fff96f2-5e6d-03b5-3a18-3a89afa7fa14@redhat.com>

On 11/16/20 5:54 PM, Jeff Law wrote:
> 
> On 11/3/20 4:56 PM, Martin Sebor via Gcc-patches wrote:
>> Attached is a simple middle end implementation of detection of
>> mismatched pairs of calls to C++ new and delete, along with
>> a substantially enhanced implementation of -Wfree-nonheap-object.
>> The latter option has been in place since 2011 but detected only
>> the most trivial bugs.
>>
>> Unlike the Clang -Wmismatched-new-delete which diagnoses
>> declarations of "overloaded operator new() and operator delete()
>> functions that do not have a corresponding free store function
>> defined within the same scope", this patch detects mismatches
>> between calls to allocation and deallocation functions, such as
>> calling free() on the result of new, of delete on the result of
>> array new.  The functionality provided by Clang can be added on
>> top of what this feature does and since they are so close I think
>> it's fine to have both under the same option (a new level could
>> be introduced to distinguish the two).
>>
>> The -Wfree-nonheap-object enhancement lets the warning detect all
>> calls to free, realloc, or C++ delete, with pointers that can be
>> proven not to point to the first byte of an allocated object.
>>
>> The patch relies on the well-tested compute_objsize() function
>> for the determination of pointer provenance and makes use of
>> the changes in the following patch submitted for review just
>> yesterday:
>> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557807.html
>>
>> As usual, I tested on x86_64-linux with Glibc & Binutils/GDB with
>> no new false positives.
>>
>> Martin
>>
>> PS A few words on the implementation choices:
>>
>> The new code is in builtins.c only because -Wfree-nonheap-object
>> is there.  I still plan to move all of the invalid access checking
>> code into its own module or pass at some point but I didn't want
>> to make this improvement contingent on that restructuring.
>> Even though it's all in builtins.c, the code is called from calls.c.
>> This is so that simple mismatches can be diagnosed even when free
>> isn't handed in builtins.c (i.e., without optimization).
>> The warning makes no attempt to analyze the CFG or handle
>> conditional mismatches.  That will have to wait until the code
>> is moved to a GIMPLE pass.
>>
>> gcc-90629.diff
>>
>> PR c++/90629 - Support for -Wmismatched-new-delete
>>
>> gcc/ChangeLog:
>>
>> 	PR c++/90629
>> 	* builtins.c (access_ref::access_ref): Initialize new member.
>> 	(compute_objsize): Use access_ref::deref.  Handle simple pointer
>> 	assignment.
>> 	(expand_builtin): Remove handling of the free built-in.
>> 	(find_assignment_location): New function.
>> 	(gimple_call_alloc_p): Same.
>> 	(gimple_call_dealloc_argno): Same.
>> 	(gimple_call_dealloc_p): Same.
>> 	(matching_alloc_calls_p): Same.
>> 	(warn_dealloc_offset): Same.
>> 	* builtins.h (struct access_ref): Declare new member.
>> 	(maybe_emit_free_warning): Make extern.  Make use of access_ref.
>> 	Handle -Wmismatched-new-delete.
>> 	* calls.c (initialize_argument_information): Call
>> 	maybe_emit_free_warning.
>> 	* doc/invoke.texi (-Wfree-nonheap-object): Expand documentation.
>> 	(-Wmismatched-new-delete): Document new option.
>>
>> gcc/c-family/ChangeLog:
>>
>> 	PR c++/90629
>> 	* c.opt (-Wmismatched-new-delete): New option.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR c++/90629
>> 	* g++.dg/warn/delete-array-1.C: Add expected warning.
>> 	* g++.old-deja/g++.other/delete2.C: Add expected warning.
>> 	* g++.dg/warn/Wfree-nonheap-object-2.C: New test.
>> 	* g++.dg/warn/Wfree-nonheap-object.C: New test.
>> 	* g++.dg/warn/Wmismatched-new-delete.C: New test.
>> 	* gcc.dg/Wfree-nonheap-object-2.c: New test.
>> 	* gcc.dg/Wfree-nonheap-object-3.c: New test.
>> 	* gcc.dg/Wfree-nonheap-object.c: New test.
> 
> So do we need to reconcile with David M's patch that adds the 
> "deallocated_by" attribute.  In that thread I raised the question about 
> using the same attribute to track both pointers as well as things like 
> file descriptors.  It looks like this patch ignores everything except 
> builtins/language intrinsics as allocation functions.  ISTM that would 
> need to be fixed for David's patch to be useful here, right?

Our emails have crossed mid air.  I replied to your comments on
David's patch here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559309.html
The work I point to there extends this warning to user-defined
functions.

I also gave some preliminary comments on David's initial RFC.
It has some more background:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555667.html

> As we discussed privately, this can trigger false positives, 
> particularly for unoptimized code  as we've seen in Fedora testing.  But 
> I'm not terribly worried about these.
> 
> I haven't seen anything failing with a valid error because of this, but 
> that's because the packages where it's triggering aren't compiling with 
> -Werror.  I did some quick grepping of the logs and I think I found 
> instances of each of the new warnings.
> 
> 
> If you're interested torsion.usersys.redhat.com:/opt/notnfs/law/WARNINGS 
> contains all the lines with "warning:" from all the Fedora test builds. 
> Warning (pun intended), it's big...  10G, so don't try to download it 
> :-)  But it is faster than find | xargs zgrep across all the build logs :-)

There are quite a few (411 instances of -Wmismatched-new-delete to
be exact) but without more context (at least the informational notes)
they're hard to analyze.  I looked at just the first one and it points
to this bug:

./gengetopt/builds/80/log.gz:gengetopt.cc:581:12: warning: 'free' called 
on pointer returned from a mismatched allocation function 
[-Wmismatched-new-delete]

gengetopt_create_option (gengetopt_option *&n, const char * long_opt, 
char short_opt,
                       const char * desc,
                       int type, int flagstat, int required,
                       const char * default_value,
                       const char * group_value,
                       const char * mode_value,
                       const char * type_str,
                       const AcceptedValues *acceptedvalues,
                       int multiple,
                       int argoptional)
{
   if ((long_opt == NULL) ||
       (long_opt[0] == 0) ||
       (desc == NULL))
     return FOUND_BUG;

   n = new gengetopt_option;         <<< allocate by new
   if (n == NULL)
     return NOT_ENOUGH_MEMORY;

   // here we will set required anyway
   n->required_set = true;

   n->long_opt = strdup (long_opt);
   if (n->long_opt == NULL)
     {
       free (n);                     <<< deallocate by free
       return NOT_ENOUGH_MEMORY;
     }

Based on what others have said about what some static analyzers
report I expect this to be a common bug (the "operator delete"
message accounts for 336 instances out of the 411).

>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>> index 5d60eab6ba2..1b8a5b82dac 100644
>> --- a/gcc/builtins.c
>> +++ b/gcc/builtins.c
>> @@ -12589,30 +12594,336 @@ maybe_emit_sprintf_chk_warning (tree exp, enum built_in_function fcode)
>>   		access_write_only);
>>   }
>>   
>> -/* Emit warning if a free is called with address of a variable.  */
>> +/* Return the location of the assignment STMT if it has one, or another
>> +   assignment on the chain that has one.  Used to improve the location
>> +   of informational notes.  */
>>   
>> -static void
>> +static location_t
>> +find_assignment_location (tree var)
>> +{
>> +  gimple *stmt = SSA_NAME_DEF_STMT (var);
>> +
>> +  for (gimple *asgn = stmt; ; )
>> +    {
>> +      if (gimple_has_location (asgn))
>> +	return gimple_location (asgn);
>> +
>> +      if (!is_gimple_assign (asgn))
>> +	break;
>> +
>> +      tree rhs = gimple_assign_rhs1 (asgn);
>> +      if (TREE_CODE (rhs) != SSA_NAME)
>> +	break;
>> +
>> +      asgn = SSA_NAME_DEF_STMT (rhs);
>> +    }
> 
> What is this code for ^^^
> 
> 
> Under what conditions does the assignment not have a location? Perhaps 
> if it's a default definition?

I've seen all sorts of assignments with no location, including
ASSERT_EXPRs, BIT_AND_EXPRs, as well as MIN/MAX_EXPs, clobber
statements and others.  I didn't try to understand the pattern
behind them.

> All the nonsense walking up the use-def chain seems unnecessary and 
> probably misleading when we actually issue the diagnostic. Why are you 
> going through the trouble to do that?

The "nonsense" is me doing my best to provide at least some context
for warnings that might otherwise not have any and be harder to track
down to the root cause.  A message like:

   warning: ‘free’ called on pointer ‘<unknown>’ with nonzero offset

with nothing else for a function hundreds of lines long makes it
harder to track down the cause of the problem than with a note
pointing to at least a location of the closest assignment to
the unknown pointer.

That said, in this patch the function is only used for constants
and AFAICS, doesn't get exercised by any tests so it can probably
be removed with no adverse effect.  I do expect it to need to come
back in some form because of the missing location problem.  (It
comes from a bigger patch in progress where it's used more
extensively.)

>> +  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
>> +  gsi_next_nondebug (&gsi);
>> +  if (gimple *next_stmt = gsi_stmt (gsi))
>> +    {
>> +      if (gimple_has_location (next_stmt))
>> +	return gimple_location (next_stmt);
>> +    }
>> +  else
>> +    {
>> +      gsi = gsi_for_stmt (stmt);
>> +      gsi_prev_nondebug (&gsi);
>> +      if (gimple *prev_stmt = gsi_stmt (gsi))
>> +	if (gimple_has_location (prev_stmt))
>> +	  return gimple_location (prev_stmt);
>> +    }
>> +
>> +  /* As a last resort fallback, return the location of the closing
>> +     curly brace.  */
>> +  return cfun->function_end_locus;
>> +}
> 
> And if you clean up the code above, then all this becomes pointless, no?

I'm not sure what you mean but as I said above, the function
doesn't appear to be essential to this patch and can be removed.

Martin

  reply	other threads:[~2020-11-17 18:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-03 23:56 Martin Sebor
2020-11-17  0:54 ` Jeff Law
2020-11-17 18:02   ` Martin Sebor [this message]
2020-12-02  0:09     ` Jeff Law
2020-12-06  9:26 ` Martin Liška
2023-06-07 14:54 ` Remove 'gcc/testsuite/g++.dg/warn/Wfree-nonheap-object.s' (was: [PATCH] add -Wmismatched-new-delete to middle end (PR 90629)) Thomas Schwinge
2023-06-07 17:10   ` Mike Stump

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585abfb6-d9a8-339d-2cca-7e95a96c82f3@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).