public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Qing Zhao <qing.zhao@oracle.com>
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: Richard Biener <rguenther@suse.de>,
	Kees Cook <keescook@chromium.org>,
	Gcc-patches Qing Zhao via <gcc-patches@gcc.gnu.org>
Subject: Re: [patch for gcc12 stage1][version 2] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc
Date: Mon, 26 Apr 2021 15:02:51 -0500	[thread overview]
Message-ID: <EC372183-BC3C-4559-885D-4378DA9512A5@oracle.com> (raw)
In-Reply-To: <mptmttlhra4.fsf@arm.com>



> On Apr 26, 2021, at 12:47 PM, Richard Sandiford <richard.sandiford@arm.com> wrote:
> 
> Qing Zhao <qing.zhao@oracle.com> writes:
>>>> @@ -1831,6 +2000,17 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
>>>> 	       as they may contain a label address.  */
>>>> 	    walk_tree (&init, force_labels_r, NULL, NULL);
>>>> 	}
>>>> +      /* When there is no explicit initializer, if the user requested,
>>>> +	 We should insert an artifical initializer for this automatic
>>>> +	 variable for non vla variables.  */
>>> 
>>> I think we should explain why we can skip VLAs here.
>> 
>> VLA is handled in another place already, it should be initialized with calls to memset/memcpy.
> 
> Yeah, what I meant here was that the comment should explain the
> difference between the handling of VLAs and non-VLAs.  It's fairly
> obvious when reading the patch, but it won't be as obvious once the
> patch is applied.

Okay, I see, will update the comments.
> 
>>>> +   children are to be processed.  TOP_OFFSET is the offset  of the processed
>>>> +   subtree which has to be subtracted from offsets of individual accesses to
>>>> +   get corresponding offsets for AGG.  GSI is a statement iterator used to place
>>>> +   the new statements.  */
>>>> +static void
>>>> +generate_subtree_deferred_init (struct access *access, tree agg,
>>>> +				enum auto_init_type init_type,
>>>> +				HOST_WIDE_INT top_offset,
>>>> +				gimple_stmt_iterator *gsi,
>>>> +				location_t loc)
>>>> +{
>>>> +  do
>>>> +    {
>>>> +      if (access->grp_to_be_replaced)
>>>> +	{
>>>> +	  tree repl = get_access_replacement (access);
>>>> +	  tree init_type_node
>>>> +	    = build_int_cst (integer_type_node, (int) init_type);
>>>> +	  gimple *call = gimple_build_call_internal (IFN_DEFERRED_INIT, 2,
>>>> +						     repl, init_type_node);
>>>> +	  gimple_call_set_lhs (call, repl);
>>> 
>>> AFAICT “access” is specifically for the lhs of the original call.
>>> So there seems to be an implicit assumption here that the lhs of the
>>> original call is the same as the first argument of the original call.
>>> Is that guaranteed/required?
>> 
>> For call to DEFFERED_INIT, yes, this is guaranteed.
> 
> OK, in that case…
> 
>>> If so, I think it's something that
>>> tree-cfg.c should check.  It might also be worth having an assertion
>>> in sra_modify_deferred_init.
>> I can definitely add an assertion to make sure this.
> 
> …I think we need the tree-cfg.c check too.  Having the check there
> ensures that the invariant is maintained throughout gimple.

Okay, will add check in tree-cfg.c too.

> 
>>>> +	  gimple_set_location (call, loc);
>>>> +
>>>> +	  sra_stats.subtree_deferred_init++;
>>>> +	}
>>>> +      else if (access->grp_to_be_debug_replaced)
>>>> +	{
>>>> +	  /* FIXME, this part might have some issue.  */
>>>> +	  tree drhs = build_debug_ref_for_model (loc, agg,
>>>> +						 access->offset - top_offset,
>>>> +						 access);
>>>> +	  gdebug *ds = gimple_build_debug_bind (get_access_replacement (access),
>>>> +						drhs, gsi_stmt (*gsi));
>>>> +	  gsi_insert_before (gsi, ds, GSI_SAME_STMT);
>>> 
>>> Would be good to fix the FIXME :-)
>> 
>> This is the part I am not very sure, so I added the FIXME in order to get more review and suggestion
>> to make sure it. -:)
>>> 
>>> I guess the thing we need to decide here is whether -ftrivial-auto-var-init
>>> should affect debug-only constructs too.
>> 
>> Where can I get more details on Debug-only constructs ?
> 
> What I meant by “debug-only construct” is a piece of source-level data
> (in this case a field of an aggregate) that has been optimised out of
> the executable code but still exists in debug stmts.

Okay, I see now.

Then, we should handle this case too, I think.

>  AIUI that's what
> the code above is handling.
> 
>>>> @@ -4863,6 +4863,29 @@ find_func_aliases_for_builtin_call (struct function *fn, gcall *t)
>>>>  return false;
>>>> }
>>>> 
>>>> +static void
>>>> +find_func_aliases_for_deferred_init (gcall *t)
>>>> +{
>>>> +  tree lhsop = gimple_call_lhs (t);
>>>> +  enum auto_init_type init_type
>>>> +    = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (t, 1));
>>>> +  auto_vec<ce_s, 2> lhsc;
>>>> +  auto_vec<ce_s, 4> rhsc;
>>>> +  struct constraint_expr temp;
>>>> +
>>>> +  get_constraint_for (lhsop, &lhsc);
>>>> +  if (init_type == AUTO_INIT_ZERO && flag_delete_null_pointer_checks)
>>>> +    temp.var = nothing_id;
>>>> +  else
>>>> +    temp.var = nonlocal_id;
>>>> +  temp.type = ADDRESSOF;
>>>> +  temp.offset = 0;
>>>> +  rhsc.safe_push (temp);
>>>> +
>>>> +  process_all_all_constraints (lhsc, rhsc);
>>>> +  return;
>>>> +}
>>>> +
>>> 
>>> What's the reasoning behind doing it like this?  AFAICT the result
>>> of the call doesn't validly alias anything, regardless of the init type.
>>> Even if there happens to be a valid decl at the address given by the
>>> chosen fill pattern, there's no expectation that accesses to that
>>> decl will be coherent wrt accesses via the result of the call.
>> 
>> So, you mean the above change will not have any impact at all?
> 
> Stepping back a bit first: why did you add the code?  What case
> it is trying to handle?  Perhaps I misunderstood the motivation.
> 
> In the above reply I meant more that the code seems unnecessarily
> pessimistic.  It looks like it's based on:
> 
>  /* x = integer is all glommed to a single variable, which doesn't
>     point to anything by itself.  That is, of course, unless it is an
>     integer constant being treated as a pointer, in which case, we
>     will return that this is really the addressof anything.  This
>     happens below, since it will fall into the default case. The only
>     case we know something about an integer treated like a pointer is
>     when it is the NULL pointer, and then we just say it points to
>     NULL.
> 
>     Do not do that if -fno-delete-null-pointer-checks though, because
>     in that case *NULL does not fail, so it _should_ alias *anything.
>     It is not worth adding a new option or renaming the existing one,
>     since this case is relatively obscure.  */
>  if ((TREE_CODE (t) == INTEGER_CST
>       && integer_zerop (t))
>      /* The only valid CONSTRUCTORs in gimple with pointer typed
> 	 elements are zero-initializer.  But in IPA mode we also
> 	 process global initializers, so verify at least.  */
>      || (TREE_CODE (t) == CONSTRUCTOR
> 	  && CONSTRUCTOR_NELTS (t) == 0))
>    {
>      if (flag_delete_null_pointer_checks)
> 	temp.var = nothing_id;
>      else
> 	temp.var = nonlocal_id;
>      temp.type = ADDRESSOF;
>      temp.offset = 0;
>      results->safe_push (temp);
>      return;
>    }
> 
> But if the user writes:
> 
>   intptr_t x = 0;
>   *(int *)x = 42;
> 
> and compiles with -fno-delete-null-pointer-checks, then we have to
> assume that the code is valid and is accessing a global decl that the
> user knows is at address 0.  So in that case we need the &nonlocal_id
> constraint.
> 
> But even with the new option, I don't think:
> 
>   intptr_t x;
>   *(int *)x = 42;
> 
> is well-defined in the same way.  Maybe there's an argument that using
> -fno-delete-null-pointer-checks is such a niche case that we might as
> well treat it as though it were well-defined.  That'd feel to me like
> a half-measure though.  It comes back to Richard Smith's argument
> that we shouldn't try to create a new dialect of C/C++.  We're
> explicitly not trying to make:
> 
>   intptr_t x;
> 
> and
> 
>   intptr_t x = 0;
> 
> equivalent in all respects.  And if we were trying to make them
> equivalent, we'd need to do much more than this.
> 
> The same applies to the pattern case.  If “x” is initialised to a pattern
> that happens to point to a real decl, we don't have to preserve the
> order of accesses to the decl wrt accesses to “*x” (especially since
> we're hoping that “*x” will trap).
> 
> I think for aliasing purposes, the .DEFERRED_INIT return value is still
> analogous to an undefined SSA name, even though we will later generate
> code to initialise it.

Okay, I see now.

Will delete this part from the code.

> 
>>> In fact it might be simpler to have something like:
>>> 
>>> if (POINTER_TYPE_P (type) && TYPE_PRECISION (type) < 64)
>>>   return build_int_cstu (type, 0xAA);
>>> 
>>> if (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type))
>>>   ...the wi::from_buffer thing above...;
>> 
>> Okay.
>>> 
>>> I'm not sure if this makes sense for NULLPTR_TYPE.
>> 
>> Is there a good testing case in gcc test suite about NULLPTR_TYPE that I can take a look?
> 
> I'm guessing you'll need to construct a new one to exercise this code path.

Will do this.

> 
>>>> +    case RECORD_TYPE:
>>>> +      {
>>>> +	tree field;
>>>> +	tree field_value;
>>>> +	vec<constructor_elt, va_gc> *v = NULL;
>>>> +	for (field= TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
>>>> +	  {
>>>> +	    if (TREE_CODE (field) != FIELD_DECL)
>>>> +	      continue;
>>>> +	    /* if the field is a variable length array, it should be the last
>>>> +	       field of the record, and no need to initialize.  */
>>> 
>>> Why doesn't it need to be initialized though?
>> 
>> My understanding is, the compiler will not allocate memory for the latest field of the record if its a VLA, it’s the user’s responsibility to allocate 
>> Memory for it.  Therefore, compiler doesn’t  need to initialize it.
> 
> Ah, OK.  This sounds like the behaviour for flexible array members rather
> than VLA members.

Oh, Yes. My bad to use a wrong concept to confuse you.


>  E.g. GCC allows:
> 
>  void bar (int *);
>  int
>  foo (int a)
>  {
>    // VLA in struct
>    struct { int x[a]; } foo;
>    bar (foo.x);
>  }
> 
> In this case the foo.x really does have “a” elements that would
> need to be initialised.
> 
> I think the case you're talking about is:
> 
>  void bar (int *);
>  int
>  foo (int a)
>  {
>    // struct ending in a flexible array
>    struct { int prefix; int x[]; } foo;
>    bar (foo.x);
>  }
> 
> where foo.x has zero elements.

Yes. 

Then, that part of the code currently is okay?

> 
>>>> @@ -11950,6 +12088,72 @@ lower_bound_in_type (tree outer, tree inner)
>>>>    }
>>>> }
>>>> 
>>>> +/* Returns true when the given TYPE has padding inside it.
>>>> +   return false otherwise.  */
>>>> +bool
>>>> +type_has_padding (tree type)
>>> 
>>> Would it be possible to reuse __builtin_clear_padding here?
>> 
>> Not sure, where can I get more details on __builtin_clear_padding? I can study a little bit more on this to make sure this.
> 
> It's documented in doc/extend.texi.

Okay, will check on that.

Thanks a lot.

Qing
> 
> Thanks,
> Richard


  reply	other threads:[~2021-04-26 20:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-24 21:21 Qing Zhao
2021-04-07 14:44 ` Qing Zhao
2021-04-07 22:19 ` Kees Cook
2021-04-08 15:22   ` Qing Zhao
2021-04-23 19:05 ` Richard Sandiford
2021-04-23 19:33   ` Kees Cook
2021-04-26 15:09   ` Qing Zhao
2021-04-26 17:47     ` Richard Sandiford
2021-04-26 20:02       ` Qing Zhao [this message]
2021-04-27  6:30       ` Richard Biener
2021-04-27 15:10         ` Qing Zhao
2021-05-05 17:29     ` Qing Zhao
2021-05-05 14:41   ` Qing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=EC372183-BC3C-4559-885D-4378DA9512A5@oracle.com \
    --to=qing.zhao@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=keescook@chromium.org \
    --cc=rguenther@suse.de \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).