From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) by sourceware.org (Postfix) with ESMTPS id E6959385DC3D for ; Mon, 26 Apr 2021 15:09:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E6959385DC3D Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 13QF62Lw045688; Mon, 26 Apr 2021 15:09:20 GMT Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 385ahbjcgq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Apr 2021 15:09:20 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 13QF6LpY038663; Mon, 26 Apr 2021 15:09:18 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 384w3rpdyp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Apr 2021 15:09:17 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 13QF9GUX028099; Mon, 26 Apr 2021 15:09:16 GMT Received: from dhcp-10-154-182-15.vpn.oracle.com (/10.154.182.15) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Apr 2021 08:09:16 -0700 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: [patch for gcc12 stage1][version 2] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc From: Qing Zhao In-Reply-To: Date: Mon, 26 Apr 2021 10:09:15 -0500 Cc: Kees Cook , Gcc-patches Qing Zhao via Content-Transfer-Encoding: quoted-printable Message-Id: <0DDDEAB4-AA58-4F98-9526-8F63AEAC3D3D@oracle.com> References: <0CE28536-176B-44E1-BDC6-3942739B829C@oracle.com> To: Richard Sandiford , Richard Biener X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9966 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 mlxscore=0 mlxlogscore=999 bulkscore=0 phishscore=0 spamscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104060000 definitions=main-2104260119 X-Proofpoint-GUID: V9RrVtpE1flHcJwalFwWd0RkZePJOLsh X-Proofpoint-ORIG-GUID: V9RrVtpE1flHcJwalFwWd0RkZePJOLsh X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=9966 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 bulkscore=0 mlxlogscore=999 priorityscore=1501 clxscore=1011 adultscore=0 suspectscore=0 spamscore=0 phishscore=0 malwarescore=0 lowpriorityscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104060000 definitions=main-2104260119 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_BL, RCVD_IN_MSPIKE_L4, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Apr 2021 15:09:27 -0000 Hi, Richard, Thanks a lot for your review. > On Apr 23, 2021, at 2:05 PM, Richard Sandiford = wrote: >=20 > Finally getting to this now that the GCC 11 rush is over. Sorry for > the slow response. >=20 > I've tried to review most of the code below, but skipped the testsuite > parts in the interests of time. I'll probably have more comments in > future rounds, just wanted to get the ball rolling. >=20 > This is realy Richi's area more than mine though, so please take this > with a grain of salt. >=20 > Qing Zhao writes: >> 2. initialize all paddings to zero when -ftrivial-auto-var-init is = present. >> In expr.c (store_constructor): >>=20 >> Clear the whole structure when >> -ftrivial-auto-var-init and the structure has paddings. >>=20 >> In gimplify.c (gimplify_init_constructor): >>=20 >> Clear the whole structure when >> -ftrivial-auto-var-init and the structure has paddings. >=20 > Just to check: are we sure we want to use zero as the padding fill > value even for -ftrivial-auto-var-init=3Dpattern? Or should it be > 0xAA instead, to match the integer fill pattern? >=20 > I can see the arguments both ways, just thought it was worth asking. For this question, I think Kees had provided the background information = on it. Yes, this is basically following Clang=E2=80=99s current implementation = in order to match C spec. >=20 >> [=E2=80=A6] >> @@ -1589,6 +1592,24 @@ handle_retain_attribute (tree *pnode, tree = name, tree ARG_UNUSED (args), >> return NULL_TREE; >> } >>=20 >> +/* Handle a "uninitialized" attribute; arguments as in >=20 > This occurs in existing code too, but s/a/an/. Okay, will fix it. >=20 >> + struct attribute_spec.handler. */ >> + >> +static tree >> +handle_uninitialized_attribute (tree *node, tree name, tree = ARG_UNUSED (args), >> + int ARG_UNUSED (flags), bool = *no_add_attrs) >> +{ >> + if (VAR_P (*node)) >> + DECL_UNINITIALIZED (*node) =3D 1; >> + else >> + { >> + warning (OPT_Wattributes, "%qE attribute ignored", name); >> + *no_add_attrs =3D true; >> + } >> + >> + return NULL_TREE; >> +} >> + >> /* Handle a "externally_visible" attribute; arguments as in >> struct attribute_spec.handler. */ >=20 >> [=E2=80=A6] >> @@ -11689,6 +11689,34 @@ Perform basic block vectorization on trees. = This flag is enabled by default at >> @option{-O3} and by @option{-ftree-vectorize}, = @option{-fprofile-use}, >> and @option{-fauto-profile}. >>=20 >> +@item -ftrivial-auto-var-init=3D@var{choice} >> +@opindex ftrivial-auto-var-init >> +Initialize automatic variables with either a pattern or with zeroes = to increase >> +program security by preventing uninitialized memory disclosure and = use. >> + >> +The three values of @var{choice} are: >> + >> +@itemize @bullet >> +@item >> +@samp{uninitialized} doesn't initialize any automatic variables. >> +This is C and C++'s default. >> + >> +@item >> +@samp{pattern} Initialize automatic variables with values which will = likely >> +transform logic bugs into crashes down the line, are easily = recognized in a >> +crash dump and without being values that programmers can rely on for = useful >> +program semantics. >> +The values used for pattern initialization might be changed in the = future. >> + >> +@item >> +@samp{zero} Initialize automatic variables with zeroes. >> +@end itemize >> + >> +The default is @samp{uninitialized}. >> + >> +You can control this behavior for a specific variable by using the = variable >> +attribute @code{uninitialized} (@pxref{Variable Attributes}). >> + >=20 > I think it's important to say here that GCC still considers the > variables to be uninitialised and still considers reading them to > be undefined behaviour. The option is simply trying to improve the > security and predictability of the program in the presence of these > uninitialised variables. >=20 > I think it would also be worth saying that options like = -Wuninitialized > still try to warn about uninitialised variables, although using > -ftrivial-auto-var-init may change which warnings are generated. >=20 > (The above comments are just a summary, not suitable for direct > inclusion. :-)) Agreed. Will update the documentation per your suggestions. >=20 >> @item -fvect-cost-model=3D@var{model} >> @opindex fvect-cost-model >> Alter the cost model used for vectorization. The @var{model} = argument >> [=E2=80=A6] >> diff --git a/gcc/gimplify.c b/gcc/gimplify.c >> index 6da6698..fafd2e9 100644 >> --- a/gcc/gimplify.c >> +++ b/gcc/gimplify.c >> @@ -1716,6 +1716,116 @@ gimplify_vla_decl (tree decl, gimple_seq = *seq_p) >>=20 >> gimplify_and_add (t, seq_p); >>=20 >> + /* Add a call to memset or calls to memcpy to initialize this vla >> + when the user requested. */ >> + if (!DECL_ARTIFICIAL (decl) >> + && VAR_P (decl) >> + && !DECL_EXTERNAL (decl) >> + && !TREE_STATIC (decl) >> + && !DECL_UNINITIALIZED (decl)) >> + switch (flag_trivial_auto_var_init) >> + { >> + case AUTO_INIT_UNINITIALIZED: >> + break; >> + case AUTO_INIT_ZERO: >> + { >> + /* Generate a call to memset to initialize this vla. */ >> + gcall *gs; >> + t =3D builtin_decl_implicit (BUILT_IN_MEMSET); >> + gs =3D gimple_build_call (t, 3, addr, integer_zero_node, >> + DECL_SIZE_UNIT (decl)); >> + gimple_call_set_memset_for_uninit (gs, true); >> + gimplify_seq_add_stmt (seq_p, gs); >> + } >> + break; >> + case AUTO_INIT_PATTERN: >> + { >> + /* Generate the following sequence to initialize this vla: >> + if (DECL_SIZE_UNIT (decl) > 0) goto label_true; >> + else goto label_cont; >> + label_true: >> + { >> + element_type =3D TREE_TYPE (TREE_TYPE (decl)); >> + size_of_element =3D DECL_SIZE_UNIT (element_type); >> + init_node =3D build_pattern_cst (element_type); >> + cur =3D addr; >> + offset =3D DECL_SIZE_UNIT (decl) - size_of_element; >> + end =3D addr + offset; >> + >> + label_loop: >> + { >> + memcpy (cur, &init_node, size_of_element); >> + cur +=3D size_of_element; >> + if (cur <=3D end) goto label_loop; >> + else goto label_cont; >> + } >> + } >> + label_cont: >> + */ >> + >> + tree size_of_element, element_type; >> + tree cur, end, offset; >> + tree init_node, addrof_init_node; >> + tree t; >> + >> + tree label_true =3D create_artificial_label = (UNKNOWN_LOCATION); >> + tree label_cont =3D create_artificial_label = (UNKNOWN_LOCATION); >> + tree label_loop =3D create_artificial_label = (UNKNOWN_LOCATION); >> + >> + element_type =3D TREE_TYPE (TREE_TYPE (decl)); >> + >> + gcond *cond_stmt =3D gimple_build_cond (GT_EXPR, = DECL_SIZE_UNIT (decl), >> + build_zero_cst = (sizetype), >> + label_true, >> + label_cont); >> + gimplify_seq_add_stmt (seq_p, cond_stmt); >> + gimplify_seq_add_stmt (seq_p, gimple_build_label = (label_true)); >> + >> + size_of_element =3D create_tmp_var (sizetype, >> + ".size_of_element"); >> + >> + init_node =3D create_tmp_var (element_type, ".init_node"); >> + mark_addressable (init_node); >> + addrof_init_node =3D build_fold_addr_expr_loc = (UNKNOWN_LOCATION, >> + init_node); >> + >> + gimplify_assign (size_of_element, TYPE_SIZE_UNIT = (element_type), >> + seq_p); >> + gimplify_assign (init_node, build_pattern_cst (element_type), = seq_p); >=20 > Rather than building a separate init_node here, would it work to = assign > to element 0 instead? Then the loop could start at element 1. I think that should be fine, can save a little time.=20 >=20 > What about nested VLAs, like: >=20 > void g(void *); > void f(int a) > { > int x[a][a]; > g(x); > } >=20 Will check on this.=20 > ? >=20 >> + >> + cur =3D create_tmp_var (ptr_type, ".cur_addr"); >> + end =3D create_tmp_var (ptr_type, ".end_addr"); >> + offset =3D create_tmp_var (sizetype, ".offset"); >> + >> + gimplify_assign (cur, addr, seq_p); >> + gimplify_seq_add_stmt (seq_p, >> + gimple_build_assign (offset, = MINUS_EXPR, >> + DECL_SIZE_UNIT = (decl), >> + size_of_element)); >> + gimplify_seq_add_stmt (seq_p, >> + gimple_build_assign (end, = POINTER_PLUS_EXPR, >> + addr, offset)); >> + >> + gimplify_seq_add_stmt (seq_p, gimple_build_label = (label_loop)); >> + >> + t =3D builtin_decl_implicit (BUILT_IN_MEMCPY); >> + gimplify_seq_add_stmt (seq_p, >> + gimple_build_call (t, 3, cur, >> + addrof_init_node, >> + size_of_element)); >> + gimplify_seq_add_stmt (seq_p, >> + gimple_build_assign (cur, = POINTER_PLUS_EXPR, >> + cur, = size_of_element)); >> + cond_stmt =3D gimple_build_cond (LE_EXPR, cur, end, = label_loop, >> + label_cont); >> + gimplify_seq_add_stmt (seq_p, cond_stmt); >> + gimplify_seq_add_stmt (seq_p, gimple_build_label = (label_cont)); >> + } >> + break; >> + default: >> + gcc_assert (0); >> + } >> + >> /* Record the dynamic allocation associated with DECL if requested. = */ >> if (flag_callgraph_info & CALLGRAPH_INFO_DYNAMIC_ALLOC) >> record_dynamic_alloc (decl); >> @@ -1738,6 +1848,65 @@ force_labels_r (tree *tp, int *walk_subtrees, = void *data ATTRIBUTE_UNUSED) >> return NULL_TREE; >> } >>=20 >> + >> +/* Build a call to internal const function DEFERRED_INIT: >> + 1st argument: DECL; >> + 2nd argument: INIT_TYPE; >> + >> + as DEFERRED_INIT (DECL, INIT_TYPE) >> + >> + DEFERRED_INIT is defined as: >> + DEF_INTERNAL_FN(DEFERRED_INIT, ECF_CONST | ECF_LEAF | = ECF_NOTHROW, NULL). */ >=20 > I don't think it's necessary to repeat the definition here. Okay, will delete it. >=20 >> + >> +static gimple * >> +build_deferred_init (tree decl, >> + enum auto_init_type init_type) >> +{ >> + tree init_type_node >> + =3D build_int_cst (integer_type_node, (int) init_type); >> + return gimple_build_call_internal (IFN_DEFERRED_INIT, 2, >> + decl, init_type_node); >=20 > Nit: indentation is off in the return statement. Okay. >=20 >> +} >> + >> [=E2=80=A6] >> @@ -1831,6 +2000,17 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq = *seq_p) >> as they may contain a label address. */ >> walk_tree (&init, force_labels_r, NULL, NULL); >> } >> + /* When there is no explicit initializer, if the user = requested, >> + We should insert an artifical initializer for this automatic >> + variable for non vla variables. */ >=20 > I think we should explain why we can skip VLAs here. VLA is handled in another place already, it should be initialized with = calls to memset/memcpy. >=20 >> + else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED >> + && !DECL_UNINITIALIZED (decl) >> + && !TREE_STATIC (decl) >> + && !is_vla) >> + gimple_add_init_for_auto_var (decl, >> + flag_trivial_auto_var_init, >> + flag_auto_init_approach, >> + seq_p); >> } >>=20 >> return GS_ALL_DONE; >> [=E2=80=A6] >> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c >> index 7e3aae5..da02be2 100644 >> --- a/gcc/tree-cfg.c >> +++ b/gcc/tree-cfg.c >> @@ -3471,6 +3471,9 @@ verify_gimple_call (gcall *stmt) >> } >> } >>=20 >> + if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) >> + return false; >> + >=20 > Why is this needed? I think a comment would be helpful. Will add the comments. >=20 >> /* ??? The C frontend passes unpromoted arguments in case it >> didn't see a function declaration before the call. So for now >> leave the call arguments mostly unverified. Once we gimplify >> diff --git a/gcc/tree-core.h b/gcc/tree-core.h >> index d2e6c89..fb42c41 100644 >> --- a/gcc/tree-core.h >> +++ b/gcc/tree-core.h >> @@ -1815,6 +1815,7 @@ struct GTY(()) tree_decl_with_vis { >> unsigned in_text_section : 1; >> unsigned in_constant_pool : 1; >> unsigned dllimport_flag : 1; >> + unsigned uninitialized : 1; >> /* Don't belong to VAR_DECL exclusively. */ >> unsigned weak_flag : 1; >>=20 >> @@ -1836,7 +1837,7 @@ struct GTY(()) tree_decl_with_vis { >> unsigned final : 1; >> /* Belong to FUNCTION_DECL exclusively. */ >> unsigned regdecl_flag : 1; >> - /* 14 unused bits. */ >> + /* 13 unused bits. */ >=20 > It doesn't look like DECL_UNINITIALIZED is used in = compile-time-sensitive > code, so it might be better to keep "uninitialized" as a normal = attribute > and use lookup_attribute where necessary. Okay.=20 >=20 >> /* 32 more unused on 64 bit HW. */ >> }; >>=20 >> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c >> index d177f1b..fe439f7 100644 >> --- a/gcc/tree-sra.c >> +++ b/gcc/tree-sra.c >> @@ -384,6 +384,13 @@ static struct >>=20 >> /* Numbber of components created when splitting aggregate = parameters. */ >> int param_reductions_created; >> + >> + /* Number of deferred_init calls that are modified. */ >> + int deferred_init; >> + >> + /* Number of deferred_init calls that are created by >> + generate_subtree_deferred_init. */ >> + int subtree_deferred_init; >> } sra_stats; >>=20 >> static void >> @@ -4070,6 +4077,105 @@ get_repl_default_def_ssa_name (struct access = *racc, tree reg_type) >> return get_or_create_ssa_default_def (cfun, = racc->replacement_decl); >> } >>=20 >> + >> +/* Generate statements to call .DEFERRED_INIT to initialize scalar = replacements >> + of accesses within a subtree ACCESS, all its children, siblings = and their >=20 > IMO reads more easily with =E2=80=9C;=E2=80=9D rather than =E2=80=9C,=E2= =80=9D before =E2=80=9Call=E2=80=9D Okay. >=20 >> + children are to be processed. TOP_OFFSET is the offset of the = processed >> + subtree which has to be subtracted from offsets of individual = accesses to >> + get corresponding offsets for AGG. GSI is a statement iterator = used to place >> + the new statements. */ >> +static void >> +generate_subtree_deferred_init (struct access *access, tree agg, >> + enum auto_init_type init_type, >> + HOST_WIDE_INT top_offset, >> + gimple_stmt_iterator *gsi, >> + location_t loc) >> +{ >> + do >> + { >> + if (access->grp_to_be_replaced) >> + { >> + tree repl =3D get_access_replacement (access); >> + tree init_type_node >> + =3D build_int_cst (integer_type_node, (int) init_type); >> + gimple *call =3D gimple_build_call_internal = (IFN_DEFERRED_INIT, 2, >> + repl, = init_type_node); >> + gimple_call_set_lhs (call, repl); >=20 > AFAICT =E2=80=9Caccess=E2=80=9D is specifically for the lhs of the = original call. > So there seems to be an implicit assumption here that the lhs of the > original call is the same as the first argument of the original call. > Is that guaranteed/required? For call to DEFFERED_INIT, yes, this is guaranteed. > If so, I think it's something that > tree-cfg.c should check. It might also be worth having an assertion > in sra_modify_deferred_init. I can definitely add an assertion to make sure this. >=20 > If the argument and lhs can be different then I think we need to > handle the access patterns for them both. >=20 > Or is the idea that any instance of: >=20 > LHS =3D .DEFERRED_INIT (RHS, INIT_TYPE) >=20 > can be replaced with: >=20 > LHS =3D .DEFERRED_INIT (LHS, INIT_TYPE) >=20 > without affecting semantics? If so, I think that would be worth > a comment. >=20 >> + gsi_insert_before (gsi, call, GSI_SAME_STMT); >> + update_stmt (call); >=20 > It seems odd that we need to call update_stmt for a new stmt, but I = see > other code in the file also does this. Will double check on this.=20 >=20 >> + gimple_set_location (call, loc); >> + >> + sra_stats.subtree_deferred_init++; >> + } >> + else if (access->grp_to_be_debug_replaced) >> + { >> + /* FIXME, this part might have some issue. */ >> + tree drhs =3D build_debug_ref_for_model (loc, agg, >> + access->offset - = top_offset, >> + access); >> + gdebug *ds =3D gimple_build_debug_bind (get_access_replacement = (access), >> + drhs, gsi_stmt (*gsi)); >> + gsi_insert_before (gsi, ds, GSI_SAME_STMT); >=20 > Would be good to fix the FIXME :-) This is the part I am not very sure, so I added the FIXME in order to = get more review and suggestion to make sure it. -:) >=20 > I guess the thing we need to decide here is whether = -ftrivial-auto-var-init > should affect debug-only constructs too. Where can I get more details on Debug-only constructs ? > If it doesn't, exmaining removed > components in a debugger might show uninitialised values in cases = where > the user was expecting initialised ones. There would be no security > concern, but it might be surprising. >=20 > I think in principle the DRHS can contain a call to DEFERRED_INIT. > Doing that would probably require further handling elsewhere though. What=E2=80=99s DRHS? Where can I get more info on it? >=20 >> + } >> + if (access->first_child) >> + generate_subtree_deferred_init (access->first_child, agg, = init_type, >> + top_offset, gsi, loc); >> + >> + access =3D access ->next_sibling; >> + } >> + while (access); >> +} >> + >> +/* For a call to .DEFERRED_INIT: >> + var =3D .DEFERRED_INIT (var, init_type); >> + examine the LHS variable VAR and replace it with a scalar = replacement if >> + there is one, also replace the RHS call to a call to = .DEFERRED_INIT of >> + the corresponding scalar relacement variable. Examine the = subtree and >> + do the scalar replacements in the subtree too. STMT is the call, = GSI is >> + the statment iterator to place newly created statements. */ >=20 > typo: statement Okay. >=20 >> + >> +static enum assignment_mod_result >> +sra_modify_deferred_init (gimple *stmt, gimple_stmt_iterator *gsi) >> +{ >> + tree lhs =3D gimple_call_lhs (stmt); >> + enum auto_init_type init_type >> + =3D (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg = (stmt, 1)); >> + struct access *access =3D get_access_for_expr (lhs); >> + if (!access) >> + return SRA_AM_NONE; >> + location_t loc =3D gimple_location (stmt); >> + >> + if (access->grp_to_be_replaced) >> + { >> + tree repl =3D get_access_replacement (access); >> + gimple_call_set_lhs (stmt, repl); >> + gimple_call_set_arg (stmt, 0, repl); >> + sra_stats.deferred_init++; >> + } >> + else if (access->grp_to_be_debug_replaced) >> + { >> + /* FIXME, this part might have some issues. */ >> + tree drepl =3D get_access_replacement (access); >> + gdebug *ds =3D gimple_build_debug_bind (drepl, NULL_TREE, >> + gsi_stmt (*gsi)); >> + gsi_insert_before (gsi, ds, GSI_SAME_STMT); >> + } >=20 > Same comments as above. OKay, will try to understand Debug-only construct details.=20 >=20 >> + >> + if (access->first_child) >> + generate_subtree_deferred_init (access->first_child, lhs, >> + init_type, access->offset, >> + gsi, loc); >> + if (access->grp_covered) >> + { >> + unlink_stmt_vdef (stmt); >> + gsi_remove (gsi, true); >> + release_defs (stmt); >> + return SRA_AM_REMOVED; >> + } >> + >> + return SRA_AM_MODIFIED; >> +} >> + >> /* Examine both sides of the assignment statement pointed to by STMT, = replace >> them with a scalare replacement if there is one and generate = copying of >> replacements if scalarized aggregates have been used in the = assignment. GSI >> [=E2=80=A6] >> @@ -4863,6 +4863,29 @@ find_func_aliases_for_builtin_call (struct = function *fn, gcall *t) >> return false; >> } >>=20 >> +static void >> +find_func_aliases_for_deferred_init (gcall *t) >> +{ >> + tree lhsop =3D gimple_call_lhs (t); >> + enum auto_init_type init_type >> + =3D (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (t, = 1)); >> + auto_vec lhsc; >> + auto_vec rhsc; >> + struct constraint_expr temp; >> + >> + get_constraint_for (lhsop, &lhsc); >> + if (init_type =3D=3D AUTO_INIT_ZERO && = flag_delete_null_pointer_checks) >> + temp.var =3D nothing_id; >> + else >> + temp.var =3D nonlocal_id; >> + temp.type =3D ADDRESSOF; >> + temp.offset =3D 0; >> + rhsc.safe_push (temp); >> + >> + process_all_all_constraints (lhsc, rhsc); >> + return; >> +} >> + >=20 > What's the reasoning behind doing it like this? AFAICT the result > of the call doesn't validly alias anything, regardless of the init = type. > Even if there happens to be a valid decl at the address given by the > chosen fill pattern, there's no expectation that accesses to that > decl will be coherent wrt accesses via the result of the call. So, you mean the above change will not have any impact at all? >=20 >> /* Create constraints for the call T. */ >>=20 >> static void >> [=E2=80=A6] >> diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c >> index 0800f59..0c61f60 100644 >> --- a/gcc/tree-ssa-uninit.c >> +++ b/gcc/tree-ssa-uninit.c >> @@ -135,6 +135,20 @@ warn_uninit (enum opt_code wc, tree t, tree = expr, tree var, >> if (is_gimple_assign (context) >> && gimple_assign_rhs_code (context) =3D=3D COMPLEX_EXPR) >> return; >> + >> + /* Ignore REALPART_EXPR or IMAGPART_EXPR if its operand is >> + a call to .DEFERRED_INIT. */ >> + if (is_gimple_assign (context) >> + && (gimple_assign_rhs_code (context) =3D=3D REALPART_EXPR >> + || gimple_assign_rhs_code (context) =3D=3D IMAGPART_EXPR)) >> + { >> + tree v =3D gimple_assign_rhs1 (context); >> + if (TREE_CODE (TREE_OPERAND (v, 0)) =3D=3D SSA_NAME >> + && gimple_call_internal_p (SSA_NAME_DEF_STMT (TREE_OPERAND (v, = 0)), >> + IFN_DEFERRED_INIT)) >> + return; >> + } >> + >=20 > Which case is this handling? In this context I would have expected > the REALPART_EXPRs and IMAGPART_EXPRs to act like normal rvalue > accessors, i.e. they are returning one half of their argument and > discarding the other half. Why is this different from, say, accessing > one field of a structure? I should add comments when I added this part of the code. I remembered that this was to fix a missed warning for complex = expression in the regression test. But I forgot the details, I will check on this, and add more detailed = comments in the code. >=20 >> if (!has_undefined_value_p (t)) >> return; >>=20 >> @@ -209,6 +223,19 @@ check_defs (ao_ref *ref, tree vdef, void *data_) >> { >> check_defs_data *data =3D (check_defs_data *)data_; >> gimple *def_stmt =3D SSA_NAME_DEF_STMT (vdef); >> + >> + /* Ignore the vdef iff the definition statement is a call >> + to .DEFERRED_INIT function. */ >> + if (gimple_call_internal_p (def_stmt, IFN_DEFERRED_INIT)) >> + return false; >> + >> + /* Ignore the vdef iff the definition statement is a call >> + to builtin_memset function that is added for uninitialized >> + auto variable initialization. */ >> + if (gimple_call_builtin_p (def_stmt, BUILT_IN_MEMSET) >> + && gimple_call_memset_for_uninit_p (def_stmt)) >> + return false; >> + >=20 > s/iff/if/, since the =E2=80=9Conly if=E2=80=9D part doesn't apply. Okay. >=20 >> /* If this is a clobber then if it is not a kill walk past it. */ >> if (gimple_clobber_p (def_stmt)) >> { >> @@ -611,6 +638,9 @@ warn_uninitialized_vars (bool wmaybe_uninit) >> ssa_op_iter op_iter; >> tree use; >>=20 >> + if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) >> + continue; >> + >=20 > I guess the reasoning here is that the call is an artificial use and = so > won't give a meaningful error message. If the result of the call is > used somewhere else then we warn there instead. Yes. >=20 > I think that deserves a comment though. Will add comments here. >=20 >> if (is_gimple_debug (stmt)) >> continue; >>=20 >> diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c >> index cf54c89..93d6124 100644 >> --- a/gcc/tree-ssa.c >> +++ b/gcc/tree-ssa.c >> @@ -1325,6 +1325,23 @@ ssa_undefined_value_p (tree t, bool partial) >> if (gimple_nop_p (def_stmt)) >> return true; >>=20 >> + /* The value is undefined iff the definition statement is a call >> + to .DEFERRED_INIT function. */ >> + if (gimple_call_internal_p (def_stmt, IFN_DEFERRED_INIT)) >> + return true; >> + >> + if (partial && is_gimple_assign (def_stmt) >> + && (gimple_assign_rhs_code (def_stmt) =3D=3D REALPART_EXPR >> + || gimple_assign_rhs_code (def_stmt) =3D=3D IMAGPART_EXPR)) >> + { >> + tree real_imag_part =3D TREE_OPERAND (gimple_assign_rhs1 = (def_stmt), 0); >> + if (TREE_CODE (real_imag_part) =3D=3D SSA_NAME >> + && gimple_call_internal_p (SSA_NAME_DEF_STMT (real_imag_part), >> + IFN_DEFERRED_INIT)) >> + return true; >> + } >> + >> + >=20 > Same s/iff/if/ comment here. Also same question as above about the > REALPART_EXPR/IMAGPART_EXPR thing. Okay, will check on the details and add more comments. >=20 > Nit: too many blank links at the end. Will fix this. >=20 >> /* Check if the complex was not only partially defined. */ >> if (partial && is_gimple_assign (def_stmt) >> && gimple_assign_rhs_code (def_stmt) =3D=3D COMPLEX_EXPR) >> diff --git a/gcc/tree.c b/gcc/tree.c >> index 7c44c22..2c9acad 100644 >> --- a/gcc/tree.c >> +++ b/gcc/tree.c >> @@ -2531,6 +2531,144 @@ build_zero_cst (tree type) >> } >> } >>=20 >> +/* Build pattern constant of type TYPE. This is used for = initializing >> + auto variables. */ >> + >> +tree >> +build_pattern_cst (tree type) >=20 > I think this should have a more descriptive name, since the pattern > is very specific to the use case. Agreed, will change the name more specifically.=20 >=20 >> +{ >> + /* The following value is a guaranteed unmappable pointer value = and has a >> + repeated byte-pattern which makes it easier to synthesize. We = use it for >> + pointers as well as integers so that aggregates are likely to = be >> + initialized with this repeated value. */ >> + uint64_t largevalue =3D 0xAAAAAAAAAAAAAAAAull; >> + /* For 32-bit platforms it's a bit trickier because, across = systems, only the >> + zero page can reasonably be expected to be unmapped, and even = then we need >> + a very low address. We use a smaller value, and that value = sadly doesn't >> + have a repeated byte-pattern. We don't use it for integers. = */ >> + uint32_t smallvalue =3D 0x000000AA; >> + >> + switch (TREE_CODE (type)) >> + { >> + case INTEGER_TYPE: >> + case ENUMERAL_TYPE: >> + case BOOLEAN_TYPE: >> + /* This will initialize a boolean type variable to 0 instead = of 1. >> + We think that initializint a boolean variable to 0 other than 1 >=20 > initializing Okay. >=20 >> + is better even for pattern initialization. */ >> + return build_int_cstu (type, largevalue); >=20 > I've no objection to that choice for booleans, but: booleans in some > languages (like Ada) can have multibit precision. If we want booleans > to be zero then it would probably be better to treat them as a = separate > case and just use build_zero_cst (type) for them. Good point, yes, it might be better to handle boolean separately. >=20 > Also, the above won't work correctly for 128-bit integers: it will > zero-initialize the upper half. It would probably be better to use > wi::from_buffer to construct the integer instead. Okay. >=20 > I'm also not sure what effect this will have on enumerations, or when > the fill value is outside the TYPE_MIN_VALUE=E2=80=A6TYPE_MAX_VALUE = range. > Hopefully Richi will chime in :-) The pattern initialization part is much more tricky, I am hoping to get = more help and Discussion here to make sure each case. >=20 >> + case POINTER_TYPE: >> + case OFFSET_TYPE: >> + case REFERENCE_TYPE: >> + case NULLPTR_TYPE: >> + { >> + poly_uint64 intvalue; >> + >> + if (POINTER_SIZE =3D=3D 64) >> + intvalue =3D largevalue; >> + else if (POINTER_SIZE =3D=3D 32) >> + intvalue =3D smallvalue; >> + else >> + gcc_assert (0); >=20 > GCC supports 16-bit targets, so we can't assert here. We might as > well just go for 0xAA there too. Okay. >=20 > In fact it might be simpler to have something like: >=20 > if (POINTER_TYPE_P (type) && TYPE_PRECISION (type) < 64) > return build_int_cstu (type, 0xAA); >=20 > if (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type)) > ...the wi::from_buffer thing above...; Okay. >=20 > I'm not sure if this makes sense for NULLPTR_TYPE. Is there a good testing case in gcc test suite about NULLPTR_TYPE that I = can take a look? >=20 >> + return build_int_cstu (type, intvalue); >> + } >> + case REAL_TYPE: >> + { >> + REAL_VALUE_TYPE rnan; >> + >> + /* create an quiet NAN for REAL TYPE. */ >> + if (real_nan (&rnan, "", 1, TYPE_MODE (type))) >> + return build_real (type, rnan); >> + return NULL_TREE; >=20 > It doesn't look like the callers can cope with a null return value. > Maybe get_max_float would be a good fallback instead. Okay. >=20 >> + } >> + >> + case FIXED_POINT_TYPE: >> + { >> + /* FIXME. What should we put into a fixed point? */ >> + FIXED_VALUE_TYPE fixed; >> + fixed_from_string (&fixed, "0xFFFFFFFFFFFFFFFF", >> + SCALAR_TYPE_MODE (type)); >> + return build_fixed (type, fixed); >> + } >> + case VECTOR_TYPE: >> + { >> + tree scalar =3D build_pattern_cst (TREE_TYPE (type)); >> + return build_vector_from_val (type, scalar); >> + } >> + case COMPLEX_TYPE: >> + { >> + tree element =3D build_pattern_cst (TREE_TYPE (type)); >> + return build_complex (type, element, element); >> + } >> + case RECORD_TYPE: >> + { >> + tree field; >> + tree field_value; >> + vec *v =3D NULL; >> + for (field=3D TYPE_FIELDS (type); field; field =3D DECL_CHAIN = (field)) >> + { >> + if (TREE_CODE (field) !=3D FIELD_DECL) >> + continue; >> + /* if the field is a variable length array, it should be the = last >=20 > s/if/If/ Okay. >=20 >> + field of the record, and no need to initialize. */ >=20 > Why doesn't it need to be initialized though? My understanding is, the compiler will not allocate memory for the = latest field of the record if its a VLA, it=E2=80=99s the user=E2=80=99s = responsibility to allocate=20 Memory for it. Therefore, compiler doesn=E2=80=99t need to initialize = it. >=20 >> + if (TREE_CODE (TREE_TYPE (field)) =3D=3D ARRAY_TYPE >> + && TYPE_SIZE (TREE_TYPE (field)) =3D=3D NULL_TREE >> + && ((TYPE_DOMAIN (TREE_TYPE (field)) !=3D NULL_TREE >> + && TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (field))) >> + =3D=3D NULL_TREE) >> + || TYPE_DOMAIN (TREE_TYPE (field)) =3D=3D NULL_TREE)) >> + continue; >> + field_value =3D build_pattern_cst (TREE_TYPE (field)); >> + CONSTRUCTOR_APPEND_ELT (v, field, field_value); >> + } >> + return build_constructor (type, v); >> + } >> + case UNION_TYPE: >> + case QUAL_UNION_TYPE: >> + { >> + tree field, max_field =3D NULL; >> + unsigned max_size =3D 0; >> + tree field_value; >> + vec *v =3D NULL; >> + /* find the field with the largest size. */ >=20 > s/find/Find/ Okay. >=20 >> + for (field =3D TYPE_FIELDS (type); field; field =3D DECL_CHAIN = (field)) >> + { >> + if (TREE_CODE (field) !=3D FIELD_DECL) >> + continue; >> + if (tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (field))) >=3D = max_size) >> + { >> + max_size =3D tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE = (field))); >> + max_field =3D field; >> + } >=20 > This isn't safe. We shouldn't use tree_to_uhwi without checking = whether > tree_fits_uhwi_p. We then need a fallback for !tree_fits_uhwi_p. Okay, will make sure this. >=20 >> + } >> + field_value =3D build_pattern_cst (TREE_TYPE (max_field)); >> + CONSTRUCTOR_APPEND_ELT (v, max_field, field_value); >> + return build_constructor (type, v); >> + } >> + case ARRAY_TYPE: >> + { >> + vec *elts =3D NULL; >> + tree element =3D build_pattern_cst (TREE_TYPE (type)); >> + tree nelts =3D array_type_nelts (type); >> + if (nelts && tree_fits_uhwi_p (nelts)) >> + { >> + unsigned HOST_WIDE_INT n =3D tree_to_uhwi (nelts) + 1; >> + for (unsigned int i =3D 0; i < n; i++) >> + CONSTRUCTOR_APPEND_ELT (elts, NULL_TREE, element); >> + return build_constructor (type, elts); >> + } >> + /* variable length array should not be here. */ >=20 > s/variable/Variable/ Okay. >=20 >> + gcc_assert (0); >=20 > Should be gcc_unreachable () instead. Okay. >=20 >> + } >> + default: >> + if (!AGGREGATE_TYPE_P (type)) >> + return fold_convert (type, build_pattern_cst = (unsigned_type_node)); >> + else >> + gcc_assert (0); >=20 > Same here. Okay. >=20 >> @@ -11950,6 +12088,72 @@ lower_bound_in_type (tree outer, tree inner) >> } >> } >>=20 >> +/* Returns true when the given TYPE has padding inside it. >> + return false otherwise. */ >> +bool >> +type_has_padding (tree type) >=20 > Would it be possible to reuse __builtin_clear_padding here? Not sure, where can I get more details on __builtin_clear_padding? I can = study a little bit more on this to make sure this. >=20 >> +{ >> + switch (TREE_CODE (type)) >> + { >> + case RECORD_TYPE: >> + { >> + unsigned HOST_WIDE_INT record_size >> + =3D tree_to_uhwi (TYPE_SIZE_UNIT (type)); >=20 > As above, it's not safe to just to use tree_to_uhwi here without = checking. > I've skipped the rest of the function because of the builtin question = above. Okay. Thanks a lot again for your review. Qing >=20 > Thanks, > Richard