From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) by sourceware.org (Postfix) with ESMTPS id 81820385E001 for ; Tue, 5 Jan 2021 19:10:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 81820385E001 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 105IxMJ4186863; Tue, 5 Jan 2021 19:10:03 GMT Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 35tgsktacv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 05 Jan 2021 19:10:03 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 105IuPTs072002; Tue, 5 Jan 2021 19:10:03 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3030.oracle.com with ESMTP id 35v4rbrwj2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 05 Jan 2021 19:10:03 +0000 Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 105JA1Ll001031; Tue, 5 Jan 2021 19:10:01 GMT Received: from dhcp-10-154-112-240.vpn.oracle.com (/10.154.112.240) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 05 Jan 2021 11:10:01 -0800 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init From: Qing Zhao In-Reply-To: <5A0F7219-DAFA-4EAA-B845-0E236A108738@ORACLE.COM> Date: Tue, 5 Jan 2021 13:10:00 -0600 Cc: Richard Sandiford , Richard Biener via Gcc-patches Content-Transfer-Encoding: quoted-printable Message-Id: References: <217BE64F-A623-4453-B45B-D38B66B71B72@ORACLE.COM> <33955130-9D2D-43D5-818D-1DCC13FC1988@ORACLE.COM> <89D58812-0F3E-47AE-95A5-0A07B66EED8C@ORACLE.COM> <9585CBB2-0082-4B9A-AC75-250F54F0797C@ORACLE.COM> <51911859-45D5-4566-B588-F828B9D7313B@ORACLE.COM> <9127AAB9-92C8-4A1B-BAD5-2F5F8762DCF9@ORACLE.COM> <5A0F7219-DAFA-4EAA-B845-0E236A108738@ORACLE.COM> To: Richard Biener X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9855 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 bulkscore=0 suspectscore=0 spamscore=0 adultscore=0 malwarescore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2101050109 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9855 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 spamscore=0 malwarescore=0 phishscore=0 impostorscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 lowpriorityscore=0 adultscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2101050109 X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jan 2021 19:10:11 -0000 I am attaching my current (incomplete) patch to gcc for your reference. =46rom a71eb73bee5857440c4ff67c4c82be115e0675cb Mon Sep 17 00:00:00 2001 From: qing zhao Date: Sat, 12 Dec 2020 00:02:28 +0100 Subject: [PATCH] First version of -ftrivial-auto-var-init --- gcc/common.opt | 35 ++++++++++++++++++ gcc/flag-types.h | 14 ++++++++ gcc/gimple-pretty-print.c | 2 +- gcc/gimplify.c | 90 = +++++++++++++++++++++++++++++++++++++++++++++++ gcc/internal-fn.c | 20 +++++++++++ gcc/internal-fn.def | 5 +++ gcc/tree-cfg.c | 3 ++ gcc/tree-ssa-uninit.c | 3 ++ gcc/tree-ssa.c | 5 +++ 9 files changed, 176 insertions(+), 1 deletion(-) diff --git a/gcc/common.opt b/gcc/common.opt index 6645539f5e5..c4c4fc28ef7 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -3053,6 +3053,41 @@ ftree-scev-cprop Common Report Var(flag_tree_scev_cprop) Init(1) Optimization Enable copy propagation of scalar-evolution information. =20 +ftrivial-auto-var-init=3D +Common Joined RejectNegative Enum(auto_init_type) = Var(flag_trivial_auto_var_init) Init(AUTO_INIT_UNINITIALIZED) +-ftrivial-auto-var-init=3D[uninitialized|pattern|zero] Add = initializations to automatic variables.=09 + +Enum +Name(auto_init_type) Type(enum auto_init_type) = UnknownError(unrecognized automatic variable initialization type %qs) + +EnumValue +Enum(auto_init_type) String(uninitialized) = Value(AUTO_INIT_UNINITIALIZED) + +EnumValue +Enum(auto_init_type) String(pattern) Value(AUTO_INIT_PATTERN) + +EnumValue +Enum(auto_init_type) String(zero) Value(AUTO_INIT_ZERO) + +fauto-var-init-approach=3D +Common Joined RejectNegative Enum(auto_init_approach) = Var(flag_auto_init_approach) Init(AUTO_INIT_A)) +-fauto-var-init-approach=3D[A|B|C|D] Choose the approach to = initialize automatic variables.=09 + +Enum +Name(auto_init_approach) Type(enum auto_init_approach) = UnknownError(unrecognized automatic variable initialization approach = %qs) + +EnumValue +Enum(auto_init_approach) String(A) Value(AUTO_INIT_A) + +EnumValue +Enum(auto_init_approach) String(B) Value(AUTO_INIT_B) + +EnumValue +Enum(auto_init_approach) String(C) Value(AUTO_INIT_C) + +EnumValue +Enum(auto_init_approach) String(D) Value(AUTO_INIT_D) + ; -fverbose-asm causes extra commentary information to be produced in ; the generated assembly code (to make it more readable). This option ; is generally only of use to those who actually need to read the diff --git a/gcc/flag-types.h b/gcc/flag-types.h index 9342bd87be3..bfd0692b82c 100644 --- a/gcc/flag-types.h +++ b/gcc/flag-types.h @@ -242,6 +242,20 @@ enum vect_cost_model { VECT_COST_MODEL_DEFAULT =3D 1 }; =20 +/* Automatic variable initialization type. */ +enum auto_init_type { + AUTO_INIT_UNINITIALIZED =3D 0, + AUTO_INIT_PATTERN =3D 1, + AUTO_INIT_ZERO =3D 2 +}; + +enum auto_init_approach { + AUTO_INIT_A =3D 0, + AUTO_INIT_B =3D 1, + AUTO_INIT_C =3D 2, + AUTO_INIT_D =3D 3 +}; + /* Different instrumentation modes. */ enum sanitize_code { /* AddressSanitizer. */ diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index 075d6e5208a..1044d54e8d3 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -81,7 +81,7 @@ newline_and_indent (pretty_printer *buffer, int spc) DEBUG_FUNCTION void debug_gimple_stmt (gimple *gs) { - print_gimple_stmt (stderr, gs, 0, TDF_VOPS|TDF_MEMSYMS); + print_gimple_stmt (stderr, gs, 0, = TDF_VOPS|TDF_MEMSYMS|TDF_LINENO|TDF_ALIAS); } =20 =20 diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 54cb66bd1dd..1eb0747ea2f 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1674,6 +1674,16 @@ gimplify_return_expr (tree stmt, gimple_seq = *pre_p) return GS_ALL_DONE; } =20 +/* Return the value that is used to initialize the vla DECL based=20 + on INIT_TYPE. */ +tree memset_init_node (enum auto_init_type init_type) +{ + if (init_type =3D=3D AUTO_INIT_ZERO) + return integer_zero_node; + else + gcc_assert (0); +} + /* Gimplify a variable-length array DECL. */ =20 static void @@ -1712,6 +1722,19 @@ gimplify_vla_decl (tree decl, gimple_seq *seq_p) =20 gimplify_and_add (t, seq_p); =20 + /* Add a call to memset to initialize this vla when the user = requested. */ + if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED + && !DECL_ARTIFICIAL (decl) + && VAR_P (decl)=20 + && !DECL_EXTERNAL (decl)=20 + && !TREE_STATIC (decl)) + { + t =3D builtin_decl_implicit (BUILT_IN_MEMSET); + tree init_node =3D memset_init_node (flag_trivial_auto_var_init); + t =3D build_call_expr (t, 3, addr, init_node, DECL_SIZE_UNIT = (decl));=20 + gimplify_and_add (t, seq_p); + } + /* Record the dynamic allocation associated with DECL if requested. = */ if (flag_callgraph_info & CALLGRAPH_INFO_DYNAMIC_ALLOC) record_dynamic_alloc (decl); @@ -1734,6 +1757,63 @@ force_labels_r (tree *tp, int *walk_subtrees, = void *data ATTRIBUTE_UNUSED) return NULL_TREE; } =20 + +/* Build a call to internal const function DEFERRED_INIT, + 1st argument: DECL; + 2nd argument: INIT_TYPE; + + as DEFERRED_INIT (DECL, INIT_TYPE) + + DEFERRED_INIT is defined as: + DEF_INTERNAL_FN (DEFERRED_INIT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, = NULL). */ + +static gimple *=20 +build_deferred_init (tree decl, + enum auto_init_type init_type) +{ + tree init_type_node =3D + build_int_cst (integer_type_node, (int) init_type); + return gimple_build_call_internal (IFN_DEFERRED_INIT, 2, decl, = init_type_node); +} + + +/* Generate initialization to automatic variable DECL based on = INIT_TYPE. */ +static void +gimple_add_init_for_auto_var (tree decl, + enum auto_init_type init_type, + enum auto_init_approach init_approach, + gimple_seq *seq_p) +{ + gcc_assert (VAR_P (decl) && !DECL_EXTERNAL (decl) && !TREE_STATIC = (decl)); + switch (init_type) + { + case AUTO_INIT_UNINITIALIZED: + case AUTO_INIT_PATTERN: + gcc_assert (0); + break; + case AUTO_INIT_ZERO: + if (init_approach =3D=3D AUTO_INIT_A) + { + tree init =3D build_zero_cst (TREE_TYPE (decl)); + init =3D build2 (INIT_EXPR, void_type_node, decl, init); + gimplify_and_add (init, seq_p); + ggc_free (init); + } + else if (init_approach =3D=3D AUTO_INIT_D) + { + gimple *call =3D build_deferred_init (decl, AUTO_INIT_ZERO); + gimple_call_set_lhs (call, decl); + gimplify_seq_add_stmt (seq_p, call); + } + else=20 + gcc_assert (0); + break; + default: + gcc_unreachable (); + } +} + + /* Gimplify a DECL_EXPR node *STMT_P by making any necessary allocation and initialization explicit. */ =20 @@ -1821,6 +1901,16 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq = *seq_p) as they may contain a label address. */ walk_tree (&init, force_labels_r, NULL, NULL); } + /* When there is no explicit initializer, if the user requested, + We should insert an artifical initializer for this automatic + variable for non vla variables. */ + else if (flag_trivial_auto_var_init > AUTO_INIT_UNINITIALIZED + && !TREE_STATIC (decl) + && !is_vla) + gimple_add_init_for_auto_var (decl,=20 + flag_trivial_auto_var_init,=20 + flag_auto_init_approach, + seq_p); } =20 return GS_ALL_DONE; diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 41223ff7d82..6eef6ddb259 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -2971,6 +2971,26 @@ expand_UNIQUE (internal_fn, gcall *stmt) emit_insn (pattern); } =20 +/* Expand the IFN_DEFERRED_INIT function according to its second = argument. */ +static void +expand_DEFERRED_INIT (internal_fn, gcall *stmt) +{ + tree var =3D gimple_call_lhs (stmt); + enum auto_init_type init_type + =3D (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, = 1)); + + switch (init_type) + { + default: + gcc_unreachable (); + case AUTO_INIT_PATTERN: + gcc_assert (0); + case AUTO_INIT_ZERO: + tree init =3D build_zero_cst (TREE_TYPE (var)); + expand_assignment (var, init, false); + } +} + /* The size of an OpenACC compute dimension. */ =20 static void diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 91a7bfea3ee..fd077d8b55c 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -347,6 +347,11 @@ DEF_INTERNAL_FN (VEC_CONVERT, ECF_CONST | ECF_LEAF = | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (UNIQUE, ECF_NOTHROW, NULL) DEF_INTERNAL_FN (PHI, 0, NULL) =20 +/* A function to represent an artifical initialization to an = uninitialized + automatic variable. The first argument is the variable itself, the + second argument is the initialization type. */ +DEF_INTERNAL_FN (DEFERRED_INIT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, = NULL) + /* DIM_SIZE and DIM_POS return the size of a particular compute dimension and the executing thread's position within that dimension. DIM_POS is pure (and not const) so that it isn't diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index f59a0c05200..3717c6d26a5 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3433,6 +3433,9 @@ verify_gimple_call (gcall *stmt) } } =20 + if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) + return false; + /* ??? The C frontend passes unpromoted arguments in case it didn't see a function declaration before the call. So for now leave the call arguments mostly unverified. Once we gimplify diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c index 516a7bd2c99..6c0946b0bc5 100644 --- a/gcc/tree-ssa-uninit.c +++ b/gcc/tree-ssa-uninit.c @@ -611,6 +611,9 @@ warn_uninitialized_vars (bool wmaybe_uninit) ssa_op_iter op_iter; tree use; =20 + if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) + continue; + if (is_gimple_debug (stmt)) continue; =20 diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c index a575979aa13..319e4150dc4 100644 --- a/gcc/tree-ssa.c +++ b/gcc/tree-ssa.c @@ -1325,6 +1325,11 @@ ssa_undefined_value_p (tree t, bool partial) if (gimple_nop_p (def_stmt)) return true; =20 + /* The value is undefined iff the definition statement is a call + to .DEFERRED_INIT function. */ + if (gimple_call_internal_p (def_stmt, IFN_DEFERRED_INIT)) + return true; + /* Check if the complex was not only partially defined. */ if (partial && is_gimple_assign (def_stmt) && gimple_assign_rhs_code (def_stmt) =3D=3D COMPLEX_EXPR) --=20 2.11.0 > On Jan 5, 2021, at 1:05 PM, Qing Zhao via Gcc-patches = wrote: >=20 > Hi, >=20 > This is an update for our previous discussion.=20 >=20 > 1. I implemented the following two different implementations in the = latest upstream gcc: >=20 > A. Adding real initialization during gimplification, not maintain the = uninitialized warnings. >=20 > D. Adding calls to .DEFFERED_INIT during gimplification, expand the = .DEFFERED_INIT during expand to > real initialization. Adjusting uninitialized pass with the new refs = with =E2=80=9C.DEFFERED_INIT=E2=80=9D. >=20 > Note, in this initial implementation, > ** I ONLY implement -ftrivial-auto-var-init=3Dzero, the = implementation of -ftrivial-auto-var-init=3Dpattern=20 > is not done yet. Therefore, the performance data is only = about -ftrivial-auto-var-init=3Dzero.=20 >=20 > ** I added an temporary option -fauto-var-init-approach=3DA|B|C|D= to choose implementation A or D for=20 > runtime performance study. > ** I didn=E2=80=99t finish the uninitialized warnings = maintenance work for D. (That might take more time than I expected).=20 >=20 > 2. I collected runtime data for CPU2017 on a x86 machine with this new = gcc for the following 3 cases: >=20 > no: default. (-g -O2 -march=3Dnative ) > A: default + -ftrivial-auto-var-init=3Dzero = -fauto-var-init-approach=3DA=20 > D: default + -ftrivial-auto-var-init=3Dzero = -fauto-var-init-approach=3DD=20 >=20 > And then compute the slowdown data for both A and D as following: >=20 > benchmarks A / no D /no >=20 > 500.perlbench_r 1.25% 1.25% > 502.gcc_r 0.68% 1.80% > 505.mcf_r 0.68% 0.14% > 520.omnetpp_r 4.83% 4.68% > 523.xalancbmk_r 0.18% 1.96% > 525.x264_r 1.55% 2.07% > 531.deepsjeng_ 11.57% 11.85% > 541.leela_r 0.64% 0.80% > 557.xz_ -0.41% -0.41% >=20 > 507.cactuBSSN_r 0.44% 0.44% > 508.namd_r 0.34% 0.34% > 510.parest_r 0.17% 0.25% > 511.povray_r 56.57% 57.27% > 519.lbm_r 0.00% 0.00% > 521.wrf_r -0.28% -0.37% > 526.blender_r 16.96% 17.71% > 527.cam4_r 0.70% 0.53% > 538.imagick_r 2.40% 2.40% > 544.nab_r 0.00% -0.65% >=20 > avg 5.17% 5.37% >=20 > =46rom the above data, we can see that in general, the runtime = performance slowdown for=20 > implementation A and D are similar for individual benchmarks. >=20 > There are several benchmarks that have significant slowdown with the = new added initialization for both > A and D, for example, 511.povray_r, 526.blender_, and 531.deepsjeng_r, = I will try to study a little bit > more on what kind of new initializations introduced such slowdown.=20 >=20 > =46rom the current study so far, I think that approach D should be = good enough for our final implementation.=20 > So, I will try to finish approach D with the following remaining work >=20 > ** complete the implementation of = -ftrivial-auto-var-init=3Dpattern; > ** complete the implementation of uninitialized warnings = maintenance work for D.=20 >=20 >=20 > Let me know if you have any comments and suggestions on my current and = future work. >=20 > Thanks a lot for your help. >=20 > Qing >=20 >> On Dec 9, 2020, at 10:18 AM, Qing Zhao via Gcc-patches = wrote: >>=20 >> The following are the approaches I will implement and compare: >>=20 >> Our final goal is to keep the uninitialized warning and minimize the = run-time performance cost. >>=20 >> A. Adding real initialization during gimplification, not maintain the = uninitialized warnings. >> B. Adding real initialization during gimplification, marking them = with =E2=80=9Cartificial_init=E2=80=9D.=20 >> Adjusting uninitialized pass, maintaining the annotation, making = sure the real init not >> Deleted from the fake init.=20 >> C. Marking the DECL for an uninitialized auto variable as = =E2=80=9Cno_explicit_init=E2=80=9D during gimplification, >> maintain this =E2=80=9Cno_explicit_init=E2=80=9D bit till after = pass_late_warn_uninitialized, or till pass_expand,=20 >> add real initialization for all DECLs that are marked with = =E2=80=9Cno_explicit_init=E2=80=9D. >> D. Adding .DEFFERED_INIT during gimplification, expand the = .DEFFERED_INIT during expand to >> real initialization. Adjusting uninitialized pass with the new = refs with =E2=80=9C.DEFFERED_INIT=E2=80=9D. >>=20 >>=20 >> In the above, approach A will be the one that have the minimum = run-time cost, will be the base for the performance >> comparison.=20 >>=20 >> I will implement approach D then, this one is expected to have the = most run-time overhead among the above list, but >> Implementation should be the cleanest among B, C, D. Let=E2=80=99s = see how much more performance overhead this approach >> will be. If the data is good, maybe we can avoid the effort to = implement B, and C.=20 >>=20 >> If the performance of D is not good, I will implement B or C at that = time. >>=20 >> Let me know if you have any comment or suggestions. >>=20 >> Thanks. >>=20 >> Qing >=20