From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from aserp2130.oracle.com (aserp2130.oracle.com [141.146.126.79]) by sourceware.org (Postfix) with ESMTPS id 430C53851C33 for ; Mon, 7 Dec 2020 16:20:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 430C53851C33 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0B7GJWG1163800; Mon, 7 Dec 2020 16:20:21 GMT Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2130.oracle.com with ESMTP id 357yqbpae4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 07 Dec 2020 16:20:21 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0B7GFeaD176625; Mon, 7 Dec 2020 16:20:20 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3030.oracle.com with ESMTP id 358m4wf6bv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 07 Dec 2020 16:20:20 +0000 Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 0B7GKIbu010106; Mon, 7 Dec 2020 16:20:18 GMT Received: from dhcp-10-154-100-75.vpn.oracle.com (/10.154.100.75) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 07 Dec 2020 08:20:18 -0800 From: Qing Zhao Message-Id: <89D58812-0F3E-47AE-95A5-0A07B66EED8C@ORACLE.COM> Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: How to traverse all the local variables that declared in the current routine? Date: Mon, 7 Dec 2020 10:20:17 -0600 In-Reply-To: Cc: Richard Biener via Gcc-patches To: Richard Biener , Richard Sandiford References: <217BE64F-A623-4453-B45B-D38B66B71B72@ORACLE.COM> <33955130-9D2D-43D5-818D-1DCC13FC1988@ORACLE.COM> X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9827 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 spamscore=0 suspectscore=4 bulkscore=0 malwarescore=0 phishscore=0 adultscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012070104 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9827 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=4 mlxlogscore=999 clxscore=1015 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 priorityscore=1501 mlxscore=0 lowpriorityscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012070104 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, HTML_MESSAGE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Dec 2020 16:20:28 -0000 > On Dec 7, 2020, at 1:12 AM, Richard Biener = wrote: >=20 > On Fri, Dec 4, 2020 at 5:19 PM Qing Zhao > wrote: >>=20 >>=20 >>=20 >> On Dec 4, 2020, at 2:50 AM, Richard Biener = wrote: >>=20 >> On Thu, Dec 3, 2020 at 6:33 PM Richard Sandiford >> wrote: >>=20 >>=20 >> Richard Biener via Gcc-patches writes: >>=20 >> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao = wrote: >>=20 >> Another issue is, in order to check whether an auto-variable has = initializer, I plan to add a new bit in =E2=80=9Cdecl_common=E2=80=9D = as: >> /* In a VAR_DECL, this is DECL_IS_INITIALIZED. */ >> unsigned decl_is_initialized :1; >>=20 >> /* IN VAR_DECL, set when the decl is initialized at the declaration. = */ >> #define DECL_IS_INITIALIZED(NODE) \ >> (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized) >>=20 >> set this bit when setting DECL_INITIAL for the variables in FE. then = keep it >> even though DECL_INITIAL might be NULLed. >>=20 >>=20 >> For locals it would be more reliable to set this flag during = gimplification. >>=20 >> Do you have any comment and suggestions? >>=20 >>=20 >> As said above - do you want to cover registers as well as locals? = I'd do >> the actual zeroing during RTL expansion instead since otherwise you >> have to figure youself whether a local is actually used (see = expand_stack_vars) >>=20 >> Note that optimization will already made have use of "uninitialized" = state >> of locals so depending on what the actual goal is here "late" may be = too late. >>=20 >>=20 >> Haven't thought about this much, so it might be a daft idea, but = would a >> compromise be to use a const internal function: >>=20 >> X1 =3D .DEFERRED_INIT (X0, INIT) >>=20 >> where the X0 argument is an uninitialised value and the INIT argument >> describes the initialisation pattern? So for a decl we'd have: >>=20 >> X =3D .DEFERRED_INIT (X, INIT) >>=20 >> and for an SSA name we'd have: >>=20 >> X_2 =3D .DEFERRED_INIT (X_1(D), INIT) >>=20 >> with all other uses of X_1(D) being replaced by X_2. The idea is = that: >>=20 >> * Having the X0 argument would keep the uninitialised use of the >> variable around for the later warning passes. >>=20 >> * Using a const function should still allow the UB to be deleted as = dead >> if X1 isn't needed. >>=20 >> * Having a function in the way should stop passes from taking = advantage >> of direct uninitialised uses for optimisation. >>=20 >> This means we won't be able to optimise based on the actual init >> value at the gimple level, but that seems like a fair trade-off. >> AIUI this is really a security feature or anti-UB hardening feature >> (in the sense that users are more likely to see predictable behaviour >> =E2=80=9Cin the field=E2=80=9D even if the program has UB). >>=20 >>=20 >> The question is whether it's in line of peoples expectation that >> explicitely zero-initialized code behaves differently from >> implicitely zero-initialized code with respect to optimization >> and secondary side-effects (late diagnostics, latent bugs, etc.). >>=20 >> Introducing a new concept like .DEFERRED_INIT is much more >> heavy-weight than an explicit zero initializer. >>=20 >>=20 >> What exactly you mean by =E2=80=9Cheavy-weight=E2=80=9D? More = difficult to implement or much more run-time overhead or both? Or = something else? >>=20 >> The major benefit of the approach of =E2=80=9C.DEFERRED_INIT=E2=80=9D = is to enable us keep the current -Wuninitialized analysis untouched and = also pass >> the =E2=80=9Cuninitialized=E2=80=9D info from source code level to = =E2=80=9Cpass_expand=E2=80=9D. >=20 > Well, "untouched" is a bit oversimplified. You do need to handle > .DEFERRED_INIT as not > being an initialization which will definitely get interesting. Yes, during uninitialized variable analysis pass, we should specially = handle the defs with =E2=80=9C.DEFERRED_INIT=E2=80=9D, to treat them as = uninitializations. >> If we want to keep the current -Wuninitialized analysis untouched, = this is a quite reasonable approach. >>=20 >> However, if it=E2=80=99s not required to keep the current = -Wuninitialized analysis untouched, adding zero-initializer directly = during gimplification should >> be much easier and simpler, and also smaller run-time overhead. >>=20 >>=20 >> As for optimization I fear you'll get a load of redundant zero-init >> actually emitted if you can just rely on RTL DSE/DCE to remove it. >>=20 >>=20 >> Runtime overhead for -fauto-init=3Dzero is one important = consideration for the whole feature, we should minimize the runtime = overhead for zero >> Initialization since it will be used in production build. >> We can do some run-time performance evaluation when we have an = implementation ready. >=20 > Note there will be other passes "confused" by .DEFERRED_INIT. Note > that there's going to be other > considerations - namely where to emit the .DEFERRED_INIT - when > emitting it during gimplification > you can emit it at the start of the block of block-scope variables. > When emitting after gimplification > you have to emit at function start which will probably make stack slot > sharing inefficient because > the deferred init will cause overlapping lifetimes. With emitting at > block boundary the .DEFERRED_INIT > will act as code-motion barrier (and it itself likely cannot be moved) > so for example invariant motion > will no longer happen. Likewise optimizations like SRA will be > confused by .DEFERRED_INIT which > again will lead to bigger stack usage (and less optimization). Yes, looks like that the inserted =E2=80=9C.DEFERRED_INIT=E2=80=9D = function calls will negatively impact tree optimizations.=20 >=20 > But sure, you can try implement a few variants but definitely > .DEFERRED_INIT will be the most > work. How about implement the following two approaches and compare the = run-time cost: A. Insert the real initialization during gimplification phase.=20 B. Insert the .DEFERRED_INIT during gimplification phase, and then = expand this call to real initialization during expand phase.=20 The Approach A will have less run-time overhead, but will mess up the = current uninitialized variable analysis in GCC. The Approach B will have more run-time overhead, but will keep the = current uninitialized variable analysis in GCC.=20 And then decide which approach we will go with? What=E2=80=99s your opinion on this? >=20 >> Btw, I don't think theres any reason to cling onto clangs semantics >> for a particular switch. We'll never be able to emulate 1:1 behavior >> and our -Wuninit behavior is probably wastly different already. >>=20 >>=20 >> =46rom my study so far, yes, the currently behavior of -Wunit for = Clang and GCC is not exactly the same. >>=20 >> For example, for the following small testing case: >> void blah(int); >>=20 >> int foo_2 (int n, int l, int m, int r) >> { >> int v; >>=20 >> if ( (n > 10) && (m !=3D 100) && (r < 20) ) >> v =3D r; >>=20 >> if (l > 100) >> if ( (n <=3D 8) && (m < 102) && (r < 19) ) >> blah(v); /* { dg-warning "uninitialized" "real warning" } */ >>=20 >> return 0; >> } >>=20 >> GCC is able to report maybe uninitialized warning, but Clang cannot. >> Looks like that GCC=E2=80=99s uninitialized analysis relies on more = analysis and optimization information than CLANG. >>=20 >> Really curious on how clang implement its uninitialized analysis? Actually, I studied a little bit on how clang implement its = uninitialized analysis last Friday.=20 And noticed that CLANG has a data flow analysis phase based on CLANG's = AST.=20 = http://clang-developers.42468.n3.nabble.com/A-survey-of-dataflow-analyses-= in-Clang-td4069644.html And clang=E2=80=99s uninitialized analysis is based on this data flow = analysis.=20 Therefore, adding initialization AFTER clang=E2=80=99s uninitialization = analysis phase is straightforward. However, for GCC, we don=E2=80=99t have data flow analysis in FE. The = uninitialized variable analysis is put in TREE optimization phase, Therefore, it=E2=80=99s much more difficult to implement this feature in = GCC than that in CLANG.=20 Qing >>=20 >> Qing >>=20 >>=20 >>=20 >>=20 >> Richard. >>=20 >> Thanks, >> Richard