From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by sourceware.org (Postfix) with ESMTPS id 92DC3393C86E for ; Mon, 7 Dec 2020 17:36:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 92DC3393C86E Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0B7HTqgW108956; Mon, 7 Dec 2020 17:36:12 GMT Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2130.oracle.com with ESMTP id 3581mqpjy2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 07 Dec 2020 17:36:12 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0B7HUkcD046394; Mon, 7 Dec 2020 17:36:11 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3030.oracle.com with ESMTP id 358ksmg0ts-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 07 Dec 2020 17:36:11 +0000 Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 0B7HaAhX002915; Mon, 7 Dec 2020 17:36:10 GMT Received: from dhcp-10-154-100-75.vpn.oracle.com (/10.154.100.75) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 07 Dec 2020 09:36:10 -0800 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: How to traverse all the local variables that declared in the current routine? From: Qing Zhao In-Reply-To: Date: Mon, 7 Dec 2020 11:36:09 -0600 Cc: Qing Zhao via Gcc-patches , Richard Biener Content-Transfer-Encoding: quoted-printable Message-Id: <23B74D40-BB4F-46AF-94A5-E490585CCCC0@ORACLE.COM> References: <217BE64F-A623-4453-B45B-D38B66B71B72@ORACLE.COM> <33955130-9D2D-43D5-818D-1DCC13FC1988@ORACLE.COM> <89D58812-0F3E-47AE-95A5-0A07B66EED8C@ORACLE.COM> To: Richard Sandiford X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9828 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=3 bulkscore=0 malwarescore=0 phishscore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012070113 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9828 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 mlxlogscore=999 clxscore=1015 malwarescore=0 priorityscore=1501 adultscore=0 lowpriorityscore=0 phishscore=0 spamscore=0 impostorscore=0 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012070111 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Dec 2020 17:36:17 -0000 > On Dec 7, 2020, at 11:10 AM, Richard Sandiford = wrote: >>>>=20 >>>> Another issue is, in order to check whether an auto-variable has = initializer, I plan to add a new bit in =E2=80=9Cdecl_common=E2=80=9D = as: >>>> /* In a VAR_DECL, this is DECL_IS_INITIALIZED. */ >>>> unsigned decl_is_initialized :1; >>>>=20 >>>> /* IN VAR_DECL, set when the decl is initialized at the = declaration. */ >>>> #define DECL_IS_INITIALIZED(NODE) \ >>>> (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized) >>>>=20 >>>> set this bit when setting DECL_INITIAL for the variables in FE. = then keep it >>>> even though DECL_INITIAL might be NULLed. >>>>=20 >>>>=20 >>>> For locals it would be more reliable to set this flag during = gimplification. >>>>=20 >>>> Do you have any comment and suggestions? >>>>=20 >>>>=20 >>>> As said above - do you want to cover registers as well as locals? = I'd do >>>> the actual zeroing during RTL expansion instead since otherwise you >>>> have to figure youself whether a local is actually used (see = expand_stack_vars) >>>>=20 >>>> Note that optimization will already made have use of = "uninitialized" state >>>> of locals so depending on what the actual goal is here "late" may = be too late. >>>>=20 >>>>=20 >>>> Haven't thought about this much, so it might be a daft idea, but = would a >>>> compromise be to use a const internal function: >>>>=20 >>>> X1 =3D .DEFERRED_INIT (X0, INIT) >>>>=20 >>>> where the X0 argument is an uninitialised value and the INIT = argument >>>> describes the initialisation pattern? So for a decl we'd have: >>>>=20 >>>> X =3D .DEFERRED_INIT (X, INIT) >>>>=20 >>>> and for an SSA name we'd have: >>>>=20 >>>> X_2 =3D .DEFERRED_INIT (X_1(D), INIT) >>>>=20 >>>> with all other uses of X_1(D) being replaced by X_2. The idea is = that: >>>>=20 >>>> * Having the X0 argument would keep the uninitialised use of the >>>> variable around for the later warning passes. >>>>=20 >>>> * Using a const function should still allow the UB to be deleted as = dead >>>> if X1 isn't needed. >>>>=20 >>>> * Having a function in the way should stop passes from taking = advantage >>>> of direct uninitialised uses for optimisation. >>>>=20 >>>> This means we won't be able to optimise based on the actual init >>>> value at the gimple level, but that seems like a fair trade-off. >>>> AIUI this is really a security feature or anti-UB hardening feature >>>> (in the sense that users are more likely to see predictable = behaviour >>>> =E2=80=9Cin the field=E2=80=9D even if the program has UB). >>>>=20 >>>>=20 >>>> The question is whether it's in line of peoples expectation that >>>> explicitely zero-initialized code behaves differently from >>>> implicitely zero-initialized code with respect to optimization >>>> and secondary side-effects (late diagnostics, latent bugs, etc.). >>>>=20 >>>> Introducing a new concept like .DEFERRED_INIT is much more >>>> heavy-weight than an explicit zero initializer. >>>>=20 >>>>=20 >>>> What exactly you mean by =E2=80=9Cheavy-weight=E2=80=9D? More = difficult to implement or much more run-time overhead or both? Or = something else? >>>>=20 >>>> The major benefit of the approach of =E2=80=9C.DEFERRED_INIT=E2=80=9D= is to enable us keep the current -Wuninitialized analysis untouched = and also pass >>>> the =E2=80=9Cuninitialized=E2=80=9D info from source code level to = =E2=80=9Cpass_expand=E2=80=9D. >>>=20 >>> Well, "untouched" is a bit oversimplified. You do need to handle >>> .DEFERRED_INIT as not >>> being an initialization which will definitely get interesting. >>=20 >> Yes, during uninitialized variable analysis pass, we should specially = handle the defs with =E2=80=9C.DEFERRED_INIT=E2=80=9D, to treat them as = uninitializations. >=20 > Are you sure we need to do that? The point of having the first = argument > to .DEFERRED_INIT was that that argument would still provide an > uninitialised use of the variable. And the values are passed and > returned by value, so the lack of initialisation is explicit in > the gcall itself, without knowing what the target function does. >=20 > The idea is that we can essentially treat .DEFERRED_INIT as a normal > (const) function call. I'd be surprised if many passes needed to > handle it specially. >=20 Just checked with a small testing case (to emulate the .DEFERRED_INIT = approach): qinzhao@gcc10:~/Bugs/auto-init$ cat t.c extern int DEFFERED_INIT (int, int) __attribute__ ((const)); int foo (int n, int r) { int v; v =3D DEFFERED_INIT (v, 0); if (n < 10)=20 v =3D r; return v; } qinzhao@gcc10:~/Bugs/auto-init$ sh t /home/qinzhao/Install/latest_write/bin/gcc -O -Wuninitialized = -fdump-tree-all -S t.c t.c: In function =E2=80=98foo=E2=80=99: t.c:7:7: warning: =E2=80=98v=E2=80=99 is used uninitialized = [-Wuninitialized] 7 | v =3D DEFFERED_INIT (v, 0); | ^~~~~~~~~~~~~~~~~~~~ We can see that the current uninitialized variable analysis treats the = new added artificial initialization as the first use of the uninialized = variable. Therefore report the warning there. However, we should report warning at =E2=80=9Creturn v=E2=80=9D.=20 So, I think that we still need to specifically handle the new added = artificial initialization during uninitialized analysis phase. Do I still miss anything? Qing > Thanks, > Richard