From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [IPv6:2a00:1450:4864:20::529]) by sourceware.org (Postfix) with ESMTPS id A72A73857C69 for ; Wed, 9 Dec 2020 08:23:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A72A73857C69 Received: by mail-ed1-x529.google.com with SMTP id r5so573760eda.12 for ; Wed, 09 Dec 2020 00:23:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=tzy/3Vzjhsg2oOTontlGkj9xw2eBmDG4fEkr2DaEozc=; b=WCBQdld9w9M2LlJ+pgYYaAcsJ/Y2dKd6oFKqTITa5OtSUzFsYnjtbVxhXOC3BkB8c2 W2cB0PxyvlhLCQB0eRonCGwPdmzW6Oi2HzkrV9/5A8LxItoHwb3bHROsjWfx/74pJD+6 hu8txRMxAVfqgdWvO7+jr2IBZb3OIJL33MTNfediJceGD3TFlO+ph4izAyJavr60JRMC 4OsFWt2OgLYhfjstrxhqb7kbsXHXXS+qID8ShsFYv/DHmsRZbbYJDI8IbOxvIALPvgdg ezvuVjifYJGEiZtmS/Wln6HhLz8m5FMNfcMaNwuUpv8yIYWMtHFBUwt+yDx4fnFftxZC VM+w== X-Gm-Message-State: AOAM5311Z1Q9sE6DK1Gxs9zfALCS4W9haHiOdvjytDjeB3wsYjBYKuat V9FrIlTemQC3RMHekPxxEl9aPlVMF+g6ZnWkvyg= X-Google-Smtp-Source: ABdhPJwvIyDKSCj7A9wMUhOIkzyb5UgBpgSRVTMtNbIEod1xzLIYj614w+PO2ll/DxwNTnvAvOEPbqIs/LxrT2AAhiE= X-Received: by 2002:a50:f61b:: with SMTP id c27mr931113edn.61.1607502224499; Wed, 09 Dec 2020 00:23:44 -0800 (PST) MIME-Version: 1.0 References: <217BE64F-A623-4453-B45B-D38B66B71B72@ORACLE.COM> <33955130-9D2D-43D5-818D-1DCC13FC1988@ORACLE.COM> <89D58812-0F3E-47AE-95A5-0A07B66EED8C@ORACLE.COM> <9585CBB2-0082-4B9A-AC75-250F54F0797C@ORACLE.COM> In-Reply-To: <9585CBB2-0082-4B9A-AC75-250F54F0797C@ORACLE.COM> From: Richard Biener Date: Wed, 9 Dec 2020 09:23:33 +0100 Message-ID: Subject: Re: How to traverse all the local variables that declared in the current routine? To: Qing Zhao Cc: Richard Sandiford , Richard Biener via Gcc-patches Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Dec 2020 08:23:48 -0000 On Tue, Dec 8, 2020 at 8:54 PM Qing Zhao wrote: > > > > On Dec 8, 2020, at 1:40 AM, Richard Biener w= rote: > > On Mon, Dec 7, 2020 at 5:20 PM Qing Zhao wrote: > > > > > On Dec 7, 2020, at 1:12 AM, Richard Biener w= rote: > > On Fri, Dec 4, 2020 at 5:19 PM Qing Zhao wrote: > > > > > On Dec 4, 2020, at 2:50 AM, Richard Biener w= rote: > > On Thu, Dec 3, 2020 at 6:33 PM Richard Sandiford > wrote: > > > Richard Biener via Gcc-patches writes: > > On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao wrote: > > Another issue is, in order to check whether an auto-variable has initiali= zer, I plan to add a new bit in =E2=80=9Cdecl_common=E2=80=9D as: > /* In a VAR_DECL, this is DECL_IS_INITIALIZED. */ > unsigned decl_is_initialized :1; > > /* IN VAR_DECL, set when the decl is initialized at the declaration. */ > #define DECL_IS_INITIALIZED(NODE) \ > (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized) > > set this bit when setting DECL_INITIAL for the variables in FE. then keep= it > even though DECL_INITIAL might be NULLed. > > > For locals it would be more reliable to set this flag during gimplificati= on. > > Do you have any comment and suggestions? > > > As said above - do you want to cover registers as well as locals? I'd do > the actual zeroing during RTL expansion instead since otherwise you > have to figure youself whether a local is actually used (see expand_stack= _vars) > > Note that optimization will already made have use of "uninitialized" stat= e > of locals so depending on what the actual goal is here "late" may be too = late. > > > Haven't thought about this much, so it might be a daft idea, but would a > compromise be to use a const internal function: > > X1 =3D .DEFERRED_INIT (X0, INIT) > > where the X0 argument is an uninitialised value and the INIT argument > describes the initialisation pattern? So for a decl we'd have: > > X =3D .DEFERRED_INIT (X, INIT) > > and for an SSA name we'd have: > > X_2 =3D .DEFERRED_INIT (X_1(D), INIT) > > with all other uses of X_1(D) being replaced by X_2. The idea is that: > > * Having the X0 argument would keep the uninitialised use of the > variable around for the later warning passes. > > * Using a const function should still allow the UB to be deleted as dead > if X1 isn't needed. > > * Having a function in the way should stop passes from taking advantage > of direct uninitialised uses for optimisation. > > This means we won't be able to optimise based on the actual init > value at the gimple level, but that seems like a fair trade-off. > AIUI this is really a security feature or anti-UB hardening feature > (in the sense that users are more likely to see predictable behaviour > =E2=80=9Cin the field=E2=80=9D even if the program has UB). > > > The question is whether it's in line of peoples expectation that > explicitely zero-initialized code behaves differently from > implicitely zero-initialized code with respect to optimization > and secondary side-effects (late diagnostics, latent bugs, etc.). > > Introducing a new concept like .DEFERRED_INIT is much more > heavy-weight than an explicit zero initializer. > > > What exactly you mean by =E2=80=9Cheavy-weight=E2=80=9D? More difficult t= o implement or much more run-time overhead or both? Or something else? > > The major benefit of the approach of =E2=80=9C.DEFERRED_INIT=E2=80=9D is= to enable us keep the current -Wuninitialized analysis untouched and also = pass > the =E2=80=9Cuninitialized=E2=80=9D info from source code level to =E2=80= =9Cpass_expand=E2=80=9D. > > > Well, "untouched" is a bit oversimplified. You do need to handle > .DEFERRED_INIT as not > being an initialization which will definitely get interesting. > > > Yes, during uninitialized variable analysis pass, we should specially han= dle the defs with =E2=80=9C.DEFERRED_INIT=E2=80=9D, to treat them as uninit= ializations. > > If we want to keep the current -Wuninitialized analysis untouched, this i= s a quite reasonable approach. > > However, if it=E2=80=99s not required to keep the current -Wuninitialized= analysis untouched, adding zero-initializer directly during gimplification= should > be much easier and simpler, and also smaller run-time overhead. > > > As for optimization I fear you'll get a load of redundant zero-init > actually emitted if you can just rely on RTL DSE/DCE to remove it. > > > Runtime overhead for -fauto-init=3Dzero is one important consideration fo= r the whole feature, we should minimize the runtime overhead for zero > Initialization since it will be used in production build. > We can do some run-time performance evaluation when we have an implementa= tion ready. > > > Note there will be other passes "confused" by .DEFERRED_INIT. Note > that there's going to be other > considerations - namely where to emit the .DEFERRED_INIT - when > emitting it during gimplification > you can emit it at the start of the block of block-scope variables. > When emitting after gimplification > you have to emit at function start which will probably make stack slot > sharing inefficient because > the deferred init will cause overlapping lifetimes. With emitting at > block boundary the .DEFERRED_INIT > will act as code-motion barrier (and it itself likely cannot be moved) > so for example invariant motion > will no longer happen. Likewise optimizations like SRA will be > confused by .DEFERRED_INIT which > again will lead to bigger stack usage (and less optimization). > > > Yes, looks like that the inserted =E2=80=9C.DEFERRED_INIT=E2=80=9D funct= ion calls will negatively impact tree optimizations. > > > But sure, you can try implement a few variants but definitely > .DEFERRED_INIT will be the most > work. > > > How about implement the following two approaches and compare the run-time= cost: > > A. Insert the real initialization during gimplification phase. > B. Insert the .DEFERRED_INIT during gimplification phase, and then expan= d this call to real initialization during expand phase. > > The Approach A will have less run-time overhead, but will mess up the cur= rent uninitialized variable analysis in GCC. > The Approach B will have more run-time overhead, but will keep the curren= t uninitialized variable analysis in GCC. > > And then decide which approach we will go with? > > What=E2=80=99s your opinion on this? > > > Well, in the end you have to try. Note for the purpose of stack slot > sharing you do want the > instrumentation to happen during gimplification. > > Another possibility is to materialize .DEFERRED_INIT earlier than > expand, for example shortly > after IPA optimizations to avoid pessimizing loop transforms and allow > SRA. At the point you > materialize the inits you could run the late uninit warning pass > (which would then be earlier > than regular but would still see the .DEFERRED_INIT). > > > If we put the =E2=80=9Cmaterializing .DEFERRED_INIT=E2=80=9D phase earlie= r as you suggested above, > the late uninitialized warning pass has to be moved earlier in order to u= tilize the =E2=80=9C.DEFERRED_INIT=E2=80=9D. > Then we might miss some opportunities for the late uninitialized warning.= I think that this is not we really > want. > > > While users may be happy to pay some performance stack usage is > probably more critical > > > So, which pass is for computing the stack usage? There is no pass doing that, stack slot assignment and sharing (when lifetimes do not overlap) is done by RTL expansion. > (just thinking of the kernel) so not regressing there should be as > important as preserving > uninit warnings (which I for practical purposes see not important at > all - people can do > "debug" builds without -fzero-init). > > > Looks like that the major issue with the =E2=80=9C.DERERRED_INIT=E2=80=9D= approach is: the new inserted calls to internal const function > might inhibit some important tree optimizations. > > So, I am thinking again the following another approach I raised in the ve= ry beginning: > > During gimplification phase, mark the DECL for an auto variable without i= nitialization as =E2=80=9Cno_explicit_init=E2=80=9D, then maintain this > =E2=80=9Cno_explicit_init=E2=80=9D bit till after pass_late_warn_uninitia= lized, or till pass_expand, add zero-iniitiazation for all DECLs that are > marked with =E2=80=9Cno_explicit_init=E2=80=9D. > > This approach will not have the issue to interrupt tree optimizations, ho= wever, I guess that =E2=80=9Cmaintaining this =E2=80=9Cno_explicit_init=E2= =80=9D bit > might be very difficult? > > Do you have any comments on this approach? As said earlier you'll still get optimistic propagation bypassing the still missing implicit zero init. Maybe that's OK - you don't get "garbage" but you'll g= et some other defined value. As said, you have to implement a few options and compare. Richard. > thanks. > > Qing > > > > Richard. > > > Btw, I don't think theres any reason to cling onto clangs semantics > for a particular switch. We'll never be able to emulate 1:1 behavior > and our -Wuninit behavior is probably wastly different already. > > > From my study so far, yes, the currently behavior of -Wunit for Clang and= GCC is not exactly the same. > > For example, for the following small testing case: > void blah(int); > > int foo_2 (int n, int l, int m, int r) > { > int v; > > if ( (n > 10) && (m !=3D 100) && (r < 20) ) > v =3D r; > > if (l > 100) > if ( (n <=3D 8) && (m < 102) && (r < 19) ) > blah(v); /* { dg-warning "uninitialized" "real warning" } */ > > return 0; > } > > GCC is able to report maybe uninitialized warning, but Clang cannot. > Looks like that GCC=E2=80=99s uninitialized analysis relies on more analy= sis and optimization information than CLANG. > > Really curious on how clang implement its uninitialized analysis? > > > > Actually, I studied a little bit on how clang implement its uninitialized= analysis last Friday. > And noticed that CLANG has a data flow analysis phase based on CLANG's AS= T. > http://clang-developers.42468.n3.nabble.com/A-survey-of-dataflow-analyses= -in-Clang-td4069644.html > > And clang=E2=80=99s uninitialized analysis is based on this data flow ana= lysis. > > Therefore, adding initialization AFTER clang=E2=80=99s uninitialization a= nalysis phase is straightforward. > > However, for GCC, we don=E2=80=99t have data flow analysis in FE. The uni= nitialized variable analysis is put in TREE optimization phase, > Therefore, it=E2=80=99s much more difficult to implement this feature in = GCC than that in CLANG. > > Qing > > > > Qing > > > > > Richard. > > Thanks, > Richard > >