From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [IPv6:2607:f8b0:4864:20::82e]) by sourceware.org (Postfix) with ESMTPS id EE7153857820 for ; Mon, 30 Nov 2020 17:18:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org EE7153857820 Received: by mail-qt1-x82e.google.com with SMTP id l7so8699878qtp.8 for ; Mon, 30 Nov 2020 09:18:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=P6MVR7+CYFE35/L/ukDKJ6AmQIf1osXTj/Fu+18JCFU=; b=al6Eb8CvAZqGJkM6kO3SWrcl+0PRhhdsHSapglbNwAe58LM3TdRlCbJ14JjRkEmDHg vmP4NWybgsmDDCVysnJa/5nrKzkmFeceJ8TTcOEOdtX4sc+S1aXXrlAPRV/EZWgJlDOb kV/byDyPhjQe9jpWvuJLBOHaqsXgeGH00oo2Hu69XQlIszAzmmBoLISuqZSS04EnRKau vbis4TzmSClcgL/Xx2GF/LsZk3NOtdjICH2WNzTn1EESK01Jd8tI4gtLAr/x4ZwbDchV 9hEe1uohIPN4Wf95P6p7nwmbs34BR6156vmnMqANFhigV1HNxkEJPspP0iJFe3ofAnHy x5zw== X-Gm-Message-State: AOAM532MY5xRfJI3mKY3mAEPSGRi/qW/4XhE5xYdFIz9lAd1RXxAVUk6 nMlympQ8Apl+3copuwC71PV4IpvXqDE= X-Google-Smtp-Source: ABdhPJykhoxgZZ/GLeGnSJXsLH+vy4MO8aiXT9EEyUQAOj1ymB4z6XVuE/PHWf9Db3yb4BEN3XZtpg== X-Received: by 2002:ac8:688b:: with SMTP id m11mr19832949qtq.206.1606756731251; Mon, 30 Nov 2020 09:18:51 -0800 (PST) Received: from [192.168.0.41] (174-16-97-231.hlrn.qwest.net. [174.16.97.231]) by smtp.gmail.com with ESMTPSA id z23sm16284671qtq.66.2020.11.30.09.18.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 30 Nov 2020 09:18:50 -0800 (PST) Subject: Re: How to traverse all the local variables that declared in the current routine? To: Qing Zhao Cc: Richard Biener , Richard Sandiford , gcc Patches References: <217BE64F-A623-4453-B45B-D38B66B71B72@ORACLE.COM> <15EA64C7-D75F-4CE1-92C8-6940186A512A@ORACLE.COM> <29CCC07E-6203-46BA-8D35-8FF7F55563A2@ORACLE.COM> From: Martin Sebor Message-ID: <08c7d0ff-468b-994f-1299-6e76eb137b43@gmail.com> Date: Mon, 30 Nov 2020 10:18:49 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <29CCC07E-6203-46BA-8D35-8FF7F55563A2@ORACLE.COM> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, FREEMAIL_REPLY, NICE_REPLY_A, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Nov 2020 17:18:54 -0000 On 11/30/20 9:23 AM, Qing Zhao wrote: > Hi, Martin, > > Thanks a lot for your suggestion. > >> On Nov 25, 2020, at 6:08 PM, Martin Sebor > > wrote: >> >> On 11/24/20 9:54 AM, Qing Zhao via Gcc-patches wrote: >>>> On Nov 24, 2020, at 9:55 AM, Richard Biener >>>> > wrote: >>>> >>>> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao >>> > wrote: >>>>> >>>>> >>>>> >>>>>> On Nov 24, 2020, at 1:32 AM, Richard Biener >>>>>> > >>>>>> wrote: >>>>>> >>>>>> On Tue, Nov 24, 2020 at 12:05 AM Qing Zhao via Gcc-patches >>>>>> > wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Does gcc provide an iterator to traverse all the local variables >>>>>>> that are declared in the current routine? >>>>>>> >>>>>>> If not, what’s the best way to traverse the local variables? >>>>>> >>>>>> Depends on what for.  There's the source level view you get by walking >>>>>> BLOCK_VARS of the >>>>>> scope tree, theres cfun->local_variables (FOR_EACH_LOCAL_DECL) and >>>>>> there's SSA names >>>>>> (FOR_EACH_SSA_NAME). >>>>> >>>>> I am planing to add a new phase immediately after >>>>> “pass_late_warn_uninitialized” to initialize all auto-variables >>>>> that are >>>>> not explicitly initialized in the declaration, the basic idea is >>>>> following: >>>>> >>>>> ** The proposal: >>>>> >>>>> A. add a new GCC option: (same name and meaning as CLANG) >>>>> -ftrivial-auto-var-init=[pattern|zero], similar pattern init as CLANG; >>>>> >>>>> B. add a new attribute for variable: >>>>> __attribute((uninitialized) >>>>> the marked variable is uninitialized intentionaly for performance >>>>> purpose. >>>>> >>>>> C. The implementation needs to keep the current static warning on >>>>> uninitialized >>>>> variables untouched in order to avoid "forking the language". >>>>> >>>>> >>>>> ** The implementation: >>>>> >>>>> There are two major requirements for the implementation: >>>>> >>>>> 1. all auto-variables that do not have an explicit initializer >>>>> should be initialized to >>>>> zero by this option.  (Same behavior as CLANG) >>>>> >>>>> 2. keep the current static warning on uninitialized variables >>>>> untouched. >>>>> >>>>> In order to satisfy 1, we should check whether an auto-variable has >>>>> initializer >>>>> or not; >>>>> In order to satisfy 2, we should add this new transformation after >>>>> "pass_late_warn_uninitialized". >>>>> >>>>> So, we should be able to check whether an auto-variable has >>>>> initializer or not after “pass_late_warn_uninitialized”, >>>>> If Not, then insert an initialization for it. >>>>> >>>>> For this purpose, I guess that “FOR_EACH_LOCAL_DECL” might be better? >>>> >>>> Yes, but do you want to catch variables promoted to register as well >>>> or just variables >>>> on the stack? >>> I think both as long as they are source-level auto-variables. Then >>> which one is better? >>>> >>>>> Another issue is, in order to check whether an auto-variable has >>>>> initializer, I plan to add a new bit in “decl_common” as: >>>>>  /* In a VAR_DECL, this is DECL_IS_INITIALIZED.  */ >>>>>  unsigned decl_is_initialized :1; >>>>> >>>>> /* IN VAR_DECL, set when the decl is initialized at the >>>>> declaration.  */ >>>>> #define DECL_IS_INITIALIZED(NODE) \ >>>>>  (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized) >>>>> >>>>> set this bit when setting DECL_INITIAL for the variables in FE. >>>>> then keep it >>>>> even though DECL_INITIAL might be NULLed. >>>> >>>> For locals it would be more reliable to set this flag during >>>> gimplification. >>> You mean I can set the flag “DECL_IS_INITIALIZED (decl)”  inside the >>> routine “gimpley_decl_expr” (gimplify.c) as following: >>>   if (VAR_P (decl) && !DECL_EXTERNAL (decl)) >>>     { >>>       tree init = DECL_INITIAL (decl); >>> ... >>>       if (init && init != error_mark_node) >>>         { >>>           if (!TREE_STATIC (decl)) >>>     { >>>       DECL_IS_INITIALIZED(decl) = 1; >>>     } >>> Is this enough for all Frontends? Are there other places that I need >>> to maintain this bit? >>>> >>>>> Do you have any comment and suggestions? >>>> >>>> As said above - do you want to cover registers as well as locals? >>> All the locals from the source-code point of view should be covered. >>>   (From my study so far,  looks like that Clang adds that phase in FE). >>> If GCC adds this phase in FE, then the following design requirement >>> C. The implementation needs to keep the current static warning on >>> uninitialized >>> variables untouched in order to avoid "forking the language”. >>> cannot be satisfied.  Since gcc’s uninitialized variables analysis is >>> applied quite late. >>> So, we have to add this new phase after “pass_late_warn_uninitialized”. >>>>  I'd do >>>> the actual zeroing during RTL expansion instead since otherwise you >>>> have to figure youself whether a local is actually used (see >>>> expand_stack_vars) >>> Adding  this new transformation during RTL expansion is okay.  I will >>> check on this in more details to see how to add it to RTL expansion >>> phase. >>>> >>>> Note that optimization will already made have use of "uninitialized" >>>> state >>>> of locals so depending on what the actual goal is here "late" may be >>>> too late. >>> This is a really good point… >>> In order to avoid optimization  to use the “uninitialized” state of >>> locals, we should add the zeroing phase as early as possible (adding >>> it in FE might be best >>> for this issue). However, if we have to met the following requirement: >>> C. The implementation needs to keep the current static warning on >>> uninitialized >>> variables untouched in order to avoid "forking the language”. >>> We have to move the new phase after all the uninitialized analysis is >>> done in order to avoid “forking the language”. >>> So, this is a problem that is not easy to resolve. >>> Do you have suggestion on this? >> >> Not having thought about it very long or hard I'd be tempted to do >> it the other way around.  For each use of an uninitialized variable >> found, first either issue or queue up a -Wuninitialized for it and >> then initialize it.  Then (if queued) at some later point, issue >> the queued up -Wuninitialized.  The last part would be done in >> tree-ssa-uninit.c where the remaining uses of uninitialized >> variables would trigger warnings and induce their initialization >> (if there were any left). > > > The major issue with this approach is: > > There are two passes for uninitialized variable analysis: > pass_early_warn_uninitialized > pass_late_warn_uninitialized > > The early pass is placed at the very beginning of the tree optimizer. > But the late pass is placed at the very late stage of the tree optimizer. > If we add the initializations at the early pass, the result of the late > pass will be changed by the new added initializations. This does not meet > the requirement. > > Do I miss anything here? I'm not sure. As I said, I'd consider issuing (or queuing up for issuing later) -Wuninitialized at the same time as initializing the uninitialized variables. With that approach I'd expect to diagnose all the same instances of uninitialized uses as the two passes do today (actually, I'd expect to diagnose more of them, including those Richard referred to above whose uninitialized state may have been made use of for optimization decisions(*)). Also with this approach the two existing warning passes would cease to serve their current purpose of hunting down uninitialized variables because by the time they ran all their uses would have been initialized (and warnings issued). One question in my mind is what to do with -Wmaybe-uninitialized. Should those also be initialized, even though they're not necessarily used? Or are you only hoping to tackle -Wuninitialized? Martin [*] With the initialization approach I'd expect concerns about the cost of losing those optimization opportunities. Although those could be addressed by making the initialization optional (i.e., opt-in).