From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id CA6CB384A88A for ; Fri, 10 Sep 2021 16:38:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CA6CB384A88A Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-498-wIXmNcXlPMGBJR-UEWIGlA-1; Fri, 10 Sep 2021 12:38:48 -0400 X-MC-Unique: wIXmNcXlPMGBJR-UEWIGlA-1 Received: by mail-wm1-f69.google.com with SMTP id n30-20020a05600c3b9e00b002fbbaada5d7so1150463wms.7 for ; Fri, 10 Sep 2021 09:38:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=puzg0DdsnAvUtCkNCdeF9O4QQPdO08FsGtDmE32cx1Q=; b=Kea09aC2DVXaWN5kPm8AsrlPoEby512ejZTJK6FSuTnVjOOKHEzstwNOJyQenKmA/Q agbC6JhP+9IzPwT7//zSH+UH0ZFoFTZM65lWSTCicnNDiL7+1wyGVpQxH43n1lCG29I8 SBsGlGdoVoMWKkrNmaHcmdVmd6EcL2EnNr2Zzs+GLunCCr5rQI2E7aJ1pd94ebdjIU5o P3ic6K8HNI5l97RBft9a91g2sIC9Fwth4UURWassU64UbK1ah7SPHxOVlgIfX910496x 6ZUGh0RWW2q7Z2CILlvl4Oa4FruKPav5egMtksYpJP3UPokuQSU1fMrZYx7Wg1yZo4mm FmXQ== X-Gm-Message-State: AOAM531b6kUlAa1fnS5TMm90/EVluFwXbh92xkKJXhPhcqWhDAP76Odt N1W3TwWXE/mNRJ+6J7s/Z1/Q2qlny8tqVrx1gjW5+f/ydoPjRnlH4IV6iluCWUFKHC++g58b0Wy dRRKz6Jc= X-Received: by 2002:adf:ef92:: with SMTP id d18mr10114510wro.264.1631291927219; Fri, 10 Sep 2021 09:38:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwwaJCEIz0uYm+njIwS1R/TZZYtf/Hrtmm8+UiNN9Mv4O786gMkFT/Gep0uMkuFNb+ztqu9hw== X-Received: by 2002:adf:ef92:: with SMTP id d18mr10114474wro.264.1631291926946; Fri, 10 Sep 2021 09:38:46 -0700 (PDT) Received: from abulafia.quesejoda.com ([139.47.33.227]) by smtp.gmail.com with ESMTPSA id s7sm5148383wra.75.2021.09.10.09.38.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 Sep 2021 09:38:46 -0700 (PDT) Subject: Re: More aggressive threading causing loop-interchange-9.c regression To: Jeff Law , Richard Biener Cc: Michael Matz , GCC Mailing List , Andrew MacLeod References: <09e48b82-bc51-405e-7680-89a5f08e4e8f@redhat.com> <6d5695e4-e4eb-14a5-46a6-f425d1514008@redhat.com> <07fdd2bb-e6b7-fe66-f6a0-df6ec0704ae4@redhat.com> <8c49db8d-3119-0dc2-2bbb-4062c8d5d53b@redhat.com> <991f9fe6-0597-4883-c114-04e01098b37a@redhat.com> <072930e0-0422-db13-c868-d97e6ede07d8@gmail.com> From: Aldy Hernandez Message-ID: <06565188-e0cf-d4f1-3a82-aeb18914f59e@redhat.com> Date: Fri, 10 Sep 2021 18:38:45 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <072930e0-0422-db13-c868-d97e6ede07d8@gmail.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Sep 2021 16:38:53 -0000 On 9/10/21 6:21 PM, Jeff Law wrote: > > > On 9/10/2021 10:05 AM, Aldy Hernandez wrote: >> >> >> On 9/10/21 5:43 PM, Jeff Law wrote: >>> >>> >>> On 9/9/2021 3:21 AM, Aldy Hernandez wrote: >>>> >>>>> >>>>>    /* If this path does not thread through the loop latch, then we are >>>>>       using the FSM threader to find old style jump threads. This >>>>>       is good, except the FSM threader does not re-use an existing >>>>>       threading path to reduce code duplication. >>>>> >>>>>       So for that case, drastically reduce the number of statements >>>>>       we are allowed to copy.  */ >>>> >>>> *blink* >>>> >>>> Woah.  The backward threader has been using FSM threads >>>> indiscriminately as far as I can remember.  I wonder what would >>>> break if we "fixed it". >>> ?!?  I'm not sure what you're suggesting here.  If you s/FSM >>> threader/backwards threader/ in the comment does it make more sense? >>> The term FSM really should largely have been dropped as the backwards >>> threader was improved to handle more cases. >> >> back_threader_registry::register_path() uses EDGE_FSM_THREAD as the >> thread type to register threads.  I was wondering if it should have >> been some combination of EDGE_START_JUMP_THREAD / >> EDGE_*_COPY_SRC_BLOCK, etc.  I (purposely) know nothing about the >> underlying threading types ;-). But if the backwards threader has been >> improved, then perhaps we should just remove the confusing FSM >> references. > No we shouldn't change it to any of the other types. EDGE_FSM_THREAD > means a thread found by the backwards threader and it's the key used to > determine which of the two CFG updating mechanisms should be used, the > generic copier in the case of EDGE_FSM_THREAD. > > > Changing the name, yes, absolutely.  I probably should have done that > when the original FSM threader was tweaked to handle generic threading. I'll put that on my list. > > As you've probably heard me mention before, all the EDGE_FSM_THREAD > stuff in the registry really could be pulled out.   The registry's > purpose is to deal with the two stage nature of jump threading in DOM > (find threads, then optimize them later).  I don't think any of the > backwards threading bits need that two stage handling. Yeah yeah. But you said that a year ago, and all I heard was *mumble mumble/complicated things*. That was before I had much exposure to the code. Now I feel a lot more comfortable ;-). I'll also put this on my list, but it may not get done this release, cause I'm running against the clock with the VRP/threader replacement, which is what's keeping us back from replacing VRP with an evrp instance right now :). > >> My current thinking is that replacing the forward VRP threader with a >> hybrid one is a gentler approach to the longer term goal of replacing >> the forward threader altogether.  However, all the work I've been >> doing could go either way-- we could try the forward/VRP replacement >> or a hybrid approach.  It will all use the path solver underneath. > And that's probably a reasonable intermediate step on the way towards > removing the VRP threading. > >> >> My main problem with replacing the forward/VRP with a backward client >> is that the cost models are so different that it was difficult to >> compare how we fared.  I constantly ran into threads the solver could >> handle just fine, but profitable_path_p was holding it back. > Yea.  Sorry about that tangle of insanity > >> >> FWIW, we get virtually everything the forward threader gets, minus a >> very few things.  At least when I plug in the solver to the >> DOM/forwarder threader, it can solve everything it can (minus noise >> and floats). > So once you plug those bits in, we don't have to carry around the > avail/copies tables for the threader anymore, right?  That's a nice > cleanup in and of itself. Correct. For the VRP/hybrid approach I'm working on, there are no copies/avails. The solver can do everything they did. After all, it's an easier target, since VRP threading only happens on ints and without the IL changing constantly. > >> >> If you prefer a backward threader instance to replace the VRP/forward >> threader, I'm game.  It's just harder to compare. Either way (backward >> threader or a hybrid forward+solver) uses the same underlying solver >> which is solid. > I think we can go hybrid, then look at the next step, which could well > be bringing some consistency into the costing models. >>> >>>> >>>> c) DOM changes the IL as it goes.  Though we could conceivably >>>> divorce do the threading after DOM is done. >>> The only reason threading runs in parallel with DOM is so that it can >>> use the context sensitive equivalences.  With the infrastructure >>> you're building, there's a reasonable chance we can move to a model >>> where we run DOM (and in the long term a simpler DOM) and threading >>> as distinct, independent passes. >> >> Andrew mumbled something about replacing all of DOM eventually :-). >> Well, except that value-numbering business I bet. > Essentially a realization of Bodik's work in GCC.   The nugget in there > is it's a path sensitive optimizer.  That's kindof what I've envisioned > DOM turning into. > > 1. We separate out jump threading from DOM. > 2. We replace the bulk of DOM with FRE > 3. The remnants of DOM turn into a path sensitive optimizer (and for > god's sake we don't want to call it DOM anymore :-) Well, my tree has improvements to the solver for full path sensitive ranges (using ranger to resolve definitions outside of the path). And it also does path sensitive relations, though Andrew is overhauling them next week. So yeah, given a path, we could tell you all sorts of interesting things about it :). Aldy