From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 46070385B83D for ; Mon, 27 Sep 2021 15:27:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 46070385B83D Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-592-cLu4x88ZPcWCLUomNJSAvQ-1; Mon, 27 Sep 2021 11:27:46 -0400 X-MC-Unique: cLu4x88ZPcWCLUomNJSAvQ-1 Received: by mail-wm1-f69.google.com with SMTP id u14-20020a7bcb0e0000b0290248831d46e4so93921wmj.6 for ; Mon, 27 Sep 2021 08:27:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=xLgqeHcjlDm+nJNnm5imWQkH48s/X5rCx+8PEUrhscg=; b=03vqdqLp4Wg9oKBn46CfCpj460jzNeyfZ1N1Cp9wHkgkJiYGNpv1Sg3iaXKTuwmM86 87KKluFTu8Xvk0ULfyeqV5iHV2clAC/iylCV/AycA++41Cl69wRnGuph+DEWHLo4TAv+ 2aqUYy6C59xImO38L/2ngMzdCJi+IqHxHxK/3Wen95OHek2NctVM7fkrgYkG5foXJ/WL 3+fGv7cgY5WLi2uCd23VYjJ5/u2jZzojJjiJCXZ+noChRFfm71IvIxn4s4/QD7bszS+/ f/zvzd+rTEsqEhUzQSsIKdncQUNX2bPqQzg671AThpGDXSuWqX8IjSDvQBdYjVEk+qaL pF5Q== X-Gm-Message-State: AOAM533blWeZC0nXJtEopCR0Uc2qSOVzTzhE0owRy1awO+LW+JEEEoWM yRkb64Ch33CU0I9ytjCPPBto4MJYvn8fEsEoqF18gQJKgNGEb2qXps3mU1PESRMZ6gA79IVJn/m dVf0+UJbLODcKQa+rpE4nQQXjoCDio1yZsCYbG3LpsG6ovzy3lZwyA++kBtuawvOa7g== X-Received: by 2002:adf:eb12:: with SMTP id s18mr492875wrn.97.1632756464495; Mon, 27 Sep 2021 08:27:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwIo8STLVvABtdlKjHJ6oa4RA9LOoPicA3uuKuyoHL0wr//CRrmMjokiw1NIeemcMaPyUYwRA== X-Received: by 2002:adf:eb12:: with SMTP id s18mr492843wrn.97.1632756464173; Mon, 27 Sep 2021 08:27:44 -0700 (PDT) Received: from abulafia.quesejoda.com ([139.47.33.227]) by smtp.gmail.com with ESMTPSA id y7sm14550492wrs.95.2021.09.27.08.27.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 27 Sep 2021 08:27:43 -0700 (PDT) Subject: Re: [PATCH] Replace VRP threader with a hybrid forward threader. To: Jeff Law Cc: Andrew MacLeod , GCC patches References: <20210924154653.1108992-1-aldyh@redhat.com> <5cd175c7-5dbb-e160-9dd2-85dcfe438059@gmail.com> From: Aldy Hernandez Message-ID: <0dc3521b-8d91-6e55-5d7d-7ee513c17c0f@redhat.com> Date: Mon, 27 Sep 2021 17:27:43 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <5cd175c7-5dbb-e160-9dd2-85dcfe438059@gmail.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2021 15:27:49 -0000 On 9/27/21 5:01 PM, Jeff Law wrote: > > > On 9/24/2021 9:46 AM, Aldy Hernandez wrote: >> This patch implements the new hybrid forward threader and replaces the >> embedded VRP threader with it. > But most importantly, it pulls it out of the VRP pass as we no longer > need the VRP data or ASSERT_EXPRs. Yes, I have a follow-up patch removing the old mini-pass. > >> >> With all the pieces that have gone in, the implementation of the hybrid >> threader is straightforward: convert the current state into >> SSA imports that the solver will understand, and let the path solver >> precompute ranges and relations for the path.  After this setup is done, >> we can use the range_query API to solve gimple statements in the >> threader. >> The forward threader is now engine agnostic so there are no changes to >> the threader per se. > So the big question is do we think it's going to be this clean when we > try to divorce the threading from DOM? Interestingly, yes. With all the refactoring I've done, it turns out that divorcing evrp from the DOM threader is a matter of having dom_jt_simplifier inherit from hybrid_jt_simplifier instead of the base class. Then we have simplify() look at the const_copies/avails, otherwise let the hybrid simplifier do its thing. Yes, I was amazed too. As usual there are caveats: First, notice that we'd still depend on const_copies/avails, because we'd need them for floats anyhow. But this has the added benefit of catching a few things in the presence of the IL changing from under us. Second, it turns out that DOM has other uses of evrp that need to be addressed-- particularly its use of evrp to do its simple copy prop. Be that as it may, none of these are show stoppers. I have a proof of concept that converts everything with a few lines of code. The big issue now is performance. Plugging in the full ranger makes it uncomfortably slower than just using evrp. Andrew has some ideas for a super fast ranger that doesn't do full look-ups, so we have finally found a good use case for something we had in the back burner. Now, numbers... Converting the DOM threader to a hybrid client improves DOM threading counts by 4%, but it's all at the expense of other passes. The total threading counts was unchanged (well, it got worse by -0.05%). It doesn't look like there's any gain. We're shuffling things around at this point. > >> >> I have put the hybrid bits in tree-ssa-threadedge.*, instead of VRP, >> because they will also be used in the evrp removal of the DOM/threader, >> which is my next task. > Sweet. > >> >> Most of the patch, is actually test changes.  I have gone through every >> single one and verified that we're correct.  Most were trivial dump >> file name changes, but others required going through the IL an >> certifying that the different IL was expected. >> >> For example, in pr59597.c, we have one less thread because the >> ASSERT_EXPR was getting in the way, and making it seem like things were >> not crossing loops.  The hybrid threader sees the correct representation >> of the IL, and avoids threading this one case. >> >> The final numbers are a 12.16% improvement in jump threads immediately >> after VRP, and a 0.82% improvement in overall jump threads.  The >> performance drop is 0.6% (plus the 1.43% hit from moving the embedded >> threader into its own pass).  As I've said, I'd prefer to keep the >> threader in its own pass, but if this is an issue, we can address this >> with a shared ranger when VRP is replaced with an evrp instance >> (upcoming). > Presumably we're also seeing a cannibalization of threads from later > passes.   And just to be clear, this is good. > > And the big question, is the pass running after VRP2 doing anything > particularly useful?  Do we want to try and kill it now, or later? Interesting question. Perhaps if we convert DOM threading to a hybrid model, it will render the post-VRP threader completely useless. Huhh... That could kill 2 birds with one stone... we get rid of a threading pass, and we don't need to worry about as much about the super-fast ranger. Huh...good idea. I will experiment. Thanks. Aldy