From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 648BE384A015 for ; Wed, 28 Jul 2021 14:51:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 648BE384A015 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-336-_-qXp3Y9PaeMPZUrt4bAVA-1; Wed, 28 Jul 2021 10:51:37 -0400 X-MC-Unique: _-qXp3Y9PaeMPZUrt4bAVA-1 Received: by mail-wm1-f69.google.com with SMTP id n7-20020a05600c3b87b029024e59a633baso1024430wms.5 for ; Wed, 28 Jul 2021 07:51:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=ftcG7InF4/fInEQp56YCV4Z0SRuU9EfxytIB5r/XI70=; b=SP4x+6mtkLQHotzjbK+4ufolpFLZc5t3V2AsFhIF7PV1I4WEQgtw5XjMX4IFNATjLX odlW21odYNcE0fI0muTjij4s0QaTmoqWyHibGXwB9PlxK0/yWp1TiRgMmnjpaUSonXBv DPZzT2I7/3LFyVNCqz7DT5+knejZlVErrxkcBX3+cWusDPq78IJewRrpSBxMLf/UsKwm znVEeXyk/Nzccpax+WXCgc5Iwu1DSwidBla3y7TwhgdYt5PPwSz5Dg0Nr4NDXghUcMN1 5n7w7zDiVMSWQWH30pHncQ0Hy/wx9/lvhIVUoR08YrwVBDY5ZkN/1KveGqZSSDR7KrHt 0BgQ== X-Gm-Message-State: AOAM5332aI4A8bI0LVfVgZ815bSyB8E4eHKog0pFBNthtG5DEmiWf1QB 0zQT4TBy70v9gyooYJvHEGGRsCIGGhKaT8OYDv1mKH4Fk/W4Fibfq0b88UTLbr68g1unY7NOj3+ pvLEt2lO59Ys8VN3YnQ== X-Received: by 2002:a1c:f414:: with SMTP id z20mr10039840wma.94.1627483895791; Wed, 28 Jul 2021 07:51:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzQbkMUyiUvWzpAXWZAjedgcuPKYrkOqE5dSw2xQ1b4a6fvYAcmLO3xA08BXPjqGKNYajoPow== X-Received: by 2002:a1c:f414:: with SMTP id z20mr10039816wma.94.1627483895490; Wed, 28 Jul 2021 07:51:35 -0700 (PDT) Received: from abulafia.quesejoda.com ([95.169.238.86]) by smtp.gmail.com with ESMTPSA id d14sm26276wrs.49.2021.07.28.07.51.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 28 Jul 2021 07:51:35 -0700 (PDT) Subject: Re: [PATCH 2/2] Backwards jump threader rewrite with ranger. To: Jeff Law , GCC patches Cc: Andrew MacLeod , Richard Biener , Martin Sebor References: <20210628162147.631850-1-aldyh@redhat.com> <20210628162147.631850-3-aldyh@redhat.com> From: Aldy Hernandez Message-ID: Date: Wed, 28 Jul 2021 16:51:33 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jul 2021 14:51:43 -0000 On 7/28/21 4:32 PM, Jeff Law wrote: > > > On 7/15/2021 8:57 AM, Aldy Hernandez wrote: >> As mentioned in my previous email, these are some minor changes to the >> previous revision. All I'm changing here is the call into the solver >> to use range_of_expr and range_of_stmt. Everything else remains the >> same. >> >> Tested on x86-64 Linux. >> >> On Mon, Jul 5, 2021 at 5:39 PM Aldy Hernandez wrote: >>> PING. >>> >>> Aldy >>> >>> 0003-Backwards-jump-threader-rewrite-with-ranger.patch >>> >>> From 1774338ddd1f4718884e766aae2fc48b97110c5d Mon Sep 17 00:00:00 2001 >>> From: Aldy Hernandez >>> Date: Tue, 15 Jun 2021 12:32:51 +0200 >>> Subject: [PATCH 3/5] Backwards jump threader rewrite with ranger. >>> >>> This is a rewrite of the backwards threader with a ranger based solver. >>> >>> The code is divided into two parts: the path solver in >>> gimple-range-path.*, and the path discovery bits in >>> tree-ssa-threadbackward.c. >>> >>> The legacy code is still available with --param=threader-mode=legacy, >>> but will be removed shortly after. >>> >>> gcc/ChangeLog: >>> >>> * Makefile.in (tree-ssa-loop-im.o-warn): New. >>> * flag-types.h (enum threader_mode): New. >>> * params.opt: Add entry for --param=threader-mode. >>> * tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New. >>> (class back_threader): New. >>> (back_threader::back_threader): New. >>> (back_threader::~back_threader): New. >>> (back_threader::maybe_register_path): New. >>> (back_threader::find_taken_edge): New. >>> (back_threader::find_taken_edge_switch): New. >>> (back_threader::find_taken_edge_cond): New. >>> (back_threader::resolve_def): New. >>> (back_threader::resolve_phi): New. >>> (back_threader::find_paths_to_names): New. >>> (back_threader::find_paths): New. >>> (dump_path): New. >>> (debug): New. >>> (thread_jumps::find_jump_threads_backwards): Call ranger threader. >>> (thread_jumps::find_jump_threads_backwards_with_ranger): New. >>> (pass_thread_jumps::execute): Abstract out code... >>> (try_thread_blocks): ...here. >>> * tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges): >>> Abstract out threading candidate code to... >>> (single_succ_to_potentially_threadable_block): ...here. >>> * tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block): >>> New. >>> * tree-ssa-threadupdate.c (register_jump_thread): Return boolean. >>> * tree-ssa-threadupdate.h (class jump_thread_path_registry): >>> Return bool from register_jump_thread. >>> >>> libgomp/ChangeLog: >>> >>> * testsuite/libgomp.graphite/force-parallel-4.c: Adjust for >>> threader. >>> * testsuite/libgomp.graphite/force-parallel-8.c: Same. >>> >>> gcc/testsuite/ChangeLog: >>> >>> * g++.dg/debug/dwarf2/deallocator.C: Adjust for threader. >>> * gcc.c-torture/compile/pr83510.c: Same. >>> * gcc.dg/loop-unswitch-2.c: Same. >>> * gcc.dg/old-style-asm-1.c: Same. >>> * gcc.dg/pr68317.c: Same. >>> * gcc.dg/pr97567-2.c: Same. >>> * gcc.dg/predict-9.c: Same. >>> * gcc.dg/shrink-wrap-loop.c: Same. >>> * gcc.dg/sibcall-1.c: Same. >>> * gcc.dg/tree-ssa/builtin-sprintf-3.c: Same. >>> * gcc.dg/tree-ssa/pr21001.c: Same. >>> * gcc.dg/tree-ssa/pr21294.c: Same. >>> * gcc.dg/tree-ssa/pr21417.c: Same. >>> * gcc.dg/tree-ssa/pr21458-2.c: Same. >>> * gcc.dg/tree-ssa/pr21563.c: Same. >>> * gcc.dg/tree-ssa/pr49039.c: Same. >>> * gcc.dg/tree-ssa/pr61839_1.c: Same. >>> * gcc.dg/tree-ssa/pr61839_3.c: Same. >>> * gcc.dg/tree-ssa/pr77445-2.c: Same. >>> * gcc.dg/tree-ssa/split-path-4.c: Same. >>> * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same. >>> * gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same. >>> * gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same. >>> * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same. >>> * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same. >>> * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same. >>> * gcc.dg/tree-ssa/ssa-fre-48.c: Same. >>> * gcc.dg/tree-ssa/ssa-thread-11.c: Same. >>> * gcc.dg/tree-ssa/ssa-thread-12.c: Same. >>> * gcc.dg/tree-ssa/ssa-thread-14.c: Same. >>> * gcc.dg/tree-ssa/vrp02.c: Same. >>> * gcc.dg/tree-ssa/vrp03.c: Same. >>> * gcc.dg/tree-ssa/vrp05.c: Same. >>> * gcc.dg/tree-ssa/vrp06.c: Same. >>> * gcc.dg/tree-ssa/vrp07.c: Same. >>> * gcc.dg/tree-ssa/vrp09.c: Same. >>> * gcc.dg/tree-ssa/vrp19.c: Same. >>> * gcc.dg/tree-ssa/vrp20.c: Same. >>> * gcc.dg/tree-ssa/vrp33.c: Same. >>> * gcc.dg/uninit-pred-9_b.c: Same. >>> * gcc.dg/vect/bb-slp-16.c: Same. >>> * gcc.target/i386/avx2-vect-aggressive.c: Same. >>> * gcc.dg/tree-ssa/ranger-threader-1.c: New test. >>> * gcc.dg/tree-ssa/ranger-threader-2.c: New test. >>> * gcc.dg/tree-ssa/ranger-threader-3.c: New test. >>> * gcc.dg/tree-ssa/ranger-threader-4.c: New test. >>> * gcc.dg/tree-ssa/ranger-threader-5.c: New test. >>> --- >>> gcc/Makefile.in | 5 + >>> gcc/flag-types.h | 7 + >>> gcc/params.opt | 17 + >>> .../g++.dg/debug/dwarf2/deallocator.C | 3 +- >>> gcc/testsuite/gcc.c-torture/compile/pr83510.c | 33 ++ >>> gcc/testsuite/gcc.dg/loop-unswitch-2.c | 2 +- >>> gcc/testsuite/gcc.dg/old-style-asm-1.c | 5 +- >>> gcc/testsuite/gcc.dg/pr68317.c | 4 +- >>> gcc/testsuite/gcc.dg/pr97567-2.c | 2 +- >>> gcc/testsuite/gcc.dg/predict-9.c | 4 +- >>> gcc/testsuite/gcc.dg/shrink-wrap-loop.c | 53 ++ >>> gcc/testsuite/gcc.dg/sibcall-1.c | 10 + >>> .../gcc.dg/tree-ssa/builtin-sprintf-3.c | 25 +- >>> gcc/testsuite/gcc.dg/tree-ssa/pr21001.c | 1 + >>> gcc/testsuite/gcc.dg/tree-ssa/pr21294.c | 1 + >>> gcc/testsuite/gcc.dg/tree-ssa/pr21417.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/pr21458-2.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/pr21563.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/pr49039.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/pr61839_1.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/pr61839_3.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c | 2 +- >>> .../gcc.dg/tree-ssa/ranger-threader-1.c | 20 + >>> .../gcc.dg/tree-ssa/ranger-threader-2.c | 39 ++ >>> .../gcc.dg/tree-ssa/ranger-threader-3.c | 41 ++ >>> .../gcc.dg/tree-ssa/ranger-threader-4.c | 83 +++ >>> .../gcc.dg/tree-ssa/ranger-threader-5.c | 80 +++ >>> gcc/testsuite/gcc.dg/tree-ssa/split-path-4.c | 4 +- >>> .../gcc.dg/tree-ssa/ssa-dom-thread-11.c | 2 +- >>> .../gcc.dg/tree-ssa/ssa-dom-thread-12.c | 2 +- >>> .../gcc.dg/tree-ssa/ssa-dom-thread-14.c | 1 + >>> .../gcc.dg/tree-ssa/ssa-dom-thread-18.c | 5 +- >>> .../gcc.dg/tree-ssa/ssa-dom-thread-6.c | 4 +- >>> .../gcc.dg/tree-ssa/ssa-dom-thread-7.c | 1 + >>> gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-48.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c | 1 + >>> gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c | 1 + >>> gcc/testsuite/gcc.dg/tree-ssa/vrp02.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/vrp03.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/vrp05.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/vrp06.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/vrp07.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/vrp09.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/vrp19.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/vrp20.c | 2 +- >>> gcc/testsuite/gcc.dg/tree-ssa/vrp33.c | 2 +- >>> gcc/testsuite/gcc.dg/uninit-pred-9_b.c | 1 + >>> gcc/testsuite/gcc.dg/vect/bb-slp-16.c | 7 + >>> .../gcc.target/i386/avx2-vect-aggressive.c | 2 +- >>> gcc/tree-ssa-threadbackward.c | 475 +++++++++++++++++- >>> gcc/tree-ssa-threadedge.c | 20 +- >>> gcc/tree-ssa-threadedge.h | 3 +- >>> gcc/tree-ssa-threadupdate.c | 12 +- >>> gcc/tree-ssa-threadupdate.h | 2 +- >>> .../libgomp.graphite/force-parallel-4.c | 1 + >>> .../libgomp.graphite/force-parallel-8.c | 2 + >>> 57 files changed, 963 insertions(+), 54 deletions(-) >>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-1.c >>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-2.c >>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-3.c >>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-4.c >>> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ranger-threader-5.c >>> >>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in >>> index 863f1256811..0e205a41ac3 100644 >>> --- a/gcc/Makefile.in >>> +++ b/gcc/Makefile.in >>> @@ -223,6 +223,11 @@ gimple-match.o-warn = -Wno-unused >>> generic-match.o-warn = -Wno-unused >>> dfp.o-warn = -Wno-strict-aliasing >>> >>> +# maybe_emit_free_warning() is picking up the inlined location for the >>> +# warning, not the source of the original va_heap::release() function >>> +# which has a pragma disabling this warning. >>> +tree-ssa-loop-im.o-warn = -Wno-free-nonheap-object > I think some of Martin's work may help here, but I'm not sure if it's > all gone in yet.  It might be worth syncing with him on the state of the > improvements to how inlining and warnings interact.  If his work does > fix the problem here, this hunk can be removed as a distinct follow-up. Yes. He definitely has some patches in this space that were likely to fix this. I will re-test without this hunk, and remove it if it's fixed. I will also add a few PRs as suggested by Martin to keep track of the XFAILs I introduced. > >>> diff --git a/gcc/params.opt b/gcc/params.opt >>> index 92b003e38cb..f1f47b44215 100644 >>> --- a/gcc/params.opt >>> +++ b/gcc/params.opt >>> @@ -1010,6 +1010,23 @@ Maximum depth of DFS walk used by modref escape analysis. >>> Common Joined UInteger Var(param_modref_max_escape_points) Init(256) Param Optimization >>> Maximum number of escape points tracked by modref per SSA-name. >>> >>> +-param=threader-iterative= >>> +Common Joined UInteger Var(param_threader_iterative) Init(0) Param Optimization >>> +Run backwards threader in iterative mode. > Presumably this is going away?  I thought the iterative mode was just > for debugging/evaluation purposes. I can remove it in the next few months, as I'll probably need it to analyze what to do with the DOM and VRP threaders. But if it really bothers you, I can rip it out now. They're simple enough to keep local. All in all, the non documented iterative mode, and the legacy code will be long gone before stage 1 ends :). > > I only glossed over the testsuite changes. > > God I love how much you've refactored here.  Lots of small, easy to > understand functions. It was the only way to understand what I was doing :). > >>> + >>> +// If any of the incoming edges for a PHI resolves the current path, >>> +// register the path(s), and return TRUE. >>> + >>> +bool >>> +back_threader::resolve_phi (gphi *phi, bitmap interesting) > [ ... ] > It might be useful to indicate somewhere what "resolving the current > path" means.  I kind of figured it out as I was walking through the > patch, but my first thought had it reversed. Will do. > > > diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c >>> index 6ce32644aa5..ea5c37a2c65 100644 >>> --- a/gcc/tree-ssa-threadedge.c >>> +++ b/gcc/tree-ssa-threadedge.c >>> @@ -1335,6 +1335,18 @@ jump_threader::thread_across_edge (edge e) >>> m_avail_exprs_stack->pop_to_marker (); >>> } >>> >>> +/* Return TRUE if BB has a single successor to a block with multiple >>> + incoming and outgoing edges. */ >>> + >>> +bool >>> +single_succ_to_potentially_threadable_block (basic_block bb) >>> +{ >>> + int flags = (EDGE_IGNORE | EDGE_COMPLEX | EDGE_ABNORMAL); >>> + return (single_succ_p (bb) >>> + && (single_succ_edge (bb)->flags & flags) == 0 >>> + && potentially_threadable_block (single_succ (bb))); >>> +} > Note on many occasions I've pondered killing these checks.  The forward > jump threader in particular uses them to narrow its search space.  But > I've regularly found that they're suppressing useful paths.  The worry, > of course, is the compile-time cost. I'll have to put this in my TODO list, because as you mention, even small changes tend to blow up the search space and show a measurable performance degradation. > > > I don't see anything in here that is worrisome.  I'd like to see the > iterating bits disappear, but I'd also understand if you want to keep > them for debugging/evaluation purposes in the immediate term. > > OK for the trunk. Sweeeet! I'll retest, remove the warning suppression if applicable, and follow-up with PRs. Let me know if you're OK removing the legacy mode in the next week or so. I see no need for it. Aldy