From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 790423858C2D for ; Tue, 16 Aug 2022 14:30:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 790423858C2D Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-568-ioVcHNYcO4WlDkcezNpOZw-1; Tue, 16 Aug 2022 10:30:05 -0400 X-MC-Unique: ioVcHNYcO4WlDkcezNpOZw-1 Received: by mail-il1-f200.google.com with SMTP id d6-20020a056e020be600b002dcc7977592so7113403ilu.17 for ; Tue, 16 Aug 2022 07:30:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc; bh=0/Lwtn72mc6LU2le6SijcXTO+PuVBEjVb1urUxjdsu0=; b=BjaMhxIv+n2wMWgPTqaXmmYTlmyYzKc5ki4r6sEN+nfO2vs9Mt+nGS6sNMn2aV9ARD r7VLf6O9ikbF2ZnDDGSht6pAb7uN/BZsv61fiXGk149npoH0g/EE3AXXODRA+Y3uLXAU xAjqBIRFZyQX2kK+seBs2cVtnMfvXdDdSZAoC7/+wTZ0/3QaHOFeK4DRbxx4OuFPhshw VWjS0pyD+87GypaZCINcDmbaV8CVNNzUGVoztiYm0bEeWpAasrw++FFBa+C6IS5Jq1XL qix8ZluhEnlZPDCufggC2F5rkB+C3MCdhaw5BwYCTIlXxplx0QdG/O23Ls2htZcyrF/8 C2eg== X-Gm-Message-State: ACgBeo24qfNa6bQHt6NFXDbKKQJetdV8btz7p4xQC7wZhajDTF+XD083 wNPB0y07XGj9bU8kOeWZMYYKpMjzjQ92kozcsYfxkKENt6FdBPGNfqn9MQRd91LYVrArg6UiGh7 djthFQPHI74E9c24Qsw== X-Received: by 2002:a05:6e02:1aa2:b0:2e5:c066:7e4f with SMTP id l2-20020a056e021aa200b002e5c0667e4fmr4442586ilv.20.1660660204416; Tue, 16 Aug 2022 07:30:04 -0700 (PDT) X-Google-Smtp-Source: AA6agR4jji8RFSDBpiJEE+fCgtasxk41tysozT8Pvtw72JJgXdntzKGjz+y1rCTrhbV9WFLyb/OZvg== X-Received: by 2002:a05:6e02:1aa2:b0:2e5:c066:7e4f with SMTP id l2-20020a056e021aa200b002e5c0667e4fmr4442573ilv.20.1660660204132; Tue, 16 Aug 2022 07:30:04 -0700 (PDT) Received: from ?IPV6:2607:fea8:a263:f600::8850? ([2607:fea8:a263:f600::8850]) by smtp.gmail.com with ESMTPSA id n30-20020a02a19e000000b00331c58086d8sm3688416jah.147.2022.08.16.07.30.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 Aug 2022 07:30:03 -0700 (PDT) Message-ID: <4c846dc9-0044-35d2-dd9a-87ec93dd8d75@redhat.com> Date: Tue, 16 Aug 2022 10:30:02 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH] Support threading of just the exit edge To: Richard Biener , Aldy Hernandez Cc: Jeff Law , gcc-patches References: <20220812120145.91C6513AAE@imap2.suse-dmz.suse.de> <3b6265b1-a0ab-f0b7-2d1f-aaf6277eec14@redhat.com> From: Andrew MacLeod In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2022 14:30:11 -0000 On 8/16/22 05:18, Richard Biener wrote: > On Mon, 15 Aug 2022, Aldy Hernandez wrote: > >> On Mon, Aug 15, 2022 at 9:24 PM Andrew MacLeod wrote: >>> heh. or just >>> >>> >>> + int_range<2> r; >>> + if (!fold_range (r, const_cast (cond_stmt)) >>> + || !r.singleton_p (&val)) >>> >>> >>> if you do not provide a range_query to any of the fold_using_range code, >>> it defaults to: >>> >>> fur_source::fur_source (range_query *q) >>> { >>> if (q) >>> m_query = q; >>> else if (cfun) >>> m_query = get_range_query (cfun); >>> else >>> m_query = get_global_range_query (); >>> m_gori = NULL; >>> } >>> >> Sweet. Even better! > So when I do the following incremental change ontop of the posted > patch then I see that the path query is able to simplify more > "single BB paths" than the global range folding. > > diff --git a/gcc/tree-ssa-threadbackward.cc > b/gcc/tree-ssa-threadbackward.cc > index 669098e4ec3..777e778037f 100644 > --- a/gcc/tree-ssa-threadbackward.cc > +++ b/gcc/tree-ssa-threadbackward.cc > @@ -314,6 +314,12 @@ back_threader::find_taken_edge_cond (const > vec &path, > { > int_range_max r; > > + int_range<2> rf; > + if (path.length () == 1) > + { > + fold_range (rf, cond); > + } > + > m_solver->compute_ranges (path, m_imports); > m_solver->range_of_stmt (r, cond); > > @@ -325,6 +331,8 @@ back_threader::find_taken_edge_cond (const > vec &path, > > if (r == true_range || r == false_range) > { > + if (path.length () == 1) > + gcc_assert (r == rf); > edge e_true, e_false; > basic_block bb = gimple_bb (cond); > extract_true_false_edges_from_block (bb, &e_true, &e_false); > > Even doing the following (not sure what's the difference and in > particular expense over the path range query) results in missed > simplifications (checking my set of cc1files). > > diff --git a/gcc/tree-ssa-threadbackward.cc > b/gcc/tree-ssa-threadbackward.cc > index 669098e4ec3..1d43a179d08 100644 > --- a/gcc/tree-ssa-threadbackward.cc > +++ b/gcc/tree-ssa-threadbackward.cc > @@ -99,6 +99,7 @@ private: > > back_threader_registry m_registry; > back_threader_profitability m_profit; > + gimple_ranger *m_ranger; > path_range_query *m_solver; > > // Current path being analyzed. > @@ -146,12 +147,14 @@ back_threader::back_threader (function *fun, > unsigned flags, bool first) > // The path solver needs EDGE_DFS_BACK in resolving mode. > if (flags & BT_RESOLVE) > mark_dfs_back_edges (); > - m_solver = new path_range_query (flags & BT_RESOLVE); > + m_ranger = new gimple_ranger; > + m_solver = new path_range_query (flags & BT_RESOLVE, m_ranger); > } > > back_threader::~back_threader () > { > delete m_solver; > + delete m_ranger; > > loop_optimizer_finalize (); > } > @@ -314,6 +317,12 @@ back_threader::find_taken_edge_cond (const > vec &path, > { > int_range_max r; > > + int_range<2> rf; > + if (path.length () == 1) > + { > + fold_range (rf, cond, m_ranger); > + } > + > m_solver->compute_ranges (path, m_imports); > m_solver->range_of_stmt (r, cond); > > @@ -325,6 +334,8 @@ back_threader::find_taken_edge_cond (const > vec &path, > > if (r == true_range || r == false_range) > { > + if (path.length () == 1) > + gcc_assert (r == rf); > edge e_true, e_false; > basic_block bb = gimple_bb (cond); > extract_true_false_edges_from_block (bb, &e_true, &e_false); > > one example is > > [local count: 14414059]: > _128 = node_177(D)->typed.type; > pretmp_413 = MEM[(const union tree_node *)_128].base.code; > _431 = pretmp_413 + 65519; > if (_128 == 0B) > goto ; [18.09%] > else > goto ; [81.91%] > > where m_imports for the path is just _128 and the range computed is > false while the ranger query returns VARYING. But > path_range_query::range_defined_in_block does > > if (bb && POINTER_TYPE_P (TREE_TYPE (name))) > m_ranger->m_cache.m_exit.maybe_adjust_range (r, name, bb); This is the coarse grained "side effect applies somewhere in the block" mechanism.  There is no understanding of where in the block it happens. > > which adjusts the range to ~[0, 0], probably because of the > dereference in the following stmt. > > Why does fold_range not do this when folding the exit test? Is there > a way to make it do so? It looks like the only routine using this > in gimple-range.cc is range_on_edge and there it's used for e->src > after calling range_on_exit for e->src (why's it not done in > range_on_exit?). A testcase for this is Fold_range doesnt do this because it is simply another statement.  It makes no attempt to understand the context in which you are folding something. you could be folding that stmt from a different location (ie recomputing)   If your context is that you are looking for the range after the last statement has been executed, then one needs to check to see if there are any side effects. ranger uses it for range_on_edge (), because  it knows all the statements in the block have been executed, and its safe to apply anything seen in the block.  It does it right after range_on_exit() is called internally. Once upon a time, it was integrated with range-on-exit, but it turned out there were certain times that it was causing problems. There have been some cleanups since then, it probably safe now to return that call to range_on_exit.. but doesnt buy us a whole lot by itself.. except of course I have now OKd using range_on-entry/exit generally :-) the cache also uses it when walking blocks to pick up inferred values during an on-entry cache fill. > int foo (int **p, int i) > { > int *q = *p; > int res = *q + i; > if (q) > return res; > return -1; > } > > which we "thread" with the path and with the above ICEs because > fold_range doesn't get that if (q) is always true. Without the Its a known limitation that, unless you are doing a walk, on-demand requests will "miss" some inferred ranges, because they are only maintained at the basic block level.  (ie, we will know that q is non-null in BB2,  but we don't know where, so we can make no assumptions at the exit condition about q in this case. the path_query is invoked in on-demand mode because it wont walk the entire IL,. so the first time you ask for the range of an ssa-name, it will quickly zip over the immediate use list and "fill" the on-exit structure for any blocks which a non-null reference is seen.  This allows the threader to pick up non-null from blocks outside the path that havent been examined. VRP does a walk, and while during the walk, adjusts ranges on the fly for the current block via the gimple_ranger::register_inferred_ranges () routine.  which is really just a wrapper around ranger_cache::apply_inferred_ranges ()  (in gimple-range-cache.cc) This is called after every statement and is where we take care of bookkeeping for adjusting values, and adding them to the blocks list. if the path query is walking those statement, it could also "adjust" the range of q on the fly... but it has to have walked those statements.  from that routine, the relevant bits use the gimple-infer class to find any inferred ranges from the statement, and would look something like:   gimple_infer_range infer(s);   for (unsigned x = 0; x < infer.num (); x++)     {       tree name = infer.name (x);       if (!interesting_p (name))          continue;       get_current_path_range (r, name);       if (r.intersect (infer.range (x)))         set_current_path_range (name, r);     } That would adjust the value of q to be non-zero after   "int res = *q + i;" but you need to have walked the statement before you get to the condition.   as long as they are all in your list of interesting statement to look at, then you'd be golden. I dont know if, when, or what direction things are examined. Andrew