From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by sourceware.org (Postfix) with ESMTPS id 34F123858400 for ; Wed, 5 Jan 2022 12:34:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 34F123858400 Received: by mail-ed1-x52c.google.com with SMTP id f5so161436401edq.6 for ; Wed, 05 Jan 2022 04:34:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=WSY6DvyPpPsIK8K+KbfRYmZN+Y8AjWG+WK+jYuypH3w=; b=MX+sa8B1fUXxtg5WNUMcavgES7nbx0R4lev/13uDAX3raM3ccyqWSkLc8nBZxWgF9J 0SUztIoNy1wbdaa6bTqojvuv6T3ncjsffIL/qg0L156jrW6XD3VoAVdRC73gC2nR49qo INF6IoVEiU4paUQ31wl7YgELYGqiw2wh+h+W/bIpeb88Jo5GtUwy2db5f/fi2OKN897+ wbIeyiMng1XhhKDG92yCfW+dLvReLo5xxdERe6yDSUcKn1hBytN1QT4S54mhMpKsDbhr al2qoaISnwKFLLnnnaBkPgndbzX6b1biwIL0XKr+PU1BmPEXCkKGfPKWvNJfjJNUpeNu XFlQ== X-Gm-Message-State: AOAM5317XlnBXl5ZmcERthl74A++LQM6CrWp5p/Ezcl8vS8tMtBNax7/ dztZrwPbpdbGCPzOQz1oNP5ZJC6mmzPtUMlLjbk= X-Google-Smtp-Source: ABdhPJy1fXfJOTqpVYCNAsaXrlLt0JytUzrXQzevllfjKoAAgVrIIKUzK1maKEdiTW071pyz6TvLwnxALRluRvnr6LU= X-Received: by 2002:aa7:d64e:: with SMTP id v14mr53628652edr.194.1641386095930; Wed, 05 Jan 2022 04:34:55 -0800 (PST) MIME-Version: 1.0 References: <0db1d9e8-f097-e766-a9fa-1a98c47b8115@suse.cz> <3a07ef98-d05f-dc07-2e36-a2b4ffd52936@suse.cz> <7bcc368c-3f26-4503-aec1-a3d6378e33ec@suse.cz> <561a3ffd-8973-d771-418f-76c484085cc5@suse.cz> <20265d97-6350-c234-695d-bc18e2e617b4@suse.cz> In-Reply-To: From: Richard Biener Date: Wed, 5 Jan 2022 13:34:44 +0100 Message-ID: Subject: Re: [PATCH] Loop unswitching: support gswitch statements. To: =?UTF-8?Q?Martin_Li=C5=A1ka?= Cc: GCC Patches , Andrew MacLeod Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jan 2022 12:34:59 -0000 On Thu, Dec 9, 2021 at 2:02 PM Martin Li=C5=A1ka wrote: > > On 11/30/21 12:17, Richard Biener wrote: > > I'd like to see the gswitch support - that's what was posted before sta= ge3 > > close, this patch on its own doesn't seem worth pushing for. That said= , > > I have some comments below (and the already raised ones about how > > things might need to change with gswitch support). Is it so difficult = to > > develop gswitch support as a separate change ontop of this? > > Hello. > > Took me some time, but I have a working version of the patch that makes b= oth > refactoring of the costing model and adds support for gswitch. For quite = some > time, I maintained 2 patches, but as commonly happens, I was forced doing= quite > some rework. If we really want a separation for bisection purpose, I sugg= est simple > disabling of gswitch support? It was really meant to ease review. I'm now looking at the combined patch, comments will follow. +static void +clean_up_after_unswitching (const auto_edge_flag &ignored_edge_flag) +{ + basic_block bb; + ... + /* Clean up the ignored_edge_flag from edges. */ + FOR_EACH_BB_FN (bb, cfun) + { + edge e; you can probably clean up outgoing edge flags in the loop above? (I'm randomly jumping) + /* Build compound expression for all cases leading + to this edge. */ + tree expr =3D NULL_TREE; + for (unsigned i =3D 1; i < gimple_switch_num_labels (stmt); += +i) + { + tree lab =3D gimple_switch_label (stmt, i); + basic_block dest =3D label_to_block (cfun, CASE_LABEL (la= b)); + edge e2 =3D find_edge (gimple_bb (stmt), dest); + if (e =3D=3D e2) just as a style note I prefer if (e !=3D e2) continue; so the following cod= e needs less indentation + tree cmp1 =3D build2 (GE_EXPR, boolean_type_node,= idx, + CASE_LOW (lab)); is there a reason you are not using fold_build2? Do we want to somehow acc= ount for the built expression size or maybe have a separate limit for the number= of cases we want to combine this way? + unswitch_predicate *predicate + =3D new unswitch_predicate (expr, idx, edge_index); + ranger->gori ().outgoing_edge_range_p (predicate->true_range, e, + idx, *get_global_range_query ()); + /* Huge switches are not supported by Ranger. */ + if (predicate->true_range.undefined_p ()) I hope ranger will set the range to varying_p () in that case, not undefined? But even then, is that a reason for punting? I guess we fail to prune cases in that case but the cost modeling should then account for those and thus we are at least consistent? + { + delete predicate; + return; In this context, when we do + if (operand_equal_p (predicate->lhs, last_predicate->lhs, 0)) + { + irange &other =3D (true_edge ? predicate->merged_true_range + : predicate->merged_false_range); + last_predicate->merged_true_range.intersect (other); + last_predicate->merged_false_range.intersect (other); + return; ranger may be conservative when intersecting (and hopefully not end up with undefined_p - heh). I also am confused about intersecting both merged_true_range and merged_false_range with 'other'? I would have expected to merge true edge info with true edge info and thus only "swap" things somehow? OTOH "path" suggests we're dealing with more than one edge and associated ranges? Maybe expanding the comment on the predicate_vector would be useful. AFAIR we there store the sequence of unswitchings done with pairs of the predicate unswitched on and a bool indicating whether we're dealing with the copy that had the predicate true or the one that had it false. + unswitch_predicate *predicate =3D NULL; + basic_block bb =3D NULL; + if (num > param_max_unswitch_level) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc, + "Not unswitching anymore, hit max level\n"); + goto exit; + } I'll notice that given we have the overall size budget limiting the number of unswitchings itself is probably unnecessary (as noted above we might need to account for the size of the unswitch condition). + for (unsigned i =3D 0; i !=3D loop->num_nodes; i++) + { + vec &predicates =3D get_predicates_for_bb (bbs= [i]); + if (!predicates.is_empty ()) + { same comment about indenting I wonder whether evaluate_control_stmt_using_entry_checks could set ignored_edge_flag itself instead of communicating via a hash_set? It's not exactly clear what we use pred->handled for, do we re-discover and re-try predicates when unswitching another level otherwise? But we _do_ want to unswitch the same switch stmt again, no? And since the BB predicate is keyed on the switch stmt that wouldn't work? I wonder whether on recursive invocations of tree_unswitch_single_loop we'd want to skip over unreachable BBs in the loop when looking for candidates (when we compute BB predicates we skip over them but that's done before all cases). Ah, the BB predicates are a vector, I wonder why we not simply remove the predicate from the BB predicate vector when we decide to unswitch on it and thus append it to a predicate path? Can you please amend the toplevel file comment as to the new flow of things? For the switch unswitching testcases I wonder if we can add dump scanning to verify we do not end up with duplicate cases across the loop versions? Overall it looks reasonable but I hope for some simplification still. I al= so think it's something for stage1 now :/ After addressing comments above can you put the patch on a branch so I can play & fixup things a bit when I run into them? Often it's easier to just do a change instead of writing up a review comment and expecting that to be point-on ;) Thanks, Richard. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > SPEC 2006 and 2017 survive running -O3 -march=3Dnative. There are 2090 op= timization notes > for SPEC 2006 (compared to 1945 before the patch). > > Thoughts? > Thanks, > Martin