From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [IPv6:2a00:1450:4864:20::529]) by sourceware.org (Postfix) with ESMTPS id D6D463858D3C for ; Fri, 19 Nov 2021 10:00:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D6D463858D3C Received: by mail-ed1-x529.google.com with SMTP id y12so40376913eda.12 for ; Fri, 19 Nov 2021 02:00:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Ehm7w5A17DF8uRjtB1Lx4cLyocPBXwlfdNe9updIcCc=; b=d8TXdfqwyzr2kpzgFp1tm4UZ9eso9+jFzRgIN7oM9lN1+bUx8Pzwndv+gFwDzQ4o+k qFUhCIfXlheAyPbgmPwwbgVOiREAn4dlbLtiML2BUYcWLjXrMb2FCNJYefp5Y+BjPlFM DYqbyvK24pLHTqRGmEZQ1VynU6qAtGdyBInACsfN53H9mK61aaaRnMzcj8CnPb81/SpG 36vd5ZShtjkPrvYKC+zeocLtxgzVnl7WmZKlHtrWwVm6/99TjMx+374XuLeC/Wo8uoym RqhN1kW2mrl73n/YtLj8rND8OjhehWACpt5gDjmgeeSZfk6Aqt4tHs2xonprvBHfcSS6 RZkw== X-Gm-Message-State: AOAM5332BL8XPWnI3U/EaN6dKsffNAusDK16RBCes+x2KrZQGjefp9su 8/Y/UjKQ67rU1u03wj5ozq2sCc8QYmRlKBdDv3M= X-Google-Smtp-Source: ABdhPJz0IYEOrkAZqU4r8WN4BwADWhrxIsiKHop1KfyxlObwncYxV5v7MgbGW0dhM6U4UTlx51u7Oz8luDe7+oakLLY= X-Received: by 2002:a05:6402:4394:: with SMTP id o20mr22659946edc.342.1637316045906; Fri, 19 Nov 2021 02:00:45 -0800 (PST) MIME-Version: 1.0 References: <7791e850-f74d-8c1c-f67c-e02f3f6e007d@redhat.com> <5c6c91d4-ed8b-8d98-2cd9-bafc84e6f2a4@suse.cz> <8da24825-19ec-56a6-a68c-5c37c7acc3e1@redhat.com> <59763e1a-8432-5f23-c399-a9b4dd6c6dff@suse.cz> In-Reply-To: From: Richard Biener Date: Fri, 19 Nov 2021 11:00:35 +0100 Message-ID: Subject: Re: [PATCH] Loop unswitching: support gswitch statements. To: =?UTF-8?Q?Martin_Li=C5=A1ka?= Cc: Andrew MacLeod , GCC Patches Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Nov 2021 10:00:48 -0000 On Tue, Nov 16, 2021 at 3:40 PM Martin Li=C5=A1ka wrote: > > On 11/11/21 08:15, Richard Biener wrote: > > So I'd try to do no functional change first, improving the costing and > > setting up the transform to simply pick up the stmts to "fold" as disco= vered > > during analysis (as I hinted you possibly can use gimple_uid to mark > > the stmts that simplify, IIRC gimple_uid is preserved during copying. > > gimple_uid would also scale better than gimple_plf in case we do > > the analysis for all candidates at once). > > Thinking about the analysis. Am I correct that we want to properly calcul= ate > loop size for true and false edge of a potential gcond before the actuall= y unswitching? Yes. > We can do that by finding a first gcond candidate, evaluate (symbolic + i= range approache) > all other gcond in the loop body and use BB_REACHABLE discovery. Similarl= y to what we do now > at lines 378-446. Then tree_num_loop_insns can be adjusted for only these= reachable blocks. > Having that, we can calculate # of insns that will live in true/false loo= ps. So whatever we do here we should record as "this control stmt folds to {true,false}" (or {true,unknown}, or in future, "this control stmt will lead to edge {e,unknown}"), recording the simplification on the true/false loop version in a way we can apply it after the transform= . > Then we can call tree_unswitch_loop and make the gcond folding as we do i= n the versioned loops. > > Is it a step in good direction? Having that we can then extend it to gswi= tch statements. One issue I see is that BB_REACHABLE is there only once but you could use auto_bb_flag reachable_true, reachable_false to distinguish the true/false loop version copies. So yes, I think that sounds reasonable. At the point we want to evaluate different (first) unswitching opportunities against each other storing this only as BB flag is likely to hit limits. When we want to evaluate multiple levels of unswitching before doing any transforms even more so (if there are 3 opportunities there'd be many cases to be considered when going to level 3 ;)). I _think_ that a sp= arse lattice of stmt UID -> edge might do the trick if we change tree_num_loop_i= nsns do to a DFS walk from the loop header, ignoring not taken edges by consulting the lattice. Or, for speed reason, pre-compute tree_num_loop_insns for each BB so we just have to sum a different set of BBs rather than walking all stmts again. That said, the second step would definitely be to choose the "best" opportu= nity on the current level. Richard. > Cheers, > Martin