From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by sourceware.org (Postfix) with ESMTPS id C9A9D385840B; Mon, 21 Feb 2022 07:13:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C9A9D385840B Received: by mail-ed1-x536.google.com with SMTP id u18so27334159edt.6; Sun, 20 Feb 2022 23:13:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7U9mgr1ykJ3Wm9xGlTk98rGBCXnDG/pvrc436Hr43aA=; b=CCdshjThibgPQCXyxSEG2AtvLZONbbJuajSOGmIM0Nt5K8UCMXOyestWBSu0PDUF+D 0fKOf7DwN3bXt3iVGfkG7lnOmRjuVwsTFKDGQW3lpos+e4/GKxRYjP3nsGuFCpH7MyEZ f1jBMSMT9ahDzgglFgH812NTPqceKlknlCSpju9bEI+/JV4dUezstSexw+qvRc7cJdl2 Yl7xKCWT1TAcsWMrGBxISZv5K2CGxPJ1dT84GRXQmA6QVBAbbYWhhNblY3kqJwbX4X28 U/hAe3iy/uuNd5TlLUC61vJkcwwHVDWjMfcnIKo9i44kh5BwUUkWp9EhIOhrdVU5rBvm gp5g== X-Gm-Message-State: AOAM530pP7iDlRqT5wGXQndPT1JPmNzkn56THV9UGNFfQxyZBmvmSX19 C40vjcCXojNhc5GSsRaH80BOA4jSHcrRPjAbkFM= X-Google-Smtp-Source: ABdhPJww/8VfCcjvZA9Eiu/dt+QCmbsYfVPfaWdQ61K35cLA9xEPqWPtpirEhgKZRvyRJMF7YwMMWjwBFZLC6Kvq+OE= X-Received: by 2002:a05:6402:2748:b0:408:c807:8db7 with SMTP id z8-20020a056402274800b00408c8078db7mr19784350edd.4.1645427623412; Sun, 20 Feb 2022 23:13:43 -0800 (PST) MIME-Version: 1.0 References: <74d2af86-c1c4-0842-b319-09e76b3bf633@gmail.com> <4bf89538-8c3c-59a7-378b-d02982e6a5ea@gmail.com> <03226f85-0334-1840-94f9-6c618175f974@gmail.com> In-Reply-To: From: Richard Biener Date: Mon, 21 Feb 2022 08:13:32 +0100 Message-ID: Subject: Re: [PR103302] skip multi-word pre-move clobber during lra To: Alexandre Oliva Cc: Jeff Law , Vladimir Makarov , zsojka@seznam.cz, GCC Patches , wilson@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Feb 2022 07:13:47 -0000 On Sat, Feb 19, 2022 at 12:28 AM Alexandre Oliva via Gcc-patches wrote: > > On Dec 15, 2021, Jeff Law wrote: > > >> * expr.c (emit_move_complex_parts): Skip clobbers during lra. > > OK for the next cycle. > > Thanks, but having looked into PR 104121, I withdraw this patch and also > the already-installed patch for PR 103302. As I found out, LRA does > worse without the clobbers for multi-word moves, not only because the > clobbers shorten live ranges and enable earlier and better allocations, > but also because find_reload_regno_insns implicitly, possibly > unknowingly, relied on the clobbers to avoid the risk of an infinite > loop. > > As noted in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121#c11 with > the clobber, a multi-word reload, and the insn the reload applies to, we > get 4 insns, so find_reload_regno_insns avoids the loop. Without the > clobber, a multi-word reload for either input or output makes for n==3, > so we enter the loop and don't ever exit it: we'll find first_insn > (input) or second_insn (output), but then we'll loop forever because we > won't iterate again on {prev,next}_insn, respectively, and the other > iterator won't find the other word reload. We advance the other till > the end, but that's not enough for us to terminate the loop. > > With the proposed patch reversal, we no longer hit the problem with the > v850 testcase in 104121, but I'm concerned we might still get an > infinite loop on ports whose input or output reloads might emit a pair > of insns without a clobber. > > I see two ways to robustify it. One is to find a complete reload > sequence: > > diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc > index c1d40ea2a14bd..ff1688917cbce 100644 > --- a/gcc/lra-assigns.cc > +++ b/gcc/lra-assigns.cc > @@ -1716,9 +1716,18 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) > start_insn = lra_insn_recog_data[uid]->insn; > n++; > } > - /* For reload pseudo we should have at most 3 insns referring for > - it: input/output reload insns and the original insn. */ > - if (n > 3) > + /* For reload pseudo we should have at most 3 (sequences of) insns > + referring for it: input/output reload insn sequences and the > + original insn. Each reload insn sequence may amount to multiple > + insns, but we expect to find each of them contiguous, one before > + start_insn, one after it. We know start_insn is between the > + sequences because it's the lowest-numbered insn, thus the first > + we'll have found above. The reload insns, emitted later, will > + have been assigned higher insn uids. If this assumption doesn't > + hold, and there happen to be intervening reload insns for other > + pseudos, we may end up returning false after searching to the end > + in the wrong direction. */ > + if (n > 1 + 2 * CEIL (lra_reg_info[regno].biggest_mode, UNITS_PER_WORD)) > return false; > if (n > 1) > { > @@ -1726,26 +1735,52 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) > next_insn = NEXT_INSN (start_insn); > n != 1 && (prev_insn != NULL || next_insn != NULL); ) > { > - if (prev_insn != NULL && first_insn == NULL) > + if (prev_insn != NULL) > { > if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, > INSN_UID (prev_insn))) > prev_insn = PREV_INSN (prev_insn); > else > { > - first_insn = prev_insn; > - n--; > + /* A reload sequence may have multiple insns, but > + they must be contiguous. */ > + do > + { > + first_insn = prev_insn; > + n--; > + prev_insn = PREV_INSN (prev_insn); > + } > + while (n > 1 && prev_insn > + && bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, > + INSN_UID (prev_insn))); > + /* After finding first_insn, we don't want to search > + backward any more, so set prev_insn to NULL so as > + to not loop indefinitely. */ > + prev_insn = NULL; > } > } > - if (next_insn != NULL && second_insn == NULL) > + else if (next_insn != NULL) > { > if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, > INSN_UID (next_insn))) > next_insn = NEXT_INSN (next_insn); > else > { > - second_insn = next_insn; > - n--; > + /* A reload sequence may have multiple insns, but > + they must be contiguous. */ > + do > + { > + second_insn = next_insn; > + n--; > + next_insn = NEXT_INSN (next_insn); > + } > + while (n > 1 && next_insn > + && bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, > + INSN_UID (next_insn))); > + /* After finding second_insn, we don't want to > + search forward any more, so set next_insn to NULL > + so as to not loop indefinitely. */ > + next_insn = NULL; > } > } > } > > > the other is to just prevent the infinite loop, that will then return > false because n > 1 after the loop ends: > > diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc > index c1d40ea2a14bd..efd5f764a37a5 100644 > --- a/gcc/lra-assigns.cc > +++ b/gcc/lra-assigns.cc > @@ -1726,7 +1726,7 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) > next_insn = NEXT_INSN (start_insn); > n != 1 && (prev_insn != NULL || next_insn != NULL); ) > { > - if (prev_insn != NULL && first_insn == NULL) > + if (prev_insn != NULL) > { > if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, > INSN_UID (prev_insn))) > @@ -1735,9 +1735,10 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) > { > first_insn = prev_insn; > n--; > + prev_insn = NULL; > } > } > - if (next_insn != NULL && second_insn == NULL) > + if (next_insn != NULL) > { > if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap, > INSN_UID (next_insn))) > @@ -1746,6 +1747,7 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish) > { > second_insn = next_insn; > n--; > + next_insn = NULL; > } > } > } > > > When it comes to the v850 testcase, one of them just moves the infinite > loop to lra(), as we never get past while(fails_p); the other hits > lra-assigns.cc:lra_assign (bool&)'s > > if (flag_checking > && (lra_assignment_iter_after_spill > > LRA_MAX_ASSIGNMENT_ITERATION_NUMBER)) > internal_error > ("maximum number of LRA assignment passes is achieved (%d)", > LRA_MAX_ASSIGNMENT_ITERATION_NUMBER); > > which would loop indefinitely too without flag_checking. > > Neither solves the v850 problem, only restoring the clobber does, > because then, with shorter live ranges, allocation succeeds for the > reloads, and we don't even try to split -> spill their pseudos. > > Would any of these patchlets make sense to pursue regardless? > > Ok to revert commit r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb OK. Please re-open the bug as appropriate. > I'm going to get back to the drawing board as to pr103302, since the > problem there will likely resurface, possibly also on v850. > > > diff --git a/gcc/expr.cc b/gcc/expr.cc > index 63a4aa838dec0..b6ed54983fabf 100644 > --- a/gcc/expr.cc > +++ b/gcc/expr.cc > @@ -3929,7 +3929,7 @@ emit_move_multi_word (machine_mode mode, rtx x, rtx y) > hard regs shouldn't appear here except as return values. > We never want to emit such a clobber after reload. */ > if (x != y > - && ! (lra_in_progress || reload_in_progress || reload_completed) > + && ! (reload_in_progress || reload_completed) > && need_clobber != 0) > emit_clobber (x); > > > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ > Free Software Activist GNU Toolchain Engineer > Disinformation flourishes because many people care deeply about injustice > but very few check the facts. Ask me about