From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <oliva@adacore.com>
Received: from rock.gnat.com (rock.gnat.com
 [IPv6:2620:20:4000:0:a9e:1ff:fe9b:1d1])
 by sourceware.org (Postfix) with ESMTPS id D3AF83858402;
 Fri, 18 Feb 2022 23:27:37 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D3AF83858402
Received: from localhost (localhost.localdomain [127.0.0.1])
 by filtered-rock.gnat.com (Postfix) with ESMTP id F19F3116113;
 Fri, 18 Feb 2022 18:27:35 -0500 (EST)
X-Virus-Scanned: Debian amavisd-new at gnat.com
X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "Cc"
Received: from rock.gnat.com ([127.0.0.1])
 by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id UEOM9kCiu14s; Fri, 18 Feb 2022 18:27:35 -0500 (EST)
Received: from free.home (tron.gnat.com
 [IPv6:2620:20:4000:0:46a8:42ff:fe0e:e294])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by rock.gnat.com (Postfix) with ESMTPS id 21B54116107;
 Fri, 18 Feb 2022 18:27:34 -0500 (EST)
Received: from livre (livre.home [172.31.160.2])
 by free.home (8.15.2/8.15.2) with ESMTPS id 21INRL29581636
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Fri, 18 Feb 2022 20:27:25 -0300
From: Alexandre Oliva <oliva@adacore.com>
To: Jeff Law <jeffreyalaw@gmail.com>, Vladimir Makarov <vmakarov@redhat.com>
Cc: gcc-patches@gcc.gnu.org, zsojka@seznam.cz, wilson@gcc.gnu.org
Cc: Sebastian Huber <sebastian.huber@embedded-brains.de>
Subject: Re: [PR103302] skip multi-word pre-move clobber during lra
Organization: Free thinker, does not speak for AdaCore
References: <orfsr38xn8.fsf@lxoliva.fsfla.org>
 <74d2af86-c1c4-0842-b319-09e76b3bf633@gmail.com>
 <orpmq6773m.fsf@lxoliva.fsfla.org>
 <4bf89538-8c3c-59a7-378b-d02982e6a5ea@gmail.com>
 <or8rwm1dmu.fsf@lxoliva.fsfla.org>
 <03226f85-0334-1840-94f9-6c618175f974@gmail.com>
Errors-To: aoliva@lxoliva.fsfla.org
Date: Fri, 18 Feb 2022 20:27:21 -0300
In-Reply-To: <03226f85-0334-1840-94f9-6c618175f974@gmail.com> (Jeff Law's
 message of "Wed, 15 Dec 2021 09:00:01 -0700")
Message-ID: <orbkz37o2u.fsf@lxoliva.fsfla.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Scanned-By: MIMEDefang 2.84
X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Feb 2022 23:27:40 -0000

On Dec 15, 2021, Jeff Law <jeffreyalaw@gmail.com> wrote:

>> * expr.c (emit_move_complex_parts): Skip clobbers during lra.
> OK for the next cycle.

Thanks, but having looked into PR 104121, I withdraw this patch and also
the already-installed patch for PR 103302.  As I found out, LRA does
worse without the clobbers for multi-word moves, not only because the
clobbers shorten live ranges and enable earlier and better allocations,
but also because find_reload_regno_insns implicitly, possibly
unknowingly, relied on the clobbers to avoid the risk of an infinite
loop.

As noted in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121#c11 with
the clobber, a multi-word reload, and the insn the reload applies to, we
get 4 insns, so find_reload_regno_insns avoids the loop.  Without the
clobber, a multi-word reload for either input or output makes for n==3,
so we enter the loop and don't ever exit it: we'll find first_insn
(input) or second_insn (output), but then we'll loop forever because we
won't iterate again on {prev,next}_insn, respectively, and the other
iterator won't find the other word reload.  We advance the other till
the end, but that's not enough for us to terminate the loop.

With the proposed patch reversal, we no longer hit the problem with the
v850 testcase in 104121, but I'm concerned we might still get an
infinite loop on ports whose input or output reloads might emit a pair
of insns without a clobber.

I see two ways to robustify it.  One is to find a complete reload
sequence:

diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
index c1d40ea2a14bd..ff1688917cbce 100644
--- a/gcc/lra-assigns.cc
+++ b/gcc/lra-assigns.cc
@@ -1716,9 +1716,18 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish)
 	start_insn = lra_insn_recog_data[uid]->insn;
       n++;
     }
-  /* For reload pseudo we should have at most 3 insns referring for
-     it: input/output reload insns and the original insn.  */
-  if (n > 3)
+  /* For reload pseudo we should have at most 3 (sequences of) insns
+     referring for it: input/output reload insn sequences and the
+     original insn.  Each reload insn sequence may amount to multiple
+     insns, but we expect to find each of them contiguous, one before
+     start_insn, one after it.  We know start_insn is between the
+     sequences because it's the lowest-numbered insn, thus the first
+     we'll have found above.  The reload insns, emitted later, will
+     have been assigned higher insn uids.  If this assumption doesn't
+     hold, and there happen to be intervening reload insns for other
+     pseudos, we may end up returning false after searching to the end
+     in the wrong direction.  */
+  if (n > 1 + 2 * CEIL (lra_reg_info[regno].biggest_mode, UNITS_PER_WORD))
     return false;
   if (n > 1)
     {
@@ -1726,26 +1735,52 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish)
 	     next_insn = NEXT_INSN (start_insn);
 	   n != 1 && (prev_insn != NULL || next_insn != NULL); )
 	{
-	  if (prev_insn != NULL && first_insn == NULL)
+	  if (prev_insn != NULL)
 	    {
 	      if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap,
 				  INSN_UID (prev_insn)))
 		prev_insn = PREV_INSN (prev_insn);
 	      else
 		{
-		  first_insn = prev_insn;
-		  n--;
+		  /* A reload sequence may have multiple insns, but
+		     they must be contiguous.  */
+		  do
+		    {
+		      first_insn = prev_insn;
+		      n--;
+		      prev_insn = PREV_INSN (prev_insn);
+		    }
+		  while (n > 1 && prev_insn
+			 && bitmap_bit_p (&lra_reg_info[regno].insn_bitmap,
+					  INSN_UID (prev_insn)));
+		  /* After finding first_insn, we don't want to search
+		     backward any more, so set prev_insn to NULL so as
+		     to not loop indefinitely.  */
+		  prev_insn = NULL;
 		}
 	    }
-	  if (next_insn != NULL && second_insn == NULL)
+	  else if (next_insn != NULL)
 	    {
 	      if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap,
 				INSN_UID (next_insn)))
 		next_insn = NEXT_INSN (next_insn);
 	      else
 		{
-		  second_insn = next_insn;
-		  n--;
+		  /* A reload sequence may have multiple insns, but
+		     they must be contiguous.  */
+		  do
+		    {
+		      second_insn = next_insn;
+		      n--;
+		      next_insn = NEXT_INSN (next_insn);
+		    }
+		  while (n > 1 && next_insn
+			 && bitmap_bit_p (&lra_reg_info[regno].insn_bitmap,
+					  INSN_UID (next_insn)));
+		  /* After finding second_insn, we don't want to
+		     search forward any more, so set next_insn to NULL
+		     so as to not loop indefinitely.  */
+		  next_insn = NULL;
 		}
 	    }
 	}


the other is to just prevent the infinite loop, that will then return
false because n > 1 after the loop ends:

diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
index c1d40ea2a14bd..efd5f764a37a5 100644
--- a/gcc/lra-assigns.cc
+++ b/gcc/lra-assigns.cc
@@ -1726,7 +1726,7 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish)
 	     next_insn = NEXT_INSN (start_insn);
 	   n != 1 && (prev_insn != NULL || next_insn != NULL); )
 	{
-	  if (prev_insn != NULL && first_insn == NULL)
+	  if (prev_insn != NULL)
 	    {
 	      if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap,
 				  INSN_UID (prev_insn)))
@@ -1735,9 +1735,10 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish)
 		{
 		  first_insn = prev_insn;
 		  n--;
+		  prev_insn = NULL;
 		}
 	    }
-	  if (next_insn != NULL && second_insn == NULL)
+	  if (next_insn != NULL)
 	    {
 	      if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap,
 				INSN_UID (next_insn)))
@@ -1746,6 +1747,7 @@ find_reload_regno_insns (int regno, rtx_insn * &start, rtx_insn * &finish)
 		{
 		  second_insn = next_insn;
 		  n--;
+		  next_insn = NULL;
 		}
 	    }
 	}


When it comes to the v850 testcase, one of them just moves the infinite
loop to lra(), as we never get past while(fails_p); the other hits
lra-assigns.cc:lra_assign (bool&)'s

  if (flag_checking
      && (lra_assignment_iter_after_spill
	  > LRA_MAX_ASSIGNMENT_ITERATION_NUMBER))
    internal_error
      ("maximum number of LRA assignment passes is achieved (%d)",
       LRA_MAX_ASSIGNMENT_ITERATION_NUMBER);

which would loop indefinitely too without flag_checking.

Neither solves the v850 problem, only restoring the clobber does,
because then, with shorter live ranges, allocation succeeds for the
reloads, and we don't even try to split -> spill their pseudos.

Would any of these patchlets make sense to pursue regardless?

Ok to revert commit r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb

I'm going to get back to the drawing board as to pr103302, since the
problem there will likely resurface, possibly also on v850.


diff --git a/gcc/expr.cc b/gcc/expr.cc
index 63a4aa838dec0..b6ed54983fabf 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -3929,7 +3929,7 @@ emit_move_multi_word (machine_mode mode, rtx x, rtx y)
      hard regs shouldn't appear here except as return values.
      We never want to emit such a clobber after reload.  */
   if (x != y
-      && ! (lra_in_progress || reload_in_progress || reload_completed)
+      && ! (reload_in_progress || reload_completed)
       && need_clobber != 0)
     emit_clobber (x);
 

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>