public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RFC: allowing fwprop to propagate subregs
@ 2011-09-14 15:40 Richard Sandiford
  2011-09-14 15:45 ` H.J. Lu
  2012-01-11 16:55 ` Ulrich Weigand
  0 siblings, 2 replies; 14+ messages in thread
From: Richard Sandiford @ 2011-09-14 15:40 UTC (permalink / raw)
  To: gcc-patches; +Cc: bonzini

At the moment, fwprop will propagate constants and registers
even if no further rtl simplifications are possible:

  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
    flags |= PR_CAN_APPEAR;

What do you think about extending this to subregs?  The reason for
asking is that on NEON, vector loads like vld4 are represented as a load
of a single monolithic register followed by subreg extractions of each
vector:

  (set (reg:OI FULL) (...))
  (set (reg:V2SI V0) (subreg:V2SI (reg:OI FULL) 0))
  (set (reg:V2SI V1) (subreg:V2SI (reg:OI FULL) 16))
  (set (reg:V2SI V2) (subreg:V2SI (reg:OI FULL) 32))
  (set (reg:V2SI V3) (subreg:V2SI (reg:OI FULL) 48))

Nothing ever propagates these subregs, so the separate moves
survive until IRA.  This has three problems:

  - We generally want the registers allocated to V0...V3 to be the same
    as FULL, so that the four subreg moves become nops.  And this often
    happens in simple examples.  But if register pressure is relatively
    high, these moves can sometimes cause IRA to spill in cases where
    it doesn't if the subregs are used instead of each Vi.

  - Perhaps related, register pressure becomes harder to estimate.

  - These moves can interfere with pre-reload scheduling.

In combination with the MODES_TIEABLE_P patch that I posted here:

    http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00626.html

this patch significantly improves the code generated for several libav
loops.  Unfortunately, I don't have a setup that can do meaningful
x86_64 performance measurements, but a diff of the before and after
output for libav showed many cases where the patch removed moves.

What do you think?  Alternatives include propagating in lower-subreg,
or maybe only in the second fwprop pass.

Richard


gcc/
	* fwprop.c (propagate_rtx): Also set PR_CAN_APPEAR for subregs.

Index: gcc/fwprop.c
===================================================================
--- gcc/fwprop.c	2011-08-26 09:58:28.829540497 +0100
+++ gcc/fwprop.c	2011-08-26 10:14:03.767707504 +0100
@@ -664,7 +664,7 @@ propagate_rtx (rtx x, enum machine_mode 
     return NULL_RTX;
 
   flags = 0;
-  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
+  if (REG_P (new_rtx) || CONSTANT_P (new_rtx) || GET_CODE (new_rtx) == SUBREG)
     flags |= PR_CAN_APPEAR;
   if (!for_each_rtx (&new_rtx, varying_mem_p, NULL))
     flags |= PR_HANDLE_MEM;

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-03-12 10:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-14 15:40 RFC: allowing fwprop to propagate subregs Richard Sandiford
2011-09-14 15:45 ` H.J. Lu
2011-09-14 16:09   ` Richard Sandiford
2011-09-14 18:28     ` Paolo Bonzini
2012-01-11 16:55 ` Ulrich Weigand
2012-01-11 19:12   ` Paolo Bonzini
2012-01-12  9:57   ` Richard Sandiford
2012-01-12 11:56     ` Richard Kenner
2012-01-16 14:21       ` Ulrich Weigand
2012-01-16 14:32         ` Richard Kenner
2012-01-17 19:25           ` Ulrich Weigand
2012-01-17 19:56             ` Richard Kenner
2012-03-07 17:40               ` [PATCH] Do not handle SUBREG in apply_distributive_law (Re: RFC: allowing fwprop to propagate subregs) Ulrich Weigand
2012-03-12 10:23                 ` Richard Guenther

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).