public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Alan Modra <amodra@gmail.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: gcc-patches@sourceware.org
Subject: Re: [RS6000] rs6000_rtx_costs reduce cost for SETs
Date: Fri, 18 Sep 2020 13:08:42 +0930	[thread overview]
Message-ID: <20200918033842.GT5452@bubble.grove.modra.org> (raw)
In-Reply-To: <20200917175125.GJ28786@gate.crashing.org>

On Thu, Sep 17, 2020 at 12:51:25PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Sep 15, 2020 at 10:49:45AM +0930, Alan Modra wrote:
> > Also use rs6000_cost only for speed.
> 
> More directly: use something completely different for !speed, namely,
> code size.

Yes, that might be better.

> > -      if (CONST_INT_P (XEXP (x, 1))
> > -	  && satisfies_constraint_I (XEXP (x, 1)))
> > +      if (!speed)
> > +	/* A little more than one insn so that nothing is tempted to
> > +	   turn a shift left into a multiply.  */
> > +	*total = COSTS_N_INSNS (1) + 1;
> 
> Please don't.  We have a lot of code elsewhere to handle this directly,
> already.  Also, this is just wrong for size.  Five shifts is *not*
> better than four muls.  If that is the only way to get good results,
> than unfortunately we probably have to; but do not do this without any
> proof.

Huh.  If a cost of 5 is "just wrong for size" then you prefer a cost
of 12 for example (power9 mulsi or muldi rs6000_cost)?  Noticing that
result for !speed rs6000_rtx_costs is the entire basis for the !speed
changes.  I don't have any proof that this is correct.

> >      case FMA:
> > -      if (mode == SFmode)
> > +      if (!speed)
> > +	*total = COSTS_N_INSNS (1) + 1;
> 
> Not here, either.
> 
> >      case DIV:
> >      case MOD:
> >        if (FLOAT_MODE_P (mode))
> >  	{
> > -	  *total = mode == DFmode ? rs6000_cost->ddiv
> > -				  : rs6000_cost->sdiv;
> > +	  if (!speed)
> > +	    *total = COSTS_N_INSNS (1) + 2;
> 
> And why + 2 even?
> 
> > -	  if (GET_MODE (XEXP (x, 1)) == DImode)
> > +	  if (!speed)
> > +	    *total = COSTS_N_INSNS (1) + 2;
> > +	  else if (GET_MODE (XEXP (x, 1)) == DImode)
> >  	    *total = rs6000_cost->divdi;
> >  	  else
> >  	    *total = rs6000_cost->divsi;
> 
> (more)

OK, I can remove all the !speed changes.  To be honest, I didn't look
anywhere near as much at code size changes as I worried about
performance.  And about not regressing any fiddly testcase we have.

> > @@ -21368,6 +21378,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
> >        return false;
> >  
> >      case AND:
> > +      *total = COSTS_N_INSNS (1);
> >        right = XEXP (x, 1);
> >        if (CONST_INT_P (right))
> >  	{
> > @@ -21380,15 +21391,15 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
> >  	       || left_code == LSHIFTRT)
> >  	      && rs6000_is_valid_shift_mask (right, left, mode))
> >  	    {
> > -	      *total = rtx_cost (XEXP (left, 0), mode, left_code, 0, speed);
> > -	      if (!CONST_INT_P (XEXP (left, 1)))
> > -		*total += rtx_cost (XEXP (left, 1), SImode, left_code, 1, speed);
> > -	      *total += COSTS_N_INSNS (1);
> > +	      rtx reg_op = XEXP (left, 0);
> > +	      if (!REG_P (reg_op))
> > +		*total += rtx_cost (reg_op, mode, left_code, 0, speed);
> > +	      reg_op = XEXP (left, 1);
> > +	      if (!REG_P (reg_op) && !CONST_INT_P (reg_op))
> > +		*total += rtx_cost (reg_op, mode, left_code, 1, speed);
> >  	      return true;
> >  	    }
> >  	}
> > -
> > -      *total = COSTS_N_INSNS (1);
> >        return false;
> 
> This doesn't improve anything?  It just makes it different from all
> surrounding code?

So it moves the common COSTS_N_INSNS (1) count and doesn't recurse for
regs, like it doesn't for const_int.

> > @@ -21519,7 +21530,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
> >        if (outer_code == TRUNCATE
> >  	  && GET_CODE (XEXP (x, 0)) == MULT)
> >  	{
> > -	  if (mode == DImode)
> > +	  if (!speed)
> > +	    *total = COSTS_N_INSNS (1) + 1;
> 
> (more)
> 
> > +    case SET:
> > +      /* The default cost of a SET is the number of general purpose
> > +	 regs being set multiplied by COSTS_N_INSNS (1).  That only
> > +	 works where the incremental cost of the operation and
> > +	 operands is zero, when the operation performed can be done in
> > +	 one instruction.  For other cases where we add COSTS_N_INSNS
> > +	 for some operation (see point 5 above), COSTS_N_INSNS (1)
> > +	 should be subtracted from the total cost.  */
> 
> What does "incremental cost" mean there?  If what increases?
> 
> > +      {
> > +	rtx_code src_code = GET_CODE (SET_SRC (x));
> > +	if (src_code == CONST_INT
> > +	    || src_code == CONST_DOUBLE
> > +	    || src_code == CONST_WIDE_INT)
> > +	  return false;
> > +	int set_cost = (rtx_cost (SET_SRC (x), mode, SET, 1, speed)
> > +			+ rtx_cost (SET_DEST (x), mode, SET, 0, speed));
> 
> This should use set_src_cost, if anything.  But that will recurse then,
> currently?  Ugh.
> 
> Using rtx_cost for SET_DEST is problematic, too.
> 
> What (if anything!) calls this for a SET?  Oh, set_rtx_cost still does
> that, hrm.
> 
> rtx_cost costs RTL *expressions*.  Not instructions.  Except where it is
> (ab)used for that, sigh.
> 
> Many targets have something for it already, but all quite different from
> this.

Right, you are starting to understand just how difficult it is to do
anything at all to rs6000_rtx_costs.

> > +	if (set_cost >= COSTS_N_INSNS (1))
> > +	  *total += set_cost - COSTS_N_INSNS (1);
> 
> I don't understand this part at all, for example.  Why not just
>   *total += set_cost - COSTS_N_INSNS (1);
> ?  If set_cost is lower than one insn's cost, don't we have a problem
> already?

The set_cost I calculate here from src and dest can easily be zero.
(set (reg) (reg)) and (set (reg) (const_int 0)) for example have a
dest cost of zero and a src cost of zero.  That can't change without
breaking places where rtx_costs is called to compare pieces of RTL.
Here though we happen to be looking at a SET, so have an entire
instruction.  The value returned should be comparable to our
instruction costs.  That's tricky to do, and this change is just a
hack.  Without the hack I saw some testcases regress.

I don't like this hack any more than you do reviewing it!

> 
> Generic things.  Please split this patch up when sending it again, it
> does too many different things, and many of those are not obvious.
> 
> All such changes that aren't completely obvious (like the previous ones
> were) should have some measurement.  We are in stage1, and we will
> notice (non-trivial) degradations, but if we can expect degradations
> (like for this patch), it needs benchmarking.

Pat did benchmark these changes..  I was somewhat surprised to see
a small improvement in spec results.

> Since you add !speed all over the place, maybe we should just have a
> separate function that does !speed?  It looks like quite a few things
> will simplify.

Revised patch as follows.

	* config/rs6000/rs6000.c (rs6000_rtx_costs): Reduce cost for SETs
	when insn operation cost handled on recursive call.  Tidy
	break/return.  Tidy AND costing.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6af8a9a31cb..26c2f443502 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21397,7 +21397,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	*total = rs6000_cost->fp;
       else
 	*total = rs6000_cost->dmul;
-      break;
+      return false;
 
     case DIV:
     case MOD:
@@ -21457,6 +21457,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
       return false;
 
     case AND:
+      *total = COSTS_N_INSNS (1);
       right = XEXP (x, 1);
       if (CONST_INT_P (right))
 	{
@@ -21469,15 +21470,15 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	       || left_code == LSHIFTRT)
 	      && rs6000_is_valid_shift_mask (right, left, mode))
 	    {
-	      *total = rtx_cost (XEXP (left, 0), mode, left_code, 0, speed);
-	      if (!CONST_INT_P (XEXP (left, 1)))
-		*total += rtx_cost (XEXP (left, 1), SImode, left_code, 1, speed);
-	      *total += COSTS_N_INSNS (1);
+	      rtx reg_op = XEXP (left, 0);
+	      if (!REG_P (reg_op))
+		*total += rtx_cost (reg_op, mode, left_code, 0, speed);
+	      reg_op = XEXP (left, 1);
+	      if (!REG_P (reg_op) && !CONST_INT_P (reg_op))
+		*total += rtx_cost (reg_op, mode, left_code, 1, speed);
 	      return true;
 	    }
 	}
-
-      *total = COSTS_N_INSNS (1);
       return false;
 
     case IOR:
@@ -21575,7 +21576,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	  *total = rs6000_cost->fp;
 	  return false;
 	}
-      break;
+      return false;
 
     case NE:
     case EQ:
@@ -21613,13 +21614,40 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	  *total = 0;
 	  return true;
 	}
-      break;
+      return false;
+
+    case SET:
+      /* The default cost of a SET is the number of general purpose
+	 regs being set multiplied by COSTS_N_INSNS (1).  Here we
+	 happen to be looking at a SET, so have an instruction rather
+	 than just a piece of RTL and want to return a cost comparable
+	 to rs6000 instruction costing.  That's a little complicated
+	 because in some cases the cost of SET operands is non-zero,
+	 see point 5 above and cost of PLUS for example, and in
+	 others it is zero, for example for (set (reg) (reg)).
+	 But (set (reg) (reg)) actually costs the same as 
+	 (set (reg) (plus (reg) (reg))).  Hack around this by
+	 subtracting COSTS_N_INSNS (1) from the operand cost in cases
+	 were we add COSTS_N_INSNS (1) for some operation.  Don't do
+	 so for constants that might cost more than zero because they
+	 don't fit in one instruction.  FIXME: rtx_costs should not be
+	 looking at entire instructions.  */
+      {
+	rtx_code src_code = GET_CODE (SET_SRC (x));
+	if (src_code == CONST_INT
+	    || src_code == CONST_DOUBLE
+	    || src_code == CONST_WIDE_INT)
+	  return false;
+	int set_cost = (rtx_cost (SET_SRC (x), mode, SET, 1, speed)
+			+ rtx_cost (SET_DEST (x), mode, SET, 0, speed));
+	if (set_cost >= COSTS_N_INSNS (1))
+	  *total += set_cost - COSTS_N_INSNS (1);
+	return true;
+      }
 
     default:
-      break;
+      return false;
     }
-
-  return false;
 }
 
 /* Debug form of r6000_rtx_costs that is selected if -mdebug=cost.  */


-- 
Alan Modra
Australia Development Lab, IBM

  reply	other threads:[~2020-09-18  3:38 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-15  1:19 [RS6000] rtx_costs Alan Modra
2020-09-15  1:19 ` [RS6000] Count rldimi constant insns Alan Modra
2020-09-15 22:29   ` Segher Boessenkool
2020-09-15  1:19 ` [RS6000] rs6000_rtx_costs for PLUS/MINUS constant Alan Modra
2020-09-15 22:31   ` Segher Boessenkool
2020-09-15  1:19 ` [RS6000] rs6000_rtx_costs for AND Alan Modra
2020-09-15 18:15   ` will schmidt
2020-09-16  7:24     ` Alan Modra
2020-09-15  1:19 ` [RS6000] rs6000_rtx_costs comment Alan Modra
2020-09-16 23:21   ` Segher Boessenkool
2020-09-15  1:19 ` [RS6000] rs6000_rtx_costs multi-insn constants Alan Modra
2020-09-16 23:28   ` Segher Boessenkool
2020-09-15  1:19 ` [RS6000] rs6000_rtx_costs cost IOR Alan Modra
2020-09-17  0:02   ` Segher Boessenkool
2020-09-17  3:42     ` Alan Modra
2020-09-21 15:49       ` Segher Boessenkool
2020-09-21 23:54         ` Alan Modra
2020-09-15  1:19 ` [RS6000] rs6000_rtx_costs reduce cost for SETs Alan Modra
2020-09-17 17:51   ` Segher Boessenkool
2020-09-18  3:38     ` Alan Modra [this message]
2020-09-18 18:13       ` Segher Boessenkool
2020-09-21  7:07         ` Alan Modra
2020-09-15  1:19 ` [RS6000] rotate and mask constants Alan Modra
2020-09-15  7:16   ` Alan Modra
2020-09-21 15:56     ` Segher Boessenkool
2020-09-15 18:15 ` [RS6000] rtx_costs will schmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200918033842.GT5452@bubble.grove.modra.org \
    --to=amodra@gmail.com \
    --cc=gcc-patches@sourceware.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).