From: Alan Modra <amodra@gmail.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: gcc-patches@sourceware.org
Subject: Re: [RS6000] rs6000_rtx_costs reduce cost for SETs
Date: Fri, 18 Sep 2020 13:08:42 +0930 [thread overview]
Message-ID: <20200918033842.GT5452@bubble.grove.modra.org> (raw)
In-Reply-To: <20200917175125.GJ28786@gate.crashing.org>
On Thu, Sep 17, 2020 at 12:51:25PM -0500, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Sep 15, 2020 at 10:49:45AM +0930, Alan Modra wrote:
> > Also use rs6000_cost only for speed.
>
> More directly: use something completely different for !speed, namely,
> code size.
Yes, that might be better.
> > - if (CONST_INT_P (XEXP (x, 1))
> > - && satisfies_constraint_I (XEXP (x, 1)))
> > + if (!speed)
> > + /* A little more than one insn so that nothing is tempted to
> > + turn a shift left into a multiply. */
> > + *total = COSTS_N_INSNS (1) + 1;
>
> Please don't. We have a lot of code elsewhere to handle this directly,
> already. Also, this is just wrong for size. Five shifts is *not*
> better than four muls. If that is the only way to get good results,
> than unfortunately we probably have to; but do not do this without any
> proof.
Huh. If a cost of 5 is "just wrong for size" then you prefer a cost
of 12 for example (power9 mulsi or muldi rs6000_cost)? Noticing that
result for !speed rs6000_rtx_costs is the entire basis for the !speed
changes. I don't have any proof that this is correct.
> > case FMA:
> > - if (mode == SFmode)
> > + if (!speed)
> > + *total = COSTS_N_INSNS (1) + 1;
>
> Not here, either.
>
> > case DIV:
> > case MOD:
> > if (FLOAT_MODE_P (mode))
> > {
> > - *total = mode == DFmode ? rs6000_cost->ddiv
> > - : rs6000_cost->sdiv;
> > + if (!speed)
> > + *total = COSTS_N_INSNS (1) + 2;
>
> And why + 2 even?
>
> > - if (GET_MODE (XEXP (x, 1)) == DImode)
> > + if (!speed)
> > + *total = COSTS_N_INSNS (1) + 2;
> > + else if (GET_MODE (XEXP (x, 1)) == DImode)
> > *total = rs6000_cost->divdi;
> > else
> > *total = rs6000_cost->divsi;
>
> (more)
OK, I can remove all the !speed changes. To be honest, I didn't look
anywhere near as much at code size changes as I worried about
performance. And about not regressing any fiddly testcase we have.
> > @@ -21368,6 +21378,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
> > return false;
> >
> > case AND:
> > + *total = COSTS_N_INSNS (1);
> > right = XEXP (x, 1);
> > if (CONST_INT_P (right))
> > {
> > @@ -21380,15 +21391,15 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
> > || left_code == LSHIFTRT)
> > && rs6000_is_valid_shift_mask (right, left, mode))
> > {
> > - *total = rtx_cost (XEXP (left, 0), mode, left_code, 0, speed);
> > - if (!CONST_INT_P (XEXP (left, 1)))
> > - *total += rtx_cost (XEXP (left, 1), SImode, left_code, 1, speed);
> > - *total += COSTS_N_INSNS (1);
> > + rtx reg_op = XEXP (left, 0);
> > + if (!REG_P (reg_op))
> > + *total += rtx_cost (reg_op, mode, left_code, 0, speed);
> > + reg_op = XEXP (left, 1);
> > + if (!REG_P (reg_op) && !CONST_INT_P (reg_op))
> > + *total += rtx_cost (reg_op, mode, left_code, 1, speed);
> > return true;
> > }
> > }
> > -
> > - *total = COSTS_N_INSNS (1);
> > return false;
>
> This doesn't improve anything? It just makes it different from all
> surrounding code?
So it moves the common COSTS_N_INSNS (1) count and doesn't recurse for
regs, like it doesn't for const_int.
> > @@ -21519,7 +21530,9 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
> > if (outer_code == TRUNCATE
> > && GET_CODE (XEXP (x, 0)) == MULT)
> > {
> > - if (mode == DImode)
> > + if (!speed)
> > + *total = COSTS_N_INSNS (1) + 1;
>
> (more)
>
> > + case SET:
> > + /* The default cost of a SET is the number of general purpose
> > + regs being set multiplied by COSTS_N_INSNS (1). That only
> > + works where the incremental cost of the operation and
> > + operands is zero, when the operation performed can be done in
> > + one instruction. For other cases where we add COSTS_N_INSNS
> > + for some operation (see point 5 above), COSTS_N_INSNS (1)
> > + should be subtracted from the total cost. */
>
> What does "incremental cost" mean there? If what increases?
>
> > + {
> > + rtx_code src_code = GET_CODE (SET_SRC (x));
> > + if (src_code == CONST_INT
> > + || src_code == CONST_DOUBLE
> > + || src_code == CONST_WIDE_INT)
> > + return false;
> > + int set_cost = (rtx_cost (SET_SRC (x), mode, SET, 1, speed)
> > + + rtx_cost (SET_DEST (x), mode, SET, 0, speed));
>
> This should use set_src_cost, if anything. But that will recurse then,
> currently? Ugh.
>
> Using rtx_cost for SET_DEST is problematic, too.
>
> What (if anything!) calls this for a SET? Oh, set_rtx_cost still does
> that, hrm.
>
> rtx_cost costs RTL *expressions*. Not instructions. Except where it is
> (ab)used for that, sigh.
>
> Many targets have something for it already, but all quite different from
> this.
Right, you are starting to understand just how difficult it is to do
anything at all to rs6000_rtx_costs.
> > + if (set_cost >= COSTS_N_INSNS (1))
> > + *total += set_cost - COSTS_N_INSNS (1);
>
> I don't understand this part at all, for example. Why not just
> *total += set_cost - COSTS_N_INSNS (1);
> ? If set_cost is lower than one insn's cost, don't we have a problem
> already?
The set_cost I calculate here from src and dest can easily be zero.
(set (reg) (reg)) and (set (reg) (const_int 0)) for example have a
dest cost of zero and a src cost of zero. That can't change without
breaking places where rtx_costs is called to compare pieces of RTL.
Here though we happen to be looking at a SET, so have an entire
instruction. The value returned should be comparable to our
instruction costs. That's tricky to do, and this change is just a
hack. Without the hack I saw some testcases regress.
I don't like this hack any more than you do reviewing it!
>
> Generic things. Please split this patch up when sending it again, it
> does too many different things, and many of those are not obvious.
>
> All such changes that aren't completely obvious (like the previous ones
> were) should have some measurement. We are in stage1, and we will
> notice (non-trivial) degradations, but if we can expect degradations
> (like for this patch), it needs benchmarking.
Pat did benchmark these changes.. I was somewhat surprised to see
a small improvement in spec results.
> Since you add !speed all over the place, maybe we should just have a
> separate function that does !speed? It looks like quite a few things
> will simplify.
Revised patch as follows.
* config/rs6000/rs6000.c (rs6000_rtx_costs): Reduce cost for SETs
when insn operation cost handled on recursive call. Tidy
break/return. Tidy AND costing.
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6af8a9a31cb..26c2f443502 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21397,7 +21397,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
*total = rs6000_cost->fp;
else
*total = rs6000_cost->dmul;
- break;
+ return false;
case DIV:
case MOD:
@@ -21457,6 +21457,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
return false;
case AND:
+ *total = COSTS_N_INSNS (1);
right = XEXP (x, 1);
if (CONST_INT_P (right))
{
@@ -21469,15 +21470,15 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
|| left_code == LSHIFTRT)
&& rs6000_is_valid_shift_mask (right, left, mode))
{
- *total = rtx_cost (XEXP (left, 0), mode, left_code, 0, speed);
- if (!CONST_INT_P (XEXP (left, 1)))
- *total += rtx_cost (XEXP (left, 1), SImode, left_code, 1, speed);
- *total += COSTS_N_INSNS (1);
+ rtx reg_op = XEXP (left, 0);
+ if (!REG_P (reg_op))
+ *total += rtx_cost (reg_op, mode, left_code, 0, speed);
+ reg_op = XEXP (left, 1);
+ if (!REG_P (reg_op) && !CONST_INT_P (reg_op))
+ *total += rtx_cost (reg_op, mode, left_code, 1, speed);
return true;
}
}
-
- *total = COSTS_N_INSNS (1);
return false;
case IOR:
@@ -21575,7 +21576,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
*total = rs6000_cost->fp;
return false;
}
- break;
+ return false;
case NE:
case EQ:
@@ -21613,13 +21614,40 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code,
*total = 0;
return true;
}
- break;
+ return false;
+
+ case SET:
+ /* The default cost of a SET is the number of general purpose
+ regs being set multiplied by COSTS_N_INSNS (1). Here we
+ happen to be looking at a SET, so have an instruction rather
+ than just a piece of RTL and want to return a cost comparable
+ to rs6000 instruction costing. That's a little complicated
+ because in some cases the cost of SET operands is non-zero,
+ see point 5 above and cost of PLUS for example, and in
+ others it is zero, for example for (set (reg) (reg)).
+ But (set (reg) (reg)) actually costs the same as
+ (set (reg) (plus (reg) (reg))). Hack around this by
+ subtracting COSTS_N_INSNS (1) from the operand cost in cases
+ were we add COSTS_N_INSNS (1) for some operation. Don't do
+ so for constants that might cost more than zero because they
+ don't fit in one instruction. FIXME: rtx_costs should not be
+ looking at entire instructions. */
+ {
+ rtx_code src_code = GET_CODE (SET_SRC (x));
+ if (src_code == CONST_INT
+ || src_code == CONST_DOUBLE
+ || src_code == CONST_WIDE_INT)
+ return false;
+ int set_cost = (rtx_cost (SET_SRC (x), mode, SET, 1, speed)
+ + rtx_cost (SET_DEST (x), mode, SET, 0, speed));
+ if (set_cost >= COSTS_N_INSNS (1))
+ *total += set_cost - COSTS_N_INSNS (1);
+ return true;
+ }
default:
- break;
+ return false;
}
-
- return false;
}
/* Debug form of r6000_rtx_costs that is selected if -mdebug=cost. */
--
Alan Modra
Australia Development Lab, IBM
next prev parent reply other threads:[~2020-09-18 3:38 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-15 1:19 [RS6000] rtx_costs Alan Modra
2020-09-15 1:19 ` [RS6000] Count rldimi constant insns Alan Modra
2020-09-15 22:29 ` Segher Boessenkool
2020-09-15 1:19 ` [RS6000] rs6000_rtx_costs for PLUS/MINUS constant Alan Modra
2020-09-15 22:31 ` Segher Boessenkool
2020-09-15 1:19 ` [RS6000] rs6000_rtx_costs for AND Alan Modra
2020-09-15 18:15 ` will schmidt
2020-09-16 7:24 ` Alan Modra
2020-09-15 1:19 ` [RS6000] rs6000_rtx_costs comment Alan Modra
2020-09-16 23:21 ` Segher Boessenkool
2020-09-15 1:19 ` [RS6000] rs6000_rtx_costs multi-insn constants Alan Modra
2020-09-16 23:28 ` Segher Boessenkool
2020-09-15 1:19 ` [RS6000] rs6000_rtx_costs cost IOR Alan Modra
2020-09-17 0:02 ` Segher Boessenkool
2020-09-17 3:42 ` Alan Modra
2020-09-21 15:49 ` Segher Boessenkool
2020-09-21 23:54 ` Alan Modra
2020-09-15 1:19 ` [RS6000] rs6000_rtx_costs reduce cost for SETs Alan Modra
2020-09-17 17:51 ` Segher Boessenkool
2020-09-18 3:38 ` Alan Modra [this message]
2020-09-18 18:13 ` Segher Boessenkool
2020-09-21 7:07 ` Alan Modra
2020-09-15 1:19 ` [RS6000] rotate and mask constants Alan Modra
2020-09-15 7:16 ` Alan Modra
2020-09-21 15:56 ` Segher Boessenkool
2020-09-15 18:15 ` [RS6000] rtx_costs will schmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200918033842.GT5452@bubble.grove.modra.org \
--to=amodra@gmail.com \
--cc=gcc-patches@sourceware.org \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).