public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][cse][3/4] Don't overwrite original rtx when folding source of set
@ 2016-01-22  9:53 Kyrill Tkachov
  2016-01-22 11:21 ` Richard Biener
  0 siblings, 1 reply; 2+ messages in thread
From: Kyrill Tkachov @ 2016-01-22  9:53 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ramana Radhakrishnan, Richard Earnshaw, Jim Wilson

[-- Attachment #1: Type: text/plain, Size: 1577 bytes --]

Hi all,

This patch is a consequence of the thread I started at https://gcc.gnu.org/ml/gcc/2016-01/msg00100.html
The problem is that the fold_rtx call in cse_insn may overwrite its argument if the insn argument is non-NULL.
This leads to CSE not considering the original form of the RTX when doing its cost analysis later on.
This led to it picking a normal SImode multiply expression over the original multiply-sign-extend expression
which in my case is cheaper (as reflected in the fixed rtx costs from patch 2)

The simple fix is to pass NULL to fold_rtx so that it will return the candidate folded expression
into src_folded but still retain the original src for analysis.

With this change the gcc.target/arm/wmul-[12].c and the costs fix in patch [2/4]
the tests now generate their expected
sign-extend+multiply (+accumulate) sequences.

Apart from that this patch has no impact codegen on SPEC2006 for arm.
For aarch64 the impact is minimal and inconsequential. I've seen sequences that
select between 1 -1 being turned from a CSINC (of zero) into a CSNEG. Both are valid
and of equal value.

On x86_64 the impact was also minimal. Most benchmarks were not changed at all.
Some showed a negligible reduction in codesize and a slight register-allocation perturbations.
But nothing significant.
Hence, I claim that this patch is low impact.

Bootstrapped and tested on arm, aarch64, x86_64.
Ok for trunk?

Thanks,
Kyrill

2016-01-22  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * cse.c (cse_insn): Pass NULL to fold_rtx when initially
     folding the source of a SET.

[-- Attachment #2: cse-fold.patch --]
[-- Type: text/x-patch, Size: 523 bytes --]

diff --git a/gcc/cse.c b/gcc/cse.c
index 58b8fc0313dcbfb2036054564746a7832ae52140..2665d9a2733cad7286b41a88753acfcf79be83f1 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -4636,7 +4636,7 @@ cse_insn (rtx_insn *insn)
 
       /* Simplify and foldable subexpressions in SRC.  Then get the fully-
 	 simplified result, which may not necessarily be valid.  */
-      src_folded = fold_rtx (src, insn);
+      src_folded = fold_rtx (src, NULL);
 
 #if 0
       /* ??? This caused bad code to be generated for the m68k port with -O2.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH][cse][3/4] Don't overwrite original rtx when folding source of set
  2016-01-22  9:53 [PATCH][cse][3/4] Don't overwrite original rtx when folding source of set Kyrill Tkachov
@ 2016-01-22 11:21 ` Richard Biener
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Biener @ 2016-01-22 11:21 UTC (permalink / raw)
  To: Kyrill Tkachov
  Cc: GCC Patches, Ramana Radhakrishnan, Richard Earnshaw, Jim Wilson

On Fri, Jan 22, 2016 at 10:52 AM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi all,
>
> This patch is a consequence of the thread I started at
> https://gcc.gnu.org/ml/gcc/2016-01/msg00100.html
> The problem is that the fold_rtx call in cse_insn may overwrite its argument
> if the insn argument is non-NULL.
> This leads to CSE not considering the original form of the RTX when doing
> its cost analysis later on.
> This led to it picking a normal SImode multiply expression over the original
> multiply-sign-extend expression
> which in my case is cheaper (as reflected in the fixed rtx costs from patch
> 2)
>
> The simple fix is to pass NULL to fold_rtx so that it will return the
> candidate folded expression
> into src_folded but still retain the original src for analysis.
>
> With this change the gcc.target/arm/wmul-[12].c and the costs fix in patch
> [2/4]
> the tests now generate their expected
> sign-extend+multiply (+accumulate) sequences.
>
> Apart from that this patch has no impact codegen on SPEC2006 for arm.
> For aarch64 the impact is minimal and inconsequential. I've seen sequences
> that
> select between 1 -1 being turned from a CSINC (of zero) into a CSNEG. Both
> are valid
> and of equal value.
>
> On x86_64 the impact was also minimal. Most benchmarks were not changed at
> all.
> Some showed a negligible reduction in codesize and a slight
> register-allocation perturbations.
> But nothing significant.
> Hence, I claim that this patch is low impact.
>
> Bootstrapped and tested on arm, aarch64, x86_64.
> Ok for trunk?

Ok if the rest of the series is approved (otherwise ok for stage1).

Thanks,
Richard.

> Thanks,
> Kyrill
>
> 2016-01-22  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * cse.c (cse_insn): Pass NULL to fold_rtx when initially
>     folding the source of a SET.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-01-22 11:21 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-22  9:53 [PATCH][cse][3/4] Don't overwrite original rtx when folding source of set Kyrill Tkachov
2016-01-22 11:21 ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).