From: Alan Lawrence <alan.lawrence@arm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>,
Michael Meissner <meissner@linux.vnet.ibm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
David Edelsohn <dje.gcc@gmail.com>
Subject: Re: [PATCH 10/11][RS6000] Migrate reduction optabs to reduc_..._scal
Date: Wed, 12 Nov 2014 18:53:00 -0000 [thread overview]
Message-ID: <5463ABD0.8020004@arm.com> (raw)
In-Reply-To: <54635096.1040508@arm.com>
Have run check-gcc on gcc110.fsffrance.org (powerpc64-unknown-linux-gnu) using
this snippet on top of original patch; no regressions.
Alan Lawrence wrote:
> So I'm no expert on RS6000 here, but following on from Segher's observation
> about the change in pattern...so the difference in 'expand' is exactly that, a
> vsx_reduc_splus_v2df followed by a vec_extract to DF, becomes a
> vsx_reduc_splus_v2df_scalar - as I expected the combiner to produce by combining
> the two previous insns.
>
> However, inspecting the logs from -fdump-rtl-combine-all, *without* my patch,
> when the combiner tries to put those two together, I see:
>
> Trying 30 -> 31:
> Failed to match this instruction:
> (set (reg:DF 179 [ stmp_s_5.7D.2196 ])
> (vec_select:DF (plus:V2DF (vec_select:V2DF (reg:V2DF 173 [ vect_s_5.6D.2195 ])
> (parallel [
> (const_int 1 [0x1])
> (const_int 0 [0])
> ]))
> (reg:V2DF 173 [ vect_s_5.6D.2195 ]))
> (parallel [
> (const_int 1 [0x1])
> ])))
>
> That is, it looks like combine_simplify_rtx has transformed the (vec_concat
> (vec_select ... 1) (vec_select ... 0)) from the vsx_reduc_plus_v2df insn, into a
> single vec_select, which does not match the vsx_reduc_plus_v2df_scalar insn.
>
> So despite the comment (in vsx.md):
>
> ;; Combiner patterns with the vector reduction patterns that knows we can get
> ;; to the top element of the V2DF array without doing an extract.
>
> It looks like the code generation prior to my patch, considered better, was
> because the combiner didn't actually use the pattern?
>
> In that case whilst you may want to dig into register allocation,
> cannot_change_mode_class, etc., for other reasons, I think the best fix for
> migrating to reduc_plus_scal... is simply to avoid using the "Combiner" patterns
> and just emit two insns, the old pattern followed by a vec_extract. The attached
> snippet does this (I won't call it a patch yet, and it applies on top of the
> previous patch - I went the route of calling the two gen functions rather than
> copying their RTL sequences, but could do the latter if that were
> preferable???), and restores code generation to the original form on your
> example above; it bootstraps OK but I'm still running check-gcc on the Compile
> Farm...
>
> However, again on your example above, I note that if I *remove* the
> reduc_plus_scal_v2df pattern altogether, I get:
>
> .sum:
> li 10,512 # 52 *movdi_internal64/4 [length = 4]
> ld 9,.LC2@toc(2) # 20 *movdi_internal64/2 [length = 4]
> xxlxor 0,0,0 # 17 *vsx_movv2df/12 [length = 4]
> mtctr 10 # 48 *movdi_internal64/11 [length = 4]
> .align 4
> .L2:
> lxvd2x 12,0,9 # 23 *vsx_movv2df/2 [length = 4]
> addi 9,9,16 # 25 *adddi3_internal1/2 [length = 4]
> xvadddp 0,0,12 # 24 *vsx_addv2df3/1 [length = 4]
> bdnz .L2 # 47 *ctrdi_internal1/1 [length = 4]
> xxsldwi 12,0,0,2 # 30 vsx_xxsldwi_v2df [length = 4]
> xvadddp 1,0,12 # 31 *vsx_addv2df3/1 [length = 4]
> nop # 37 *vsx_extract_v2df_internal2/1 [length = 4]
> blr # 55 return [length = 4]
>
> this is presumably using gcc's scalar reduction code, but (to my untrained eye
> on powerpc!) it looks even better than the first form above (the same in the
> loop, and in the reduction, an xxpermdi is replaced by a nop !)...
>
> --Alan
>
>
> Segher Boessenkool wrote:
>> On Mon, Nov 10, 2014 at 05:36:24PM -0500, Michael Meissner wrote:
>>> However, the double pattern is completely broken. This cannot go in.
>> [snip]
>>
>>> It is unacceptable to have to do the inner loop doing a load, vector add, and
>>> store in the loop.
>> Before the patch, the final reduction used *vsx_reduc_splus_v2df; after
>> the patch, it is *vsx_reduc_plus_v2df_scalar. The former does a vector
>> add, the latter a float add. And it uses the same pseudoregister for the
>> accumulator throughout. IRA decides a register is more expensive than
>> memory for this, I suppose because it wants both V2DF and DF? It doesn't
>> seem to like the subreg very much.
>>
>> The new code does look nicer otherwise :-)
>>
>>
>> Segher
next prev parent reply other threads:[~2014-11-12 18:49 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-24 11:57 [PATCH v2 0-6/11] Fix PR/61114, make direct vector reductions endianness-neutral Alan Lawrence
2014-10-24 11:58 ` [PATCH 7/11][ARM] Migrate to new reduc_plus_scal_optab Alan Lawrence
2014-11-03 17:32 ` Ramana Radhakrishnan
2014-10-24 12:01 ` [PATCH v2 0-6/11] Fix PR/61114, make direct vector reductions endianness-neutral Richard Biener
2014-10-24 12:05 ` [PATCH 8/11][ARM] Migrate to new reduc_[us](min|max)_scal_optab Alan Lawrence
2014-11-04 11:08 ` Ramana Radhakrishnan
2014-10-24 12:06 ` [PATCH 9/11][i386] Migrate reduction optabs to reduc_..._scal Alan Lawrence
2014-10-24 12:07 ` [PATCH 10/11][RS6000] " Alan Lawrence
2014-10-24 12:14 ` Alan Lawrence
2014-10-25 0:08 ` David Edelsohn
2014-11-03 17:51 ` Bill Schmidt
2014-11-06 16:44 ` Alan Lawrence
2014-11-06 18:57 ` Bill Schmidt
2014-11-07 10:09 ` Alan Lawrence
2014-11-10 22:39 ` Michael Meissner
2014-11-11 7:10 ` Segher Boessenkool
2014-11-12 1:54 ` Michael Meissner
2014-11-12 9:26 ` Segher Boessenkool
2014-11-12 19:20 ` Michael Meissner
2014-11-12 12:32 ` Alan Lawrence
2014-11-12 18:53 ` Alan Lawrence [this message]
2014-12-11 15:59 ` Ping: " Alan Lawrence
2014-12-11 18:37 ` Alan Lawrence
2014-10-24 12:12 ` [Protopatch 11/11][IA64] Migrate to reduc_(plus|min|max)_scal_v2df optab Alan Lawrence
2014-10-24 12:50 ` Alan Lawrence
2014-10-24 15:19 ` [PATCH v2 0-6/11] Fix PR/61114, make direct vector reductions endianness-neutral Matthew Fortune
2014-10-27 11:48 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5463ABD0.8020004@arm.com \
--to=alan.lawrence@arm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=meissner@linux.vnet.ibm.com \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).