From: Ira Rosen <ira.rosen@linaro.org>
To: Richard Guenther <richard.guenther@gmail.com>
Cc: gcc-patches@gcc.gnu.org, Patch Tracking <patches@linaro.org>
Subject: Re: [patch] Improve detection of widening multiplication in the vectorizer
Date: Thu, 02 Jun 2011 11:08:00 -0000 [thread overview]
Message-ID: <BANLkTikAvbDF=Kg+=ae-Cixkyz4XHfnQCQ@mail.gmail.com> (raw)
In-Reply-To: <BANLkTi=KCbm4rOWFDk0qQFBvX-Ph+cjJ8Q@mail.gmail.com>
On 2 June 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote:
> On Thu, Jun 2, 2011 at 10:46 AM, Ira Rosen <ira.rosen@linaro.org> wrote:
>> On 1 June 2011 15:14, Richard Guenther <richard.guenther@gmail.com> wrote:
>>> On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen <ira.rosen@linaro.org> wrote:
>>>> On 1 June 2011 12:42, Richard Guenther <richard.guenther@gmail.com> wrote:
>>>>
>>>>> Did you think about moving pass_optimize_widening_mul before
>>>>> loop optimizations? Does that pass catch the cases you are
>>>>> teaching the pattern recognizer? I think we should try to expose
>>>>> these more complicated instructions to loop optimizers.
>>>>>
>>>>
>>>> pass_optimize_widening_mul doesn't catch these cases, but I can try to
>>>> teach it instead of the vectorizer.
>>>> I am now testing
>>>>
>>>> Index: passes.c
>>>> ===================================================================
>>>> --- passes.c (revision 174391)
>>>> +++ passes.c (working copy)
>>>> @@ -870,6 +870,7 @@
>>>> NEXT_PASS (pass_split_crit_edges);
>>>> NEXT_PASS (pass_pre);
>>>> NEXT_PASS (pass_sink_code);
>>>> + NEXT_PASS (pass_optimize_widening_mul);
>>>> NEXT_PASS (pass_tree_loop);
>>>> {
>>>> struct opt_pass **p = &pass_tree_loop.pass.sub;
>>>> @@ -934,7 +935,6 @@
>>>> NEXT_PASS (pass_forwprop);
>>>> NEXT_PASS (pass_phiopt);
>>>> NEXT_PASS (pass_fold_builtins);
>>>> - NEXT_PASS (pass_optimize_widening_mul);
>>>> NEXT_PASS (pass_tail_calls);
>>>> NEXT_PASS (pass_rename_ssa_copies);
>>>> NEXT_PASS (pass_uncprop);
>>>>
>>>> to see how it affects other loop optimizations (vectorizer pattern
>>>> tests obviously fail).
>>
>> Looks like it needs copy_prop and dce as well:
>>
>> Index: passes.c
>> ===================================================================
>> --- passes.c (revision 174391)
>> +++ passes.c (working copy)
>> @@ -870,6 +870,9 @@
>> NEXT_PASS (pass_split_crit_edges);
>> NEXT_PASS (pass_pre);
>> NEXT_PASS (pass_sink_code);
>> + NEXT_PASS (pass_copy_prop);
>> + NEXT_PASS (pass_dce);
>> + NEXT_PASS (pass_optimize_widening_mul);
>> NEXT_PASS (pass_tree_loop);
>> {
>> struct opt_pass **p = &pass_tree_loop.pass.sub;
>> @@ -934,7 +937,6 @@
>> NEXT_PASS (pass_forwprop);
>> NEXT_PASS (pass_phiopt);
>> NEXT_PASS (pass_fold_builtins);
>> - NEXT_PASS (pass_optimize_widening_mul);
>> NEXT_PASS (pass_tail_calls);
>> NEXT_PASS (pass_rename_ssa_copies);
>> NEXT_PASS (pass_uncprop);
>>
>> otherwise I get (on x86_64-suse-linux)
>>
>> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss
>> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd
>> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss
>> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd
>> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss
>> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd
>
> Hmm. I would have put the pass next to the sincos pass, but yes,
> in principle a copyprop & dce pass after PRE makes sense
> (the loop passes likely don't run because there are no loops in
> those testcases - both copyprop and dce should be scheduled
> more like TODOs, or even automatically by the pass manager
> via PROPs ...). Dead code can indeed confuse those matching
> passes that look for single-use vars.
>
> I'll think about a more elegant solution for this problem.
>
> Would you mind checking if the next-to-sincos position makes
> any difference?
Before sincos we have
D.2747_2 = __builtin_powf (a_1(D), 2.0e+0);
D.2746_4 = D.2747_2 + c_3(D);
which is transformed by sincos to
powmult.8_7 = a_1(D) * a_1(D);
D.2747_2 = powmult.8_7;
D.2746_4 = D.2747_2 + c_3(D);
but widening_mul is confused by D.2747_2 = powmult.8_7; and it needs
both copy_prop and dce to remove it:
powmult.8_7 = a_1(D) * a_1(D);
D.2746_4 = c_3(D) + powmult.8_7;
So moving widening_mul next to sincos doesn't help.
Maybe gimple_expand_builtin_pow() can be changed to generate the last
version by itself?
Ira
>
> Thanks,
> Richard.
>
>> Ira
>>
>>>
>>> Thanks. I would hope that we eventually can get rid of the
>>> pattern recognizer ... at least for SSE there is also always
>>> a scalar variant instruction for each vectorized one.
>>>
>>> Richard.
>>>
>>
>
next prev parent reply other threads:[~2011-06-02 11:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-01 9:23 Ira Rosen
2011-06-01 9:42 ` Richard Guenther
2011-06-01 11:37 ` Ira Rosen
2011-06-01 12:15 ` Richard Guenther
2011-06-02 8:46 ` Ira Rosen
2011-06-02 9:59 ` Richard Guenther
2011-06-02 11:08 ` Ira Rosen [this message]
2011-06-02 15:35 ` Richard Guenther
2011-06-06 13:05 ` Richard Sandiford
2011-06-06 14:28 ` Richard Guenther
2011-06-07 21:08 ` H.J. Lu
2011-06-21 0:40 ` H.J. Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='BANLkTikAvbDF=Kg+=ae-Cixkyz4XHfnQCQ@mail.gmail.com' \
--to=ira.rosen@linaro.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=patches@linaro.org \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).