public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Guenther <richard.guenther@gmail.com>
To: Ira Rosen <ira.rosen@linaro.org>
Cc: gcc-patches@gcc.gnu.org, Patch Tracking <patches@linaro.org>
Subject: Re: [patch] Improve detection of widening multiplication in the vectorizer
Date: Thu, 02 Jun 2011 09:59:00 -0000	[thread overview]
Message-ID: <BANLkTi=KCbm4rOWFDk0qQFBvX-Ph+cjJ8Q@mail.gmail.com> (raw)
In-Reply-To: <BANLkTikSzRMm_QFL6YKKN6dxNtAEx8DNqg@mail.gmail.com>

On Thu, Jun 2, 2011 at 10:46 AM, Ira Rosen <ira.rosen@linaro.org> wrote:
> On 1 June 2011 15:14, Richard Guenther <richard.guenther@gmail.com> wrote:
>> On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen <ira.rosen@linaro.org> wrote:
>>> On 1 June 2011 12:42, Richard Guenther <richard.guenther@gmail.com> wrote:
>>>
>>>> Did you think about moving pass_optimize_widening_mul before
>>>> loop optimizations?  Does that pass catch the cases you are
>>>> teaching the pattern recognizer?  I think we should try to expose
>>>> these more complicated instructions to loop optimizers.
>>>>
>>>
>>> pass_optimize_widening_mul doesn't catch these cases, but I can try to
>>> teach it instead of the vectorizer.
>>> I am now testing
>>>
>>> Index: passes.c
>>> ===================================================================
>>> --- passes.c    (revision 174391)
>>> +++ passes.c    (working copy)
>>> @@ -870,6 +870,7 @@
>>>       NEXT_PASS (pass_split_crit_edges);
>>>       NEXT_PASS (pass_pre);
>>>       NEXT_PASS (pass_sink_code);
>>> +      NEXT_PASS (pass_optimize_widening_mul);
>>>       NEXT_PASS (pass_tree_loop);
>>>        {
>>>          struct opt_pass **p = &pass_tree_loop.pass.sub;
>>> @@ -934,7 +935,6 @@
>>>       NEXT_PASS (pass_forwprop);
>>>       NEXT_PASS (pass_phiopt);
>>>       NEXT_PASS (pass_fold_builtins);
>>> -      NEXT_PASS (pass_optimize_widening_mul);
>>>       NEXT_PASS (pass_tail_calls);
>>>       NEXT_PASS (pass_rename_ssa_copies);
>>>       NEXT_PASS (pass_uncprop);
>>>
>>> to see how it affects other loop optimizations (vectorizer pattern
>>> tests obviously fail).
>
> Looks like it needs copy_prop and dce as well:
>
> Index: passes.c
> ===================================================================
> --- passes.c    (revision 174391)
> +++ passes.c    (working copy)
> @@ -870,6 +870,9 @@
>       NEXT_PASS (pass_split_crit_edges);
>       NEXT_PASS (pass_pre);
>       NEXT_PASS (pass_sink_code);
> +      NEXT_PASS (pass_copy_prop);
> +      NEXT_PASS (pass_dce);
> +      NEXT_PASS (pass_optimize_widening_mul);
>       NEXT_PASS (pass_tree_loop);
>        {
>          struct opt_pass **p = &pass_tree_loop.pass.sub;
> @@ -934,7 +937,6 @@
>       NEXT_PASS (pass_forwprop);
>       NEXT_PASS (pass_phiopt);
>       NEXT_PASS (pass_fold_builtins);
> -      NEXT_PASS (pass_optimize_widening_mul);
>       NEXT_PASS (pass_tail_calls);
>       NEXT_PASS (pass_rename_ssa_copies);
>       NEXT_PASS (pass_uncprop);
>
> otherwise I get (on x86_64-suse-linux)
>
> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss
> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd
> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss
> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd
> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss
> FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd

Hmm.  I would have put the pass next to the sincos pass, but yes,
in principle a copyprop & dce pass after PRE makes sense
(the loop passes likely don't run because there are no loops in
those testcases - both copyprop and dce should be scheduled
more like TODOs, or even automatically by the pass manager
via PROPs ...).  Dead code can indeed confuse those matching
passes that look for single-use vars.

I'll think about a more elegant solution for this problem.

Would you mind checking if the next-to-sincos position makes
any difference?

Thanks,
Richard.

> Ira
>
>>
>> Thanks.  I would hope that we eventually can get rid of the
>> pattern recognizer ... at least for SSE there is also always
>> a scalar variant instruction for each vectorized one.
>>
>> Richard.
>>
>

  reply	other threads:[~2011-06-02  9:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-01  9:23 Ira Rosen
2011-06-01  9:42 ` Richard Guenther
2011-06-01 11:37   ` Ira Rosen
2011-06-01 12:15     ` Richard Guenther
2011-06-02  8:46       ` Ira Rosen
2011-06-02  9:59         ` Richard Guenther [this message]
2011-06-02 11:08           ` Ira Rosen
2011-06-02 15:35             ` Richard Guenther
2011-06-06 13:05       ` Richard Sandiford
2011-06-06 14:28         ` Richard Guenther
2011-06-07 21:08 ` H.J. Lu
2011-06-21  0:40   ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='BANLkTi=KCbm4rOWFDk0qQFBvX-Ph+cjJ8Q@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=ira.rosen@linaro.org \
    --cc=patches@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).