public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Guenther <richard.guenther@gmail.com>
To: Ilya Enkovich <enkovich.gnu@gmail.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH Atom][PR middle-end/44382] Tree reassociation improvement
Date: Thu, 01 Sep 2011 09:24:00 -0000	[thread overview]
Message-ID: <CAFiYyc06f5m2njPh3i1kbXc=jUAviUfv3dRPZ3NFxta3feX=mA@mail.gmail.com> (raw)
In-Reply-To: <CAMbmDYYc_eYGZ2w+E+rY_zvERUVrhfXkMpnAjxCzKT9_s7-HAA@mail.gmail.com>

On Wed, Aug 31, 2011 at 2:17 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
> Hello Richard,
>
> Thanks for the review!
>
>> The fact that you have to adjust gcc.dg/tree-ssa/pr38533.c looks problematic
>> to me.  Can you investigate the problem report, look at the geneated
>> code with the atom default of the param and see whether it's still
>> reasonably "fixed" (maybe you'd already done this)?
> This test was created as a regression test to the problem in
> linearize_expr_tree which moves all statements down to the first
> modified one during reassociation increasing registers pressure. Test
> has a lot of definitions which are all ORed like this:
>  def r1
>  def r2
>  s = r1 or r2
>  def r3
>  s = s or r3
>  def r4
>  s = s or r4
>  ...
> and it checks that register pressure is not increased after
> reassociation by simple scan of two sequential defs. If we use
> reassociation width higher than 1 then test will fails because we need
> to increase register pressure to get parallelism and finally get code
> like this:
>  def r1
>  def r2
>  def r3
>  t1 = r1 or r2
>  s = s or r3
>  def r4
>  def r5
>  s = s or t1
>  t1 = r4 or r5
>  ...
> So, I had to fix a test.

Ok.  Thanks for explaining.

>> +  /* Check if we may use less width and still compute sequence for
>> +     the same time.  It will allow us to reduce registers usage.  */
>> +  while (width > 1
>> +        && get_required_cycles (ops_num, width - 1) == cycles_best)
>> +    width--;
>>
>> I suppose get_required_cycles () is monotonic in width?  Would it
>> make sense to binary search the best width then (I realize the
>> default is 1 and the only target differing has 2, but ...)?  Maybe at
>> least add a comment to this effect?  Or not decrement by one
>> but instead divide by two on each iteration (which is the same for 1 and 2 ...)?
>> It's also all a mapping that is constant - we should be able to
>> pre-compute it for all values, eventually "compressing" it to a
>> much simpler formula width = f (cpu_width, ops_num)?
> I replaced sequential search with a binary search. I did not pay much
> attention to this function because do not think it is really time
> consuming compared to other parts of reassociation phase. Am I too
> optimistic? If you think it might significantly affect compile time I
> can introduce a table of pre-computed values (or make it later as a
> separate fix).

+  /* Get the minimal time required for sequence computation.  */
+  cycles_best = get_required_cycles (ops_num, width);
+
+  /* Check if we may use less width and still compute sequence for
+     the same time.  It will allow us to reduce registers usage.  */
+  width_min = 1;
+  while (width > width_min)
+    {
+      int width_mid = (width + width_min) / 2;
+
+      if (get_required_cycles (ops_num, width_mid) == cycles_best)
+       width = width_mid;
+      else if (width_min < width_mid)
+       width_min = width_mid;
+      else
+       break;
+    }

this seems to not allow cycles_best to drop with lower width, but
that it can't should be an implementation detail of get_required_cycles.
To make it not so, can you add a comment before the loop, like

  /* get_required_cycles is monotonically increasing with lower width
     so we can perform a binary search for the minimal width that still
     results in the optimal cycle count.  */

> I made all other fixes as you suggested.

With the above change the non-x86 specifc parts are ok.  Please get
approval for them from a x86 maintainer.

Thanks,
Richard.


> Bootstrapped and checked on x86_64-linux.
>
> Thanks,
> Ilya
> ---
> gcc/
>
> 2011-08-31  Enkovich Ilya  <ilya.enkovich@intel.com>
>
>        PR middle-end/44382
>        * target.def (reassociation_width): New hook.
>
>        * doc/tm.texi.in (reassociation_width): Likewise.
>
>        * doc/tm.texi (reassociation_width): Likewise.
>
>        * doc/invoke.texi (tree-reassoc-width): New param documented.
>
>        * hooks.h (hook_int_uint_mode_1): New default hook.
>
>        * hooks.c (hook_int_uint_mode_1): Likewise.
>
>        * config/i386/i386.h (ix86_tune_indices): Add
>        X86_TUNE_REASSOC_INT_TO_PARALLEL and
>        X86_TUNE_REASSOC_FP_TO_PARALLEL.
>
>        (TARGET_REASSOC_INT_TO_PARALLEL): New.
>        (TARGET_REASSOC_FP_TO_PARALLEL): Likewise.
>
>        * config/i386/i386.c (initial_ix86_tune_features): Add
>        X86_TUNE_REASSOC_INT_TO_PARALLEL and
>        X86_TUNE_REASSOC_FP_TO_PARALLEL.
>
>        (ix86_reassociation_width) implementation of
>        new hook for i386 target.
>
>        * params.def (PARAM_TREE_REASSOC_WIDTH): New param added.
>
>        * tree-ssa-reassoc.c (get_required_cycles): New function.
>        (get_reassociation_width): Likewise.
>        (swap_ops_for_binary_stmt): Likewise.
>        (rewrite_expr_tree_parallel): Likewise.
>
>        (rewrite_expr_tree): Refactored. Part of code moved into
>        swap_ops_for_binary_stmt.
>
>        (reassociate_bb): Now checks reassociation width to be used
>        and call rewrite_expr_tree_parallel instead of rewrite_expr_tree
>        if needed.
>
> gcc/testsuite/
>
> 2011-08-31  Enkovich Ilya  <ilya.enkovich@intel.com>
>
>        * gcc.dg/tree-ssa/pr38533.c (dg-options): Added option
>        --param tree-reassoc-width=1.
>
>        * gcc.dg/tree-ssa/reassoc-24.c: New test.
>        * gcc.dg/tree-ssa/reassoc-25.c: Likewise.
>

  reply	other threads:[~2011-09-01  9:24 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-12 12:35 Илья Энкович
2011-07-12 17:08 ` William J. Schmidt
2011-07-12 17:22   ` H.J. Lu
2011-07-13  8:28     ` Ilya Enkovich
2011-07-12 17:24   ` William J. Schmidt
2011-07-13  8:13     ` Ilya Enkovich
2011-07-13  8:30       ` Jakub Jelinek
2011-07-13  9:04         ` Ilya Enkovich
2011-07-13  9:52           ` Jakub Jelinek
2011-07-13  9:59             ` Ilya Enkovich
2011-07-13 11:01               ` Richard Guenther
2011-07-13 13:11   ` William J. Schmidt
2011-07-13 18:05 ` Michael Meissner
2011-07-13 20:18   ` Ilya Enkovich
2011-07-14  2:39 ` Michael Meissner
2011-07-14  9:40   ` Richard Guenther
2011-07-14  9:42     ` Richard Guenther
2011-07-14 10:04       ` Ilya Enkovich
2011-07-14 10:14         ` Richard Guenther
2011-07-14 11:03           ` Ilya Enkovich
2011-07-14 15:31             ` Ilya Enkovich
2011-07-19 12:10               ` Richard Guenther
2011-07-19 15:25                 ` Ilya Enkovich
2011-08-05 11:46                   ` Richard Guenther
2011-08-05 15:08                     ` Ilya Enkovich
2011-08-10 15:19                       ` Ilya Enkovich
2011-08-19  9:09                         ` Ilya Enkovich
2011-08-26 11:45                           ` Ilya Enkovich
2011-08-30 13:16                         ` Richard Guenther
2011-08-31 15:38                           ` Ilya Enkovich
2011-09-01  9:24                             ` Richard Guenther [this message]
2011-09-01 10:28                               ` Ilya Enkovich
2011-09-02 12:53                                 ` Uros Bizjak
2011-09-02 13:07                                   ` Richard Guenther
2011-09-02 13:46                                     ` Ilya Enkovich
2011-09-02 14:01                                       ` Richard Guenther
2011-09-02 14:13                                         ` Ilya Enkovich
2011-09-06 12:40                                   ` Ilya Enkovich
2011-09-06 15:53                                     ` Uros Bizjak
2011-09-06 16:18                                       ` Ilya Enkovich
2011-09-06 16:45                                         ` H.J. Lu
2011-07-26  9:54                 ` Ilya Enkovich
2011-08-03  8:35                   ` Ilya Enkovich
2011-07-21 20:14               ` Joseph S. Myers
2011-07-14 16:04       ` Michael Meissner
2011-07-14 16:08         ` Ilya Enkovich
2011-07-14 10:38   ` Ilya Enkovich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc06f5m2njPh3i1kbXc=jUAviUfv3dRPZ3NFxta3feX=mA@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=enkovich.gnu@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).