public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jiong Wang <jiong.wang@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
Date: Thu, 04 Dec 2014 19:32:00 -0000	[thread overview]
Message-ID: <5480B6D6.2020201@arm.com> (raw)
In-Reply-To: <CAFiYyc0gEQt_Ci1TyCfYys=JnZMr8FmYW7dFtq+mBmqKjeuttw@mail.gmail.com>

On 04/12/14 11:07, Richard Biener wrote:

> On Thu, Dec 4, 2014 at 12:07 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Thu, Dec 4, 2014 at 12:00 PM, Jiong Wang <jiong.wang@arm.com> wrote:
>>
>>
>> which means re-associate the constant imm with the virtual frame pointer.
>>
>> transform
>>
>>       RA <- fixed_reg + RC
>>       RD <- MEM (RA + const_offset)
>>
>>    into:
>>
>>       RA <- fixed_reg + const_offset
>>       RD <- MEM (RA + RC)
>>
>> then RA <- fixed_reg + const_offset is actually loop invariant, so the later
>> RTL GCSE PRE pass could catch it and do the hoisting, and thus ameliorate
>> what tree
>> level ivopts could not sort out.
>> There is a LIM pass after gimple ivopts - if the invariantness is already
>> visible there why not handle it there similar to the special-cases in
>> rewrite_bittest and rewrite_reciprocal?

maybe, needs further check.

>>
>> And of course similar tricks could be applied on the RTL level to
>> RTL invariant motion?

Thanks. I just checked the code, yes, loop invariant motion pass
is the natural place to integrate such multi-insns invariant analysis trick.

those code could be integrated into loop-invariant.c cleanly, but
then I found although the invariant detected successfully but it's not moved
out of loop because of cost issue, and looks like the patch committed to fix PR33928
is trying to prevent such cheap address be hoisted to reduce register pressure.


  805       /* ??? Try to determine cheapness of address computation.  Unfortunately
  806          the address cost is only a relative measure, we can't really compare
  807          it with any absolute number, but only with other address costs.
  808          But here we don't have any other addresses, so compare with a magic
  809          number anyway.  It has to be large enough to not regress PR33928
  810          (by avoiding to move reg+8,reg+16,reg+24 invariants), but small
  811          enough to not regress 410.bwaves either (by still moving reg+reg
  812          invariants).
  813          See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html .  */
  814       inv->cheap_address = address_cost (SET_SRC (set), word_mode,
  815                                          ADDR_SPACE_GENERIC, speed) < 3;


I think that maybe necessary for x86 which is short of register, while for RISC,
it may not be that necessary, especially the whole register pressure is not big.

currently, what I could think of is for this transformation below, we should
increase the costs:
  
A
==
      RA <- virtual_stack_var + RC
      RD <- MEM (RA + const_offset)

   into:

B
==
      RA <- virtual_stack_var + const_offset  <--B
      RD <- MEM (RA + RC)

because the cost is not that cheap, if there is not re-assocation of virtual_stack_var
with const_offset, then lra elimination will create another instruction to hold the
elimination result, so format A will actually be

      RT <- real_stack_pointer + elimination_offset
      RA <- RT + RC
      RD <- MEM (RA + const_offset)

so, the re-assocation and later hoisting of invariant B could actually save two instructions
in the loop, this is why there are 15% perf gap for bzip2 under some situation.

On aarch64, for Seb's testcase,

before IV hoisting
===
         b       .L4
.L37:
         sub     w19, w19, #1
.L4:
	add	x1, x29, 48
	add	x0, x1, x0, sxtw
	ldrb	w0, [x0, -16]
         bl      foo
         mov     w0, w19
         cbnz    w19, .L37
         ldr     x19, [sp, 16]
         ldp     x29, x30, [sp], 48
         ret

after this transformation, and the IV hoisting
(need one extra callee saved, x20)
===
.L3:
         add     x20, x29, 32  <-- IV hoisted
         b       .L4
.L37:
         sub     w19, w19, #1
.L4:
         ldrb    w0, [x20, w0, sxtw]
         bl      foo
         mov     w0, w19
         cbnz    w19, .L37
         ldp     x19, x20, [sp, 16]
         ldp     x29, x30, [sp], 48
         ret

> Oh, and the patch misses a testcase.

It's at the start of the email, will include in the patch.

void bar(int i) {
   char A[10];
   int d = 0;
   while (i > 0)
   A[d++] = i--;

   while (d > 0)
   foo(A[d--]);
}

>
>> Thanks,
>> Richard.
>>
>>> and this patch only tries to re-shuffle instructions within single basic
>>> block which
>>> is a inner loop which is perf critical.
>>>
>>> I am reusing the loop info in fwprop because there is loop info and it's run
>>> before
>>> GCSE.
>>>
>>> verified on aarch64 and mips64, the array base address hoisted out of loop.
>>>
>>> bootstrap ok on x86-64 and aarch64.
>>>
>>> comments?
>>>
>>> thanks.
>>>
>>> gcc/
>>>    PR62173
>>>    fwprop.c (prepare_for_gcse_pre): New function.
>>>    (fwprop_done): Call it.


  reply	other threads:[~2014-12-04 19:32 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-04 11:00 Jiong Wang
2014-12-04 11:07 ` Richard Biener
2014-12-04 11:07   ` Richard Biener
2014-12-04 19:32     ` Jiong Wang [this message]
2014-12-15 15:29       ` Jiong Wang
2014-12-15 15:36         ` Jiong Wang
2014-12-17 16:19           ` Richard Biener
2014-12-18 17:08             ` Jiong Wang
2014-12-18 21:16               ` Jiong Wang
2014-12-18 22:19               ` Segher Boessenkool
2014-12-19  4:06                 ` Bin.Cheng
2014-12-19 10:29                   ` Jiong Wang
2014-12-19 11:45                     ` Richard Biener
2014-12-19 15:31                       ` Kenneth Zadeck
2015-02-11 11:20                         ` Jiong Wang
2015-02-11 14:22                           ` Kenneth Zadeck
2015-02-11 18:18                             ` Jiong Wang
2015-04-14 15:06                               ` Jiong Wang
2015-04-14 16:49                                 ` Steven Bosscher
2015-04-14 17:24                                   ` Jeff Law
2015-04-14 21:49                                     ` Jiong Wang
2015-04-21 14:43                                       ` Jiong Wang
2015-04-24  1:55                                         ` Jeff Law
2015-04-24 17:05                                           ` Jiong Wang
2015-05-14 20:04                                             ` Jeff Law
2015-05-14 22:07                                               ` Jiong Wang
2015-05-14 22:24                                                 ` Jeff Law
2015-05-21 21:51                                                   ` Jiong Wang
2015-05-27 16:11                                                     ` Jeff Law
2015-09-02 13:49                                                       ` Jiong Wang
2015-09-02 20:52                                                         ` Jeff Law
2015-04-28 12:16                                         ` Jiong Wang
2015-04-28 14:00                                           ` Matthew Fortune
2015-04-28 14:31                                             ` Jiong Wang
2015-04-19 16:20                               ` Jiong Wang
2014-12-19 12:09                     ` Eric Botcazou
2014-12-19 15:21                   ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5480B6D6.2020201@arm.com \
    --to=jiong.wang@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).