From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-386531-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 6936 invoked by alias); 4 Dec 2014 11:07:33 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 6923 invoked by uid 89); 4 Dec 2014 11:07:33 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-oi0-f53.google.com
Received: from mail-oi0-f53.google.com (HELO mail-oi0-f53.google.com) (209.85.218.53) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 04 Dec 2014 11:07:32 +0000
Received: by mail-oi0-f53.google.com with SMTP id x69so11853382oia.26        for <gcc-patches@gcc.gnu.org>; Thu, 04 Dec 2014 03:07:30 -0800 (PST)
MIME-Version: 1.0
X-Received: by 10.60.250.135 with SMTP id zc7mr6282373oec.54.1417691250361; Thu, 04 Dec 2014 03:07:30 -0800 (PST)
Received: by 10.76.174.2 with HTTP; Thu, 4 Dec 2014 03:07:30 -0800 (PST)
In-Reply-To: <CAFiYyc1jauY_hejCfgU88DXtaSCCSZDUMiKMb678KqQ_QrMzrQ@mail.gmail.com>
References: <54803EBE.2060607@arm.com>	<CAFiYyc1jauY_hejCfgU88DXtaSCCSZDUMiKMb678KqQ_QrMzrQ@mail.gmail.com>
Date: Thu, 04 Dec 2014 11:07:00 -0000
Message-ID: <CAFiYyc0gEQt_Ci1TyCfYys=JnZMr8FmYW7dFtq+mBmqKjeuttw@mail.gmail.com>
Subject: Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
From: Richard Biener <richard.guenther@gmail.com>
To: Jiong Wang <jiong.wang@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset=UTF-8
X-IsSubscribed: yes
X-SW-Source: 2014-12/txt/msg00360.txt.bz2

On Thu, Dec 4, 2014 at 12:07 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Thu, Dec 4, 2014 at 12:00 PM, Jiong Wang <jiong.wang@arm.com> wrote:
>> For PR62173, the ideal solution is to resolve the problem on tree level
>> ivopt pass.
>>
>> While, apart from the tree level issue, PR 62173 also exposed another two
>> RTL level issues.
>> one of them is looks like we could improve RTL level loop invariant hoisting
>> by re-shuffle insns.
>>
>> for Seb's testcase
>>
>> void bar(int i) {
>>   char A[10];
>>   int d = 0;
>>   while (i > 0)
>>   A[d++] = i--;
>>
>>   while (d > 0)
>>   foo(A[d--]);
>> }
>>
>> the insn sequences to calculate A[I]'s address looks like:
>>
>> (insn 76 75 77 22 (set (reg/f:DI 109)
>>   (plus:DI (reg/f:DI 64 sfp)
>>   (reg:DI 108 [ i ]))) seb-pop.c:8 84 {*adddi3_aarch64}
>>   (expr_list:REG_DEAD (reg:DI 108 [ i ])
>>   (nil)))
>> (insn 77 76 78 22 (set (reg:SI 110 [ D.2633 ])
>>   (zero_extend:SI (mem/j:QI (plus:DI (reg/f:DI 109)
>>   (const_int -16 [0xfffffffffffffff0])) [0 A S1 A8]))) seb-pop.c:8 76
>> {*zero_extendqisi2_aarch64}
>>   (expr_list:REG_DEAD (reg/f:DI 109)
>>   (nil)))
>>
>> while for most RISC archs, reg + reg addressing is typical, so if we
>> re-shuffle
>> the instruction sequences into the following:
>>
>> (insn 96 94 97 22 (set (reg/f:DI 129)
>>   (plus:DI (reg/f:DI 64 sfp)
>>   (const_int -16 [0xfffffffffffffff0]))) seb-pop.c:8 84 {*adddi3_aarch64}
>>   (nil))
>> (insn 97 96 98 22 (set (reg:DI 130 [ i ])
>>   (sign_extend:DI (reg/v:SI 97 [ i ]))) seb-pop.c:8 70
>> {*extendsidi2_aarch64}
>>   (expr_list:REG_DEAD (reg/v:SI 97 [ i ])
>>   (nil)))
>> (insn 98 97 99 22 (set (reg:SI 131 [ D.2633 ])
>>   (zero_extend:SI (mem/j:QI (plus:DI (reg/f:DI 129)
>>   (reg:DI 130 [ i ])) [0 A S1 A8]))) seb-pop.c:8 76
>> {*zero_extendqisi2_aarch64}
>>   (expr_list:REG_DEAD (reg:DI 130 [ i ])
>>   (expr_list:REG_DEAD (reg/f:DI 129)
>>   (nil))))
>>
>> which means re-associate the constant imm with the virtual frame pointer.
>>
>> transform
>>
>>      RA <- fixed_reg + RC
>>      RD <- MEM (RA + const_offset)
>>
>>   into:
>>
>>      RA <- fixed_reg + const_offset
>>      RD <- MEM (RA + RC)
>>
>> then RA <- fixed_reg + const_offset is actually loop invariant, so the later
>> RTL GCSE PRE pass could catch it and do the hoisting, and thus ameliorate
>> what tree
>> level ivopts could not sort out.
>
> There is a LIM pass after gimple ivopts - if the invariantness is already
> visible there why not handle it there similar to the special-cases in
> rewrite_bittest and rewrite_reciprocal?
>
> And of course similar tricks could be applied on the RTL level to
> RTL invariant motion?

Oh, and the patch misses a testcase.

> Thanks,
> Richard.
>
>> and this patch only tries to re-shuffle instructions within single basic
>> block which
>> is a inner loop which is perf critical.
>>
>> I am reusing the loop info in fwprop because there is loop info and it's run
>> before
>> GCSE.
>>
>> verified on aarch64 and mips64, the array base address hoisted out of loop.
>>
>> bootstrap ok on x86-64 and aarch64.
>>
>> comments?
>>
>> thanks.
>>
>> gcc/
>>   PR62173
>>   fwprop.c (prepare_for_gcse_pre): New function.
>>   (fwprop_done): Call it.