From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-489075-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 64951 invoked by alias); 5 Nov 2018 18:16:50 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 64820 invoked by uid 89); 5 Nov 2018 18:16:22 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy=
X-HELO: foss.arm.com
Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 05 Nov 2018 18:16:21 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 46C2080D;	Mon,  5 Nov 2018 10:16:19 -0800 (PST)
Received: from [10.2.206.82] (e109742-lin.cambridge.arm.com [10.2.206.82])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 52A073F718;	Mon,  5 Nov 2018 10:16:18 -0800 (PST)
Subject: Re: [PATCH] combine: Do not combine moves from hard registers
From: Renlin Li <renlin.li@foss.arm.com>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Jeff Law <law@redhat.com>, Christophe Lyon <christophe.lyon@linaro.org>, gcc Patches <gcc-patches@gcc.gnu.org>, bergner@linux.ibm.com
References: <68abf72a5400b96b9a100966331d3ad2056648e7.1540237620.git.segher@kernel.crashing.org> <CAKdteOY-201je46aEmH+28BB0e0nNn8Aonyz2BMp-TvehOUFPQ@mail.gmail.com> <20181023122855.GI5205@gate.crashing.org> <CAKdteObR3gOjwd1nxVOdD1U0GVA9CNuQ96y4OcvHLSRGJ-0ZAw@mail.gmail.com> <20181023222645.GT5205@gate.crashing.org> <CAKdteOa4RoWrn+SC=4nPcN4mxtCeWitX7gUPGASR_5UBVM_E5w@mail.gmail.com> <a27db9c8-97c9-9b62-7d62-769c49220229@foss.arm.com> <20181102230320.GV5994@gate.crashing.org> <20181102235422.GW5994@gate.crashing.org> <55082488-6f99-9ff9-b784-070495905d13@redhat.com> <c9cf3c78-0c1b-e1ea-07a3-fa18055d1cfb@foss.arm.com>
Message-ID: <c0ffbd85-67d3-8249-40ec-2ec6b34a3da6@foss.arm.com>
Date: Mon, 05 Nov 2018 18:16:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0
MIME-Version: 1.0
In-Reply-To: <c9cf3c78-0c1b-e1ea-07a3-fa18055d1cfb@foss.arm.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
X-IsSubscribed: yes
X-SW-Source: 2018-11/txt/msg00277.txt.bz2


On 11/05/2018 12:35 PM, Renlin Li wrote:
> Hi Segher,
> 
> On 11/03/2018 02:34 AM, Jeff Law wrote:
>> On 11/2/18 5:54 PM, Segher Boessenkool wrote:
>>> On Fri, Nov 02, 2018 at 06:03:20PM -0500, Segher Boessenkool wrote:
>>>>> The original rtx is generated by expand_builtin_setjmp_receiver to adjust
>>>>> the frame pointer.
>>>>>
>>>>> And later in LRA, it will try to eliminate frame_pointer with hard frame
>>>>> pointer which is
>>>>> defined the ELIMINABLE_REGS.
>>>>>
>>>>> Your change split the insn into two.
>>>>> This makes it doesn't match the "from" and "to" regs defined in
>>>>> ELIMINABLE_REGS.
>>>>> The if statement to generate the adjustment insn is been skipt.
>>>>> And the original instruction is just been deleted!
>>>> I don't follow why, or what should have prevented it from being deleted.
>>>>
>>>>> Probably, we don't want to split the move rtx if they are related to
>>>>> entries defined in ELIMINABLE_REGS?
>>>> One thing I can easily do is not making an intermediate pseudo when copying
>>>> *to* a fixed reg, which sfp is.Â  Let me try if that helps the testcase I'm
>>>> looking at (setjmp-4.c).
>>> This indeed helps, see patch below.Â  Could you try that on the whole
>>> testsuite?
>>>
>>> Thanks,
>>>
>>>
>>> Segher
>>>
>>>
>>> p.s. It still is a problem in the arm backend, but this won't hurt combine,
>>> so why not.
>>>
>>>
>>> Â From 814ca23ce05384d017b3c2bff41ab61cf5446e46 Mon Sep 17 00:00:00 2001
>>> Message-Id: <814ca23ce05384d017b3c2bff41ab61cf5446e46.1541202704.git.segher@kernel.crashing.org>
>>> From: Segher Boessenkool <segher@kernel.crashing.org>
>>> Date: Fri, 2 Nov 2018 23:33:32 +0000
>>> Subject: [PATCH] combine: Don't break up copy from hard to fixed reg
>>>
>>> ---
>>> Â  gcc/combine.c | 2 ++
>>> Â  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/gcc/combine.c b/gcc/combine.c
>>> index dfb0b44..15e941a 100644
>>> --- a/gcc/combine.c
>>> +++ b/gcc/combine.c
>>> @@ -14998,6 +14998,8 @@ make_more_copies (void)
>>> Â Â Â Â Â Â Â Â Â  continue;
>>> Â Â Â Â Â Â Â  if (TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src)))
>>> Â Â Â Â Â Â Â Â Â  continue;
>>> +Â Â Â Â Â  if (REG_P (dest) && TEST_HARD_REG_BIT (fixed_reg_set, REGNO (dest)))
>>> +Â Â Â Â Â Â Â  continue;
>>> Â Â Â Â Â Â Â  rtx new_reg = gen_reg_rtx (GET_MODE (dest));
>>> Â Â Â Â Â Â Â  rtx_insn *new_insn = gen_move_insn (new_reg, src);
>>> -- 1.8.3.1
>> It certainly helps the armeb test results.
> 
> Yes, I can also see it helps a lot with the regression test.
> Thanks for working on it!
> 
> 
> Beside the correctness issue, there are performance regression issues as other people also reported.
> 
> I analysised a case, which is gcc.c-torture/execute/builtins/memcpy-chk.c
> In this case, two additional register moves and callee saves are emitted.
> 
> The problem is that, make_more_moves split a move into two. Ideally, the RA could figure out and
> make the best register allocation. However, in reality, scheduler in some cases will reschedule
> the instructions, and which changes the live-range of registers. And thus change the interference graph
> of pseudo registers.
> 
> This will force the RA to choose a different register for it, and make the move instruction not redundant,
> at least, not possible for RA to eliminate it.
> 
> For example,
> 
> set r102, r1
> 
> After combine:
> insn x: set r103, r1
> insn x+1: set r22, r103
> 
> After scheduler:
> insn x: set r103, r1
> ...
> ...
> ...
> insn x+1: set r102, r103
> 
> After IRA, r1 could be assigned to operands used in instructions in between insn x and x+1.
> so r23 is conflicting with r1. LRA has to assign r23 a different hard register.

Sorry, this is not correct. Instructions scheduled between x and x+1 directly use hard register r1.
It is not IRA/LRA assigning r1 to the operands.


To reproduce this particular case, you could use:
cc1  -O3 -marm -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp gcc.c-torture/execute/builtins/memcpy-chk.c

This insn is been splitted.

(insn 152 150 154 11 (set (mem/c:QI (plus:SI (reg/f:SI 266)
                 (const_int 24 [0x18])) [0 MEM[(void *)&p + 20B]+4 S1 A32])
         (reg:QI 1 r1)) "memcpy-chk-reduce.c":48:3 189 {*arm_movqi_insn}
      (expr_list:REG_DEAD (reg:QI 1 r1)
         (nil)))


Regards,
Renlin


> This cause one additional move, and probably one more callee save/restore.
> 
> Nothing is obviously wrong here. But...
> 
> One simple case probably not beneficial is to split hard register store.
> According to your comment on make_more_moves, you might want to apply the transformation only
> on hard-reg-to-pseudo-copy?
> 
> Regards,
> Renlin
> 
> 
> 
> 
>>
>> Jeff
>>