From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 64951 invoked by alias); 5 Nov 2018 18:16:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 64820 invoked by uid 89); 5 Nov 2018 18:16:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 05 Nov 2018 18:16:21 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 46C2080D; Mon, 5 Nov 2018 10:16:19 -0800 (PST) Received: from [10.2.206.82] (e109742-lin.cambridge.arm.com [10.2.206.82]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 52A073F718; Mon, 5 Nov 2018 10:16:18 -0800 (PST) Subject: Re: [PATCH] combine: Do not combine moves from hard registers From: Renlin Li To: Segher Boessenkool Cc: Jeff Law , Christophe Lyon , gcc Patches , bergner@linux.ibm.com References: <68abf72a5400b96b9a100966331d3ad2056648e7.1540237620.git.segher@kernel.crashing.org> <20181023122855.GI5205@gate.crashing.org> <20181023222645.GT5205@gate.crashing.org> <20181102230320.GV5994@gate.crashing.org> <20181102235422.GW5994@gate.crashing.org> <55082488-6f99-9ff9-b784-070495905d13@redhat.com> Message-ID: Date: Mon, 05 Nov 2018 18:16:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2018-11/txt/msg00277.txt.bz2 On 11/05/2018 12:35 PM, Renlin Li wrote: > Hi Segher, > > On 11/03/2018 02:34 AM, Jeff Law wrote: >> On 11/2/18 5:54 PM, Segher Boessenkool wrote: >>> On Fri, Nov 02, 2018 at 06:03:20PM -0500, Segher Boessenkool wrote: >>>>> The original rtx is generated by expand_builtin_setjmp_receiver to adjust >>>>> the frame pointer. >>>>> >>>>> And later in LRA, it will try to eliminate frame_pointer with hard frame >>>>> pointer which is >>>>> defined the ELIMINABLE_REGS. >>>>> >>>>> Your change split the insn into two. >>>>> This makes it doesn't match the "from" and "to" regs defined in >>>>> ELIMINABLE_REGS. >>>>> The if statement to generate the adjustment insn is been skipt. >>>>> And the original instruction is just been deleted! >>>> I don't follow why, or what should have prevented it from being deleted. >>>> >>>>> Probably, we don't want to split the move rtx if they are related to >>>>> entries defined in ELIMINABLE_REGS? >>>> One thing I can easily do is not making an intermediate pseudo when copying >>>> *to* a fixed reg, which sfp is.  Let me try if that helps the testcase I'm >>>> looking at (setjmp-4.c). >>> This indeed helps, see patch below.  Could you try that on the whole >>> testsuite? >>> >>> Thanks, >>> >>> >>> Segher >>> >>> >>> p.s. It still is a problem in the arm backend, but this won't hurt combine, >>> so why not. >>> >>> >>>  From 814ca23ce05384d017b3c2bff41ab61cf5446e46 Mon Sep 17 00:00:00 2001 >>> Message-Id: <814ca23ce05384d017b3c2bff41ab61cf5446e46.1541202704.git.segher@kernel.crashing.org> >>> From: Segher Boessenkool >>> Date: Fri, 2 Nov 2018 23:33:32 +0000 >>> Subject: [PATCH] combine: Don't break up copy from hard to fixed reg >>> >>> --- >>>   gcc/combine.c | 2 ++ >>>   1 file changed, 2 insertions(+) >>> >>> diff --git a/gcc/combine.c b/gcc/combine.c >>> index dfb0b44..15e941a 100644 >>> --- a/gcc/combine.c >>> +++ b/gcc/combine.c >>> @@ -14998,6 +14998,8 @@ make_more_copies (void) >>>           continue; >>>         if (TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src))) >>>           continue; >>> +      if (REG_P (dest) && TEST_HARD_REG_BIT (fixed_reg_set, REGNO (dest))) >>> +        continue; >>>         rtx new_reg = gen_reg_rtx (GET_MODE (dest)); >>>         rtx_insn *new_insn = gen_move_insn (new_reg, src); >>> -- 1.8.3.1 >> It certainly helps the armeb test results. > > Yes, I can also see it helps a lot with the regression test. > Thanks for working on it! > > > Beside the correctness issue, there are performance regression issues as other people also reported. > > I analysised a case, which is gcc.c-torture/execute/builtins/memcpy-chk.c > In this case, two additional register moves and callee saves are emitted. > > The problem is that, make_more_moves split a move into two. Ideally, the RA could figure out and > make the best register allocation. However, in reality, scheduler in some cases will reschedule > the instructions, and which changes the live-range of registers. And thus change the interference graph > of pseudo registers. > > This will force the RA to choose a different register for it, and make the move instruction not redundant, > at least, not possible for RA to eliminate it. > > For example, > > set r102, r1 > > After combine: > insn x: set r103, r1 > insn x+1: set r22, r103 > > After scheduler: > insn x: set r103, r1 > ... > ... > ... > insn x+1: set r102, r103 > > After IRA, r1 could be assigned to operands used in instructions in between insn x and x+1. > so r23 is conflicting with r1. LRA has to assign r23 a different hard register. Sorry, this is not correct. Instructions scheduled between x and x+1 directly use hard register r1. It is not IRA/LRA assigning r1 to the operands. To reproduce this particular case, you could use: cc1 -O3 -marm -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp gcc.c-torture/execute/builtins/memcpy-chk.c This insn is been splitted. (insn 152 150 154 11 (set (mem/c:QI (plus:SI (reg/f:SI 266) (const_int 24 [0x18])) [0 MEM[(void *)&p + 20B]+4 S1 A32]) (reg:QI 1 r1)) "memcpy-chk-reduce.c":48:3 189 {*arm_movqi_insn} (expr_list:REG_DEAD (reg:QI 1 r1) (nil))) Regards, Renlin > This cause one additional move, and probably one more callee save/restore. > > Nothing is obviously wrong here. But... > > One simple case probably not beneficial is to split hard register store. > According to your comment on make_more_moves, you might want to apply the transformation only > on hard-reg-to-pseudo-copy? > > Regards, > Renlin > > > > >> >> Jeff >>