From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 117689 invoked by alias); 5 Nov 2018 22:50:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 117676 invoked by uid 89); 5 Nov 2018 22:50:24 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy=greedy X-HELO: gate.crashing.org Received: from gate.crashing.org (HELO gate.crashing.org) (63.228.1.57) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 05 Nov 2018 22:50:23 +0000 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id wA5MoB18008121; Mon, 5 Nov 2018 16:50:12 -0600 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id wA5Mo66m008119; Mon, 5 Nov 2018 16:50:06 -0600 Date: Mon, 05 Nov 2018 22:50:00 -0000 From: Segher Boessenkool To: Renlin Li Cc: Jeff Law , Christophe Lyon , gcc Patches , bergner@linux.ibm.com Subject: Re: [PATCH] combine: Do not combine moves from hard registers Message-ID: <20181105225004.GA5994@gate.crashing.org> References: <20181023122855.GI5205@gate.crashing.org> <20181023222645.GT5205@gate.crashing.org> <20181102230320.GV5994@gate.crashing.org> <20181102235422.GW5994@gate.crashing.org> <55082488-6f99-9ff9-b784-070495905d13@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-IsSubscribed: yes X-SW-Source: 2018-11/txt/msg00320.txt.bz2 Hi! On Mon, Nov 05, 2018 at 12:35:24PM +0000, Renlin Li wrote: > >>--- a/gcc/combine.c > >>+++ b/gcc/combine.c > >>@@ -14998,6 +14998,8 @@ make_more_copies (void) > >> continue; > >> if (TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src))) > >> continue; > >>+ if (REG_P (dest) && TEST_HARD_REG_BIT (fixed_reg_set, REGNO > >>(dest))) > >>+ continue; > >> > >> rtx new_reg = gen_reg_rtx (GET_MODE (dest)); > >> rtx_insn *new_insn = gen_move_insn (new_reg, src); > >>-- 1.8.3.1 > >It certainly helps the armeb test results. > > Yes, I can also see it helps a lot with the regression test. > Thanks for working on it! I committed a variant that does this only for frame_pointer_rtx, as r265821. > Beside the correctness issue, there are performance regression issues as > other people also reported. > > I analysised a case, which is gcc.c-torture/execute/builtins/memcpy-chk.c > In this case, two additional register moves and callee saves are emitted. > > The problem is that, make_more_moves split a move into two. Ideally, the RA > could figure out and > make the best register allocation. However, in reality, scheduler in some > cases will reschedule > the instructions, and which changes the live-range of registers. And thus > change the interference graph > of pseudo registers. > > This will force the RA to choose a different register for it, and make the > move instruction not redundant, > at least, not possible for RA to eliminate it. > > For example, > > set r102, r1 > > After combine: > insn x: set r103, r1 > insn x+1: set r22, r103 > > After scheduler: > insn x: set r103, r1 > ... > ... > ... > insn x+1: set r102, r103 > > After IRA, r1 could be assigned to operands used in instructions in between > insn x and x+1. > so r23 is conflicting with r1. LRA has to assign r23 a different hard > register. > This cause one additional move, and probably one more callee save/restore. > > Nothing is obviously wrong here. But... > > One simple case probably not beneficial is to split hard register store. You mean a store from a hard reg directly to memory? Leaving that constrains scheduling. > According to your comment on make_more_moves, you might want to apply the > transformation only > on hard-reg-to-pseudo-copy? hard-reg-to-anything really. Actually making it do this only for pseudos caused a lot of degradation for some targets iirc. I can do more tests. Almost all reported degradations are cases where RA does not make the best decision. There are other cases where this combine change makes better code than before. (If this was not true, a greedy RA would work well.) Segher