From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13228 invoked by alias); 6 Dec 2013 00:47:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 13218 invoked by uid 89); 6 Dec 2013 00:47:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from Unknown (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 06 Dec 2013 00:47:31 +0000 Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1VojZb-00015M-Eb from Tom_deVries@mentor.com ; Thu, 05 Dec 2013 16:47:15 -0800 Received: from SVR-IES-FEM-02.mgc.mentorg.com ([137.202.0.106]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Thu, 5 Dec 2013 16:47:15 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-02.mgc.mentorg.com (137.202.0.106) with Microsoft SMTP Server id 14.2.247.3; Fri, 6 Dec 2013 00:47:12 +0000 Message-ID: <52A11E8E.8090103@mentor.com> Date: Fri, 06 Dec 2013 00:47:00 -0000 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: Vladimir Makarov CC: Subject: Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA. References: <510282FE.1060809@mentor.com> <5102A694.5010000@redhat.com> <5113FC6B.7090702@mentor.com> <511C1538.308@redhat.com> <514199BC.9070608@mentor.com> In-Reply-To: <514199BC.9070608@mentor.com> Content-Type: multipart/mixed; boundary="------------040207020206020102050906" X-SW-Source: 2013-12/txt/msg00586.txt.bz2 --------------040207020206020102050906 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Content-length: 2241 On 14-03-13 10:34, Tom de Vries wrote: >> I thought about implementing your optimization for LRA by myself. But it >> >is ok if you decide to work on it. At least, I am not going to start >> >this work for a month. >>> >>I'm also currently looking at how to use the analysis in LRA. >>> >>AFAIU, in lra-constraints.c we do a backward scan over the insns, and keep track >>> >>of how many calls we've seen (calls_num), and mark insns with that number. Then >>> >>when looking at a live-range segment consisting of a def or use insn a and a >>> >>following use insn b, we can compare the number of calls seen for each insn, and >>> >>if they're not equal there is at least one call between the 2 insns, and if the >>> >>corresponding hard register is clobbered by calls, we spill after insn a and >>> >>restore before insn b. >>> >> >>> >>That is too coarse-grained to use with our analysis, since we need to know which >>> >>calls occur in between insn a and insn b, and more precisely which registers >>> >>those calls clobbered. >> > >>> >>I wonder though if we can do something similar: we keep an array >>> >>call_clobbers_num[FIRST_PSEUDO_REG], initialized at 0 when we start scanning. >>> >>When encountering a call, we increase the call_clobbers_num entries for the hard >>> >>registers clobbered by the call. >>> >>When encountering a use, we set the call_clobbers_num field of the use to >>> >>call_clobbers_num[reg_renumber[original_regno]]. >>> >>And when looking at a live-range segment, we compare the clobbers_num field of >>> >>insn a and insn b, and if it is not equal, the hard register was clobbered by at >>> >>least one call between insn a and insn b. >>> >>Would that work? WDYT? >>> >> >> >As I understand you looked at live-range splitting code in >> >lra-constraints.c. To get necessary info you should look at ira-lives.c. > Unfortunately I haven't been able to find time to work further on the LRA part. > So if you're still willing to pick up that part, that would be great. Vladimir, I gave this a try. The attached patch works for the included test-case for x86_64. I've bootstrapped and reg-tested the patch (in combination with the other patches from the series) on x86_64. OK for stage1? Thanks, - Tom --------------040207020206020102050906 Content-Type: text/x-patch; name="fuse-caller-save-lra.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="fuse-caller-save-lra.patch" Content-length: 6862 2013-12-04 Tom de Vries * lra-int.h (struct lra_reg): Add field actual_call_used_reg_set. * lra.c (initialize_lra_reg_info_element): Add init of actual_call_used_reg_set field. (lra): Call lra_create_live_ranges before lra_inheritance for -fuse-caller-save. * lra-assigns.c (lra_assign): Allow call_used_regs to cross calls for -fuse-caller-save. * lra-constraints.c (need_for_call_save_p): Use actual_call_used_reg_set instead of call_used_reg_set for -fuse-caller-save. * lra-lives.c (process_bb_lives): Calculate actual_call_used_reg_set. * gcc.target/i386/fuse-caller-save.c: New test. * gcc.dg/ira-shrinkwrap-prep-1.c: Run with -fno-use-caller-save. diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c index 88fc693..943b349 100644 --- a/gcc/lra-assigns.c +++ b/gcc/lra-assigns.c @@ -1413,6 +1413,7 @@ lra_assign (void) bitmap_head insns_to_process; bool no_spills_p; int max_regno = max_reg_num (); + unsigned int call_used_reg_crosses_call = 0; timevar_push (TV_LRA_ASSIGN); init_lives (); @@ -1425,14 +1426,22 @@ lra_assign (void) bitmap_initialize (&all_spilled_pseudos, ®_obstack); create_live_range_start_chains (); setup_live_pseudos_and_spill_after_risky_transforms (&all_spilled_pseudos); -#ifdef ENABLE_CHECKING for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) if (lra_reg_info[i].nrefs != 0 && reg_renumber[i] >= 0 && lra_reg_info[i].call_p && overlaps_hard_reg_set_p (call_used_reg_set, PSEUDO_REGNO_MODE (i), reg_renumber[i])) - gcc_unreachable (); -#endif + { + if (!flag_use_caller_save) + gcc_unreachable (); + call_used_reg_crosses_call++; + } + if (lra_dump_file + && call_used_reg_crosses_call > 0) + fprintf (lra_dump_file, + "Found %u pseudo(s) with a call used reg crossing a call.\n" + "Allowing due to -fuse-caller-save\n", + call_used_reg_crosses_call); /* Setup insns to process on the next constraint pass. */ bitmap_initialize (&changed_pseudo_bitmap, ®_obstack); init_live_reload_and_inheritance_pseudos (); diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index bb5242a..d0939dc 100644 --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -4438,7 +4438,10 @@ need_for_call_save_p (int regno) lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0); return (usage_insns[regno].calls_num < calls_num && (overlaps_hard_reg_set_p - (call_used_reg_set, + ((flag_use_caller_save && + ! hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set)) + ? lra_reg_info[regno].actual_call_used_reg_set + : call_used_reg_set, PSEUDO_REGNO_MODE (regno), reg_renumber[regno]) || HARD_REGNO_CALL_PART_CLOBBERED (reg_renumber[regno], PSEUDO_REGNO_MODE (regno)))); diff --git a/gcc/lra-int.h b/gcc/lra-int.h index 6d8d80f..f2b8079 100644 --- a/gcc/lra-int.h +++ b/gcc/lra-int.h @@ -77,6 +77,10 @@ struct lra_reg /* The following fields are defined only for pseudos. */ /* Hard registers with which the pseudo conflicts. */ HARD_REG_SET conflict_hard_regs; + /* Call used registers with which the pseudo conflicts, taking into account + the registers used by functions called from calls which cross the + pseudo. */ + HARD_REG_SET actual_call_used_reg_set; /* We assign hard registers to reload pseudos which can occur in few places. So two hard register preferences are enough for them. The following fields define the preferred hard registers. If diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c index efc19f2..774d6c2 100644 --- a/gcc/lra-lives.c +++ b/gcc/lra-lives.c @@ -624,6 +624,17 @@ process_bb_lives (basic_block bb, int &curr_point) if (call_p) { + if (flag_use_caller_save) + { + HARD_REG_SET this_call_used_reg_set; + get_call_reg_set_usage (curr_insn, &this_call_used_reg_set, + call_used_reg_set); + + EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j) + IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set, + this_call_used_reg_set); + } + sparseset_ior (pseudos_live_through_calls, pseudos_live_through_calls, pseudos_live); if (cfun->has_nonlocal_label diff --git a/gcc/lra.c b/gcc/lra.c index d0d9bcb..599f95a 100644 --- a/gcc/lra.c +++ b/gcc/lra.c @@ -1427,6 +1427,7 @@ initialize_lra_reg_info_element (int i) lra_reg_info[i].no_stack_p = false; #endif CLEAR_HARD_REG_SET (lra_reg_info[i].conflict_hard_regs); + CLEAR_HARD_REG_SET (lra_reg_info[i].actual_call_used_reg_set); lra_reg_info[i].preferred_hard_regno1 = -1; lra_reg_info[i].preferred_hard_regno2 = -1; lra_reg_info[i].preferred_hard_regno_profit1 = 0; @@ -2343,7 +2344,18 @@ lra (FILE *f) lra_eliminate (false, false); /* Do inheritance only for regular algorithms. */ if (! lra_simple_p) - lra_inheritance (); + { + if (flag_use_caller_save) + { + if (live_p) + lra_clear_live_ranges (); + /* As a side-effect of lra_create_live_ranges, we calculate + actual_call_used_reg_set, which is needed during + lra_inheritance. */ + lra_create_live_ranges (true); + } + lra_inheritance (); + } if (live_p) lra_clear_live_ranges (); /* We need live ranges for lra_assign -- so build them. */ diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c index 54d3e76..a386fab 100644 --- a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c +++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { { x86_64-*-* && lp64 } || { powerpc*-*-* && lp64 } } } } */ -/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue" } */ +/* { dg-options "-O3 -fdump-rtl-ira -fdump-rtl-pro_and_epilogue -fno-use-caller-save" } */ long __attribute__((noinline, noclone)) foo (long a) diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c new file mode 100644 index 0000000..c5d620c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fuse-caller-save -fdump-rtl-reload" } */ +/* Testing -fuse-caller-save optimization option. */ + +static int __attribute__((noinline)) +bar (int x) +{ + return x + 3; +} + +int __attribute__((noinline)) +foo (int y) +{ + return y + bar (y); +} + +int +main (void) +{ + return !(foo (5) == 13); +} + +/* { dg-final { scan-rtl-dump-times "Found 1 pseudo.* with a call used reg crossing a call" 1 "reload" } } */ +/* { dg-final { scan-rtl-dump-times "Found .* pseudo.* with a call used reg crossing a call" 1 "reload" } } */ +/* { dg-final { scan-rtl-dump-times "Allowing due to -fuse-caller-save" 1 "reload" } } */ +/* { dg-final { cleanup-rtl-dump "reload" } } */ --------------040207020206020102050906--